# Fio (POSIX) Benchmark

## Fio Tests Overview

Fio (Flexible I/O Tester) is an open source powerful tool used for benchmarking and testing the performance of storage systems. It supports a variety of I/O operations, including sequential and random reads/writes, and allows for highly customizable workloads. Fio is cross-platform, working on Linux, Windows, and macOS, and provides detailed performance metrics like IOPS, bandwidth, and latency.

This page demonstrate the fio testing results of Alluxio. The test cases can be applied to other storage systems.

## Results summary

### Single Worker & Client Throughput (256k block size)

| Bandwidth/Threads | Single Thread | 32 Threads |
| ----------------- | ------------- | ---------- |
| Sequential Read   | 2182 MB/s     | 8580 MB/s  |
| Random Read       | 148 MB/s      | 7869 MB/s  |

### Single Worker & Client IOPS (4k block size)

| IOPS/Threads    | Single Thread | 32 Threads | 128 Threads |
| --------------- | ------------- | ---------- | ----------- |
| Sequential Read | 55.9k         | 244k       | 179k        |
| Random Read     | 1.6k          | 70.1k      | 162k        |

* Performance for cached data

## Test details

### Test environments

All instances are on the same availability zone in AWS.

Alluxio Worker

* 1 [i3en.metal](https://aws.amazon.com/ec2/instance-types/i3en/) instance
* Raid 0 of 8 nvme SSDs (created by mdadm command)
* 100Gbps network bandwidth
* Ubuntu 24.04
* An Alluxio worker process
* A single ETCD node

Alluxio Client

* 1 [c5n.metal](https://aws.amazon.com/ec2/instance-types/c5/) instance
* 100Gbps network bandwidth
* Ubuntu 24.04
* Fuse 3.16.2
* An Alluxio FUSE process

### Single Worker & Client Test

This scenario tests the read performance against a single 100GB large file. Only 1 client and 1 worker are involved in the test.

#### Installing fio

Fio can be installed through yum: `sudo yum install fio`. Alternative download locations can be found on its [github](https://github.com/axboe/fio).

#### Test preparation

Place a single 100GB file in the UFS. In this benchmark, we use an S3 bucket in the same region with the workers & clients.

#### Sequential read

Run the following commands on the Alluxio client:

```bash
fio -iodepth=1 -rw=read -ioengine=libaio -bs=<block_size> -numjobs=<numjobs> -group_reporting -size=100G -filename=/mnt/alluxio/100gb -name=read_test --readonly -direct=1 --runtime=60
```

The `numjobs` param specifies the concurrent fio jobs that performs read. In this benchmark, `numjobs` is set to `1`, `32`, or `128`. The `bs` param specifies the block size used in the test. We use `256k` for throughput testing and `4k` for testing IOPS.

#### Random read

Same as the hot sequential read, but the `rw` param is changed to `randread`.

```bash
fio -iodepth=1 -rw=randread -ioengine=libaio -bs=<block_size> -numjobs=<numjobs> -group_reporting -size=100G -filename=/mnt/alluxio/100gb -name=read_test --readonly -direct=1 --runtime=60
```

## Appendix - Alluxio Configurations

### Cluster configuration (`alluxio-site.properties`)

```
alluxio.master.hostname=localhost
alluxio.master.journal.type=NOOP
alluxio.security.authorization.permission.enabled=false
alluxio.worker.membership.manager.type=ETCD
alluxio.mount.table.source=ETCD
alluxio.etcd.endpoints=<endpoints>
alluxio.client.list.status.from.ufs.enabled=false
alluxio.worker.page.store.sizes=2TB
alluxio.worker.page.store.page.size=4M
alluxio.worker.page.store.dirs=/data1/worker
alluxio.user.metadata.cache.max.size=2000000
alluxio.dora.client.ufs.fallback.enabled=false
alluxio.user.position.reader.streaming.async.prefetch.thread=256
```

### JVM options (`alluxio-env.sh`)

```
ALLUXIO_WORKER_JAVA_OPTS="$ALLUXIO_WORKER_JAVA_OPTS -Xmx24G -Xmx24G -XX:+UseG1GC"
ALLUXIO_FUSE_JAVA_OPTS="$ALLUXIO_FUSE_JAVA_OPTS -Xms48G -Xmx48G -XX:MaxDirectMemorySize=24g  -XX:+UseG1GC"
```

### Fuse mount options

```
-max_background=256 -max_idle_threads=256 -entry_timeout=60 -attr_timeout=60
```
