# Fio (POSIX) Benchmark

## Fio Tests Overview

Fio (Flexible I/O Tester) is an open source powerful tool used for benchmarking and testing the performance of storage systems. It supports a variety of I/O operations, including sequential and random reads/writes, and allows for highly customizable workloads. Fio is cross-platform, working on Linux, Windows, and macOS, and provides detailed performance metrics like IOPS, bandwidth, and latency.

This page demonstrate the fio testing results of Alluxio. The test cases can be applied to other storage systems.

## Results summary

### Single Worker & Client Throughput (256k block size)

| Bandwidth/Threads | Single Thread | 32 Threads |
| ----------------- | ------------- | ---------- |
| Sequential Read   | 2182 MB/s     | 8580 MB/s  |
| Random Read       | 148 MB/s      | 7869 MB/s  |

### Single Worker & Client IOPS (4k block size)

| IOPS/Threads    | Single Thread | 32 Threads | 128 Threads |
| --------------- | ------------- | ---------- | ----------- |
| Sequential Read | 55.9k         | 244k       | 179k        |
| Random Read     | 1.6k          | 70.1k      | 162k        |

* Performance for cached data

## Test details

### Test environments

All instances are on the same availability zone in AWS.

Alluxio Worker

* 1 [i3en.metal](https://aws.amazon.com/ec2/instance-types/i3en/) instance
* Raid 0 of 8 nvme SSDs (created by mdadm command)
* 100Gbps network bandwidth
* Ubuntu 24.04
* An Alluxio worker process
* A single ETCD node

Alluxio Client

* 1 [c5n.metal](https://aws.amazon.com/ec2/instance-types/c5/) instance
* 100Gbps network bandwidth
* Ubuntu 24.04
* Fuse 3.16.2
* An Alluxio FUSE process

### Single Worker & Client Test

This scenario tests the read performance against a single 100GB large file. Only 1 client and 1 worker are involved in the test.

#### Installing fio

Fio can be installed through yum for RPM-based Linux distributions (ex. `sudo yum install fio`). Alternative download locations can be found on its [github](https://github.com/axboe/fio).

#### Test preparation

Place a single 100GB file in the UFS. In this benchmark, we use an S3 bucket in the same region with the workers & clients.

#### Sequential read

Run the following commands on the Alluxio client node with the FUSE mount:

```bash
fio -iodepth=1 -rw=read -ioengine=libaio -bs=<block_size> -numjobs=<numjobs> -group_reporting -size=100G -filename=/mnt/alluxio/100gb -name=read_test --readonly -direct=1 --runtime=60
```

`filename` should point to a path under the path of the FUSE mount. `numjobs` specifies the concurrent fio read jobs; it can be set to `1`, `32`, or `128`. `bs` specifies the block size used in the test; we use `256k` for throughput testing and `4k` for testing IOPS.

#### Random read

Same as the sequential read, but the `rw` parameter is changed from `read` to `randread`.

```bash
fio -iodepth=1 -rw=randread -ioengine=libaio -bs=<block_size> -numjobs=<numjobs> -group_reporting -size=100G -filename=/mnt/alluxio/100gb -name=read_test --readonly -direct=1 --runtime=60
```

## Appendix - Alluxio Configurations

### Cluster configuration (`alluxio-site.properties`)

```
alluxio.master.hostname=localhost
alluxio.master.journal.type=NOOP
alluxio.security.authorization.permission.enabled=false
alluxio.worker.membership.manager.type=ETCD
alluxio.mount.table.source=ETCD
alluxio.etcd.endpoints=<endpoints>
alluxio.client.list.status.from.ufs.enabled=false
alluxio.worker.page.store.sizes=2TB
alluxio.worker.page.store.page.size=4M
alluxio.worker.page.store.dirs=/data1/worker
alluxio.user.metadata.cache.max.size=2000000
alluxio.dora.client.ufs.fallback.enabled=false
alluxio.user.position.reader.streaming.async.prefetch.thread=256
```

### JVM options (`alluxio-env.sh`)

```
ALLUXIO_WORKER_JAVA_OPTS="$ALLUXIO_WORKER_JAVA_OPTS -Xmx24G -Xmx24G -XX:+UseG1GC"
ALLUXIO_FUSE_JAVA_OPTS="$ALLUXIO_FUSE_JAVA_OPTS -Xms48G -Xmx48G -XX:MaxDirectMemorySize=24g  -XX:+UseG1GC"
```

### Fuse mount options

```
-max_background=256 -max_idle_threads=256 -entry_timeout=60 -attr_timeout=60
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.alluxio.io/ee-ai-en/ai-3.6/benchmark/fio.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
