# FIO

The [Flexible I/O tester](https://fio.readthedocs.io/en/latest/fio_doc.html), or fio, is a benchmarking tool to test the performance of storage systems. It is a straightforward microbenchmarking tool that allows the user to describe various parameters for a repeatable workload, such as the file total size, block size, read types (sequential vs random read), number of concurrent clients, etc. You can read more about how others are utilizing the same benchmark, such as in [this blog from Nvidia](https://developer.nvidia.com/blog/storage-performance-basics-for-deep-learning/).

{% hint style="info" %}
View the following tutorial video if it's your first time
{% endhint %}

{% embed url="<https://www.youtube.com/watch?v=StErjysOCeE>" %}

## Setup

This demo cluster recreates the testing environment and benchmark results described in the [fio tests documentation page](https://docs.alluxio.io/ee-ai/user/stable/en/performance/Fio-Tests.html). Launch this demo cluster by selecting the `fioLargeFileDemo` cluster type; it will take about 25 minutes to complete its deployment.

An Alluxio worker is deployed on a [i3en.metal EC2 instance](https://aws.amazon.com/ec2/instance-types/i3en/), mounting the 8 NVMe SSDs storage devices provided as a single Raid 0 device. An Alluxio FUSE client is deployed on a [c5n.metal EC2 instance](https://aws.amazon.com/ec2/instance-types/c5/), mounting the path `/mnt/alluxio` to the Alluxio filesystem root. An S3 bucket is created to serve as Alluxio's UFS, mounted on the `/s3` path in the Alluxio filesystem. The Alluxio master and a single ETCD node reside on a [t3.large EC2 instance](https://aws.amazon.com/ec2/instance-types/t3/).

As preparation to benchmark the read performance, a single 20GB file is written to the S3 bucket mounted to Alluxio at path `s3://<bucketName>/single_20G/20G`. From the client node where the FUSE process is running, this file can be accessed along the path `/mnt/alluxio/s3/single_20G/20G`.

## Running the benchmark

The three main steps we will run in this experiment are:

1. A cold read run, as a baseline comparison
2. A hot read run, demonstrating the performance when leveraging Alluxio
3. Evict the cache to reset the cluster state for a new comparison

The key comparison is the cold read performance of reading the large file from S3 before any data is cached in Alluxio versus the hot read performance of the same operation but with the file is cached in Alluxio. It is also important to note that the data will be cached into Alluxio as a result of the cold read; there is no separate data load step required.

Most of these steps will be executed by [running a job](https://documentation.alluxio.io/rad/get-started/running-jobs). All jobs are initiated from the Jobs link on the left navigation bar, following the sequence of selecting the cluster, selecting the job type, and setting parameters for the job.

### Cold read

Run the [fio benchmark job](https://documentation.alluxio.io/rad/running-jobs/fio-benchmark#fiobenchmark). The default parameters used are a block size of 256K and 32 jobs. If you change any of the parameters, note their values so that they can be reused for the subsequent run in order to have comparable results. Take note of the highlighted bandwidth and duration of the benchmark.

At the end of the first benchmark run, navigate to the [metrics dashboard](https://documentation.alluxio.io/rad/deploy/view-cluster-details#metrics-dashboard) of the cluster. Notice the cache steadily filling with data throughout the duration of the first benchmark run.

<figure><img src="https://3120376371-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FmrwbXqYrN8NJJz04XpWG%2Fuploads%2Fgit-blob-b5c7f064f69ac2989c3bb5e4f74a968bf152bbce%2Fimage%20(1).png?alt=media" alt=""><figcaption></figcaption></figure>

There should be 20GB of data in Alluxio cache at the conclusion of the run, matching the size of the file in S3. With the file cached in Alluxio, a subsequent read operation on the same file will be served from Alluxio rather than the UFS.

### Hot read

Rerun the fio benchmark with the same parameters, where the only difference is that it is a hot read since Alluxio is serving the data rather than from S3. Take note of the highlighted bandwidth and duration again to compare this hot read with the previous run's cold read.

### Evict Alluxio cache to reset

To run another cold read to make another comparison, the cached data from Alluxio must be evicted first. To do this, run the [Alluxio free job](https://documentation.alluxio.io/rad/get-started/running-jobs/alluxio-free-cache). After the job completes, check the [metrics dashboard](https://documentation.alluxio.io/rad/deploy/view-cluster-details#metrics-dashboard) to confirm there is no cached data in Alluxio.

## Example sequence to benchmark sequential and random reads

1. Run the fio benchmark with default parameters. This cold sequential read will take about **5 minutes** to complete.
   * Block size = 256K
   * Number of jobs = 32
   * Read type = Sequential read
2. Inspect metrics dashboard to confirm data is cached.
3. Run the fio benchmark again with the same parameters. This hot sequential read will take about **1 minute** to complete.
4. Run the Alluxio free job to evict cached data.
5. Inspect the metrics dashboard to confirm the cache is empty.
6. Run the fio benchmark with the same parameters as before except change the read type to `Random read`. This cold random read will take about **2 minutes** to complete.
7. Inspect metrics dashboard to confirm data is cached.
8. Run the fio benchmark again with the same parameters. This hot random read will take about **90 seconds** to complete.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.alluxio.io/rad/get-started/demo-clusters/fio.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
