FIO

The Flexible I/O tester, or fio, is a benchmarking tool to test the performance of storage systems. It is a straightforward microbenchmarking tool that allows the user to describe various parameters for a repeatable workload, such as the file total size, block size, read types (sequential vs random read), number of concurrent clients, etc. You can read more about how others are utilizing the same benchmark, such as in this blog from Nvidia.

View the following tutorial video if it's your first time

Setup

This demo cluster recreates the testing environment and benchmark results described in the fio tests documentation page. Launch this demo cluster by selecting the fioLargeFileDemo cluster type; it will take about 25 minutes to complete its deployment.

An Alluxio worker is deployed on a i3en.metal EC2 instance, mounting the 8 NVMe SSDs storage devices provided as a single Raid 0 device. An Alluxio FUSE client is deployed on a c5n.metal EC2 instance, mounting the path /mnt/alluxio to the Alluxio filesystem root. An S3 bucket is created to serve as Alluxio's UFS, mounted on the /s3 path in the Alluxio filesystem. The Alluxio master and a single ETCD node reside on a t3.large EC2 instance.

As preparation to benchmark the read performance, a single 20GB file is written to the S3 bucket mounted to Alluxio at path s3://<bucketName>/single_20G/20G. From the client node where the FUSE process is running, this file can be accessed along the path /mnt/alluxio/s3/single_20G/20G.

Running the benchmark

The three main steps we will run in this experiment are:

A cold read run, as a baseline comparison
A hot read run, demonstrating the performance when leveraging Alluxio
Evict the cache to reset the cluster state for a new comparison

The key comparison is the cold read performance of reading the large file from S3 before any data is cached in Alluxio versus the hot read performance of the same operation but with the file is cached in Alluxio. It is also important to note that the data will be cached into Alluxio as a result of the cold read; there is no separate data load step required.

Most of these steps will be executed by running a job. All jobs are initiated from the Jobs link on the left navigation bar, following the sequence of selecting the cluster, selecting the job type, and setting parameters for the job.

Cold read

Run the fio benchmark job. The default parameters used are a block size of 256K and 32 jobs. If you change any of the parameters, note their values so that they can be reused for the subsequent run in order to have comparable results. Take note of the highlighted bandwidth and duration of the benchmark.

At the end of the first benchmark run, navigate to the metrics dashboard of the cluster. Notice the cache steadily filling with data throughout the duration of the first benchmark run.

There should be 20GB of data in Alluxio cache at the conclusion of the run, matching the size of the file in S3. With the file cached in Alluxio, a subsequent read operation on the same file will be served from Alluxio rather than the UFS.

Hot read

Rerun the fio benchmark with the same parameters, where the only difference is that it is a hot read since Alluxio is serving the data rather than from S3. Take note of the highlighted bandwidth and duration again to compare this hot read with the previous run's cold read.

Evict Alluxio cache to reset

To run another cold read to make another comparison, the cached data from Alluxio must be evicted first. To do this, run the Alluxio free job. After the job completes, check the metrics dashboard to confirm there is no cached data in Alluxio.

Example sequence to benchmark sequential and random reads

Run the fio benchmark with default parameters. This cold sequential read will take about 5 minutes to complete.
- Block size = 256K
- Number of jobs = 32
- Read type = Sequential read
Inspect metrics dashboard to confirm data is cached.
Run the fio benchmark again with the same parameters. This hot sequential read will take about 1 minute to complete.
Run the Alluxio free job to evict cached data.
Inspect the metrics dashboard to confirm the cache is empty.
Run the fio benchmark with the same parameters as before except change the read type to Random read. This cold random read will take about 2 minutes to complete.
Inspect metrics dashboard to confirm data is cached.
Run the fio benchmark again with the same parameters. This hot random read will take about 90 seconds to complete.

Last updated 10 months ago