Benchmarks
Benchmarking is a critical step to validate that your Alluxio cluster is configured optimally and delivering the expected performance for your workloads. This section provides guides for using industry-standard tools to measure Alluxio's performance for different use cases.
These guides provide standardized methodologies for testing, but your results will vary based on your unique hardware, network, and cluster configuration.
Hardware Selection
Worker Nodes
Three principles determine whether your worker hardware can produce meaningful benchmark results:
Storage must not be the bottleneck: Alluxio serves cached data directly from the page store. Workers should have local NVMe SSDs — network-attached or spinning disk storage will become the bottleneck before Alluxio does.
Network must not be the bottleneck: The worker's network bandwidth caps the observable throughput. Use instances with at least 25 Gbps networking; 100 Gbps or higher is recommended for saturation tests.
To measure peak performance, eliminate disk I/O: If you want to isolate Alluxio's serving overhead from storage latency, configure the page store on a RAM-backed filesystem (
/dev/shm, which is tmpfs on Linux — no extra configuration needed):worker: pagestore: hostPath: /dev/shm/alluxio # tmpfs — RAM-backed, no disk I/O size: 120Gi reservedSize: 10GiRAM cache is volatile — data is lost on pod restart. It is suitable for benchmarking and for AI inference workloads where the dataset can be reloaded from UFS.
Common instance types used as Alluxio workers in benchmarks:
AWS
i3en.metal
8× NVMe SSD
100 Gbps
OCI
BM.DenseIO.E5.128
12× NVMe SSD
100 Gbps
Workload Generator Nodes
Workload generators (COSBench drivers, Warp clients, fio clients) do not need fast local storage — they only generate requests and receive data over the network. Network bandwidth and CPU are the relevant constraints.
Common instance type used for workload generators: AWS c5n.metal (100 Gbps, no local NVMe). A typical setup uses 4× c5n.metal drivers against a 4-worker Alluxio cluster.
Benchmarking Guides
POSIX API Benchmarks Use the Flexible I/O Tester (Fio) to measure the read/write throughput and IOPS of an Alluxio FUSE mount. This is ideal for general-purpose POSIX workloads.
S3 API Benchmarks — reference baselines and tool selection for stress-testing Alluxio's S3-compatible API. Three tool-specific recipes are covered separately:
with COSBench — complex multi-stage mixed workloads.
with Warp — single-binary quick bucket-wide reads; redirect-disabled only.
with httpbench — ~50-line Go tool for per-worker, redirect-aware measurement.
MLPerf Benchmarks Use the MLPerf Storage benchmark to simulate the I/O patterns of machine learning training jobs and evaluate how well Alluxio accelerates them.
From Benchmarking to Optimization
After running benchmarks, the next step is to analyze the results and tune your cluster. For detailed guidance on configuring Alluxio for maximum performance, please see our guide on Performance Optimization.
Last updated