# Benchmarks

Benchmarking is a critical step to validate that your Alluxio cluster is configured optimally and delivering the expected performance for your workloads. This section provides guides for using industry-standard tools to measure Alluxio's performance for different use cases.

These guides provide standardized methodologies for testing, but your results will vary based on your unique hardware, network, and cluster configuration.

## Hardware Selection

### Worker Nodes

Three principles determine whether your worker hardware can produce meaningful benchmark results:

1. **Storage must not be the bottleneck**: Alluxio serves cached data directly from the page store. Workers should have local NVMe SSDs — network-attached or spinning disk storage will become the bottleneck before Alluxio does.
2. **Network must not be the bottleneck**: The worker's network bandwidth caps the observable throughput. Use instances with at least 25 Gbps networking; 100 Gbps or higher is recommended for saturation tests.
3. **To measure peak performance, eliminate disk I/O**: If you want to isolate Alluxio's serving overhead from storage latency, configure the page store on a RAM-backed filesystem (`/dev/shm`, which is tmpfs on Linux — no extra configuration needed):

   ```yaml
   worker:
     pagestore:
       hostPath: /dev/shm/alluxio   # tmpfs — RAM-backed, no disk I/O
       size: 120Gi
       reservedSize: 10Gi
   ```

   > RAM cache is volatile — data is lost on pod restart. It is suitable for benchmarking and for AI inference workloads where the dataset can be reloaded from UFS.

Common instance types used as Alluxio workers in benchmarks:

| Cloud | Instance            | Storage      | Network  |
| ----- | ------------------- | ------------ | -------- |
| AWS   | `i3en.metal`        | 8× NVMe SSD  | 100 Gbps |
| OCI   | `BM.DenseIO.E5.128` | 12× NVMe SSD | 100 Gbps |

### Workload Generator Nodes

Workload generators (COSBench drivers, Warp clients, fio clients) do not need fast local storage — they only generate requests and receive data over the network. Network bandwidth and CPU are the relevant constraints.

Common instance type used for workload generators: AWS `c5n.metal` (100 Gbps, no local NVMe). A typical setup uses 4× `c5n.metal` drivers against a 4-worker Alluxio cluster.

## Benchmarking Guides

* [**POSIX API Benchmarks**](https://documentation.alluxio.io/ee-ai-en/benchmark/benchmarking-posix-performance) Use the Flexible I/O Tester (Fio) to measure the read/write throughput and IOPS of an Alluxio FUSE mount. This is ideal for general-purpose POSIX workloads.
* [**S3 API Benchmarks**](https://documentation.alluxio.io/ee-ai-en/benchmark/benchmarking-s3-api-performance) Use COSBench or Warp to stress-test Alluxio's S3-compatible API. This is useful for evaluating the performance of object storage workloads.
* [**MLPerf Benchmarks**](https://documentation.alluxio.io/ee-ai-en/benchmark/benchmarking-ml-training-performance-with-mlperf) Use the MLPerf Storage benchmark to simulate the I/O patterns of machine learning training jobs and evaluate how well Alluxio accelerates them.

## From Benchmarking to Optimization

After running benchmarks, the next step is to analyze the results and tune your cluster. For detailed guidance on configuring Alluxio for maximum performance, please see our guide on [**Performance Optimization**](https://documentation.alluxio.io/ee-ai-en/performance).
