S3 API Benchmarks

Scope

TL;DR

  • Two Performance baselines of running COSBench on Alluxio's S3-compatible API

  • Run COSBench or Warp to benchmark Alluxio's S3-compatible API

  • Key potential performance bottlenecks: http redirect, network bandwidth, TCP connection reuse, kernel tuning

For how Alluxio's S3 API works (request flow, consistent hashing, redirects), see Architecture Overview.

Baselines: 4 Node Cosbench on AWS

  • Performance baselines below assume data is fully cached in Alluxio. If data is served from the underlying UFS (e.g., S3), throughput will be significantly lower.

  • All throughput numbers are total cluster throughput measured with a 4-driver COSBench cluster against a 4-worker Alluxio cluster.

  • 1GB files — throughput is bandwidth-bound. Scales with concurrency until network saturation.

  • 100KB files — throughput is IOPS-bound. Scales with concurrency until CPU saturation.

  • Expect near-linear scaling up to hardware limits (typically 64–128 threads per driver).

  • Cluster scaling: total throughput scales approximately linearly with the number of workers.

Test Environment

The following results were achieved using a 4-driver COSBench cluster testing a 4-worker Alluxio cluster. The tests measured read throughput against data fully cached in Alluxio.

Component
Configuration

COSBench Controller

1x c5n.metal

COSBench Drivers

4x c5n.metal

Alluxio Coordinator

1 node

Alluxio Workers

4x i3en.metal (8 NVMe SSDs each)

Load Balancer

AWS ELB across 4 workers. See LB setup guide.

Large File Read Throughput (1GB files)

Concurrency per Driver
Total Throughput

1 Thread

2.35 GB/s

16 Threads

20.44 GB/s

128 Threads

36.94 GB/s

Small File Read IOPS (100KB files)

Concurrency per Driver
Total Throughput
Total Operations/sec

1 Thread

50.26 MB/s

502 op/s

16 Threads

1.10 GB/s

11,302 op/s

128 Threads

4.69 GB/s

46,757 op/s

Baselines: 6 Node Warp on OCI

In Warp GET tests on OCI BM.DenseIO.E5.128 nodes (100Gbps networking, 12× NVMe in RAID 0), Alluxio achieved 11.2 GiB/s on a single node (0.3 ms avg latency, 0.4 ms P99) and 33.3 GiB/s on 6 nodes (0.6 ms avg, 0.9 ms P99). Note that Warp does not support HTTP 307 redirects, so throughput is approximately halved compared to redirect-capable clients. See Alluxio on OCIarrow-up-right for full results.

COSBench vs Warp

COSBench
Warp

Best for

Complex, multi-stage workloads

Quick single-operation validation

Setup

Controller + driver nodes

Single binary

Workload definition

XML config files

CLI flags

Redirect support

Yes

No — requires alluxio.worker.s3.redirect.enabled=false

Results UI

Web dashboard

Terminal output

Benchmarking with COSBench

Prerequisites

  • Operating System: CentOS 7 (kernel 3.10) or later. COSBench has known compatibility issues on Ubuntu and is not recommended.

  • Pre-loaded Data: For benchmarking hot reads (cache hit), ensure your test dataset is fully loaded into the Alluxio cache. Use bin/alluxio job load --path /path --submit to load data and bin/alluxio fs check-cached /path to verify.

Installation

On all COSBench controller and driver nodes:

  1. Download COSBench: Download COSBench version 0.4.2.c4arrow-up-right and unzip it.

  2. Install Dependencies:

  3. Disable MD5 Validation: Edit the cosbench-start.sh script and add the following Java property to disable MD5 validation for S3 GET requests. This is necessary for compatibility with Alluxio's S3 API.

  4. Start COSBench Services: From the COSBench root directory, start the controller and all drivers.

1. Configure the Workload

Create an XML file (e.g., s3-benchmark.xml) to define the test workload. A COSBench workload consists of several stages:

Stage
Purpose

init

Creates the test buckets

prepare

Writes the initial data for testing

main

Runs the read/write operations for a set duration

cleanup

Deletes the objects created during the test

dispose

Deletes the buckets

Note: If you are using Alluxio mount points, you cannot create new buckets via the S3 API. You must skip the init and dispose stages and use pre-existing buckets that match your mount configuration.

COSBench config syntax:

  • r(1,10)range: sequentially iterates over items 1 through 10. Used in init, prepare, cleanup, and dispose stages.

  • u(1,10)uniform random: randomly selects an item between 1 and 10. Used in main stage for realistic access patterns.

  • c(64)KBconstant: fixed size of 64KB per object.

Example: Basic Read/Write Workload

Goal: Verify basic S3 read/write functionality with a small dataset.

Creates two buckets, writes 10 objects of 64KB to each, runs a 30-second test with an 80/20 read-write ratio, and then cleans up.

Note: The accesskey and secretkey below are placeholder values. Replace them with your Alluxio S3 API credentials.

Example: High-Concurrency Read Test

Goal: Measure maximum read IOPS/throughput under heavy concurrency. Key difference from the basic example: uses 4 distributed drivers with 128 threads each.

Prepares 10,000 small objects (100KB) and uses four drivers, each with 128 worker threads, to read concurrently for 300 seconds.

Note: Each <work> block is repeated per driver because COSBench requires explicit driver assignment for distributed workloads. Each block sends traffic from a separate driver node to maximize aggregate concurrency.

2. Submit the Workload

3. Monitor the Results

View benchmark status and results from the COSBench web interface at http://<CONTROLLER_IP>:19088/controller/index.html.

4. Stop COSBench Services

Benchmarking with Warp

MinIO Warparrow-up-right is a lightweight S3 performance evaluation tool for measuring GET, PUT, and mixed workload performance.

Important: Warp does not support HTTP 307 redirection. You must set alluxio.worker.s3.redirect.enabled=false in alluxio-site.properties. For details on redirect behavior and which clients are affected, see HTTP Redirects and Client Compatibility.

Installation

Running Warp

Ensure your warp client has network access to the Alluxio S3 endpoint.

PUT Throughput (Write)

GET Throughput (Read)

For expected throughput ranges under similar hardware, refer to the Performance Baselines section above.

Performance Tuning and Troubleshooting

For suggested Alluxio configuration parameters and Linux kernel tuning, see S3 API — Performance Tuning.

  • Throughput ~50% lower than expected — check if HTTP 307 redirect is not supported by the benchmark, so data is proxied through an intermediate worker instead of served directly by the owning worker. Enable redirects and use a client that supports 307. See HTTP Redirects.

  • Throughput far below baselines — most likely data is not fully cached. Verify with bin/alluxio fs check-cached that files show as cached in Alluxio before testing.

  • Low throughput despite high concurrency — network bottleneck or unbalanced load balancer. Verify 100Gbps connectivity, same-AZ deployment, and that the load balancer correctly distributes requests evenly across all Alluxio workers.

  • No scaling with added concurrency — CPU or connection pool bottleneck. Check worker CPU utilization and ensure connection.keep.alive is enabled.

  • High tail latency — TCP port exhaustion. Apply kernel tuning (tcp_tw_reuse, tcp_fin_timeout).

  • Throughput plateaus at low level — health check overhead. Disable alluxio.worker.s3.redirect.health.check.enabled for benchmarks.

  • Warp returns errors — Warp does not support HTTP 307 redirects. Set alluxio.worker.s3.redirect.enabled=false.

  • Inconsistent or highly variable results across runs — data not fully cached, or noisy environment (cross-AZ traffic, shared network). Pre-load data and re-run in a dedicated, same-AZ setup.

  • COSBench init stage fails — Alluxio mount points do not support bucket creation via S3 API. Skip init/dispose stages and use pre-existing buckets.

See Also

Last updated