> For the complete documentation index, see [llms.txt](https://documentation.alluxio.io/ee-ai-en/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://documentation.alluxio.io/ee-ai-en/benchmark/s3-api/httpbench.md). # httpbench Benchmarks `httpbench` is a purpose-built, \~50-line Go tool for measuring Alluxio S3 API read throughput. Use it when you need **per-worker isolation** (e.g., CPU-profiling one worker, or measuring a single NIC) or when your cluster runs in **redirect mode**, where [Warp](/ee-ai-en/benchmark/s3-api/warp.md) does not work. Compared to warp it: follows HTTP 307 transparently (Go default), accepts an explicit URL list instead of enumerating a bucket, skips SigV4 signing, and does not emit chunked SHA-256 payloads. For when to pick httpbench over [COSBench](/ee-ai-en/benchmark/s3-api/cosbench.md) or [Warp](/ee-ai-en/benchmark/s3-api/warp.md), see [Choosing a Benchmark Tool](/ee-ai-en/benchmark/s3-api.md#choosing-a-benchmark-tool). For reference throughput from a 6 × c5n.18xlarge cluster, see [6 Node httpbench on AWS](/ee-ai-en/benchmark/s3-api.md#id-6-node-httpbench-on-aws). ## Prerequisites * Go 1.21+ on the client host. * Alluxio workers reachable from the client on the S3 API port (default 29998). * Test dataset fully cached. Pre-load with `bin/alluxio job load --path --submit` and verify with `bin/alluxio fs check-cached `. ## Installation Save the tool source as `httpbench.go`: ```go package main import ( "flag" "fmt" "io" "net/http" "sync" "sync/atomic" "time" ) func main() { conc := flag.Int("c", 16, "concurrency (parallel workers)") dur := flag.Duration("d", 30*time.Second, "duration") flag.Parse() urls := flag.Args() if len(urls) == 0 { fmt.Println("usage: httpbench -c CONC -d DUR URL1 URL2 ...") return } tr := &http.Transport{ MaxIdleConns: *conc * 2, MaxIdleConnsPerHost: *conc * 2, MaxConnsPerHost: *conc * 2, IdleConnTimeout: 60 * time.Second, DisableCompression: true, ForceAttemptHTTP2: false, } client := &http.Client{Transport: tr, Timeout: 5 * time.Minute} var totalBytes, totalReqs int64 var wg sync.WaitGroup deadline := time.Now().Add(*dur) t0 := time.Now() for i := 0; i < *conc; i++ { wg.Add(1) go func(gid int) { defer wg.Done() j := gid for time.Now().Before(deadline) { url := urls[j%len(urls)] j++ resp, err := client.Get(url) if err != nil { continue } n, _ := io.Copy(io.Discard, resp.Body) resp.Body.Close() if resp.StatusCode >= 200 && resp.StatusCode < 300 { atomic.AddInt64(&totalBytes, n) atomic.AddInt64(&totalReqs, 1) } } }(i) } wg.Wait() elapsed := time.Since(t0).Seconds() fmt.Printf("Reqs: %d Bytes: %.2f GB Time: %.2fs\n", totalReqs, float64(totalBytes)/1e9, elapsed) fmt.Printf("→ %.2f GB/s (%.1f Gbps)\n", float64(totalBytes)/elapsed/1e9, float64(totalBytes)*8/elapsed/1e9) } ``` Build once: ```shell go build -o httpbench httpbench.go ``` Build a `file → owner worker` map. All three scenarios below reuse this map; build it once per bucket. A 1-byte Range GET against any worker returns `206` for locally-owned files or `307` (with the owner in `Location`) otherwise: ```shell for f in $(kubectl -n alx-ns exec -i alluxio-cluster-coordinator-0 -- \ alluxio fs ls /mybucket | awk '{print $NF}'); do name=$(basename "$f") resp=$(curl -s --range 0-0 -o /dev/null \ -w "%{http_code}|%{redirect_url}" \ "http://any-worker:29998/mybucket/$name") code=${resp%%|*} redir=${resp#*|} if [ "$code" = "206" ]; then echo "any-worker|$name" else owner=$(echo "$redir" | sed 's|http://||' | cut -d: -f1) echo "$owner|$name" fi done > file_owners.txt ``` ## Usage ### Scenario: Pattern 1 — Single Worker, Local-only Keys Isolates one worker's S3 API throughput without any cross-worker routing. Use when CPU-profiling a specific worker or measuring a single NIC. Bench one worker against only its local keys: ```shell # Extract URLs for worker-1 grep "^worker-1|" file_owners.txt \ | awk -F'|' '{print "http://worker-1:29998/mybucket/"$2}' \ > worker-1-urls.txt # Run the bench ./httpbench -c 32 -d 30s $(cat worker-1-urls.txt) ``` Sample output: ```console Reqs: 224 Bytes: 390.86 GB Time: 35.03s → 11.16 GB/s (89.3 Gbps) ``` ### Scenario: Pattern 2 — Single Client, Full Bucket via Redirect One client drives the whole bucket through a single entry worker. 307 redirects fire for keys owned by other workers; Go's client follows each and reuses keep-alive to the real owner. ```shell # Every URL points at the same entry worker — Alluxio redirects what it doesn't own for name in $(awk -F'|' '{print $2}' file_owners.txt); do echo "http://worker-1:29998/mybucket/$name" done > all-via-worker-1.txt ./httpbench -c 64 -d 30s $(cat all-via-worker-1.txt) ``` Bottlenecked at the client NIC, not the Alluxio cluster — throughput is \~11 GB/s for a 100 Gbps NIC, matching Pattern 1. For object-size-dependent behavior, see [307 Redirect Cost: Large vs Small Objects](/ee-ai-en/benchmark/s3-api.md#id-307-redirect-cost-large-vs-small-objects). ### Scenario: Pattern 3 — N Clients × N Workers Paired Aggregate Each client hits only its paired worker's local keys. This mirrors a distributed inference fleet loading model shards from Alluxio. Deploy `httpbench` + the per-worker URL list onto N clients, then kick off all clients simultaneously: ```shell # Coordinate start time START=$(date -d '+60 seconds' +%s) for i in 1 2 3 4 5 6; do ssh client-$i " while [ \$(date +%s) -lt $START ]; do sleep 0.1; done ./httpbench -c 32 -d 30s \$(cat worker-$i-urls.txt) " > client-$i.out 2>&1 & done wait # Aggregate awk '/GB\/s/ { sum += $2 } END { print sum, "GB/s aggregate" }' client-*.out ``` Expected aggregate on 6 × c5n.18xlarge (100 Gbps NICs, fully cached): **\~68 GB/s**, scaling near-linearly from per-pair 11.4 GB/s. Match against [6 Node httpbench on AWS](/ee-ai-en/benchmark/s3-api.md#id-6-node-httpbench-on-aws). ## Troubleshooting * **Pattern 2 looks identical to Pattern 1 for large objects** — this is expected behavior for 1+ GiB objects, where the 307 handshake amortises to near-zero. See [307 Redirect Cost: Large vs Small Objects](/ee-ai-en/benchmark/s3-api.md#id-307-redirect-cost-large-vs-small-objects). * **Throughput far below 95% of iperf3 ceiling** — most likely data is not fully cached, or a single-TCP-stream AWS ENA cap is hit (C=1 is capped at \~5 Gbps on ENA, regardless of NIC size). Bump `-c` to 32 or above, and verify cache with `bin/alluxio fs check-cached`. For cross-tool troubleshooting (kernel tuning, health check overhead, tail latency), see [Performance Tuning and Troubleshooting](/ee-ai-en/benchmark/s3-api.md#performance-tuning-and-troubleshooting) on the hub page. ## See Also * [S3 API Benchmarks](/ee-ai-en/benchmark/s3-api.md) — overview, reference baselines, tool selection, cross-tool troubleshooting * [COSBench Benchmarks](/ee-ai-en/benchmark/s3-api/cosbench.md) — for complex multi-stage workloads * [Warp Benchmarks](/ee-ai-en/benchmark/s3-api/warp.md) — quick single-binary alternative, redirect-mode incompatible * [S3 API Setup and Configuration](/ee-ai-en/data-access/s3-api.md) — deployment patterns, redirect behavior, load balancer configuration