# httpbench Benchmarks

`httpbench` is a purpose-built, \~50-line Go tool for measuring Alluxio S3 API read throughput. Use it when you need **per-worker isolation** (e.g., CPU-profiling one worker, or measuring a single NIC) or when your cluster runs in **redirect mode**, where [Warp](/ee-ai-en/ai-3.8-15.1.x/benchmark/s3-api/warp.md) does not work. Compared to warp it: follows HTTP 307 transparently (Go default), accepts an explicit URL list instead of enumerating a bucket, skips SigV4 signing, and does not emit chunked SHA-256 payloads.

For when to pick httpbench over [COSBench](/ee-ai-en/ai-3.8-15.1.x/benchmark/s3-api/cosbench.md) or [Warp](/ee-ai-en/ai-3.8-15.1.x/benchmark/s3-api/warp.md), see [Choosing a Benchmark Tool](/ee-ai-en/ai-3.8-15.1.x/benchmark/s3-api.md#choosing-a-benchmark-tool). For reference throughput from a 6 × c5n.18xlarge cluster, see [6 Node httpbench on AWS](/ee-ai-en/ai-3.8-15.1.x/benchmark/s3-api.md#id-6-node-httpbench-on-aws).

## Prerequisites

* Go 1.21+ on the client host.
* Alluxio workers reachable from the client on the S3 API port (default 29998).
* Test dataset fully cached. Pre-load with `bin/alluxio job load --path <path> --submit` and verify with `bin/alluxio fs check-cached <path>`.

## Installation

Save the tool source as `httpbench.go`:

```go
package main

import (
    "flag"
    "fmt"
    "io"
    "net/http"
    "sync"
    "sync/atomic"
    "time"
)

func main() {
    conc := flag.Int("c", 16, "concurrency (parallel workers)")
    dur := flag.Duration("d", 30*time.Second, "duration")
    flag.Parse()
    urls := flag.Args()
    if len(urls) == 0 {
        fmt.Println("usage: httpbench -c CONC -d DUR URL1 URL2 ...")
        return
    }
    tr := &http.Transport{
        MaxIdleConns:        *conc * 2,
        MaxIdleConnsPerHost: *conc * 2,
        MaxConnsPerHost:     *conc * 2,
        IdleConnTimeout:     60 * time.Second,
        DisableCompression:  true,
        ForceAttemptHTTP2:   false,
    }
    client := &http.Client{Transport: tr, Timeout: 5 * time.Minute}

    var totalBytes, totalReqs int64
    var wg sync.WaitGroup
    deadline := time.Now().Add(*dur)
    t0 := time.Now()

    for i := 0; i < *conc; i++ {
        wg.Add(1)
        go func(gid int) {
            defer wg.Done()
            j := gid
            for time.Now().Before(deadline) {
                url := urls[j%len(urls)]
                j++
                resp, err := client.Get(url)
                if err != nil {
                    continue
                }
                n, _ := io.Copy(io.Discard, resp.Body)
                resp.Body.Close()
                if resp.StatusCode >= 200 && resp.StatusCode < 300 {
                    atomic.AddInt64(&totalBytes, n)
                    atomic.AddInt64(&totalReqs, 1)
                }
            }
        }(i)
    }
    wg.Wait()
    elapsed := time.Since(t0).Seconds()
    fmt.Printf("Reqs: %d  Bytes: %.2f GB  Time: %.2fs\n",
        totalReqs, float64(totalBytes)/1e9, elapsed)
    fmt.Printf("→ %.2f GB/s  (%.1f Gbps)\n",
        float64(totalBytes)/elapsed/1e9, float64(totalBytes)*8/elapsed/1e9)
}
```

Build once:

```shell
go build -o httpbench httpbench.go
```

Build a `file → owner worker` map. All three scenarios below reuse this map; build it once per bucket. A 1-byte Range GET against any worker returns `206` for locally-owned files or `307` (with the owner in `Location`) otherwise:

```shell
for f in $(kubectl -n alx-ns exec -i alluxio-cluster-coordinator-0 -- \
             alluxio fs ls /mybucket | awk '{print $NF}'); do
  name=$(basename "$f")
  resp=$(curl -s --range 0-0 -o /dev/null \
              -w "%{http_code}|%{redirect_url}" \
              "http://any-worker:29998/mybucket/$name")
  code=${resp%%|*}
  redir=${resp#*|}
  if [ "$code" = "206" ]; then
    echo "any-worker|$name"
  else
    owner=$(echo "$redir" | sed 's|http://||' | cut -d: -f1)
    echo "$owner|$name"
  fi
done > file_owners.txt
```

## Usage

### Scenario: Pattern 1 — Single Worker, Local-only Keys

Isolates one worker's S3 API throughput without any cross-worker routing. Use when CPU-profiling a specific worker or measuring a single NIC.

Bench one worker against only its local keys:

```shell
# Extract URLs for worker-1
grep "^worker-1|" file_owners.txt \
  | awk -F'|' '{print "http://worker-1:29998/mybucket/"$2}' \
  > worker-1-urls.txt

# Run the bench
./httpbench -c 32 -d 30s $(cat worker-1-urls.txt)
```

Sample output:

```console
Reqs: 224  Bytes: 390.86 GB  Time: 35.03s
→ 11.16 GB/s  (89.3 Gbps)
```

### Scenario: Pattern 2 — Single Client, Full Bucket via Redirect

One client drives the whole bucket through a single entry worker. 307 redirects fire for keys owned by other workers; Go's client follows each and reuses keep-alive to the real owner.

```shell
# Every URL points at the same entry worker — Alluxio redirects what it doesn't own
for name in $(awk -F'|' '{print $2}' file_owners.txt); do
  echo "http://worker-1:29998/mybucket/$name"
done > all-via-worker-1.txt

./httpbench -c 64 -d 30s $(cat all-via-worker-1.txt)
```

Bottlenecked at the client NIC, not the Alluxio cluster — throughput is \~11 GB/s for a 100 Gbps NIC, matching Pattern 1. For object-size-dependent behavior, see [307 Redirect Cost: Large vs Small Objects](/ee-ai-en/ai-3.8-15.1.x/benchmark/s3-api.md#id-307-redirect-cost-large-vs-small-objects).

### Scenario: Pattern 3 — N Clients × N Workers Paired Aggregate

Each client hits only its paired worker's local keys. This mirrors a distributed inference fleet loading model shards from Alluxio.

Deploy `httpbench` + the per-worker URL list onto N clients, then kick off all clients simultaneously:

```shell
# Coordinate start time
START=$(date -d '+60 seconds' +%s)

for i in 1 2 3 4 5 6; do
  ssh client-$i "
    while [ \$(date +%s) -lt $START ]; do sleep 0.1; done
    ./httpbench -c 32 -d 30s \$(cat worker-$i-urls.txt)
  " > client-$i.out 2>&1 &
done
wait

# Aggregate
awk '/GB\/s/ { sum += $2 } END { print sum, "GB/s aggregate" }' client-*.out
```

Expected aggregate on 6 × c5n.18xlarge (100 Gbps NICs, fully cached): **\~68 GB/s**, scaling near-linearly from per-pair 11.4 GB/s. Match against [6 Node httpbench on AWS](/ee-ai-en/ai-3.8-15.1.x/benchmark/s3-api.md#id-6-node-httpbench-on-aws).

## Troubleshooting

* **Pattern 2 looks identical to Pattern 1 for large objects** — this is expected behavior for 1+ GiB objects, where the 307 handshake amortises to near-zero. See [307 Redirect Cost: Large vs Small Objects](/ee-ai-en/ai-3.8-15.1.x/benchmark/s3-api.md#id-307-redirect-cost-large-vs-small-objects).
* **Throughput far below 95% of iperf3 ceiling** — most likely data is not fully cached, or a single-TCP-stream AWS ENA cap is hit (C=1 is capped at \~5 Gbps on ENA, regardless of NIC size). Bump `-c` to 32 or above, and verify cache with `bin/alluxio fs check-cached`.

For cross-tool troubleshooting (kernel tuning, health check overhead, tail latency), see [Performance Tuning and Troubleshooting](/ee-ai-en/ai-3.8-15.1.x/benchmark/s3-api.md#performance-tuning-and-troubleshooting) on the hub page.

## See Also

* [S3 API Benchmarks](/ee-ai-en/ai-3.8-15.1.x/benchmark/s3-api.md) — overview, reference baselines, tool selection, cross-tool troubleshooting
* [COSBench Benchmarks](/ee-ai-en/ai-3.8-15.1.x/benchmark/s3-api/cosbench.md) — for complex multi-stage workloads
* [Warp Benchmarks](/ee-ai-en/ai-3.8-15.1.x/benchmark/s3-api/warp.md) — quick single-binary alternative, redirect-mode incompatible
* [S3 API Setup and Configuration](/ee-ai-en/ai-3.8-15.1.x/data-access/s3-api.md) — deployment patterns, redirect behavior, load balancer configuration


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.alluxio.io/ee-ai-en/ai-3.8-15.1.x/benchmark/s3-api/httpbench.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
