# S3 API

Alluxio exposes an S3-compatible REST API that allows applications built for Amazon S3 to read and write data without code changes. This page covers endpoint configuration, authentication, load balancing, and client compatibility.

> **Need low-latency writes?** If your workload requires millisecond-level `PUT` latency or async persistence, see [S3-API Write Optimization](/ee-ai-en/performance/s3-write-cache.md).

## Prerequisites

Complete [Installing on Kubernetes](/ee-ai-en/start/installing-on-kubernetes.md) through Step 5 (Mount Storage) before proceeding. The S3 API requires:

* A running Alluxio cluster (`CLUSTERPHASE` = `Ready`)
* At least one UFS mount point — mount points become the S3 buckets exposed by the API

## Quick Start

This section gets you to a working S3 endpoint in three steps using the default configuration (proxy mode, Pattern B). Most clients — including AWS CLI, boto3, and PyTorch S3 Connector — work out of the box with this setup.

### Step 1: Enable the S3 API

Add the following to `spec.properties` in `alluxio-cluster.yaml` and apply:

```yaml
spec:
  properties:
    alluxio.worker.s3.api.enabled: "true"
```

```shell
kubectl apply -f alluxio-cluster.yaml
```

The S3 API is exposed on every worker on port `29998` (HTTP) and `29996` (HTTPS, requires [TLS](/ee-ai-en/administration/security/securing-alluxio-with-tls.md)).

### Step 2: Expose the Endpoint

**With a load balancer (recommended for production)**

Add a worker service to `alluxio-cluster.yaml`:

```yaml
spec:
  worker:
    service:
      type: LoadBalancer
```

```shell
kubectl apply -f alluxio-cluster.yaml
```

Wait for the external IP to be assigned:

```shell
kubectl -n alx-ns get svc alluxio-cluster-worker --watch
```

**✅ Success:** `EXTERNAL-IP` is populated. Use that address as `<ENDPOINT>` below.

**Without a load balancer (eval/single-node)**

Use port-forward to reach a worker directly:

```shell
kubectl -n alx-ns port-forward pod/alluxio-cluster-worker-0 29998:29998 &
```

Use `http://localhost:29998` as `<ENDPOINT>`.

### Step 3: Verify with AWS CLI

Configure path-style requests (required — virtual-hosted style is not supported):

```shell
aws configure set default.s3.addressing_style path
```

With SIMPLE auth (the default), any non-empty credentials are accepted:

```shell
# List buckets (Alluxio mount points)
AWS_ACCESS_KEY_ID=testuser AWS_SECRET_ACCESS_KEY=testpassword \
  aws s3 ls --endpoint-url http://<ENDPOINT>

# Upload a test file
AWS_ACCESS_KEY_ID=testuser AWS_SECRET_ACCESS_KEY=testpassword \
  aws s3 cp test.txt s3://<bucket>/test.txt --endpoint-url http://<ENDPOINT>

# Download it back
AWS_ACCESS_KEY_ID=testuser AWS_SECRET_ACCESS_KEY=testpassword \
  aws s3 cp s3://<bucket>/test.txt downloaded.txt --endpoint-url http://<ENDPOINT>
```

**Quick boto3 example:**

```python
import boto3

s3 = boto3.client(
    "s3",
    aws_access_key_id="placeholder",
    aws_secret_access_key="placeholder",
    region_name="us-east-1",
    endpoint_url="http://<ENDPOINT>",
)

for bucket in s3.list_buckets().get("Buckets", []):
    print(bucket["Name"])
```

> **Want higher throughput?** The default proxy mode traverses the cluster network twice for cross-worker reads, roughly halving throughput compared to redirect mode. If your client follows HTTP 307 redirects to non-AWS endpoints, see [Pattern A](#pattern-a-load-balancer--redirect) for near-linear scaling.

***

## Deployment Patterns

### How It Works

Each Alluxio worker exposes an S3 endpoint on port `29998` and owns a slice of the namespace via [consistent hashing](/ee-ai-en/how-alluxio-works.md#the-consistent-hash-ring). When a request lands on a worker that does not own the requested data, the worker must route it to the data owner. There are two strategies:

{% hint style="info" %}
**Background: what HTTP 307 means here.** HTTP 307 is a redirect status code — the server responds with a new URL instead of data. The client then opens a fresh connection directly to that URL. Critically, no data payload traverses the original worker: the redirect response is a tiny HTTP message, and the full read happens worker-to-client without a proxy intermediary. The only cost is one extra round-trip for the redirect itself.
{% endhint %}

* **Proxy mode** (`alluxio.worker.s3.redirect.enabled=false`, the default): the receiving worker fetches data from the owning worker and streams it to the client. Every client works. Cost: cross-worker reads traverse the network twice (\~50% throughput vs. redirect mode).
* **Redirect mode** (`alluxio.worker.s3.redirect.enabled=true`): the receiving worker issues an HTTP 307 pointing the client directly at the data owner — zero proxy overhead, linear throughput scaling. **Constraint**: the AWS SDK does not follow 307 redirects to non-AWS endpoints. Clients built on it (AWS CLI, boto3, PyTorch S3 Connector) cannot use this mode.

### Choosing a Pattern

```
Does your client follow HTTP 307 redirects to non-AWS endpoints?
├── Yes (COSBench, minio-py)
│   └── Pattern A: Load Balancer + Redirect  (alluxio.worker.s3.redirect.enabled=true)
│       Maximum throughput — client connects directly to data-owning worker
└── No  (AWS SDK / CLI, boto3, PyTorch S3 Connector, Warp)
    └── Pattern B: Load Balancer + Proxy Mode  (alluxio.worker.s3.redirect.enabled=false, default)
        Simple setup — ~50% throughput vs Pattern A due to cross-worker proxy
```

### Pattern A: Load Balancer + Redirect

Best for: COSBench, minio-py, and any client that correctly follows HTTP 307 redirects to non-AWS endpoints.

```
┌─────────────────────────────────┐
│         S3 Clients              │
│      (minio-py, COSBench)       │
└──────────┬──────────────────────┘
           │  S3 API (HTTP)
           ▼
┌─────────────────────────────────┐
│   Load Balancer (NLB/Nginx)     │
└──────────┬──────────────────────┘
           │  Distributed across workers
           ▼
┌─────────────────────────────────────────────────┐
│  Alluxio Workers (port 29998)                   │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
│  │ Worker 1 │  │ Worker 2 │  │ Worker N │ ...   │
│  │ NVMe SSD │  │ NVMe SSD │  │ NVMe SSD │       │
│  └──────────┘  └──────────┘  └──────────┘       │
└──────────┬──────────────────────────────────────┘
           │  Cache miss → fetch from UFS
           ▼
┌─────────────────────────────────────┐
│   Underlying Storage (S3/HDFS/...)  │
└─────────────────────────────────────┘
```

How it works: the load balancer distributes incoming requests across workers. With `alluxio.worker.s3.redirect.enabled=true`, the receiving worker checks ownership via consistent hashing and issues a 307 only when needed — if the request already lands on the data owner, no redirect occurs. The client connects directly to the data-owning worker — **zero-copy, no proxy overhead**. Read throughput scales linearly with the number of workers. For throughput baselines, see [Benchmarking S3 API Performance](/ee-ai-en/benchmark/s3-api.md).

{% tabs %}
{% tab title="Kubernetes (Operator)" %}

```yaml
apiVersion: k8s-operator.alluxio.com/v1
kind: AlluxioCluster
spec:
  properties:
    alluxio.worker.s3.api.enabled: "true"
    alluxio.worker.s3.redirect.enabled: "true"
  worker:
    service:
      type: LoadBalancer
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-type: nlb
        service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
```

```shell
kubectl apply -f alluxio-cluster.yaml
```

Wait for the NLB to be provisioned (typically 1–2 minutes on AWS):

```shell
kubectl -n alx-ns get svc alluxio-cluster-worker --watch
```

**✅ Success:** `EXTERNAL-IP` is populated. Use that hostname as the S3 endpoint.
{% endtab %}

{% tab title="Docker / Bare-Metal" %}
Use Nginx, HAProxy, or DNS round-robin to distribute requests across worker nodes on port `29998`.
{% endtab %}
{% endtabs %}

### Pattern B: Load Balancer + Proxy Mode

Best for: AWS SDK (AWS CLI), boto3, PyTorch S3 Connector (AWS CRT), and MinIO Warp — clients that do not follow HTTP 307 redirects to non-AWS endpoints.

`alluxio.worker.s3.redirect.enabled=false` (the default). All clients work without code changes.

Use the same load balancer setup as Pattern A, but leave `alluxio.worker.s3.redirect.enabled` at its default (`false`). The load balancer distributes requests across workers; cross-worker reads are proxied internally.

**Trade-off**: data traverses the cluster network twice for cross-worker reads (data owner → proxy worker → client), roughly halving aggregate throughput compared to Pattern A.

Point your S3 endpoint to the load balancer address:

```
http://<LOAD_BALANCER_ADDRESS>:29998
```

{% hint style="info" %}
**Single-worker deployments**: 307 redirects only fire when the receiving worker is not the data owner. With a single worker, redirects never occur and all clients work regardless of `redirect.enabled`.
{% endhint %}

***

## Authentication

Alluxio's S3 API supports two authentication methods:

**SIMPLE (Default)** — Alluxio parses the [AWS Signature V4](https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-auth-using-authorization-header.html) `Authorization` header to extract the username, but **does not validate the signature**.

* **Access Key**: The Alluxio username to perform operations as. If omitted, operations run as the user that launched the worker process.
* **Secret Key**: Any non-empty value. Required by the client to generate the signature, but ignored by Alluxio.

**OIDC** — For centralized identity management using OpenID Connect tokens, refer to the [Authentication](/ee-ai-en/administration/security/enabling-authentication.md) guide.

***

## Performance

### Key Alluxio Parameters

The following parameters control S3 API performance. Apply the recommended values for high-throughput workloads.

* `alluxio.worker.s3.redirect.enabled` — Default: `false`; Set to `true` for Pattern A (redirect mode). When `false`, all cross-worker reads are proxied through the receiving worker, which adds an extra network hop and roughly halves throughput. Enable only with clients that correctly follow HTTP 307 redirects to non-AWS endpoints (see [Deployment Patterns](#deployment-patterns)).
* `alluxio.worker.s3.connection.keep.alive.enabled` — Default: `false`; Recommended: `true`. Reuses TCP connections across requests to reduce handshake overhead and improve throughput under high concurrency.
* `alluxio.worker.s3.connection.idle.max.time` — Default: `0sec`. Idle timeout for keep-alive connections; `0` means no timeout.
* `alluxio.worker.s3.access.logging.enabled` — Default: `false`; Recommended: `true` for production. When `false`, only failed requests are logged; when `true`, every request is logged. Useful for auditing and debugging; disable during benchmarks to avoid I/O overhead.
* `alluxio.worker.s3.only.https.access` — Default: `false`. When enabled, rejects non-HTTPS requests.
* `alluxio.worker.s3.redirect.health.check.enabled` — Default: `true`; Recommended: `false` for benchmarks. Disables per-request RPC health checks before redirect, reducing overhead and improving throughput; keep enabled in production for safety.

### Linux Kernel Parameters

For intensive S3 benchmarks with high TCP connection rates, tuning kernel parameters can improve connection reuse and lower latency.

```shell
sysctl -w net.ipv4.tcp_tw_reuse=1
sysctl -w net.ipv4.tcp_fin_timeout=30

# Only on older kernels (e.g., CentOS 7 / kernel 3.10):
sysctl -w net.ipv4.tcp_tw_recycle=1
```

| Parameter                  | Effect                                                                                                                                                                                                                   |
| -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `net.ipv4.tcp_tw_reuse`    | Reuses sockets in `TIME_WAIT` state, preventing port exhaustion during high-frequency requests.                                                                                                                          |
| `net.ipv4.tcp_tw_recycle`  | Accelerates `TIME_WAIT` cleanup. **Removed in Linux 4.12+** — this parameter only exists on older kernels (e.g., CentOS 7 with kernel 3.10). On newer kernels, this sysctl key does not exist and the command will fail. |
| `net.ipv4.tcp_fin_timeout` | Reduces idle connection close time, freeing resources faster. Default: 60s → recommended: 30s.                                                                                                                           |

> **Warning**: Modifying kernel parameters can impact system stability. Ensure you understand these settings before applying them, especially in production environments.
>
> **NAT Compatibility**: On older kernels where `tcp_tw_recycle` is available, enabling it may cause connection failures for clients behind a NAT device due to timestamp validation issues. Do not use in NAT environments.

## Advanced Features

### MultiPartUpload (MPU)

For large object uploads that pass through to the underlying storage, these parameters control multipart upload behavior:

| Parameter                                             | Default | Recommended               | Description                                                                  |
| ----------------------------------------------------- | ------- | ------------------------- | ---------------------------------------------------------------------------- |
| `alluxio.underfs.object.store.multipart.upload.async` | `false` | `true`                    | Enables async multipart uploads to UFS, improving write throughput.          |
| `alluxio.underfs.s3.upload.threads.max`               | `20`    | `256` for high-throughput | Maximum concurrent upload threads per worker for S3 UFS writes.              |
| `alluxio.underfs.s3.multipart.upload.buffer.number`   | `64`    | `256` for high-throughput | Number of multipart upload buffers. Increase alongside `upload.threads.max`. |

By default, multipart uploads are **pass-through** — each part is uploaded directly to the underlying storage, and the object is committed there upon `CompleteMultipartUpload`. Alluxio does not buffer the parts locally.

If you need to write parts to Alluxio's cache layer first and asynchronously persist them to the underlying storage, see [S3-API Write Optimization](/ee-ai-en/performance/s3-write-cache.md).

### Tagging and Metadata

* **Enable Tagging**: Requires extended attribute support for your UFS:

  ```properties
  alluxio.underfs.xattr.change.enabled=true
  ```
* **Tag Limits**: User-defined tags are limited to 10 per object/bucket per [S3 tag restrictions](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-tagging.html). Disable with `alluxio.worker.s3.tagging.restrictions.enabled=false`.
* **Metadata Size**: User-defined metadata is limited to 2KB per [S3 metadata restrictions](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingMetadata.html). Adjust with `alluxio.worker.s3.header.metadata.max.size`.

## Limitations

Before adopting Alluxio's S3 API, be aware of these constraints:

* **Path-style only** — Virtual-hosted style requests (`http://<bucket>.<endpoint>/...`) are not supported. All clients must use path-style (`http://<endpoint>/<bucket>/<key>`).
* **Buckets**: Only top-level directories in the Alluxio namespace are treated as S3 buckets. The root directory (`/`) is not a bucket. To preserve existing S3 URIs, use the bucket name as the mount path:

  ```
  alluxio mount add --path /<bucket-name> --ufs-uri s3://<bucket-name>/
  ```
* **No Versioning or Locking**: If multiple clients write to the same object simultaneously, the last write wins.
* **Unsupported Characters**: Object keys must not contain `?`, `\`, `./`, or `../`; `//` may cause undefined behavior. For **multipart uploads (MPU)**, object keys are additionally restricted to letters, digits, spaces, and the characters `_ . : / = + - @` due to implementation constraints. Characters outside this set — such as `&`, `#`, `%`, `!`, `^`, `*`, `|`, `<`, `>`, `(`, `)`, `[`, `]`, `{`, `}`, `"`, `'`, and `~` — are not supported in MPU object keys.
* **Folder Objects**: Subdirectories are returned as 0-byte folder objects in `ListObjects(V2)` responses, matching AWS S3 console behavior.
* **No Atomicity**: `If-Match` and `If-None-Match` headers are not supported in GetObject and PutObject.
* **Signatures are not validated** — Alluxio extracts the username from the `Authorization` header but does not verify the cryptographic signature. The secret key can be any non-empty string.
* **AWS SDK clients must use proxy mode** — AWS SDK (AWS CLI), boto3, and PyTorch S3 Connector (AWS CRT) do not follow HTTP 307 redirects to non-AWS endpoints. Use [Pattern B: Load Balancer + Proxy Mode](#pattern-b-load-balancer-proxy-mode) (`alluxio.worker.s3.redirect.enabled=false`, the default).
* **No virtual-hosted DNS** — There is no DNS wildcard resolution for `<bucket>.endpoint`; this is a corollary of path-style only.

## Supported S3 Actions

The following table lists supported S3 API actions. For detailed usage, see the [official S3 API documentation](https://docs.aws.amazon.com/AmazonS3/latest/API/API_Operations.html).

| S3 API Action                                                                                               | Supported Headers                                                                                           | Supported Query Parameters                                                              |
| ----------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
| [AbortMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_AbortMultipartUpload.html)       | N/A                                                                                                         | N/A                                                                                     |
| [CompleteMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CompleteMultipartUpload.html) | N/A                                                                                                         | N/A                                                                                     |
| [CopyObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html)                           | `Content-Type`, `x-amz-copy-source`, `x-amz-metadata-directive`, `x-amz-tagging-directive`, `x-amz-tagging` | N/A                                                                                     |
| [CreateMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html)     | N/A                                                                                                         | N/A                                                                                     |
| [DeleteBucketTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteBucketTagging.html)         | N/A                                                                                                         | N/A                                                                                     |
| [DeleteObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObject.html)                       | N/A                                                                                                         | N/A                                                                                     |
| [DeleteObjects](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html)                     | N/A                                                                                                         | N/A                                                                                     |
| [DeleteObjectTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjectTagging.html)         | N/A                                                                                                         | N/A                                                                                     |
| [GetBucketTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketTagging.html)               | N/A                                                                                                         | N/A                                                                                     |
| [GetObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html)                             | `Range`                                                                                                     | N/A                                                                                     |
| [GetObjectTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectTagging.html)               | N/A                                                                                                         | N/A                                                                                     |
| [HeadBucket](https://docs.aws.amazon.com/AmazonS3/latest/API/API_HeadBucket.html)                           | N/A                                                                                                         | N/A                                                                                     |
| [HeadObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_HeadObject.html)                           | N/A                                                                                                         | N/A                                                                                     |
| [ListBuckets](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListBuckets.html)                         | N/A                                                                                                         | N/A                                                                                     |
| [ListMultipartUploads](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListMultipartUploads.html)       | N/A                                                                                                         | N/A                                                                                     |
| [ListObjects](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html)                         | N/A                                                                                                         | `delimiter`, `encoding-type`, `marker`, `max-keys`, `prefix`                            |
| [ListObjectsV2](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html)                     | N/A                                                                                                         | `continuation-token`, `delimiter`, `encoding-type`, `max-keys`, `prefix`, `start-after` |
| [ListParts](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListParts.html)                             | N/A                                                                                                         | N/A                                                                                     |
| [PutBucketTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketTagging.html)               | N/A                                                                                                         | N/A                                                                                     |
| [PutObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html)                             | `Content-Length`, `Content-MD5`, `Content-Type`, `x-amz-tagging`                                            | N/A                                                                                     |
| [PutObjectTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObjectTagging.html)               | N/A                                                                                                         | N/A                                                                                     |
| [UploadPart](https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html)                           | `Content-Length`, `Content-MD5`                                                                             | N/A                                                                                     |

## Usage Examples

### minio-py — Pattern A (redirect-capable)

[minio-py](https://minio-py.min.io/) is a Python SDK that correctly follows HTTP 307 redirects to non-AWS endpoints. Use with a load balancer endpoint.

```python
from minio import Minio

client = Minio(
    "<LOAD_BALANCER_ADDRESS>",   # host:port, no http:// prefix
    access_key="testuser",       # Alluxio username (or any value)
    secret_key="testpassword",   # Ignored by Alluxio
    secure=False,                # Set True if using HTTPS
)

# List buckets (Alluxio mount points)
for bucket in client.list_buckets():
    print(bucket.name)

# Download an object
response = client.get_object("<bucket>", "<object-key>")
data = response.read()
response.close()
response.release_conn()

# Upload an object
client.put_object("<bucket>", "<object-key>", data=open("file.txt", "rb"), length=-1, part_size=10*1024*1024)
```

Install with: `pip install minio`

### boto3 — Pattern B (proxy mode)

boto3 does not follow HTTP 307 redirects to non-AWS endpoints. Use Pattern B (proxy mode) with a load balancer endpoint.

```python
import boto3

s3 = boto3.client(
    "s3",
    aws_access_key_id="placeholder",      # Alluxio username (or any value)
    aws_secret_access_key="placeholder",   # Ignored by Alluxio
    region_name="us-east-1",
    endpoint_url="http://<LOAD_BALANCER_ADDRESS>:29998",
)

# List buckets (Alluxio mount points)
response = s3.list_buckets()
for bucket in response.get("Buckets", []):
    print(f" - {bucket['Name']}")
```

Install with: `pip install boto3`

### PyTorch S3 Connector — Pattern B (proxy mode)

PyTorch S3 Connector is built on AWS CRT and does not follow HTTP 307 redirects to non-AWS endpoints. Use Pattern B (proxy mode) with a load balancer endpoint.

```python
from s3torchconnector import S3IterableDataset, S3ClientConfig

s3_client_config = S3ClientConfig(force_path_style=True)

dataset = S3IterableDataset.from_prefix(
    "s3://s3-mount",                                    # Alluxio mount point
    region="us-east-1",
    endpoint="http://<LOAD_BALANCER_ADDRESS>:29998",
    s3client_config=s3_client_config,
)

for item in dataset:
    content = item.read()
    print(f"{item.key}: {len(content)} bytes")
```

Install with:

```shell
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install s3torchconnector
```

## Troubleshooting

### AWS CLI, boto3, or PyTorch return errors or wrong data

**Symptom**: Operations fail with connection errors, or return unexpected results.

**Cause**: These clients are built on the AWS SDK (or AWS CRT), which does not follow HTTP 307 redirects to non-AWS endpoints. They require [Pattern B: Load Balancer + Proxy Mode](#pattern-b-load-balancer-proxy-mode) — see [Deployment Patterns](#deployment-patterns) for setup instructions.

### `NoSuchBucket` error even though the bucket exists in S3

**Symptom**: `aws s3 ls s3://<bucket>` or a `GetObject` returns `NoSuchBucket`.

**Cause**: In Alluxio's S3 API, buckets are **Alluxio mount point names**, not UFS bucket names. The bucket name in S3 requests must match the mount path in Alluxio (e.g., `/s3` → bucket name `s3`), not the underlying S3 bucket name.

**Diagnosis**: List the actual bucket names Alluxio exposes:

```shell
AWS_ACCESS_KEY_ID=testuser AWS_SECRET_ACCESS_KEY=testpassword \
  aws s3 ls --endpoint-url http://<ENDPOINT>
```

Then use the name shown in the output — not the UFS bucket name — in subsequent requests.

### Connection refused on port 29998

**Symptom**: `aws s3 ls` returns `Could not connect to the endpoint URL`.

**Cause**: The S3 API is not enabled on the workers, or the worker service is not reachable.

**Fix**:

1. Confirm S3 API is enabled:

   ```shell
   kubectl -n alx-ns exec -i alluxio-cluster-worker-0 -- alluxio conf get alluxio.worker.s3.api.enabled
   ```

   Expected: `true`. If not, add `alluxio.worker.s3.api.enabled: "true"` to `spec.properties` in `alluxio-cluster.yaml` and apply.
2. If running on Kubernetes without a load balancer, use port-forward:

   ```shell
   kubectl -n alx-ns port-forward pod/alluxio-cluster-worker-0 29998:29998 &
   ```

### `InvalidAccessKeyId` or signature errors

**Cause**: Alluxio's SIMPLE auth mode does not validate the signature, but the `Authorization` header must still be present and well-formed. Some clients omit it when no credentials are configured.

**Fix**: Pass any non-empty access key and secret key. With AWS CLI:

```shell
AWS_ACCESS_KEY_ID=testuser AWS_SECRET_ACCESS_KEY=testpassword \
  aws s3 ls --endpoint-url http://<ENDPOINT>
```

## See Also

* [S3-API Write Optimization](/ee-ai-en/performance/s3-write-cache.md) — low-latency writes with async persistence (requires FoundationDB)
* [Benchmarking S3 API Performance](/ee-ai-en/benchmark/s3-api.md) — reference baselines, tool selection (COSBench / Warp / httpbench), and Linux kernel tuning
* [S3 UFS Integration](/ee-ai-en/ufs/s3.md) — multipart upload tuning, high concurrency settings, and S3 region configuration


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.alluxio.io/ee-ai-en/data-access/s3-api.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
