# Client APIs

Alluxio exposes three client-facing APIs. All three access the same cached data — the choice depends on your application's language, framework, and operational constraints.

## Choose Your API

|               | [POSIX API](/ee-ai-en/ai-3.8-15.1.x/data-access/fuse-based-posix-api.md) | [S3 API](/ee-ai-en/ai-3.8-15.1.x/data-access/s3-api.md)       | [Python FSSpec](/ee-ai-en/ai-3.8-15.1.x/data-access/fsspec.md) |
| ------------- | ------------------------------------------------------------------------ | ------------------------------------------------------------- | -------------------------------------------------------------- |
| **Use when**  | App uses standard filesystem calls; zero code changes required           | App already uses boto3 / AWS SDK / any S3-compatible client   | Pure Python: Pandas, PyArrow, Ray, Hugging Face Datasets       |
| **Languages** | Any                                                                      | Any (HTTP)                                                    | Python only                                                    |
| **Setup**     | FUSE daemon on each client node (CSI or DaemonSet on K8s)                | Enable S3 endpoint on workers; point SDK at `worker-ip:29998` | `pip install alluxiofs`                                        |
| **Status**    | GA                                                                       | GA                                                            | Experimental                                                   |

### Quick decision guide

* **ML training (PyTorch, TensorFlow) on K8s** → POSIX API via CSI, or S3 API with PyTorch S3 Connector
* **Existing S3 workload** (boto3, Spark, etc.) → S3 API — swap the endpoint URL, no other changes
* **Pandas / PyArrow / Ray** → FSSpec (`alluxiofs`) — no mount or daemon required
* **Performance-critical, high-throughput** → S3 API with HTTP 307 redirect enabled

## API Overview

### POSIX API

Mounts Alluxio as a local directory. Any tool that uses standard filesystem calls (`open`, `read`, `write`, `ls`, `cp`) works without modification — no code changes, no SDK dependencies.

A FUSE daemon runs on each client node and translates filesystem calls into Alluxio RPC — completely transparent to the application. The tradeoff: each operation passes through the FUSE process (an extra hop vs. direct SDK access), and the daemon must be deployed on every client node.

```shell
# After FUSE is mounted — no code changes needed
ls /mnt/alluxio/fuse/s3/
cat /mnt/alluxio/fuse/s3/dataset/train.csv
```

→ [POSIX API Guide](/ee-ai-en/ai-3.8-15.1.x/data-access/fuse-based-posix-api.md)

### S3 API

Exposes an S3-compatible HTTP endpoint directly on each Alluxio worker process (port 29998). Applications using any AWS S3 SDK connect to Alluxio by changing the endpoint URL — no other code changes needed.

The S3 endpoint runs directly inside each worker process — no separate daemon, no extra component. See the guide for deployment patterns and throughput tuning.

```python
import boto3
client = boto3.client(
    "s3",
    endpoint_url="http://<WORKER_IP>:29998",
    aws_access_key_id="alluxio",
    aws_secret_access_key="alluxio",
)
client.list_objects_v2(Bucket="s3")
```

→ [S3 API Guide](/ee-ai-en/ai-3.8-15.1.x/data-access/s3-api.md)

### Python FSSpec API

`alluxiofs` implements the [fsspec](https://filesystem-spec.readthedocs.io/) interface, making Alluxio a drop-in replacement for `s3fs` in Pandas, PyArrow, Ray Dataset, and Hugging Face Datasets.

> **Experimental**: `alluxiofs` is not yet GA. Use UFS-native paths (e.g. `s3://bucket/path`), not Alluxio virtual mount paths.

```python
import fsspec
from alluxiofs import AlluxioFileSystem
fsspec.register_implementation("alluxiofs", AlluxioFileSystem, clobber=True)

fs = fsspec.filesystem("alluxiofs", worker_hosts="<WORKER_IP>", target_protocol="s3")

import pandas as pd
df = pd.read_parquet("s3://my-bucket/dataset/train.parquet", filesystem=fs)
```

→ [FSSpec API Guide](/ee-ai-en/ai-3.8-15.1.x/data-access/fsspec.md)

## Path Mapping

All three APIs use the same path namespace. Alluxio resolves paths against the **mount table**: each mount point binds an Alluxio virtual path to a UFS URI. For example, if `s3://my-bucket/data/` is mounted at `/data`, then:

* POSIX: `cat /mnt/alluxio/fuse/data/file.csv`
* S3 API: `client.get_object(Bucket="data", Key="file.csv")`
* FSSpec: `fs.open("s3://my-bucket/data/file.csv")`

All three resolve to the same cached data. Mount points are always at the top level of the Alluxio namespace and cannot be nested. See [Underlying Storage](/ee-ai-en/ai-3.8-15.1.x/ufs.md) for how to add and manage mounts.

### Virtual Path Mapping

For cases where the mount table alone is not enough — namespace isolation, zero-downtime UFS migration, or multi-version path aliases — Alluxio supports Virtual Path Mapping. It is a server-side path rewrite layer that applies transparently across POSIX, S3 API, and CLI before path resolution. Your application code is unaffected.

Example: redirect `/model/latest/` to `/model/v3/` without changing any client configuration.

→ [Virtual Path Mapping Guide](/ee-ai-en/ai-3.8-15.1.x/data-access/client-virtual-path-mapping.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://documentation.alluxio.io/ee-ai-en/ai-3.8-15.1.x/data-access.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
