# Client APIs

Alluxio exposes three client-facing APIs. All three access the same cached data — the choice depends on your application's language, framework, and operational constraints.

## Choose Your API

|               | [POSIX API](https://documentation.alluxio.io/ee-ai-en/data-access/fuse-based-posix-api) | [S3 API](https://documentation.alluxio.io/ee-ai-en/data-access/s3-api) | [Python FSSpec](https://documentation.alluxio.io/ee-ai-en/data-access/fsspec) |
| ------------- | --------------------------------------------------------------------------------------- | ---------------------------------------------------------------------- | ----------------------------------------------------------------------------- |
| **Use when**  | App uses standard filesystem calls; zero code changes required                          | App already uses boto3 / AWS SDK / any S3-compatible client            | Pure Python: Pandas, PyArrow, Ray, Hugging Face Datasets                      |
| **Languages** | Any                                                                                     | Any (HTTP)                                                             | Python only                                                                   |
| **Setup**     | FUSE daemon on each client node (CSI or DaemonSet on K8s)                               | Enable S3 endpoint on workers; point SDK at `worker-ip:29998`          | `pip install alluxiofs`                                                       |
| **Status**    | GA                                                                                      | GA                                                                     | Experimental                                                                  |

### Quick decision guide

* **ML training (PyTorch, TensorFlow) on K8s** → POSIX API via CSI, or S3 API with PyTorch S3 Connector
* **Existing S3 workload** (boto3, Spark, etc.) → S3 API — swap the endpoint URL, no other changes
* **Pandas / PyArrow / Ray** → FSSpec (`alluxiofs`) — no mount or daemon required
* **Performance-critical, high-throughput** → S3 API with HTTP 307 redirect enabled

## API Overview

### POSIX API

Mounts Alluxio as a local directory. Any tool that uses standard filesystem calls (`open`, `read`, `write`, `ls`, `cp`) works without modification — no code changes, no SDK dependencies.

A FUSE daemon runs on each client node and translates filesystem calls into Alluxio RPC — completely transparent to the application. The tradeoff: each operation passes through the FUSE process (an extra hop vs. direct SDK access), and the daemon must be deployed on every client node.

```shell
# After FUSE is mounted — no code changes needed
ls /mnt/alluxio/fuse/s3/
cat /mnt/alluxio/fuse/s3/dataset/train.csv
```

→ [POSIX API Guide](https://documentation.alluxio.io/ee-ai-en/data-access/fuse-based-posix-api)

### S3 API

Exposes an S3-compatible HTTP endpoint directly on each Alluxio worker process (port 29998). Applications using any AWS S3 SDK connect to Alluxio by changing the endpoint URL — no other code changes needed.

The S3 endpoint runs directly inside each worker process — no separate daemon, no extra component. See the guide for deployment patterns and throughput tuning.

```python
import boto3
client = boto3.client(
    "s3",
    endpoint_url="http://<WORKER_IP>:29998",
    aws_access_key_id="alluxio",
    aws_secret_access_key="alluxio",
)
client.list_objects_v2(Bucket="s3")
```

→ [S3 API Guide](https://documentation.alluxio.io/ee-ai-en/data-access/s3-api)

### Python FSSpec API

`alluxiofs` implements the [fsspec](https://filesystem-spec.readthedocs.io/) interface, making Alluxio a drop-in replacement for `s3fs` in Pandas, PyArrow, Ray Dataset, and Hugging Face Datasets.

> **Experimental**: `alluxiofs` is not yet GA. Use UFS-native paths (e.g. `s3://bucket/path`), not Alluxio virtual mount paths.

```python
import fsspec
from alluxiofs import AlluxioFileSystem
fsspec.register_implementation("alluxiofs", AlluxioFileSystem, clobber=True)

fs = fsspec.filesystem("alluxiofs", worker_hosts="<WORKER_IP>", target_protocol="s3")

import pandas as pd
df = pd.read_parquet("s3://my-bucket/dataset/train.parquet", filesystem=fs)
```

→ [FSSpec API Guide](https://documentation.alluxio.io/ee-ai-en/data-access/fsspec)

## Path Mapping

All three APIs use the same path namespace. Alluxio resolves paths against the **mount table**: each mount point binds an Alluxio virtual path to a UFS URI. For example, if `s3://my-bucket/data/` is mounted at `/data`, then:

* POSIX: `cat /mnt/alluxio/fuse/data/file.csv`
* S3 API: `client.get_object(Bucket="data", Key="file.csv")`
* FSSpec: `fs.open("s3://my-bucket/data/file.csv")`

All three resolve to the same cached data. Mount points are always at the top level of the Alluxio namespace and cannot be nested. See [Underlying Storage](https://documentation.alluxio.io/ee-ai-en/ufs) for how to add and manage mounts.

### Virtual Path Mapping

For cases where the mount table alone is not enough — namespace isolation, zero-downtime UFS migration, or multi-version path aliases — Alluxio supports Virtual Path Mapping. It is a server-side path rewrite layer that applies transparently across POSIX, S3 API, and CLI before path resolution. Your application code is unaffected.

Example: redirect `/model/latest/` to `/model/v3/` without changing any client configuration.

→ [Virtual Path Mapping Guide](https://documentation.alluxio.io/ee-ai-en/data-access/client-virtual-path-mapping)
