POSIX API

Alluxio's POSIX API allows you to mount the Alluxio namespace as a standard filesystem on most Unix-like operating systems. This feature, commonly known as "Alluxio FUSE," lets you use standard command-line tools (ls, cat, mkdir) and existing applications to interact with data in Alluxio without any code changes.

Unlike specific filesystem wrappers like S3FS, Alluxio FUSE acts as a generic caching and data orchestration layer for the many storage systems Alluxio supports, making it ideal for accelerating I/O in workloads like AI/ML model training and serving.

When to Use FUSE

The FUSE interface is particularly powerful for traditional applications and modern AI/ML workloads. Common use cases include:

  • AI/ML Model Training: When training models with frameworks like PyTorch or TensorFlow, you can read datasets directly from the mounted FUSE path. This simplifies data access and leverages Alluxio's caching to dramatically speed up training jobs. See Model Loading for performance tuning.

  • Model Serving: For inference servers that need to load models quickly, FUSE provides low-latency access to models stored in Alluxio.

  • Legacy Applications: Applications that expect a standard filesystem can be pointed to the FUSE mount to read and write data from Alluxio without modification.

  • Interactive Data Exploration: Data scientists and engineers can use shell commands (ls, cat, head) to explore and interact with data in Alluxio just like a local filesystem.

Prerequisites

Each Quick Start subsection below lists additional deployment-specific prerequisites.

Quick Start

Three deployment methods are supported. Use the first that applies to your environment:

Method-specific prerequisites:

The Container Storage Interface (CSI) is the standard, recommended way to use Alluxio FUSE in Kubernetes. The Alluxio Operator automatically provisions a PersistentVolumeClaim (PVC) named alluxio-cluster-fuse when the cluster is installed.

To use it, mount this PVC into your application pods. The operator will handle the creation and binding of the underlying PersistentVolume (PV).

Example Pod Configuration:

Save the following configuration to a file named fuse-pod.yaml:

Create the pod:

Verify the pod is running and the FUSE mount is accessible:

Expected: STATUS = Running, READY = 1/1.

Key details:

  • Shared FUSE Process: Multiple pods on the same Kubernetes node can use the same PVC and will share a single Alluxio FUSE process for efficiency.

  • mountPropagation: HostToContainer: This setting is critical. It ensures that if the FUSE process crashes, the mount point can be automatically recovered and re-propagated to your container.

Once mounted, you can interact with the /data directory as if it were the root of your Alluxio namespace.

Method 2: Kubernetes with DaemonSet

If your Kubernetes version or environment does not support CSI, you can deploy FUSE using a DaemonSet. This approach runs a FUSE pod on each node (or a subset of nodes you select).

  1. Configure the DaemonSet: Before deploying your Alluxio cluster, modify your alluxio-cluster.yaml to use the daemonSet type and specify a host path for the mount.

    This will deploy FUSE pods on all nodes with the label alluxio.com/selected-for-fuse: true. Label the nodes first:

  2. Mount in Your Application Pod: In your application pod, mount the hostPath where the FUSE DaemonSet exposes the filesystem.

    Similar to the CSI method, mountPropagation is essential for auto-recovery.

  3. Verify: After deploying, confirm the DaemonSet pods are running and the mount is accessible:

    Expected: FUSE pods are Running on each labeled node.

Method 3: Docker / Bare-Metal

On hosts without Kubernetes, the Alluxio FUSE client runs as a standalone Docker container in host-network mode. The reference setup is one FUSE container per client host, pointing at the Alluxio cluster brought up during Docker Installation.

Create the host mount point (ownership must match the alluxio UID inside the image, which is 1000), then launch the container:

  • --device /dev/fuse + --cap-add SYS_ADMIN grant the capabilities the container needs to mount FUSE. --security-opt apparmor=unconfined is required on distributions whose default AppArmor profile blocks FUSE from containers.

  • -v /mnt/alluxio:/mnt/alluxio:rshared uses recursive shared mount propagation so the FUSE mount is visible from the host.

  • Fill in <JAVA_OPTS> with the etcd endpoint for cluster discovery and JVM heap / direct-memory sizing — see Customizing Resource Limits.

✅ Verify the mount:

Lists any registered UFS mounts. Transport endpoint is not connected means the container exited — check sudo docker logs alluxio-fuse.

Verifying End-to-End Access

Regardless of method, confirm a round-trip read and write through the FUSE mount, then verify the same file appears when listed directly via Alluxio. The examples below assume the FUSE mount is at /data (Kubernetes) or /mnt/alluxio/fuse (Docker / Bare-Metal).

✅ Success: The cat returns hello, world!, and alluxio fs ls shows the same file size, confirming FUSE writes flow through to Alluxio.

POSIX Compatibility

While most standard filesystem operations are supported, Alluxio FUSE does not provide full POSIX compatibility. Below is a summary of supported and unsupported operations.

File Operations

Supported
Unsupported
  • Create and delete files

  • Rename files

  • Sequential, random and concurrent reads

  • Sequential, append, random and concurrent writes

  • Truncate or overwrite files

  • Symbolic links (ln -s)

  • Get file status (stat)

  • Hard links (ln)

  • File locking (flock)

  • Changing ownership (chown) or permissions (chmod)

  • Changing access/modification times (utimens)

  • Extended attributes (chattr, sticky bit, xattr)

  • Atomic concurrent writes to the same file

Note: Some features like advanced writes and symbolic links are supported but disabled by default. See the following sections for instructions on how to enable them:

Directory Operations

Supported
Unsupported
  • Create and delete directories

  • Rename directories

  • List directory contents (ls)

  • Get directory status (stat)

No major unsupported operations.

Other Limitations

  • Special Files: Device files, pipes, and FIFOs are not supported.

  • Path Names: Avoid using special characters (?, \) or patterns (./, ../) in file or directory names.

  • Capacity Reporting: df, statvfs, and similar calls do not reflect the UFS backing-store capacity. Treat the mount as effectively unbounded for sizing purposes.

  • Metadata Freshness: File and directory metadata is cached in the kernel for attr_timeout / entry_timeout seconds (default 60). Files modified directly on the UFS while a FUSE client is running may appear stale for up to this window. Lower the timeouts if you need sub-minute consistency against external writers — see Customizing FUSE Mount Options.

Advanced Configuration

Enabling Append and Random Writes

To enable append and random write operations, set the following property in your Alluxio configuration (alluxio-site.properties or via the Helm chart values):

This allows applications to modify existing files, which is useful for workloads like logging or databases, but may have performance implications.

Symbolic links (symlinks) are disabled by default. To enable them, set the following property in your Alluxio configuration (alluxio-site.properties or via the Helm chart values):

Enabling Parallel getattr Operations

By default, the FUSE kernel module serializes lookup and readdir operations within the same directory. To improve performance for workloads requiring high concurrency metadata operations (such as getattr on many files within a single directory), you can enable parallel directory operations.

To enable this feature, set the following property in your Alluxio configuration (alluxio-site.properties):

Note: This feature is currently recommended for read-only workloads.

Isolating Data Access

By default, the FUSE mount provides access to the entire Alluxio namespace. For multi-tenant environments, you may want to restrict a user's access to a specific subdirectory.

Use ephemeral CSI volumes with the mountPath volume attribute to scope access to a subdirectory. This approach avoids auto-recovery issues associated with subPath and does not require creating separate StorageClass or PVC resources.

Multiple subdirectories can share the same FUSE instance by defining additional volumes with different mountPath values.

Using subPath (Deprecated)

Deprecated: The subPath method breaks the auto-recovery mechanism and is difficult to recover automatically. Use ephemeral CSI volumes instead.

You can mount a specific subdirectory within the Alluxio namespace into your pod using the subPath field.

Accessing FUSE from Another Namespace

If your application runs in a different namespace from the Alluxio cluster, you must create a corresponding PVC in your application's namespace.

  1. Create the PVC in your namespace: The storageClassName must point to the FUSE StorageClass created by the operator in the Alluxio namespace (e.g., alx-ns-alluxio-cluster-fuse). Save the following to a file named csi-pvc.yaml:

  2. Apply the PVC:

    Verify:

    Expected: PVC exists.

Customizing FUSE Mount Options

You can tune FUSE performance by providing mount options in the AlluxioCluster YAML. These options are passed directly to the underlying FUSE driver. For a full list, see the FUSE documentation.

Example Configuration:

Commonly Tuned Options:

Mount option
FUSE kernel default
Alluxio Operator default
Description

kernel_cache

disabled

enabled

Allows the kernel to cache file data, which can significantly improve read performance. Only use this if the underlying files are not modified externally (i.e., outside of Alluxio).

auto_cache

disabled

disabled

Similar to kernel_cache, but the cache is invalidated if the file's modification time or size changes. Prefer this over kernel_cache on bare-metal deployments with mutable data.

attr_timeout=N

1.0

60

Seconds for which file and directory attributes (permissions, size) are cached by the kernel. Increasing this reduces metadata overhead on repeated stat calls.

entry_timeout=N

1.0

60

Seconds for which filename lookups are cached. Increasing this speeds up path resolutions for workloads with many repeated file opens.

max_background=N

12

128

Maximum number of outstanding background requests the FUSE kernel driver is allowed to queue. Increase for workloads with high I/O concurrency.

max_idle_threads=N

10

128

Maximum number of idle FUSE daemon threads. Increasing this prevents overhead from frequent thread creation/destruction under heavy concurrent load.

ro

disabled

disabled

Mount the FUSE filesystem as read-only. Useful for serving datasets that should never be modified through the mount.

For read-heavy AI/ML workloads, see File Reading Optimization for additional Alluxio-level tuning beyond FUSE mount options.

On bare-metal client hosts driving hundreds of concurrent threads, raise max_idle_threads and max_background from the 128 default to 256.

Customizing Resource Limits

You can adjust the CPU and memory resources allocated to the FUSE pods and their JVMs.

Memory limit formula:

For the config above (-Xmx22g, -XX:MaxDirectMemorySize=10g): minimum limit is 22 + 10 + 2 = 34 GiB, set to 36 GiB in the example.

If -XX:MaxDirectMemorySize is omitted, the JVM defaults it to the same value as -Xmx, so the container limit typically needs to be 2.5× -Xmx or more.

Profile reference:

Profile

-Xmx

-XX:MaxDirectMemorySize

Memory limit

When to use

Evaluation

8g

4g

16 GiB

Dev/test and small clusters

Standard

22g

10g

36 GiB

Production Kubernetes pods (default for most workloads)

High throughput

48g

64g

120 GiB

Bare-metal hosts with large NIC bandwidth and hot-read workloads

Performance

Diagnosing Where You Are Bottlenecked

When throughput is below expectation, the symptom tells you where to look:

Symptom
Likely bottleneck
What to do

FUSE container CPU pinned near 100% × cores (top, nstat)

FUSE-side is saturated

Add another FUSE client node, or tune mount options — see Customizing FUSE Mount Options

FUSE CPU well below saturated, but latency is high

Worker or UFS not serving fast enough

See Slow read performance for step-by-step diagnosis, including cache-metric checks

Neither — low CPU and low latency but low throughput

Client host (CPU, NIC, kernel)

Run top, nstat, and iperf3 against the worker

Profiling a FUSE Container

To capture a CPU flamegraph from a live FUSE container, add --cap-add SYS_PTRACE to the docker run in addition to --cap-add SYS_ADMIN. Then attach async-profiler (or any JVM profiler you prefer) to the FUSE JVM inside the container and copy the generated HTML out with docker cp for analysis.

Troubleshooting

FUSE mount shows "Transport endpoint is not connected"

Symptom: Accessing the mount path returns Transport endpoint is not connected.

Cause: The FUSE process crashed or was restarted, and the mount was not recovered.

Solution:

  1. Verify mountPropagation: HostToContainer is set in the application pod spec. Without it, auto-recovery cannot work.

  2. Check if the FUSE pod is running:

  3. If the FUSE pod is running but the mount is stale, delete and recreate the application pod:

FUSE process exits with OOM

Symptom: FUSE repeatedly crashes — on Kubernetes, CrashLoopBackOff / Exit Code 137 / OOMKilled; on Docker, the container stops and docker logs reports OutOfMemoryError or a SIGKILL from the cgroup.

Cause: The container memory limit is too low for the configured JVM heap and direct memory.

Solution: Ensure the memory limit satisfies:

Check FUSE logs before the crash:

Look for OutOfMemoryError to determine whether to increase -Xmx or -XX:MaxDirectMemorySize. See Customizing Resource Limits.

Application pod stuck in ContainerCreating

Symptom: Application pod remains in ContainerCreating status after requesting the FUSE PVC.

Cause: The CSI driver is not installed, or the FUSE PVC does not exist.

Solution:

  1. Check events on the pod:

  2. If the event mentions the PVC is not found, verify the PVC exists:

  3. If the CSI nodeplugin is missing, verify the operator was installed with CSI enabled (the default). Reinstall the operator without alluxio-csi.enabled: false if needed.

Permission denied on FUSE mount

Symptom: ls: cannot access '/data': Permission denied when accessing the mount.

Cause: The FUSE mount does not include the allow_other option, which restricts access to the user who mounted it.

Solution: Add allow_other to the FUSE mount options in alluxio-cluster.yaml:

Then recreate the Alluxio cluster for the change to take effect.

For fine-grained access control, see Enabling Authorization for FUSE.

Slow read performance

Symptom: Reading files through FUSE is significantly slower than expected.

Diagnosis:

  1. Check if data is cached in Alluxio:

    If the file shows 0% cached, the first read will be slow as it fetches from the underlying storage.

  2. Check FUSE mount options — ensure kernel_cache or auto_cache and increased attr_timeout/entry_timeout values are set. See Customizing FUSE Mount Options.

  3. For AI/ML training workloads, preload data before starting training:

  4. If the symptom is specifically tail latency (P99) rather than average throughput, also investigate worker-side JVM GC pauses and UFS fallback reads under load — check worker logs for UFS read entries and tune GC if pauses are confirmed.

For comprehensive read performance tuning, see File Reading Optimization. For benchmarking, see Benchmarking POSIX Performance.

Cleanup

Remove any test pod and custom PVCs created during setup:

The alluxio-cluster-fuse PVC is managed by the Alluxio Operator and will be cleaned up automatically when the cluster is deleted. Do not delete it manually.

See Also

Last updated