FUSE Write Optimization

circle-exclamation

This guide shows how to use the Write Cache backend through the FUSE POSIX interface, enabling low-latency writes via standard filesystem calls (write(), open(), close()). Write Cache must already be deployed before following this guide.

How It Relates to S3-API Write Cache

The Write Cache backend (FoundationDB metadata + NVMe data + async UFS persistence) is shared between the two access interfaces:

FUSE Write Optimization (this guide)

Interface

S3-compatible API (PUT, GET)

POSIX filesystem calls (write, read)

Client

AWS CLI, boto3, s3fs, any S3 client

Any POSIX application, ML frameworks, shell tools

Write policies

WRITE_THROUGH, WRITE_BACK, TRANSIENT

Same

FoundationDB

Required

Required (same FDB cluster)

Additional limitations

None beyond S3 semantics

Before You Start

When FUSE Write Cache is active, a larger fraction of NVMe capacity is consumed by unpersisted write data. Increase the pinned-space ratio from the default 0.3 to 0.5 in your alluxio-cluster.yaml:

Apply the change:

Size workers' NVMe accordingly: at ratio 0.5 and total cache 1 TiB, up to 500 GiB can be occupied by unpersisted write data at any time. If incoming write throughput exceeds persistence throughput, space fills and Alluxio returns out-of-space errors. See Cache Space Management.

Deploy a FUSE Client Pod

The operator creates a PVC named <CLUSTER_NAME>-fuse during cluster installation. Mount it with mountPropagation: HostToContainer for auto-recovery if the FUSE process restarts.

Expected: STATUS = Running, READY = 1/1.

For full FUSE deployment options (DaemonSet, Docker / Bare-Metal), see POSIX API.

Configure Write-Back Paths

Write policies are configured at the path level — the same as in S3-API Write Optimization. The paths refer to the Alluxio namespace (e.g., /s3/checkpoints), not the FUSE mount path (/data/s3/checkpoints).

Non-interactive configuration (for scripting):

Expected: Update successful!

Verify a specific path resolves to the expected policy:

Expected: output contains "policyMode": "WRITE_BACK".

Verify Write-Back via FUSE

Write a file through FUSE and confirm it is eventually persisted to UFS:

Wait for async persistence (up to alluxio.write.cache.async.file.check.period, default 10min):

Expected: All files PERSISTED. within 2 minutes.


POSIX Compatibility in Write-Cache Mode

circle-exclamation

rename() returns EIO

All rename() operations fail with EIO while any write-cache policy is active, regardless of whether the file has been persisted to UFS. This affects:

  • Shell mv and rename

  • Python os.rename(), pathlib.Path.rename()

  • Write-then-rename patterns (write to .tmp, rename into place)

Workaround — silly rename interception (opt-in): When applications perform rm on open files, Linux internally issues a rename() to .fuse_hidden*. Enable the interceptor to handle this transparently:

With this option enabled, Alluxio intercepts .fuse_hidden* renames and handles open-file deletion without triggering an S3 CopyObject + DeleteObject. Default is false.

Files are write-once after close

Once a file is closed, it cannot be re-opened for writing, appending, or truncating:

Operation
errno

open(path, O_CREAT | O_EXCL) — file already exists

EEXIST

open(path, O_WRONLY) or open(path, O_RDWR) — file already exists

EACCES

Impact: Applications that update files in place (databases, log rotation, config rewriters) will not work through a write-cache FUSE mount. The write-cache FUSE mount is best suited for write-once workloads: model checkpoints, training datasets, ETL stage outputs.

link() returns EOPNOTSUPP. Tools that rely on hard links (rsync --hard-links, some package managers) will not work through the mount.

Cache page reclaim on delete (15.1.3+ behavior)

When a file is deleted via FUSE rm or rm -rf, cached pages are reclaimed on all workers that hold copies of the file — not only the hash-ring owner. In builds prior to 15.1.3, only the owner worker reclaimed pages; other workers retained orphaned pages until the next eviction cycle.


Monitoring Async Persistence

Two CLI commands (available 15.1.3+) let you inspect in-flight persist operations without waiting for alluxio fs ls:

Use async-persist stat when alluxio fs ls shows a file stuck in NOT_PERSISTED to determine whether the issue is in the queue or the upload itself.

Key Configuration

Property
Default
Description

alluxio.write.cache.enabled

false

Enables Write Cache (shared with S3 API).

alluxio.worker.page.store.pinned.file.capacity.limit.ratio

0.3

Max fraction of NVMe capacity for unpersisted write data. Raise to 0.5 for write-heavy FUSE workloads.

alluxio.write.cache.async.file.check.period

10min

Scan interval for orphan detection. Shorter values increase FDB load.

alluxio.write.cache.async.check.orphan.timeout

1h

Uncommitted writes older than this are treated as abandoned and cleaned up.

alluxio.fuse.silly.rename.interceptor.enabled

false

CLIENT-scoped. Intercepts .fuse_hidden* rename/unlink for transparent rm of open files.

alluxio.worker.mark.writing.files.duration

10min

If a file is open for write but receives no new data for this duration, the worker treats it as a dangling write eligible for cleanup. Timer resets on every write.

Troubleshooting

Directory deletion returns DEADLINE_EXCEEDED

Running alluxio fs rm -R or rm -rf on a WRITE_BACK path may fail with:

triangle-exclamation

Recovery steps:

  1. Verify UFS state directly:

  2. If files are gone from S3, the deletion succeeded at the data layer. Re-running alluxio fs rm -R will confirm by returning Path does not exist.

  3. Pagestore disk space may not shrink immediately — orphaned pages are reclaimed on the next eviction cycle.


Files stuck in NOT_PERSISTED

If files remain NOT_PERSISTED beyond alluxio.write.cache.async.file.check.period:

  1. Check async-persist queue:

  2. Check specific file status:

  3. Check worker logs for upload errors:

  4. If UFS is unreachable, retries enter exponential backoff (up to alluxio.worker.write.cache.async.persist.retry.max.interval, default 1h). Verify UFS connectivity from the worker pod.


rename() returns EIO unexpectedly

This is expected behaviour when any write-cache policy is active (see rename() returns EIO). If your application relies on rename:

  • Switch the affected path to NO_CACHE policy to bypass the write cache entirely for that path.

  • Enable alluxio.fuse.silly.rename.interceptor.enabled: "true" if the rename is triggered by rm of an open file.


FUSE pod OOM or mount not connected

These are not write-cache-specific. See FUSE Troubleshooting.

See Also

Last updated