FUSE Write Optimization
This feature is experimental since AI-3.8.15.1.4.
This guide shows how to use the Write Cache backend through the FUSE POSIX interface, enabling low-latency writes via standard filesystem calls (write(), open(), close()). Write Cache must already be deployed before following this guide.
How It Relates to S3-API Write Cache
The Write Cache backend (FoundationDB metadata + NVMe data + async UFS persistence) is shared between the two access interfaces:
Interface
S3-compatible API (PUT, GET)
POSIX filesystem calls (write, read)
Client
AWS CLI, boto3, s3fs, any S3 client
Any POSIX application, ML frameworks, shell tools
Write policies
WRITE_THROUGH, WRITE_BACK, TRANSIENT
Same
FoundationDB
Required
Required (same FDB cluster)
Before You Start
Recommended Cluster Configuration
When FUSE Write Cache is active, a larger fraction of NVMe capacity is consumed by unpersisted write data. Increase the pinned-space ratio from the default 0.3 to 0.5 in your alluxio-cluster.yaml:
Apply the change:
Size workers' NVMe accordingly: at ratio
0.5and total cache1 TiB, up to 500 GiB can be occupied by unpersisted write data at any time. If incoming write throughput exceeds persistence throughput, space fills and Alluxio returnsout-of-spaceerrors. See Cache Space Management.
Deploy a FUSE Client Pod
The operator creates a PVC named <CLUSTER_NAME>-fuse during cluster installation. Mount it with mountPropagation: HostToContainer for auto-recovery if the FUSE process restarts.
Expected: STATUS = Running, READY = 1/1.
For full FUSE deployment options (DaemonSet, Docker / Bare-Metal), see POSIX API.
Configure Write-Back Paths
Write policies are configured at the path level — the same as in S3-API Write Optimization. The paths refer to the Alluxio namespace (e.g., /s3/checkpoints), not the FUSE mount path (/data/s3/checkpoints).
Non-interactive configuration (for scripting):
Expected: Update successful!
Verify a specific path resolves to the expected policy:
Expected: output contains "policyMode": "WRITE_BACK".
Verify Write-Back via FUSE
Write a file through FUSE and confirm it is eventually persisted to UFS:
Wait for async persistence (up to alluxio.write.cache.async.file.check.period, default 10min):
Expected: All files PERSISTED. within 2 minutes.
POSIX Compatibility in Write-Cache Mode
The FUSE Write Cache is built on the same backend as the S3-API Write Cache. As a result, the FUSE mount has additional restrictions that go beyond both standard POSIX and standard FUSE semantics. These restrictions apply to all write-cache policies (WRITE_THROUGH, WRITE_BACK, TRANSIENT). Review them before migrating workloads to a write-cache FUSE mount.
rename() returns EIO
All rename() operations fail with EIO while any write-cache policy is active, regardless of whether the file has been persisted to UFS. This affects:
Shell
mvandrenamePython
os.rename(),pathlib.Path.rename()Write-then-rename patterns (write to
.tmp, rename into place)
Workaround — silly rename interception (opt-in): When applications perform rm on open files, Linux internally issues a rename() to .fuse_hidden*. Enable the interceptor to handle this transparently:
With this option enabled, Alluxio intercepts .fuse_hidden* renames and handles open-file deletion without triggering an S3 CopyObject + DeleteObject. Default is false.
Files are write-once after close
Once a file is closed, it cannot be re-opened for writing, appending, or truncating:
open(path, O_CREAT | O_EXCL) — file already exists
EEXIST
open(path, O_WRONLY) or open(path, O_RDWR) — file already exists
EACCES
Impact: Applications that update files in place (databases, log rotation, config rewriters) will not work through a write-cache FUSE mount. The write-cache FUSE mount is best suited for write-once workloads: model checkpoints, training datasets, ETL stage outputs.
Hard links are not supported
link() returns EOPNOTSUPP. Tools that rely on hard links (rsync --hard-links, some package managers) will not work through the mount.
Cache page reclaim on delete (15.1.3+ behavior)
When a file is deleted via FUSE rm or rm -rf, cached pages are reclaimed on all workers that hold copies of the file — not only the hash-ring owner. In builds prior to 15.1.3, only the owner worker reclaimed pages; other workers retained orphaned pages until the next eviction cycle.
Monitoring Async Persistence
Two CLI commands (available 15.1.3+) let you inspect in-flight persist operations without waiting for alluxio fs ls:
Use async-persist stat when alluxio fs ls shows a file stuck in NOT_PERSISTED to determine whether the issue is in the queue or the upload itself.
Key Configuration
alluxio.write.cache.enabled
false
Enables Write Cache (shared with S3 API).
alluxio.worker.page.store.pinned.file.capacity.limit.ratio
0.3
Max fraction of NVMe capacity for unpersisted write data. Raise to 0.5 for write-heavy FUSE workloads.
alluxio.write.cache.async.file.check.period
10min
Scan interval for orphan detection. Shorter values increase FDB load.
alluxio.write.cache.async.check.orphan.timeout
1h
Uncommitted writes older than this are treated as abandoned and cleaned up.
alluxio.fuse.silly.rename.interceptor.enabled
false
CLIENT-scoped. Intercepts .fuse_hidden* rename/unlink for transparent rm of open files.
alluxio.worker.mark.writing.files.duration
10min
If a file is open for write but receives no new data for this duration, the worker treats it as a dangling write eligible for cleanup. Timer resets on every write.
Troubleshooting
Directory deletion returns DEADLINE_EXCEEDED
Running alluxio fs rm -R or rm -rf on a WRITE_BACK path may fail with:
Despite the error, the underlying files may have already been deleted from UFS before the timeout. Do not assume the data is still present.
Recovery steps:
Verify UFS state directly:
If files are gone from S3, the deletion succeeded at the data layer. Re-running
alluxio fs rm -Rwill confirm by returningPath does not exist.Pagestore disk space may not shrink immediately — orphaned pages are reclaimed on the next eviction cycle.
Files stuck in NOT_PERSISTED
If files remain NOT_PERSISTED beyond alluxio.write.cache.async.file.check.period:
Check async-persist queue:
Check specific file status:
Check worker logs for upload errors:
If UFS is unreachable, retries enter exponential backoff (up to
alluxio.worker.write.cache.async.persist.retry.max.interval, default1h). Verify UFS connectivity from the worker pod.
rename() returns EIO unexpectedly
This is expected behaviour when any write-cache policy is active (see rename() returns EIO). If your application relies on rename:
Switch the affected path to
NO_CACHEpolicy to bypass the write cache entirely for that path.Enable
alluxio.fuse.silly.rename.interceptor.enabled: "true"if the rename is triggered byrmof an open file.
FUSE pod OOM or mount not connected
These are not write-cache-specific. See FUSE Troubleshooting.
See Also
S3-API Write Optimization — write cache via S3 API; deploy this first
POSIX API — FUSE deployment details, mount options, read-cache mode
S3 API Benchmarks — S3-side write throughput baselines
Benchmarking POSIX Performance — FUSE-side throughput baselines
Last updated