Non-disruptive FUSE Migration

Background

In many modern machine learning (ML) workloads, data is accessed through a FUSE-based virtual file system. These workloads often run in containerized environments like Kubernetes, where compute pods rely on the availability of mounted FUSE paths for data ingestion, training checkpoints, and real-time model evaluation.

However, a typical restart or upgrade of the FUSE daemon causes the file system mount to be temporarily torn down and rebuilt. This disrupts open file descriptors, invalidates directory handles, and can lead to I/O errors or failures in long-running ML jobs. In the worst case, it forces full job restarts or leads to corrupted intermediate results.

To mitigate these issues, we introduce non-disruptive FUSE migration. This is a mechanism that enables the FUSE daemon to be paused, its internal state snapshotted, and then resumed by a new process (e.g., a new container or pod) without breaking in-flight user operations. This allows system updates, bug fixes, or resource migrations to be performed seamlessly—critical for production-grade ML systems that demand high availability and fault tolerance.

Goals & Limitations

Goals

Read operations (read) will be preserved during FUSE upgrades. These requests will hang briefly during the migration (typically within tens of seconds) and automatically resume once the new daemon is active. This ensures minimal disruption to read-only workloads.

Limitations

Slow read requests will be forcefully aborted if they are not able to finish within 3 seconds.
Write and list operations (write, mv, unlink, readdir) will fail during the migration window. These operations depend on in-memory state that cannot be safely transferred and will return errors instead of hanging. Applications should implement retry logic if necessary. This is a tradeoff to prioritize safety and migration speed.

How It Works

When this feature is enabled, a "takeover" process occurs during a FUSE restart or upgrade, instead of simply killing the previous FUSE pod and starting a new one.

Pause Old Process: The old FUSE daemon stops processing new requests.
State Transfer: The new FUSE daemon starts and transfers the necessary state from the old daemon.
Seamless Switch: Once the state transfer is complete, the operator kills the old FUSE daemon, and the new daemon begins serving all requests.

The entire transition is designed to be smooth. For read-only workloads, the application should not sense any disruption.

How to Enable

This feature can be enabled via the Alluxio Operator.

Enable the Feature: Simply set the following property to true.
```
alluxio.fuse.non.disruptive.migration.enabled=true
```
Detailed instructions can be found in the official documentation.
Configure Grace Period: When a non-disruptive migration triggers, the old FUSE daemon will give ongoing requests a grace period to finish. The default value is 5 seconds. To modify this value, change the following property:
```
alluxio.fuse.migration.ongoing.request.grace_period=5s
```

Last updated 3 months ago