Optimizing Reads

Alluxio provides several advanced features to accelerate file reading performance. Whether you are working with sequential access patterns, very large files, or specialized AI/ML workloads, you can tune Alluxio to meet your needs. This guide covers the key mechanisms for optimizing read operations.

Tune Client-Side Prefetching for Sequential Reads

For sequential file reads, the Alluxio client automatically prefetches data that is likely to be read next. This data is cached in the client's local memory, allowing subsequent read requests to be served directly from the client without needing a network request to a worker.

The prefetch window is self-adjusting: it increases during continuous sequential reads and decreases for non-continuous or random reads. For completely random access patterns, the prefetch window will eventually shrink to zero.

Standard Prefetching Configuration

Client-side prefetching is enabled by default. You can tune its behavior using the following properties in conf/alluxio-site.properties:

Configuration Property
Recommended Value
Description

alluxio.user.position.reader.streaming.async.prefetch.thread

64

The overall concurrency for the async prefetch thread pool.

alluxio.user.position.reader.streaming.async.prefetch.part.length

4MB

The size of each prefetch unit.

alluxio.user.position.reader.streaming.async.prefetch.max.part.number

8

The maximum number of units a single opened file can have. For example, with a 4MB unit size and a max of 8 units, Alluxio will prefetch up to 32MB of data ahead for a single file.

alluxio.user.position.reader.streaming.async.prefetch.file.length.threshold

4MB

If a file's size is below this threshold, Alluxio will immediately maximize the prefetch window instead of starting small. This is useful for improving the read performance of small files.

Using the Slow Prefetch Pool for Mixed Workloads

Some workloads, like cold reads, may benefit from different prefetching parameters (e.g., higher concurrency) than others. Alluxio provides a secondary, "slow" prefetch pool that can be configured independently for these scenarios.

To enable and configure the slow prefetch pool, set the following properties:

Configuration Property
Recommended Value
Description

alluxio.user.position.reader.streaming.async.prefetch.use.slow.thread.pool

true

Set to true to enable the slow prefetch pool.

alluxio.user.position.reader.streaming.async.prefetch.use.slow.thread.pool.for.cold.read

true

If true, the slow pool will be used for cold reads. Otherwise, it will only be used for cache filter reads.

alluxio.user.position.reader.streaming.slow.async.prefetch.thread

256

The overall async prefetch concurrency for the slow pool.

alluxio.user.position.reader.streaming.slow.async.prefetch.part.length

1MB

The size of the prefetch unit used by the slow pool.

alluxio.user.position.reader.streaming.slow.async.prefetch.max.part.number

64

The maximum number of units a single opened file can have for the slow pool.

Accelerate Large File Reads

Reading very large files presents unique challenges, such as initial cold read latency and single-worker bottlenecks. Alluxio offers two features to address these issues.

Preloading Entire Files for Faster Cold Reads

For workloads that read an entire large file from beginning to end, you can enable full-file preloading. When a client begins reading a large file for the first time, this feature triggers Alluxio workers to concurrently load the entire file from the UFS into the cache. This can make cold read performance nearly as fast as a fully cached hot read.

Note: This feature causes read amplification if your application only reads a small portion of the file, as the entire file is loaded into Alluxio regardless.

To enable this feature, set the following properties:

Configuration Property
Recommended Value
Description

alluxio.user.position.reader.preload.data.enabled

true

Set to true to enable large file preloading.

alluxio.user.position.reader.preload.data.file.size.threshold.min

1GB

The minimum file size to trigger the async preload.

alluxio.user.position.reader.preload.data.file.size.threshold.max

200GB

The maximum file size to trigger the async preload. This prevents extremely large files from filling the entire cache and causing excessive evictions.

alluxio.worker.preload.data.thread.pool.size

64

The number of concurrent jobs on each worker to load parts of the file from the UFS in parallel. For example, with a 4MB page size and 64 threads, a worker can load up to 256MB per iteration.

Segmenting Large Files Across Multiple Workers

By default, an entire file is cached on a single Alluxio worker. For very large files, this can create a bottleneck if multiple clients request the same file. File segmentation solves this by breaking a large file into smaller, independent segments, each of which can be cached on a different worker. When a client reads the file, it can fetch different segments from multiple workers in parallel, dramatically increasing read throughput.

How It Works

A file is treated as an ordered list of segments. Each segment is identified by a unique Segment ID, which is a combination of the original file's ID and the segment's index within the file.

Segment ID := (fileId, segmentIndex)

When a client needs to read part of a file, Alluxio uses this Segment ID—not the file ID—as the key for its worker selection algorithm. This ensures that different segments are mapped to different workers.

When a client reads a region that spans multiple segments, the request is broken down, and each segment is read from its corresponding worker. This parallelizes the I/O and distributes the load across the cluster.

Limitations

  • Files created and written directly by clients into Alluxio cannot be segmented.

  • The segment size is a cluster-wide setting and cannot be configured on a per-file basis.

Enabling File Segmentation

To enable this feature, set the following properties on all Alluxio nodes (masters, workers, and clients):

Configuration Property
Recommended Value
Description

alluxio.dora.file.segment.read.enabled

true

Set to true to enable file segmentation.

alluxio.dora.file.segment.size

(depends on use case)

The size of the segments. Defaults to 1 GiB.

Choosing the right segment size is a trade-off. If segments are too small, clients may switch between workers too frequently, underutilizing network bandwidth. If they are too large, the risk of uneven cache distribution across workers increases. A common range is between several gigabytes to tens of gigabytes.

Optimize for AI Model File Loading

Alluxio is highly effective at accelerating AI model file loading, a common bottleneck in production machine learning systems. In a typical workflow, trained models are stored in a central UFS, and online inference services need to load them quickly to serve predictions. These model files are often large, and conventional filesystems can struggle with the high-frequency, concurrent read requests from many service replicas, leading to traffic spikes and slow startup times.

By using Alluxio as a caching layer—often accessed via Alluxio FUSE to present models as a local filesystem—you can dramatically improve model loading speed and reduce load on the UFS.

While the standard client prefetching is often sufficient, you can enable an enhanced prefetching logic specifically for the high-concurrency reads common in model serving. When multiple services read the same model file through a single Alluxio FUSE instance, this feature can provide up to a 3x performance improvement.

To enable this optimization, set the following properties:

alluxio.user.position.reader.streaming.async.prefetch.by.file.enabled=true
alluxio.user.position.reader.streaming.async.prefetch.shared.cache.enabled=true

Last updated