Cache Eviction

Data leaves the Alluxio cache in three ways:

  1. Automatic eviction — workers evict data when cache fills up, ordered by the configured policy (LRU by default)

  2. TTL expiry — background scan removes data whose lifetime has elapsed, regardless of access or priority

  3. Manual evictionjob free explicitly purges a path on demand

Automatic Eviction

When a worker needs space for new data, it runs an evictor to select which cached pages to remove. Three policies are available:

Policy
Evicts

LRU (default)

Data not accessed for the longest time

LFU

Data accessed the fewest times overall

FIFO

Data written earliest

To change the eviction policy, set in alluxio-site.properties on all workers:

# Use LFU instead of the default LRU
alluxio.worker.page.store.evictor.type=LFU

Asynchronous Eviction

By default, eviction runs synchronously during writes, which can add latency. Asynchronous eviction runs in the background to keep headroom available before the cache fills up:

alluxio.worker.page.store.async.eviction.enabled=true
# start evicting when cache usage exceeds this threshold (default: 0.9)
alluxio.worker.page.store.async.eviction.high.watermark=0.85
# stop evicting when cache usage drops below this threshold (default: 0.8)
alluxio.worker.page.store.async.eviction.low.watermark=0.75
# how often to check cache usage (default: 1min)
alluxio.worker.page.store.async.eviction.check.interval=30s
circle-info

TTL-based eviction and Cache Priority also affect what gets evicted and when. See Cache Policies for details.

Manual Eviction: The free Job

Use job free to explicitly purge cached data for a path — without touching the underlying UFS data. Common scenarios:

  • Model version update: free the old version before (or after) loading the new one

  • Post-job cleanup: release space after a batch job completes

  • Force re-cache: free then reload to pick up UFS changes for an immutable-policy path

Submit and Monitor

Example progress output:

Stop a Running Free Job

Stopping leaves partially-freed data in the cache. The job can be resumed by submitting it again with --submit.

Version Update Pattern

To replace a pinned dataset with a newer version:

For a complete list of job free flags, see the job free CLI reference.

You can also trigger and manage free jobs via the REST API.

Stale Cache Cleaning

Cluster topology changes can leave data cached on workers that no longer "own" that data according to the consistent hash ring. This stale data consumes space but is never served to clients.

When this happens:

  • Workers are added or removed (ownership redistributes)

  • A file's replication factor is reduced

  • A worker goes offline temporarily and its data migrates, then it rejoins

Trigger Stale Cleaning

This submits an async job to each worker. Workers scan local storage, verify ownership against the current hash ring, and delete any data they no longer own. Monitor progress via the alluxio_cleared_stale_cached_data Prometheus metric or worker logs.

For more details, see the REST API reference.

circle-exclamation

Last updated