For the complete documentation index, see llms.txt. This page is also available as Markdown.

Cache Management

Alluxio sits between your compute and persistent storage as a caching layer. Understanding how to manage that cache — what goes in, what stays, and what gets removed — is the key to consistent, high-performance workloads.

What do you want to do?

Goal
Tool
Guide

Pre-warm a dataset before a job runs

job load

Prevent a critical model from being evicted

Cache Priority (pinning)

Automatically expire stale or temporary data

TTL

Control how fresh cached data must be

Cache Filter (maxAge)

Limit cache usage per team or directory

Quota

Remove a dataset after a job or version update

job free

How the Cache Lifecycle Works

1. Data Enters the Cache

Two paths:

  • Passive (read-through): On a cache miss, the worker fetches from UFS and caches automatically. Zero configuration — but the first read is slow.

  • Active preloading: Use job load to push data into cache before any read happens. Eliminates cold-start latency for scheduled jobs and model serving.

Cache Loading →

2. Policies Control What Stays and How Long

Once data is cached, four mechanisms govern its behavior:

  • Cache Priority (Pinning) — marks data as HIGH priority so LRU never evicts it before lower-priority data

  • TTL — sets a hard expiry: data is evicted when its lifetime expires, regardless of access

  • Cache Filter — controls admission: immutable, skip cache, or max-age revalidation

  • Quota — caps how much cache space a directory tree can consume

Cache Policies →

3. Data Leaves the Cache

Three ways data exits:

  • Automatic eviction: LRU (default), LFU, or FIFO removes data when workers fill up

  • TTL expiry: background scan removes data whose lifetime has elapsed

  • Manual removal: job free explicitly purges a path — use this when invalidating a model version or freeing space on demand

Cache Eviction →

Strategy Guide

LLM / ML Model Serving

Examples: fine-tuned models, base model weights, embedding indexes.

Goal: Zero cold-start, critical models never evicted.

  1. Preload before serving: job load --path s3://models/llama3/ --submit --verify

  2. Pin against LRU: alluxio priority add --path s3://models/llama3/ --priority high

  3. On model update: job free --path s3://models/llama3-v1/ --submit → reload v2

Periodically Updated Data

Examples: daily ETL output, retraining datasets.

Goal: Fresh data with good hit rate.

  • Use maxAge filter (e.g. 1d) so Alluxio revalidates on next access after expiry

  • Run job load --skip-if-exists after upstream update to pre-warm the new version

Temporary or Checkpoint Data

Examples: training checkpoints, temp query results.

Goal: Avoid filling cache with short-lived data.

  • skipCache filter to bypass caching entirely, or

  • Short TTL (e.g. 1h) + LOW priority so eviction hits this first

Compliance / Sensitive Data

Examples: PII logs, GDPR-scoped directories.

Goal: Hard limit on data lifetime in cache.

  • TTL with a compliance-aligned window (e.g. 90d)

  • job free immediately after processing completes

Last updated