Alluxio
ProductsLanguageHome
AI-3.6 (stable)
AI-3.6 (stable)
  • Overview
    • Alluxio Namespace and Under File System
    • Worker Management and Consistent Hashing
    • Multi Tenancy and Unified Management
    • I/O Resiliency
  • Getting Started with K8s
    • Resource Prerequisites and Compatibility
    • Installation
      • Install on Kubernetes
      • Handling Images
      • Advanced Configuration
      • License
    • Monitoring and Metrics
    • Management Console
      • Deployment
      • Navigation
      • User Roles & Access Control
    • Cluster Administration
    • System Health Check & Quick Recovery
    • Diagnostic Snapshot
  • Storage Integrations
    • Amazon AWS S3
    • Google Cloud GCS
    • Azure Blob Store
    • Aliyun OSS
    • Tencent COS
    • Volcengine TOS
    • Baidu Object Storage
    • HDFS
    • Network Attached Storage (NAS)
  • Data Access
    • Access via FUSE (POSIX API)
      • Client Writeback
      • Client Virtual Path Mapping
    • Access via S3 API
    • Access via PythonSDK/FSSpec
    • Data Access High Availability
      • Multiple Replicas
      • Multiple Availability Zones (AZ)
    • Performance Optimizations
      • File Reading
      • File Writing
      • Metadata Listing
    • UFS Bandwidth Limiter
  • Cache Management
    • Cache Filter Policy
    • Cache Loading
    • Cache Eviction
      • Manual Eviction by Free Command
      • Auto Eviction by TTL Policy
      • Auto Eviction by Priority Policy
    • Stale Cache Cleaning
    • Cache Quota
  • Performance Benchmarks
    • Fio (POSIX) Benchmark
    • COSBench (S3) Benchmark
    • MLPerf Storage Benchmark
  • Security
    • TLS Support
  • Reference
    • User CLI
    • Metrics
    • REST API
    • S3 API Usage
    • Third Party Licenses
  • Release Notes
Powered by GitBook
On this page
  • Cache Eviction Overview
  • Evict on Writing
  • Cache Evictors
  • Background Asynchronous Evicting
  • Eviction based on Capacity
  • Eviction based on Limit on Number of Pages
  • REST API for Updating Configurations Dynamically
  1. Cache Management

Cache Eviction

Cache Eviction Overview

As the storage space used by Alluxio is limited, workers evict old data through several strategies to ensure that there is enough storage space to cache new data.

There are two different ways Alluxio will evict its cache:

  • Evict on Writing

  • Background Asynchronous Evicting

Evict on Writing

Evict on writing is to synchronously check and eliminate the cached data when writing pages in Alluxio. The eviction will be triggered when Alluxio is about to write a page that would cause the total cache to exceed the storage capacity.

Cache Evictors

Alluxio provides the following five eviction algorithms to evict cached data:

  • LRU (default): LRU cache eviction policy

  • FIFO: FIFO cache eviction policy

  • LFU: LFU cache eviction policy. Pages are sorted in bucket order based on logarithmic count. Pages inside the bucket are sorted in LRU order.

  • Nondeterministic LRU: LRU with non-deterministic cache eviction policy. Uniformly evict elements in the LRU tail.

  • Two Choice Random: Two Choice Random client-side cache eviction policy. It selects two random page IDs and evicts the one least-recently used.

The following configuration in alluxio-site.properties sets LRU eviction policy for worker cache.

alluxio.worker.page.store.evictor.type=LRU

Available options are LRU, LFU, FIFO and RANDOM. If Nondeterministic LRU is needed, please set the configuration as follows:

alluxio.worker.page.store.evictor.type=LRU
alluxio.worker.page.store.evictor.nondeterministic.enabled=true

Background Asynchronous Evicting

Alluxio supports setting different constraints on the cache space that will trigger an eviction:

  1. Setting a limit on the total size that the pages can occupy, i.e. the capacity of the page store;

  2. Setting a limit on the total number of pages in the page store.

On every page put operation, Alluxio checks if any of the constraints is violated. If a constraint is violated, a synchronous eviction takes place to make room for the incoming page. However, synchronous eviction on writing will degrade performance dramatically. Background asynchronous eviction aims at evicting cached data asynchronously beforehand to avoid evicting cached data during a write operation.

Eviction based on Capacity

To enable the background asynchronous evicting feature, add the following configurations to alluxio-site.properties:

alluxio.worker.page.store.async.eviction.enabled=true
alluxio.worker.page.store.async.eviction.check.interval=1min
alluxio.worker.page.store.async.eviction.high.watermark=0.9
alluxio.worker.page.store.async.eviction.low.watermark=0.8

By setting the above configuration, Alluxio will create a background thread that checks if the page cache space reaches the high water mark threshold. Once this condition is triggered, it will evict cached pages until the low water mark threshold is reached. The background thread will check periodically defined by the check interval property.

Eviction based on Limit on Number of Pages

Alluxio asynchronously evicts pages that exceeds the limit, in a background thread that periodically scans for excessive pages. When the total page number exceeds highWatermark * maxPageNumberLimit, it triggers an eviction, until the total page number drops below lowWatermark * maxPageNumberLimit.

To enable this async eviction by page number limit, set alluxio.worker.page.store.max.page.number.limit.enabled to true. The limit on the maximum number of pages can be specified by alluxio.worker.page.store.max.page.number:

alluxio.worker.page.store.async.eviction.enabled=true
alluxio.worker.page.store.async.eviction.check.interval=1min
alluxio.worker.page.store.async.eviction.high.watermark=0.9
alluxio.worker.page.store.async.eviction.low.watermark=0.8

alluxio.worker.page.store.max.page.number.limit.enabled=true
alluxio.worker.page.store.max.page.number=100000

REST API for Updating Configurations Dynamically

Alluxio provides the following REST APIs for users to set and get async eviction configurations dynamically:

  • Enable async eviction and update the related parameters

curl --location --request POST 'localhost:28080/v1/cache?cmd=enableCacheAsyncEviction&chacheEvictionCheckInterval=30&highWaterMark=0.8&lowWaterMark=0.5'
  • Disable async eviction

curl --location --request POST 'localhost:28080/v1/cache?cmd=disableCacheAsyncEviction'
  • Get the current async eviction parameters

curl --location 'localhost:28080/v1/cache?cmd=getPageCacheAsyncEvictionManagerInfo'

Last updated 19 days ago