Alluxio
ProductsLanguageHome
AI-3.3
AI-3.3
  • Overview
  • Getting Started with K8s
    • Resource Prerequisites and Compatibility
    • Install on Kubernetes
    • Monitoring and Metrics
    • Cluster Administration
    • System Health Check & Quick Recovery
    • Collecting Cluster Information
  • Storage Integrations
    • Storage Integrations Overview
    • Amazon AWS S3
    • HDFS
    • Aliyun OSS
    • COS
    • TOS
    • GCS
  • Client APIs
    • Alluxio Python Filesystem API based on FSSpec
    • FUSE based POSIX API
    • S3 API
  • Features
    • Alluxio Namespace and Under File System Namespaces
    • Cache Preloading
    • Client Writeback
    • Cache Evicting
    • Cache Filtering
    • Cache Free
    • Directory-Based Cluster Quota
    • File Replication
    • File Segmentation
    • Index Service
    • I/O Resiliency
  • Performance Benchmarks
    • Fio Tests
    • MLPerf Storage Benchmark
    • Performance Optimization
    • COSBench performance benchmark
  • Reference
    • User CLI
    • S3 API Usage
    • Third Party Licenses
Powered by GitBook
On this page
  • Cache Evicting Overview
  • Evict on Writing
  • Cache Evictors
  • Background Asynchronous Evicting
  • Eviction based on Capacity
  • Eviction based on Limit on Number of Pages
  • REST API for Updating Configurations Dynamically
  1. Features

Cache Evicting

Cache Evicting Overview

As the storage space used by Alluxio is limited, the Cache Evicting feature evicts old data through several strategies to ensure that there is enough storage space to cache new data.

There are two different ways Alluxio will evict its cache:

  • Evict on Writing

  • Background Asynchronous Evicting

Evict on Writing

Evict on writing is to synchronously check and eliminate the cached data when writing pages in Alluxio. The eviction will be triggered when Alluxio is about to write a page that would cause the total cache to exceed the storage capacity.

Cache Evictors

Alluxio provides the following five evictors to evict cached data:

  • LRUCacheEvictor (default): LRU cache eviction policy

  • FIFOCacheEvictor: FIFO cache eviction policy

  • LFUCacheEvictor: LFU cache eviction policy. Pages are sorted in bucket order based on logarithmic count. Pages inside the bucket are sorted in LRU order.

  • NondeterministicLRUCacheEvictor: LRU with non-deterministic cache eviction policy. Uniformly evict elements in the LRU tail.

  • TwoChoiceRandomEvictor: Two Choice Random client-side cache eviction policy. It selects two random page IDs and evicts the one least-recently used.

The worker cache and client cache have separate properties to define their respective evictor. For example, the following configuration in alluxio-site.properties sets LRUCacheEvictor for both worker and client-side caches.

alluxio.worker.page.store.evictor.class=alluxio.client.file.cache.evictor.LRUCacheEvictor
alluxio.user.client.cache.evictor.class=alluxio.client.file.cache.evictor.LRUCacheEvictor

Background Asynchronous Evicting

Alluxio supports setting different constraints on the cache space that will trigger an eviction:

  1. Setting a limit on the total size that the pages can occupy, i.e. the capacity of the page store;

  2. Setting a limit on the total number of pages in the page store. On every page put operation, Alluxio checks if any of the constraints is violated. If a constraint is violated, a synchronous eviction takes place to make room for the incoming page. However, synchronous eviction on writing will degrade performance dramatically. Background asynchronous eviction aims at evicting cached data asynchronously beforehand to avoid evicting cached data during a write operation.

Eviction based on Capacity

To enable the background asynchronous evicting feature, add the following configurations to alluxio-site.properties:

alluxio.user.client.cache.async.eviction.enabled=true
alluxio.user.client.cache.async.eviction.check.interval=1min
alluxio.user.client.cache.async.eviction.high.water.mark=0.9
alluxio.user.client.cache.async.eviction.low.water.mark=0.8

By setting the above configuration, Alluxio will create a background thread that checks if the page cache space reaches the high water mark threshold. Once this condition is triggered, it will evict cached pages until the low water mark threshold is reached. The background thread will check periodically defined by the check interval property.

Eviction based on Limit on Number of Pages

Alluxio asynchronously evicts pages that exceeds the limit, in a background thread that periodically scans for excessive pages. When the total page number exceeds highWatermark * maxPageNumberLimit, it triggers an eviction, until the total page number drops below lowWatermark * maxPageNumberLimit.

To enable this async eviction by page number limit, set alluxio.worker.page.store.max.page.number.limit.enabled to true. The limit on the maximum number of pages can be specified by alluxio.worker.page.store.max.page.number:

alluxio.user.client.cache.async.eviction.enabled=true
alluxio.user.client.cache.async.eviction.check.interval=1min
alluxio.user.client.cache.async.eviction.low.water.mark=0.6
alluxio.user.client.cache.async.eviction.high.water.mark=0.8

alluxio.worker.page.store.max.page.number.limit.enabled=true
alluxio.worker.page.store.max.page.number=100000

REST API for Updating Configurations Dynamically

Alluxio provides the following REST APIs for users to set and get async eviction configurations dynamically:

  • Enable async eviction and update the related parameters

curl --location --request POST 'localhost:28080/v1/cache?cmd=enableCacheAsyncEviction&chacheEvictionCheckInterval=30&highWaterMark=0.8&lowWaterMark=0.5'
  • Disable async eviction

curl --location --request POST 'localhost:28080/v1/cache?cmd=disableCacheAsyncEviction'
  • Get the current async eviction parameters

curl --location 'localhost:28080/v1/cache?cmd=getPageCacheAsyncEvictionManagerInfo'

Last updated 6 months ago