Directory-Based Cluster Quota
Last updated
Last updated
Alluxio allows admins to define a quota on a directory, which can limit the total amount of cache space used by the files in that directory in Alluxio, across all workers in the cluster.
ETCD only stores quota definitions. Current quota usages are not stored and updated in ETCD.
Workers stay updated on the latest quota definitions by querying ETCD. Based on the quota definitions, workers track the current cache usage under directory with quota definitions.
The coordinator is responsible for periodically polling the latest quota usages from the workers, and forming a cluster-wide usage view by aggregating the usage from each worker. For a certain quota definition, if the total usage in all workers exceeds the defined limit, the coordinator will detect that. Then the coordinator will send cache-control commands to all workers to handle the violated quotas. More details about cache-control commands can be found here.
Please make sure the coordinator is running when enabling directory-based cluster quota feature. If the coordinator is not running, commands that add/remove/update/list quotas will fail. The workers will keep operating, but the quota definitions will not be enforced. So some quotas may be overused without limit.
To enable the directory-based cluster quota feature, set the following properties in conf/alluxio-site.properties
:
You can add a quota definition on a directory in Alluxio by specifying the directory path and the quota size. Note the path is an Alluxio path without a scheme, instead of a UFS path (like s3://bucket
).
When you add a quota definition, please make sure the directory is resolvable in Alluxio. In other words, the target directory must be under an existing Alluxio mount point and it must exist in the UFS. In the example below, there are two existing mount points in Alluxio namespace, and you may only add quota definitions on directories under /s3
or /local
.
You can set quota definition on any directory under existing mount points. For example:
Alluxio also supports nested quota definitions. For example, you can add children quota definitions under an existing parent quota definition, to further split a quota into smaller ones.
The sum of children quota definitions must not exceed the parent quota definition. We also highly recommend leaving enough buffer in the parent quota definition, which is not allocated to any child quota definitions.
You can add a parent quota definition on top of existing children quota definitions.
You can also add a child quota definition at a disjoint level. In other words, not every directory level in a quota definition needs to have a quota definition.
There is no limit in how many levels of directory a quota definition can have. There is also no limit in how many children a parent quota definition can have. However, we do not recommend defining more than 100 quota rules or more than 3 levels of nested quota rules. That will both introduce performance overhead to the cluster and operational overhead to the cluster administrator.
You can remove a quota definition on a directory in Alluxio by specifying the directory path.
You can remove a parent quota definition without removing its children first.
Updating a quota definition takes the same arguments as adding a quota definition.
However, note that if you are updating the quota size to a smaller value than the current usage, the update will fail. In this case, we recommend freeing up some space before reducing the quota size.
In that case, you may use the --force
option to force the update. However, that is not the recommended approach. If you force-update the quota capacity to be less than the current usage, the coordinator will realize the quota is violated. It will then trigger eviction on all workers to evict some cache under that quota. It may also, depending on the configuration, ask the workers to either stop caching or reject I/O requests if they try to create new cache under that directory. See more details about the configured action in When Quota Limit Is Exceeded
When you update a quota definition, the sum of all children quota definitions may never exceed the parent quota definition. If the updated quota definition violates this rule, the update will fail.
You can list all existing quota definitions and their usages in Alluxio by running the following command. Note that if you have just added a new quota definition, workers may need some time to scan their existing cache and calculate the current usage under this directory. In the meantime, the usage will be marked as Calculating
. A child quota usage will be reported under its parent quota.
Alluxio also provides a command to continuously poll the latest quota usages from the coordinator:
When the current quota usage exceeds the defined limit, the coordinator will command all workers in the cluster to evict some cache under that quota, so the aggregated usage will fall below the limit again. For example, if there is one existing quota definition of 10GB capacity on Alluxio path /s3/
with 2 workers in the cluster.
Worker A currently holds 12GB cache under /s3/
.
Worker B currently holds 4GB cache under /s3/
.
The coordinator will aggregate the total usage from both workers, and observe that the current total is 16GB, which exceeds the limit of 10GB. The coordinator will send a command to both worker A and B, asking them to each evict (16 - 10) / 16 = 0.375
of their current cache under /s3/
. If the eviction is successful, after a short time, the cache usage of the two workers will become:
Worker A now holds 7.5GB cache under /s3/
.
Worker B now holds 2.5GB cache under /s3/
.
As can be observed, when the quota limit is exceeded, the coordinator will ask each worker to evict the same percentage of cache. This is the only eviction strategy supported.
On top of eviction, alluxio.quota.limit.exceeded.action
controls the behavior when a quota is violated. There are 3 supported modes:
NO_CACHE
(default): When a quota is violated, all workers will avoid adding new cache under that path. If a read request is a cache-miss, the worker will serve it by reading the UFS but not caching. This mode stops new cache from being added to the cluster, while allowing requests to complete without errors.
REJECT
: When a quota is violated, all workers will reject new cache requests under that path. If a request wants to create new cache under that path, the worker returns an exception. The behavior of Alluxio cluster cache becomes more similar to a disk, where you cannot write into a full disk.
NOOP
: When a quota is violated, still allow new cache to be created under that path. Under this mode, new cache will keep being added to the workers, whereas the coordinator keeps commanding workers to evict cache to restore the quota limit. Note that if the incoming rate of cache is faster than eviction, the usage may never be restored to below the limit set by the quota definition.
When the cache usage under a quota definition is above the limit, cache eviction will be triggered to restore the quota limit. The behavior of eviction is more complicated when there are nested quota definitions.
Within one group of parent and child quota definitions, the quota usages can be over the limit in cases below:
Case 1: Only child quotas are overused One or more child quotas are overused, but the usage of parent quota is still below the limit. In this case, eviction will be triggered only on the overused child quotas. As a consequence, some cache in the corresponding child quota directories will be evicted.
Case 2: Only the parent quota is overused No child quotas are overused, but the usage of parent quota is above the limit. In this case, eviction will be triggered on the parent quota directory. The eviction will consider all cache under the parent directory according to the eviction policy, including those in the child quota directories. As a consequence, quota usage in some child directories may decrease, although their quota capacity are not exceeded.
Case 3: The parent quota is overused because of overuse in child quotas One or more child quotas are overused. That causes the parent quota usage to be over the capacity. In this case, eviction will be triggered only on the overused child quotas. When usages in those child quotas are restored to below the capacity, the parent quota usage will also go back to below the limit.
Case 4: Both the parent and child quotas are overused One or more child quotas are overused, but the overuse in the parent quota is more than that. In this case, the evaluation of which files to evict starts from the innermost level of quota definitions. Eviction for parent quota definitions are not considered until the quotas for their children are met. For example, the parent quota usage is 100GB over the capacity, and one child quota is 50GB over the capacity. Then 50GB will be first evicted from this child quota, and another 50GB will be evicted from the parent quota.
alluxio.quota.worker.heartbeat.interval.ms
controls the frequency of coordinator aggregating the current quota usage from workers in the cluster. Because the coordinator polls the quota usages from workers periodically, the quota usage it observes (also reflected in the bin/alluxio quota list
command) will have a delay.
Other than the bin/alluxio quota list
command, Alluxio also exposes the current quota usages in Prometheus metrics on the Coordinator process. The metrics are labeled with the Alluxio path of the quota definition. The cluster admin can track these Prometheus metrics in the Grafana dashboard and monitor the current quota usage without running the command.
The Load command will load a directory or a large list of files specified in an index file into the cache. It is possible for a load operation to exceed a quota and cause undesired cache eviction. To avoid this scenario, a check can be enabled to reject the job if the load operation would exceed a quota.
This check will calculate and compare 3 pieces of information:
The total size of the files in the UFS
For those files, how much of them are already cached in Alluxio workers
The current availability of the quota definition
For example, if the index file specifies 100 files of total 100GB size, with 50GB of them are already cached in the workers and the quota currently only has 10GB availability left, Alluxio should reject this load job. The admin should manually free up at least 40GB from that quota before trying to submit the load job again.
By default, the quota check step in a Load job is disabled and none of the following constraints will be enforced. To enable the quota check step, set the following property:
Adding the --skip-quota-check
will override the quota check enabled by the configuration property.
The quota check must be enabled for any of the following constraints to be enforced. If it is disabled by configuration or command line flag, no checks will occur.
The quota check will reject a load job if it involves files from multiple quota definitions. This applies to both loading a directory or loading a list of files specified in an index file. This constraint will always be enforced if quota check is enabled.
The quota check can be configured to reject a load job if it tries to load a path that is not under a quota definition. This constraint can be used to ensure all load jobs are under the quota control. It can optionally be turned off from the set of quota checks by setting:
Since the quota is only checked at the start of a load job, it is possible to exceed the quota if running concurrent load jobs under the same quota. To avoid this situation, the quota check can be configured to only allow one load job at a time to run within a path set with a quota. To opt into enforcing this restriction, set the property: