List Of Configuration Properties

All Alluxio configuration settings fall into one of the six categories: Common (shared by Master and Worker), Master specific, Worker specific, User specific, Cluster specific (used for running Alluxio with cluster managers like Mesos and YARN), and Security specific (shared by Master, Worker, and User).

Common Configuration

The common configuration contains constants shared by different components.

Property Name
Default
Description

alluxio.conf.dynamic.update.enabled

false

Whether to support dynamic update property.

alluxio.cross.cluster.master.bind.host

0.0.0.0

The host that the Alluxio cross cluster master will bind to.

alluxio.cross.cluster.master.hostname

${alluxio.master.hostname}

The hostname of the Cross Cluster master.

alluxio.cross.cluster.master.rpc.port

20009

The port for Alluxio cross cluster master's RPC service.

alluxio.cross.cluster.master.web.bind.host

0.0.0.0

The host that the cross cluster master web server binds to.

alluxio.cross.cluster.master.web.hostname

${alluxio.cross.cluster.master.hostname}

The hostname of the cross cluster master web server.

alluxio.cross.cluster.master.web.port

20010

The port the cross cluster master web server uses.

alluxio.debug

false

Set to true to enable debug mode which has additional logging and info in the Web UI.

alluxio.exit.collect.info

true

If true, the process will dump metrics and jstack into the log folder. This only applies to Alluxio master and worker processes.

alluxio.fuse.auth.policy.class

alluxio.fuse.auth.LaunchUserGroupAuthPolicy

The fuse auth policy class. Valid options include: `alluxio.fuse.auth.LaunchUserGroupAuthPolicy` using the user launching the AlluxioFuse application to do authentication, `alluxio.fuse.auth.SystemUserGroupAuthPolicy` using the end-user running the fuse command to do authentication which matches POSIX standard but sacrifices performance, `alluxio.fuse.auth.CustomAuthPolicy` using the custom user group to do authentication.

alluxio.fuse.auth.policy.custom.group

The fuse group name for custom auth policy. Only valid if the alluxio.fuse.auth.policy.class is alluxio.fuse.auth.CustomAuthPolicy

alluxio.fuse.auth.policy.custom.user

The fuse user name for custom auth policy. Only valid if the alluxio.fuse.auth.policy.class is alluxio.fuse.auth.CustomAuthPolicy

alluxio.fuse.cached.paths.max

500

Maximum number of FUSE-to-Alluxio path mappings to cache for FUSE conversion.

alluxio.fuse.debug.enabled

false

Run FUSE in debug mode, and have the fuse process log every FS request.

alluxio.fuse.fs.name

alluxio-fuse

The FUSE file system name.

alluxio.fuse.jnifuse.enabled

true

Use JNI-Fuse library for better performance. If disabled, JNR-Fuse will be used.

alluxio.fuse.jnifuse.libfuse.version

2

The version of libfuse used by libjnifuse. Libfuse2 and Libfuse3 are supported.

alluxio.fuse.logging.threshold

10s

Logging a FUSE API call when it takes more time than the threshold.

alluxio.fuse.mount.alluxio.path

/

The Alluxio path to mount to the given Fuse mount point configured by alluxio.fuse.mount.point in the worker when alluxio.worker.fuse.enabled is enabled or in the standalone Fuse process.

alluxio.fuse.mount.options

attr_timeout=600,entry_timeout=600

The platform specific Fuse mount options to mount the given Fuse mount point. If multiple mount options are provided, separate them with comma.

alluxio.fuse.mount.point

/mnt/alluxio-fuse

The absolute local filesystem path that worker (if alluxio.worker.fuse.enabled is enabled)or standalone Fuse will mount Alluxio path to.

alluxio.fuse.shared.caching.reader.enabled

false

(Experimental) Use share grpc data reader for better performance on multi-process file reading through Alluxio JNI Fuse. Blocks data will be cached on the client side so more memory is required for the Fuse process.

alluxio.fuse.special.command.enabled

false

If enabled, user can issue special FUSE commands by using 'ls -l /path/to/fuse_mount/.alluxiocli.<command_name>.<subcommand_name>', For example, when the Alluxio is mounted at local path /mnt/alluxio-fuse, 'ls -l /mnt/alluxio-fuse/.alluxiocli.metadatacache.dropAll' will drop all the user metadata cache. 'ls -l /mnt/alluxio-fuse/.alluxiocli.metadatacache.size' will get the metadata cache size, the size value will be show in the output's filesize field. 'ls -l /mnt/alluxio-fuse/path/to/be/cleaned/.alluxiocli.metadatacache.drop' will drop the metadata cache of path '/mnt/alluxio-fuse/path/to/be/cleaned/'

alluxio.fuse.stat.cache.refresh.interval

5min

The fuse filesystem statistics (e.g. Alluxio capacity information) will be refreshed after being cached for this time period. If the refresh time is too big, operations on the FUSE may fail because of the stale filesystem statistics. If it is too small, continuously fetching filesystem statistics create a large amount of master RPC calls and lower the overall performance of the Fuse application. A value small than or equal to zero means no statistics cache on the Fuse side.

alluxio.fuse.umount.timeout

0s

The timeout to wait for all in progress file read and write to finish before unmounting the Fuse filesystem when SIGTERM signal is received. A value smaller than or equal to zero means no umount wait time.

alluxio.fuse.user.group.translation.enabled

false

Whether to translate Alluxio users and groups into Unix users and groups when exposing Alluxio files through the FUSE API. When this property is set to false, the user and group for all FUSE files will match the user who started the alluxio-fuse process.Note that this applies to JNR-FUSE only.

alluxio.fuse.web.bind.host

0.0.0.0

The hostname Alluxio FUSE web UI binds to.

alluxio.fuse.web.enabled

false

Whether to enable FUSE web server.

alluxio.fuse.web.hostname

The hostname of Alluxio FUSE web UI.

alluxio.fuse.web.port

49999

The port Alluxio FUSE web UI runs on.

alluxio.grpc.reflection.enabled

false

If true, grpc reflection will be enabled on alluxio grpc servers, including masters, workers, job masters and job workers. This makes grpc tools such as grpcurl or grpcui can send grpc requests to the master server easier without knowing the protobufs. This is a debug option.

alluxio.hadoop.kerberos.keytab.login.autorenewal

Kerberos authentication keytab login auto renew.

alluxio.hadoop.security.authentication

HDFS authentication method.

alluxio.hadoop.security.krb5.conf

Kerberos krb file for configuration of Kerberos.

alluxio.home

/opt/alluxio

Alluxio installation directory.

alluxio.job.batch.size

20

The number of tasks would be included in a job request.

alluxio.job.load.failure.count.threshold

100

When the number of failed files is greater than the current value, the job status changes to failure.

alluxio.job.load.failure.ratio.threshold

0.05

When the file failure ratio is greater than the current value,the job status changes to failure.

alluxio.job.master.bind.host

0.0.0.0

The host that the Alluxio job master will bind to.

alluxio.job.master.client.threads

1024

The number of threads the Alluxio master uses to make requests to the job master.

alluxio.job.master.embedded.journal.addresses

A comma-separated list of journal addresses for all job masters in the cluster. The format is 'hostname1:port1,hostname2:port2,...'. Defaults to the journal addresses set for the Alluxio masters (alluxio.master.embedded.journal.addresses), but with the job master embedded journal port.

alluxio.job.master.embedded.journal.port

20003

The port job masters use for embedded journal communications.

alluxio.job.master.finished.job.purge.count

-1

The maximum amount of jobs to purge at any single time when the job master reaches its maximum capacity. It is recommended to set this value when setting the capacity of the job master to a large ( > 10M) value. Default is -1 denoting an unlimited value

alluxio.job.master.finished.job.retention.time

60sec

The length of time the Alluxio Job Master should save information about completed jobs before they are discarded.

alluxio.job.master.hostname

${alluxio.master.hostname}

The hostname of the Alluxio job master.

alluxio.job.master.job.capacity

100000

The total possible number of available job statuses in the job master. This value includes running and finished jobs which are have completed within alluxio.job.master.finished.job.retention.time.

alluxio.job.master.job.trace.retention.time

1d

The length of time the client can trace the submitted job.

alluxio.job.master.lost.master.interval

10sec

The time interval the job master waits between checks for lost job masters.

alluxio.job.master.lost.worker.interval

1sec

The time interval the job master waits between checks for lost workers.

alluxio.job.master.master.heartbeat.interval

1sec

The amount of time that a standby Alluxio Job Master should wait in between heartbeats to the primary Job Master.

alluxio.job.master.master.timeout

60sec

The time period after which the primary Job Master will mark a standby as lost without a subsequent heartbeat.

alluxio.job.master.network.flowcontrol.window

2MB

The HTTP2 flow control window used by Alluxio job-master gRPC connections. Larger value will allow more data to be buffered but will use more memory.

alluxio.job.master.network.keepalive.time

2h

The amount of time for Alluxio job-master gRPC server to wait for a response before pinging the client to see if it is still alive.

alluxio.job.master.network.keepalive.timeout

30sec

The maximum time for Alluxio job-master gRPC server to wait for a keepalive response before closing the connection.

alluxio.job.master.network.max.inbound.message.size

100MB

The maximum size of a message that can be sent to the Alluxio master

alluxio.job.master.network.permit.keepalive.time

30sec

Specify the most aggressive keep-alive time clients are permitted to configure. The server will try to detect clients exceeding this rate and when detected will forcefully close the connection.

alluxio.job.master.rpc.addresses

A list of comma-separated host:port RPC addresses where the client should look for job masters when using multiple job masters without Zookeeper. This property is not used when Zookeeper is enabled, since Zookeeper already stores the job master addresses. If property is not defined, clients will look for job masters using [alluxio.master.rpc.addresses]:alluxio.job.master.rpc.port first, then for [alluxio.job.master.embedded.journal.addresses]:alluxio.job.master.rpc.port.

alluxio.job.master.rpc.port

20001

The port for Alluxio job master's RPC service.

alluxio.job.master.web.bind.host

0.0.0.0

The host that the job master web server binds to.

alluxio.job.master.web.hostname

${alluxio.job.master.hostname}

The hostname of the job master web server.

alluxio.job.master.web.port

20002

The port the job master web server uses.

alluxio.job.master.worker.heartbeat.interval

1sec

The amount of time that the Alluxio job worker should wait in between heartbeats to the Job Master.

alluxio.job.master.worker.timeout

60sec

The time period after which the job master will mark a worker as lost without a subsequent heartbeat.

alluxio.job.request.batch.size

1

The batch size client uses to make requests to the job master.

alluxio.job.retention.time

1d

The length of time the Alluxio should save information about completed jobs before they are discarded.

alluxio.job.worker.bind.host

0.0.0.0

The host that the Alluxio job worker will bind to.

alluxio.job.worker.data.port

30002

The port the Alluxio Job worker uses to send data.

alluxio.job.worker.hostname

${alluxio.worker.hostname}

The hostname of the Alluxio job worker.

alluxio.job.worker.rpc.port

30001

The port for Alluxio job worker's RPC service.

alluxio.job.worker.secure.rpc.port

30004

N/A

alluxio.job.worker.threadpool.size

10

Number of threads in the thread pool for job worker. This may be adjusted to a lower value to alleviate resource saturation on the job worker nodes (CPU + IO).

alluxio.job.worker.throttling

false

Whether the job worker should throttle itself based on whether the resources are saturated.

alluxio.job.worker.web.bind.host

0.0.0.0

The host the job worker web server binds to.

alluxio.job.worker.web.port

30003

The port the Alluxio job worker web server uses.

alluxio.jvm.monitor.info.threshold

1sec

When the JVM pauses for anything longer than this, log an INFO message.

alluxio.jvm.monitor.sleep.interval

1sec

The time for the JVM monitor thread to sleep.

alluxio.jvm.monitor.warn.threshold

10sec

When the JVM pauses for anything longer than this, log a WARN message.

alluxio.leak.detector.exit.on.leak

false

If set to true, the JVM will exit as soon as a leak is detected. Use only in testing environments.

alluxio.leak.detector.level

DISABLED

Set this to one of {DISABLED, SIMPLE, ADVANCED, PARANOID} to track resource leaks in the Alluxio codebase. DISABLED does not track any leaks. SIMPLE only samples resources, and doesn't track recent accesses, having a low overhead. ADVANCED is like simple, but tracks recent object accesses and has higher overhead. PARANOID tracks all objects and has the highest overhead. It is recommended to only use this value during testing.

alluxio.lib.dir

${alluxio.home}/lib

N/A

alluxio.license.file

${alluxio.home}/license.json

N/A

alluxio.locality.compare.node.ip

false

Whether try to resolve the node IP address for locality checking

alluxio.logserver.hostname

The hostname of Alluxio logserver. Note: overwriting this property will only work when it is passed as a JVM system property (e.g., appending "-Dalluxio.logserver.hostname"=<NEW_VALUE>" to $ALLUXIO_JAVA_OPTS). Setting it in alluxio-site.properties will not work.

alluxio.logserver.logs.dir

${alluxio.work.dir}/logs

Default location for remote log files. Note: overwriting this property will only work when it is passed as a JVM system property (e.g., appending "-Dalluxio.logserver.logs.dir"=<NEW_VALUE>" to $ALLUXIO_JAVA_OPTS). Setting it in alluxio-site.properties will not work.

alluxio.logserver.port

45600

Default port of logserver to receive logs from alluxio servers. Note: overwriting this property will only work when it is passed as a JVM system property (e.g., appending "-Dalluxio.logserver.port"=<NEW_VALUE>" to $ALLUXIO_JAVA_OPTS). Setting it in alluxio-site.properties will not work.

alluxio.logserver.threads.max

2048

The maximum number of threads used by logserver to service logging requests.

alluxio.logserver.threads.min

512

The minimum number of threads used by logserver to service logging requests.

alluxio.metrics.conf.file

${alluxio.conf.dir}/metrics.properties

The file path of the metrics system configuration file. By default it is `metrics.properties` in the `conf` directory.

alluxio.metrics.executor.task.warn.frequency

5sec

When instrumenting an executor withInstrumentedExecutorService, if the number of active tasks (queued or running) is greater than alluxio.metrics.executor.task.warn.size value, a warning log will be printed at the given interval

alluxio.metrics.executor.task.warn.size

1000

When instrumenting an executor with InstrumentedExecutorService, if the number of active tasks (queued or running) is greater than this value, a warning log will be printed at the interval given by alluxio.metrics.executor.task.warn.frequency

alluxio.native.library.path

${alluxio.home}/lib/native

N/A

alluxio.network.connection.auth.timeout

30sec

Maximum time to wait for a connection (gRPC channel) to attempt to receive an authentication response.

alluxio.network.connection.health.check.timeout

5sec

Allowed duration for checking health of client connections (gRPC channels) before being assigned to a client. If a connection does not become active within configured time, it will be shut down and a new connection will be created for the client

alluxio.network.connection.server.shutdown.timeout

60sec

Maximum time to wait for gRPC server to stop on shutdown

alluxio.network.connection.shutdown.graceful.timeout

45sec

Maximum time to wait for connections (gRPC channels) to stop on shutdown

alluxio.network.connection.shutdown.timeout

15sec

Maximum time to wait for connections (gRPC channels) to stop after graceful shutdown attempt.

alluxio.network.host.resolution.timeout

5sec

During startup of the Master and Worker processes Alluxio needs to ensure that they are listening on externally resolvable and reachable host names. To do this, Alluxio will automatically attempt to select an appropriate host name if one was not explicitly specified. This represents the maximum amount of time spent waiting to determine if a candidate host name is resolvable over the network.

alluxio.network.ip.address.used

false

If true, when alluxio.<service_name>.hostname and alluxio.<service_name>.bind.host of a service not specified, use IP as the connect host of the service.

alluxio.network.journal.tls.enabled

true

If network tls enabled, decide whether the tls for journal enabled

alluxio.network.tls.client.no.endpoint.identification

false

The client has no endpoint identification

alluxio.network.tls.keystore.alias

When the keystore contains multiple keys/aliases, this parameter must be set with the alias name of the key to use for the server. If the keystore only contains a single key, this parameter is ignored.

alluxio.network.tls.keystore.key.password

The password for the key in the keystore.

alluxio.network.tls.keystore.password

The password for the keystore.

alluxio.network.tls.keystore.path

The path to the keystore (in JKS format) for the server side of a TLS connection.

alluxio.network.tls.self.signed.enabled

false

If true, enables TLS with self signed cert

alluxio.network.tls.server.protocols

Comma-separated list of protocol names to enable on the server. If unset, the default internal list is used, which typically includes all the supported protocols. Example: TLSv1.1,TLSv1.2

alluxio.network.tls.truststore.alias

When the truststore contains multiple aliases, this parameter must be set with the alias name of the trust certificate. If the truststore only contains a single alias, this parameter is ignored.

alluxio.network.tls.truststore.password

The password for the truststore.

alluxio.network.tls.truststore.path

The path to the truststore (in JKS format) for the client side of a TLS connection.

alluxio.network.tls.use.system.trusstore

false

Whether to use the system truststore, if true, no need to specify truststore path

alluxio.policy.action.commit.executor.keepalive

1min

Maximum wait time for idle non-core threads before being terminated for committing actions in policy engine.

alluxio.policy.action.commit.executor.threads

8 * {number of CPUs}

Number of threads for committing actions in policy engine.

alluxio.policy.action.execution.executor.keepalive

1min

Maximum wait time for idle threads for executing actions in policy engine to be terminated.

alluxio.policy.action.execution.executor.threads

16 * {number of CPUs}

Number of threads for executing actions in policy engine.

alluxio.policy.action.scheduler.heartbeat.interval

10s

The interval between two policy engine action scheduler heartbeats.

alluxio.policy.action.scheduler.running.actions.max

8192

Max number of running actions allowed by policy engine action scheduler.

alluxio.policy.action.scheduler.scheduled.actions.max

100000

Max number of actions allowed to be scheduled by policy engine action scheduler.

alluxio.policy.action.scheduler.threads

8

Number of threads in executor pool for policy engine action scheduler.

alluxio.policy.enabled

true

Whether policy engine is enabled for Alluxio.

alluxio.policy.executor.shutdown.timeout

5sec

Maximum time to wait for policy engine executors to shutdown.

alluxio.policy.history.file.length.max

1000000

Maximum number of records in policy history file

alluxio.policy.incremental.incomplete.files.max

65536

The maximum number of incomplete file journal entries tracked by policy engine.

alluxio.policy.scan.file.metadata.sync.interval

7d

The interval for syncing UFS metadata before evaluating policies on a path. -1 means no sync will occur. 0 means Alluxio will always sync the metadata with UFS during policy scan. If you specify a time interval, Alluxio will (best effort) not re-sync a path within that time interval. Syncing the metadata for a path must interact with the UFS, so it is an expensive operation. It is recommended to set alluxio.master.metastore to ROCKS when setting policies on directories with a lot of files.

alluxio.policy.scan.file.metadata.sync.queue.size.max

10000000

Maximum number of tasks in queue allowed in policy engine for sync'ing metadata.

alluxio.policy.scan.file.metadata.sync.threads

4

Number of threads used by policy engine for sync'ing metadata.

alluxio.policy.scan.initial.delay

5m

The initial delay for the policy engine scan after the primary master starts.

alluxio.policy.scan.interval

24h

The interval between two policy engine scan events.

alluxio.proxy.audit.logging.enabled

false

Set to true to enable proxy audit.

alluxio.proxy.bind.host

0.0.0.0

The hostname Alluxio's worker node binds to.

alluxio.proxy.hostname

The hostname of Alluxio proxy.

alluxio.proxy.master.heartbeat.interval

10sec

Proxy instances maintain a heartbeat with the primary master. This key specifies the heartbeat interval.

alluxio.proxy.rpc.port

39998

The port for Alluxio proxy's RPC service.

alluxio.proxy.s3.api.nocache.ufs.read.through.enabled

false

(Experimental) If enabled, reading files with a read type of NO_CACHE will be directly read from UFS.

alluxio.proxy.s3.api.noprefix.enabled

false

(Experimental) remove the /api/v1/s3 prefix and support /bucket/object way to access proxy.

alluxio.proxy.s3.bucket.naming.restrictions.enabled

false

Toggles whether or not the Alluxio S3 API will enforce AWS S3 bucket naming restrictions. See https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html.

alluxio.proxy.s3.bucketpathcache.timeout

0min

Expire bucket path statistics in cache for this time period. Set 0min to disable the cache. If enabling the cache, be careful that Alluxio S3 API will behave differently from AWS S3 API if bucket path cache entries become stale.

alluxio.proxy.s3.complete.multipart.upload.keepalive.enabled

false

Whether or not to enabled sending whitespace characters as a keepalive message during CompleteMultipartUpload. Enabling this will cause any errors to be silently ignored. However, the errors will appear in the Proxy logs.

alluxio.proxy.s3.complete.multipart.upload.keepalive.time.interval

30sec

The complete multipart upload maximum keepalive time. The keepalive whitespace characters will be sent after 1 second, exponentially increasing in duration up to the configured value.

alluxio.proxy.s3.complete.multipart.upload.min.part.size

5MB

The minimum required file size of parts for multipart uploads. Parts which are smaller than this limit aside from the final part will result in an EntityTooSmall error code. Set to 0 to disable size requirements.

alluxio.proxy.s3.complete.multipart.upload.pool.size

20

The complete multipart upload thread pool size.

alluxio.proxy.s3.deletetype

ALLUXIO_AND_UFS

Delete type when deleting buckets and objects through S3 API. Valid options are `ALLUXIO_AND_UFS` (delete both in Alluxio and UFS), `ALLUXIO_ONLY` (delete only the buckets or objects in Alluxio namespace).

alluxio.proxy.s3.global.read.rate.limit.mb

0

Limit the maximum read speed for all connections. Set value less than or equal to 0 to disable rate limits.

alluxio.proxy.s3.header.metadata.max.size

2KB

The maximum size to allow for user-defined metadata in S3 PUTrequest headers. Set to 0 to disable size limits.

alluxio.proxy.s3.multipart.upload.cleaner.enabled

false

Enable automatic cleanup of long-running multipart uploads.

alluxio.proxy.s3.multipart.upload.cleaner.pool.size

1

The abort multipart upload cleaner pool size.

alluxio.proxy.s3.multipart.upload.cleaner.retry.count

3

The retry count when aborting a multipart upload fails.

alluxio.proxy.s3.multipart.upload.cleaner.retry.delay

10sec

The retry delay time when aborting a multipart upload fails.

alluxio.proxy.s3.multipart.upload.cleaner.timeout

10min

The timeout for aborting proxy s3 multipart upload automatically.

alluxio.proxy.s3.multipart.upload.stream.through

true

The complete multipart upload write type.

alluxio.proxy.s3.multipart.upload.write.through

false

The complete multipart upload write type.

alluxio.proxy.s3.single.connection.read.rate.limit.mb

0

Limit the maximum read speed for each connection. Set value less than or equal to 0 to disable rate limits.

alluxio.proxy.s3.tagging.restrictions.enabled

true

Toggles whether or not the Alluxio S3 API will enforce AWS S3 tagging restrictions (10 tags, 128 character keys, 256 character values) See https://docs.aws.amazon.com/AmazonS3/latest/userguide/tagging-managing.html.

alluxio.proxy.s3.throttle.max.wait.time.ms

60000

The maximum waiting time when the request is throttled.

alluxio.proxy.s3.use.position.read.range.size

0

When the requested range length is less than this value, the S3 proxy will use 'positionRead' to read data from the worker. Setting a value less than or equal to 0 indicates disabling this feature. In the current implementation, each request for a position read uses a byte array of the same size as the range to temporarily store data, which consumes additional memory. Therefore, in practical use, we limit this value to 4MB. This means that if a value exceeding 4MB is configured, it will be modified to 4MB.

alluxio.proxy.s3.v2.async.context.timeout.ms

30000

Timeout(in milliseconds) for async context. Set zero or less indicates no timeout.

alluxio.proxy.s3.v2.async.heavy.pool.core.thread.number

8

Core thread number for async heavy thread pool.

alluxio.proxy.s3.v2.async.heavy.pool.maximum.thread.number

64

Maximum thread number for async heavy thread pool.

alluxio.proxy.s3.v2.async.heavy.pool.queue.size

65536

Queue size for async heavy thread pool.

alluxio.proxy.s3.v2.async.light.pool.core.thread.number

8

Core thread number for async light thread pool.

alluxio.proxy.s3.v2.async.light.pool.maximum.thread.number

64

Maximum thread number for async light thread pool.

alluxio.proxy.s3.v2.async.light.pool.queue.size

65536

Queue size for async light thread pool.

alluxio.proxy.s3.v2.async.processing.enabled

false

(Experimental) If enabled, handle S3 request in async mode when v2 version of Alluxio s3 proxy service is enabled.

alluxio.proxy.s3.v2.version.enabled

false

(Experimental) V2, an optimized version of Alluxio s3 proxy service.

alluxio.proxy.s3.writetype

CACHE_THROUGH

Write type when creating buckets and objects through S3 API. Valid options are `MUST_CACHE` (write will only go to Alluxio and must be stored in Alluxio), `CACHE_THROUGH` (try to cache, write to UnderFS synchronously), `THROUGH` (no cache, write to UnderFS synchronously).

alluxio.proxy.secure.rpc.bind.host

0.0.0.0

N/A

alluxio.proxy.secure.rpc.hostname

N/A

alluxio.proxy.secure.rpc.port

39997

N/A

alluxio.proxy.stream.cache.timeout

1hour

The timeout for the input and output streams cache eviction in the proxy.

alluxio.proxy.web.bind.host

0.0.0.0

The hostname that the Alluxio proxy's web server runs on.

alluxio.proxy.web.hostname

The hostname Alluxio proxy's web UI binds to.

alluxio.proxy.web.port

39999

The port Alluxio proxy's web UI runs on.

alluxio.s3.rest.authentication.assume.role.from.token

false

Whether to treat the custom field in the token as the requested authentication role. the value of this field is used to specify the Alluxio ACL username to perform an operation with.

alluxio.s3.rest.authentication.assume.role.token.field

sub

Custom field in the token as the requested authentication role.the value of this field is used to specify the Alluxio ACL username to perform an operation with.

alluxio.s3.rest.authentication.enabled

false

Whether to enable check s3 rest request header.

alluxio.s3.rest.authentication.jwksaddr

The HTTPS JWKS endpoint to retrieve and cache keys for OIDC Token.When alluxio enable authentication through OIDC Token, this parameter must be set with the JWKs endpoint address provided by IdP. It will be used to verify the id token from IdP.

alluxio.s3.rest.authentication.oidc.enabled

false

Whether to enable authentication through OIDC Token in S3 Rest API.

alluxio.s3.rest.authentication.token.default.duration

3600

Default lifetime of a temporary session token for S3 Rest API.

alluxio.s3.rest.authenticator.classname

alluxio.proxy.s3.auth.PassAllAuthenticator

The class's name is instantiated as an S3 authenticator.

alluxio.secondary.master.metastore.dir

${alluxio.work.dir}/secondary-metastore

The secondary master metastore work directory. Only some metastores need disk.

alluxio.site.conf.dir

${alluxio.conf.dir}/,${user.home}/.alluxio/,/etc/alluxio/

Comma-separated search path for alluxio-site.properties. Note: overwriting this property will only work when it is passed as a JVM system property (e.g., appending "-Dalluxio.site.conf.dir"=<NEW_VALUE>" to $ALLUXIO_JAVA_OPTS). Setting it in alluxio-site.properties will not work.

alluxio.site.conf.rocks.block.file

Path of file containing RocksDB block store configuration. A template configuration cab be found at ${alluxio.conf.dir}/rocks-block.ini.template. See https://github.com/facebook/rocksdb/blob/main/examples/rocksdb_option_file_example.ini for more information on RocksDB configuration files. If unset then a default configuration will be used.

alluxio.site.conf.rocks.inode.file

Path of file containing RocksDB inode store configuration. A template configuration cab be found at ${alluxio.conf.dir}/rocks-inode.ini.template. See https://github.com/facebook/rocksdb/blob/main/examples/rocksdb_option_file_example.ini for more information on RocksDB configuration files. If unset then a default configuration will be used.

alluxio.standalone.fuse.jvm.monitor.enabled

false

Whether to enable start JVM monitor thread on the standalone fuse process. This will start a thread to detect JVM-wide pauses induced by GC or other reasons.

alluxio.standby.master.metrics.sink.enabled

false

Whether a standby master runs the metric sink

alluxio.standby.master.web.enabled

false

Whether a standby master runs a web server

alluxio.table.catalog.path

/catalog

The Alluxio file path for the table catalog metadata.

alluxio.table.catalog.udb.sync.initial.delay

30m

This is the initial delay period, after becoming the primary master, before the period udb sync starts. If this is too short, it may interfere with other tasks when becoming the primary. If this is too long, it may result in more staleness of the table metadata. This parameter is only used if udb sync is enabled by setting alluxio.table.catalog.udb.sync.interval appropriately.

alluxio.table.catalog.udb.sync.interval

4h

The catalog service can periodically sync its databases with the UDB. This interval specifies how often the sync should be automatically performed. If this is too short, it may overload the interaction with the UDB. If this is too long, there will be a larger window of stale table metadata. Set this to -1 to disable the automatic syncing.

alluxio.table.catalog.udb.sync.timeout

1h

The timeout period for a db sync to finish in the catalog. If a synctakes longer than this timeout, the sync will be terminated.

alluxio.table.enabled

true

(Experimental) Enables the table service.

alluxio.table.journal.partitions.chunk.size

500

The maximum table partitions number in a single journal entry.

alluxio.table.load.default.replication

1

The default replication number of files under the SDS table after load option.

alluxio.table.transform.manager.job.history.retention.time

300sec

The length of time the Alluxio Table Master should keep information about finished transformation jobs before they are discarded.

alluxio.table.transform.manager.job.monitor.interval

10s

Job monitor is a heartbeat thread in the transform manager, this is the time interval in milliseconds the job monitor heartbeat is run to check the status of the transformation jobs and update table and partition locations after transformation.

alluxio.table.udb.hive.clientpool.MAX

256

The maximum capacity of the hive client pool per hive metastore

alluxio.table.udb.hive.clientpool.min

16

The minimum capacity of the hive client pool per hive metastore

alluxio.test.deprecated.key

N/A

alluxio.tmp.dirs

/tmp

The path(s) to store Alluxio temporary files, use commas as delimiters. If multiple paths are specified, one will be selected at random per temporary file. Currently, only files to be uploaded to object stores are stored in these paths.

alluxio.underfs.allow.set.owner.failure

false

Whether to allow setting owner in UFS to fail. When set to true, it is possible file or directory owners diverge between Alluxio and UFS.

alluxio.underfs.cleanup.enabled

false

Whether or not to clean up under file storage periodically.Some ufs operations may not be completed and cleaned up successfully in normal ways and leave some intermediate data that needs periodical cleanup.If enabled, all the mount points will be cleaned up when a leader master starts or cleanup interval is reached. This should be used sparingly.

alluxio.underfs.cleanup.interval

1day

The interval for periodically cleaning all the mounted under file storages.

alluxio.underfs.eventual.consistency.retry.base.sleep

50ms

To handle eventually consistent storage semantics for certain under storages, Alluxio will perform retries when under storage metadata doesn't match Alluxio's expectations. These retries use exponential backoff. This property determines the base time for the exponential backoff.

alluxio.underfs.eventual.consistency.retry.max.num

0

To handle eventually consistent storage semantics for certain under storages, Alluxio will perform retries when under storage metadata doesn't match Alluxio's expectations. These retries use exponential backoff. This property determines the maximum number of retries. This property defaults to 0 as modern object store UFSs provide strong consistency.

alluxio.underfs.eventual.consistency.retry.max.sleep

30sec

To handle eventually consistent storage semantics for certain under storages, Alluxio will perform retries when under storage metadata doesn't match Alluxio's expectations. These retries use exponential backoff. This property determines the maximum wait time in the backoff.

alluxio.underfs.gcs.default.mode

0700

Mode (in octal notation) for GCS objects if mode cannot be discovered.

alluxio.underfs.gcs.directory.suffix

/

Directories are represented in GCS as zero-byte objects named with the specified suffix.

alluxio.underfs.gcs.owner.id.to.username.mapping

Optionally, specify a preset gcs owner id to Alluxio username static mapping in the format "id1=user1;id2=user2". The Google Cloud Storage IDs can be found at the console address https://console.cloud.google.com/storage/settings . Please use the "Owners" one. This property key is only valid when alluxio.underfs.gcs.version=1

alluxio.underfs.gcs.retry.delay.multiplier

2

Delay multiplier while retrying requests on the ufs

alluxio.underfs.gcs.retry.initial.delay

1000

Initial delay before attempting the retry on the ufs

alluxio.underfs.gcs.retry.jitter

true

Enable delay jitter while retrying requests on the ufs

alluxio.underfs.gcs.retry.max

60

Maximum Number of retries on the ufs

alluxio.underfs.gcs.retry.max.delay

1min

Maximum delay before attempting the retry on the ufs

alluxio.underfs.gcs.retry.total.duration

5min

Maximum retry duration on the ufs

alluxio.underfs.gcs.version

2

Specify the version of GCS module to use. GCS version "1" builds on top of jets3t package which requires fs.gcs.accessKeyId and fs.gcs.secretAccessKey. GCS version "2" build on top of Google cloud API which requires fs.gcs.credential.path

alluxio.underfs.hdfs.configuration

${alluxio.conf.dir}/core-site.xml:${alluxio.conf.dir}/hdfs-site.xml

Location of the HDFS configuration file to overwrite the default HDFS client configuration. Note that, these files must be availableon every node.

alluxio.underfs.hdfs.impl

org.apache.hadoop.hdfs.DistributedFileSystem

The implementation class of the HDFS as the under storage system.

alluxio.underfs.hdfs.prefixes

hdfs://,glusterfs:///

Optionally, specify which prefixes should run through the HDFS implementation of UnderFileSystem. The delimiter is any whitespace and/or ','.

alluxio.underfs.hdfs.remote

true

Boolean indicating whether or not the under storage worker nodes are remote with respect to Alluxio worker nodes. If set to true, Alluxio will not attempt to discover locality information from the under storage because locality is impossible. This will improve performance. The default value is true.

alluxio.underfs.io.threads

Use 3*{CPU core count} for UFS IO.

Number of threads used for UFS IO operation

alluxio.underfs.listing.length

1000

The maximum number of directory entries to list in a single query to under file system. If the total number of entries is greater than the specified length, multiple queries will be issued.

alluxio.underfs.local.skip.broken.symlinks

false

When set to true, any time the local underfs lists a broken symlink, it will treat the entry as if it didn't exist at all.

alluxio.underfs.logging.threshold

10s

Logging a UFS API call when it takes more time than the threshold.

alluxio.underfs.object.store.breadcrumbs.enabled

true

Set this to false to prevent Alluxio from creating zero byte objects during read or list operations on object store UFS. Leaving this on enables more efficient listing of prefixes.

alluxio.underfs.object.store.mount.shared.publicly

false

Whether or not to share object storage under storage system mounted point with all Alluxio users. Note that this configuration has no effect on HDFS nor local UFS.

alluxio.underfs.object.store.multi.range.chunk.size

${alluxio.user.block.size.bytes.default}

Default chunk size for ranged reads from multi-range object input streams.

alluxio.underfs.object.store.service.threads

20

The number of threads in executor pool for parallel object store UFS operations, such as directory renames and deletes.

alluxio.underfs.object.store.skip.parent.directory.creation

true

Do not create parent directory for new files. Object stores generally uses prefix which is not required for creating new files. Skipping parent directory is recommended for better performance. Set this to false if the object store requires prefix creation for new files.

alluxio.underfs.object.store.streaming.upload.part.timeout

Timeout for uploading part when using streaming uploads.

alluxio.underfs.obs.intermediate.upload.clean.age

3day

Streaming uploads may not have been completed/aborted correctly and need periodical ufs cleanup. If ufs cleanup is enabled, intermediate multipart uploads in all non-readonly OBS mount points older than this age will be cleaned. This may impact other ongoing upload operations, so a large clean age is encouraged.

alluxio.underfs.obs.streaming.upload.enabled

false

(Experimental) If true, using streaming upload to write to OBS.

alluxio.underfs.obs.streaming.upload.partition.size

64MB

Maximum allowable size of a single buffer file when using S3A streaming upload. When the buffer file reaches the partition size, it will be uploaded and the upcoming data will write to other buffer files.If the partition size is too small, OBS upload speed might be affected.

alluxio.underfs.obs.streaming.upload.threads

20

the number of threads to use for streaming upload data to OBS.

alluxio.underfs.oss.default.mode

0700

Mode (in octal notation) for OSS objects if mode cannot be discovered.

alluxio.underfs.oss.ecs.ram.role

The RAM role of current owner of ECS.

alluxio.underfs.oss.intermediate.upload.clean.age

3day

Streaming uploads may not have been completed/aborted correctly and need periodical ufs cleanup. If ufs cleanup is enabled, intermediate multipart uploads in all non-readonly OSS mount points older than this age will be cleaned. This may impact other ongoing upload operations, so a large clean age is encouraged.

alluxio.underfs.oss.owner.id.to.username.mapping

Optionally, specify a preset oss canonical id to Alluxio username static mapping, in the format "id1=user1;id2=user2".

alluxio.underfs.oss.retry.max

3

The maximum number of OSS error retry.

alluxio.underfs.oss.streaming.upload.enabled

false

(Experimental) If true, using streaming upload to write to OSS.

alluxio.underfs.oss.streaming.upload.partition.size

64MB

Maximum allowable size of a single buffer file when using OSS streaming upload. When the buffer file reaches the partition size, it will be uploaded and the upcoming data will write to other buffer files.If the partition size is too small, OSS upload speed might be affected.

alluxio.underfs.oss.streaming.upload.threads

20

the number of threads to use for streaming upload data to OSS.

alluxio.underfs.oss.sts.ecs.metadata.service.endpoint

http://100.100.100.200/latest/meta-data/ram/security-credentials/

The ECS metadata service endpoint for Aliyun STS

alluxio.underfs.oss.sts.enabled

false

Whether to enable oss STS(Security Token Service).

alluxio.underfs.oss.sts.token.refresh.interval.ms

30m

Time before an OSS Security Token is considered expired and will be automatically renewed

alluxio.underfs.persistence.async.temp.dir

.alluxio_ufs_persistence

The temporary directory used for async persistence in the ufs

alluxio.underfs.read.chunk.size

8KB

Read buffer size when reading data from UFS(currently only for distributedLoad).

alluxio.underfs.s3.admin.threads.max

20

The maximum number of threads to use for metadata operations when communicating with S3. These operations may be fairly concurrent and frequent but should not take much time to process.

alluxio.underfs.s3.assumerole.credential.process

If set, call the given process to get temporary credential to assume role.For more details, please refer to https://docs.aws.amazon.com/sdkref/latest/guide/setting-global-credential_process.html

alluxio.underfs.s3.assumerole.enabled

false

When using AssumeRole, the s3a.accessKeyId and s3a.secretKey are used to switch to a role (specified by alluxio.underfs.s3.assumerole.rolearn), for a temporary session. Then temporary keys are acquired from AWS STS for the session. The keys will be automatically refreshed at the end of the session, even if the session is manually revoked from the AWS console.No action is required at the end of alluxio.underfs.s3.assumerole.session.duration.second. For more details about the AssumeRole API, please refer to https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html

alluxio.underfs.s3.assumerole.https.enabled

true

Whether or not to use HTTPS protocol when communicating with AWS STS.

alluxio.underfs.s3.assumerole.proxy.host

Optionally, specify a proxy host for communicating with AWS STS. Note that the proxy port also need to be set to enable the proxy setting with Alluxio configuration, otherwise the proxy setting will take from system environment.

alluxio.underfs.s3.assumerole.proxy.https.enabled

true

Whether or not to use HTTPS protocol as proxy protocolwhen communicating with AWS STS.

alluxio.underfs.s3.assumerole.proxy.port

Optionally, specify a proxy port for communicating with AWS STS. Note that the proxy host also need to be set to enable the proxy setting with Alluxio configuration, otherwise the proxy setting will take from system environment.

alluxio.underfs.s3.assumerole.refresh.max.retry.times

10

The maximum number of retries for refreshing the assume role session token.

alluxio.underfs.s3.assumerole.refresh.sleep.base.ms

100ms

The base sleep time for refreshing the assume role session token.

alluxio.underfs.s3.assumerole.refresh.sleep.max.ms

1sec

The maximum sleep time for refreshing the assume role session token.

alluxio.underfs.s3.assumerole.rolearn

The Amazon Resource Name (ARN) of the role to assume. This is a required property if AssumeRole is enabled.

alluxio.underfs.s3.assumerole.s3client.cache.duration.threshold

30min

If the AssumeRole S3Client's lifecycle is greater than the threshold, the S3Client can be cached for the following session. The cache time would be (the AssumeRole session cache threshold - threshold).

alluxio.underfs.s3.assumerole.s3client.min.usable.lifetime

15min

The AmazonS3 client with AssumeRole Session Token is safe to use only if its remaining lifetime is greater than the threshold, otherwise some time-consuming operations like upload/download file may fail because of token expiration. The default value is 15min.

alluxio.underfs.s3.assumerole.session.cache.duration.threshold

45min

If the AssumeRole Session Token duration is greater than this threshold, the AssumeRole session token can be cached and used for next request. The cache time would be (duration - threshold).

alluxio.underfs.s3.assumerole.session.cache.size

1000

The AssumeRole session and AmazonS3 client cache size in the Master.

alluxio.underfs.s3.assumerole.session.duration.second

3600

The duration, in seconds, of the role session. The value can range from 900 seconds up to the maximum session duration setting for the role. The default value is 3600.

alluxio.underfs.s3.assumerole.session.prefix

alluxio-assume-role

A prefix to use for the AssumeRole session. The session name will be <prefix> + <a random string>.

alluxio.underfs.s3.assumerole.session.scope

USER

This is the AssumeRole session's scope. By default the AssumeRole session is created per user and path(USER). It can be switched to per user(USER).

alluxio.underfs.s3.assumerole.throttling.max.retry.time

30sec

When the aws assume role application is throttled, this is the max wait time

alluxio.underfs.s3.connection.ttl

-1

The expiration time of S3 connections in ms. -1 means the connection will never expire.

alluxio.underfs.s3.default.mode

0700

Mode (in octal notation) for S3 objects if mode cannot be discovered.

alluxio.underfs.s3.directory.suffix

/

Directories are represented in S3 as zero-byte objects named with the specified suffix.

alluxio.underfs.s3.disable.dns.buckets

false

Optionally, specify to make all S3 requests path style.

alluxio.underfs.s3.endpoint

Optionally, to reduce data latency or visit resources which are separated in different AWS regions, specify a regional endpoint to make aws requests. An endpoint is a URL that is the entry point for a web service. For example, s3.cn-north-1.amazonaws.com.cn is an entry point for the Amazon S3 service in beijing region.

alluxio.underfs.s3.endpoint.region

Optionally, set the S3 endpoint region. If not provided, inducted from the endpoint uri or set to null

alluxio.underfs.s3.inherit.acl

true

Set this property to false to disable inheriting bucket ACLs on objects. Note that the translation from bucket ACLs to Alluxio user permissions is best effort as some S3-like storage services doe not implement ACLs fully compatible with S3.

alluxio.underfs.s3.intermediate.upload.clean.age

3day

Streaming uploads may not have been completed/aborted correctly and need periodical ufs cleanup. If ufs cleanup is enabled, intermediate multipart uploads in all non-readonly S3 mount points older than this age will be cleaned. This may impact other ongoing upload operations, so a large clean age is encouraged.

alluxio.underfs.s3.list.objects.v1

false

Whether to use version 1 of GET Bucket (List Objects) API.

alluxio.underfs.s3.max.error.retry

The maximum number of retry attempts for failed retryable requests.Setting this property will override the AWS SDK default.

alluxio.underfs.s3.owner.id.to.username.mapping

Optionally, specify a preset s3 canonical id to Alluxio username static mapping, in the format "id1=user1;id2=user2". The AWS S3 canonical ID can be found at the console address https://console.aws.amazon.com/iam/home?#security_credential . Please expand the "Account Identifiers" tab and refer to "Canonical User ID". Unspecified owner id will map to a default empty username

alluxio.underfs.s3.proxy.host

Optionally, specify a proxy host for communicating with S3.

alluxio.underfs.s3.proxy.port

Optionally, specify a proxy port for communicating with S3.

alluxio.underfs.s3.region

Optionally, set the S3 bucket region. If not provided, will enable the global bucket access with extra requests

alluxio.underfs.s3.request.timeout

1min

The timeout for a single request to S3. Infinity if set to 0. Setting this property to a non-zero value can improve performance by avoiding the long tail of requests to S3. For very slow connections to S3, consider increasing this value or setting it to 0.

alluxio.underfs.s3.secure.http.enabled

false

Whether or not to use HTTPS protocol when communicating with S3.

alluxio.underfs.s3.server.side.encryption.enabled

false

Whether or not to encrypt data stored in S3.

alluxio.underfs.s3.signer.algorithm

The signature algorithm which should be used to sign requests to the s3 service. This is optional, and if not set, the client will automatically determine it. For interacting with an S3 endpoint which only supports v2 signatures, set this to "S3SignerType".

alluxio.underfs.s3.socket.timeout

50sec

Length of the socket timeout when communicating with S3.

alluxio.underfs.s3.streaming.upload.enabled

false

(Experimental) If true, using streaming upload to write to S3.

alluxio.underfs.s3.streaming.upload.partition.size

64MB

Maximum allowable size of a single buffer file when using S3A streaming upload. When the buffer file reaches the partition size, it will be uploaded and the upcoming data will write to other buffer files.If the partition size is too small, S3A upload speed might be affected.

alluxio.underfs.s3.threads.max

40

The maximum number of threads to use for communicating with S3 and the maximum number of concurrent connections to S3. Includes both threads for data upload and metadata operations. This number should be at least as large as the max admin threads plus max upload threads.

alluxio.underfs.s3.upload.threads.max

20

For an Alluxio worker, this is the maximum number of threads to use for uploading data to S3 for multipart uploads. These operations can be fairly expensive, so multiple threads are encouraged. However, this also splits the bandwidth between threads, meaning the overall latency for completing an upload will be higher for more threads. For the Alluxio master, this is the maximum number of threads used for the rename (copy) operation. It is recommended that value should be greater than or equal to alluxio.underfs.object.store.service.threads

alluxio.underfs.security.authorization.plugin.name

Name of the authorization plugin for the under filesystem.

alluxio.underfs.security.authorization.plugin.paths

Classpaths for the under filesystem authorization plugin, separated by colons.

alluxio.underfs.strict.version.match.enabled

false

When enabled, Alluxio finds the UFS connector by strict version matching. Otherwise only version prefix is compared.

alluxio.web.cors.allow.credential

false

Enable request include credential.

alluxio.web.cors.allow.headers

*

Which headers is allowed for cors. use * allow all any header.

alluxio.web.cors.allow.methods

*

Which methods is allowed for cors. use * allow all any method.

alluxio.web.cors.allow.origins

*

Which origins is allowed for cors. use * allow all any origin.

alluxio.web.cors.enabled

false

Set to true to enable Cross-Origin Resource Sharing for RESTful APIendpoints.

alluxio.web.cors.exposed.headers

*

Which headers are allowed to set in response when access cross-origin resource. use * allow all any header.

alluxio.web.cors.max.age

-1

Maximum number of seconds the results can be cached. -1 means no cache.

alluxio.web.file.info.enabled

true

Whether detailed file information are enabled for the web UI.

alluxio.web.login.enabled

false

Whether login and authentication are enabled for the web UI.

alluxio.web.login.password

admin

Password to log in to the web UI.

alluxio.web.login.session.timeout

8h

If a session is inactive for a certain time period, then the session is automatically invalidated. This property specifies the time period. Valid values are formatted like 1min, 1h, 1d, representing 1 minute, 1 hour, and 1 day respectively.

alluxio.web.login.sessions

1000

The maximum number of active sessions.

alluxio.web.login.username

admin

Username to log in to the web UI.

alluxio.web.manager.enabled

false

Whether Manager is enabled for the web UI.

alluxio.web.refresh.interval

15s

The amount of time to await before refreshing the Web UI if it is set to auto refresh.

alluxio.web.ssl.enabled

false

Whether SSL is enabled for the web UI.

alluxio.web.ssl.key.alias

https

The alias of the key to be used. If there is only one key in the KeyStore, this can be an empty string. If there are multiple keys with different aliases in the KeyStore, use this to specify the key to be used, otherwise, a key will be chosen by javax.net.ssl.X509KeyManager.

alluxio.web.ssl.key.password

${alluxio.web.ssl.keystore.password}

Password for getting the key from the KeyStore. When creating a key in the KeyStore, a password can be set for the key. This password is needed when getting the key from the keystore.If the key password is not set, alluxio.web.ssl.keystore.password will be used as the key password.

alluxio.web.ssl.keystore.password

changeit

Password for the KeyStore at alluxio.web.ssl.keystore.path.

alluxio.web.ssl.keystore.path

${alluxio.conf.dir}/web_keystore

Path to the KeyStore containing the SSL key pairs and certificates. If the KeyStore does not exist, a default KeyStore with self signed key and certificate will be generated and used. The generation needs configurations for alluxio.web.ssl.keystore.password, alluxio.web.ssl.key.password, and alluxio.web.ssl.key.alias.

alluxio.web.threaddump.log.enabled

false

Whether thread information is also printed to the log when the thread dump api is accessed

alluxio.web.threads

1

How many threads to serve Alluxio web UI.

alluxio.web.ui.enabled

true

Whether the master/worker will have Web UI enabled. If set to false, the master/worker will not have Web UI page, but the RESTful endpoints and metrics will still be available.

alluxio.work.dir

${alluxio.home}

The directory to use for Alluxio's working directory. By default, the journal, logs, and under file storage data (if using local filesystem) are written here.

alluxio.zookeeper.address

Address of ZooKeeper.

alluxio.zookeeper.auth.enabled

true

If true, enable client-side Zookeeper authentication.

alluxio.zookeeper.connection.timeout

15s

Connection timeout for Alluxio (job) masters to select the leading (job) master when connecting to Zookeeper

alluxio.zookeeper.election.path

/alluxio/election

Election directory in ZooKeeper.

alluxio.zookeeper.enabled

false

If true, setup master fault tolerant mode using ZooKeeper.

alluxio.zookeeper.job.election.path

/alluxio/job_election

N/A

alluxio.zookeeper.job.leader.path

/alluxio/job_leader

N/A

alluxio.zookeeper.leader.connection.error.policy

SESSION

Connection error policy defines how errors on zookeeper connections to be treated in leader election. STANDARD policy treats every connection event as failure.SESSION policy relies on zookeeper sessions for judging failures, helping leader to retain its status, as long as its session is protected.

alluxio.zookeeper.leader.inquiry.retry

10

The number of retries to inquire leader from ZooKeeper.

alluxio.zookeeper.leader.path

/alluxio/leader

Leader directory in ZooKeeper.

alluxio.zookeeper.session.timeout

60s

Session timeout to use when connecting to Zookeeper

fs.azure.account.oauth2.client.endpoint

The oauth endpoint for ABFS.

fs.azure.account.oauth2.client.id

The client id for ABFS.

fs.azure.account.oauth2.client.secret

The client secret for ABFS.

fs.azure.account.oauth2.msi.endpoint

MSI endpoint

fs.azure.account.oauth2.msi.tenant

MSI Tenant ID

fs.gcs.accessKeyId

The access key of GCS bucket. This property key is only valid when alluxio.underfs.gcs.version=1

fs.gcs.credential.path

The json file path of Google application credentials. This property key is only valid when alluxio.underfs.gcs.version=2

fs.gcs.secretAccessKey

The secret key of GCS bucket. This property key is only valid when alluxio.underfs.gcs.version=1

fs.obs.accessKey

The access key of OBS bucket.

fs.obs.bucketType

obs

The type of bucket (obs/pfs).

fs.obs.endpoint

obs.myhwclouds.com

The endpoint of OBS bucket.

fs.obs.secretKey

The secret key of OBS bucket.

s3a.accessKeyId

The access key of S3 bucket.

s3a.secretKey

The secret key of S3 bucket.

Master Configuration

The master configuration specifies information regarding the master node, such as the address and the port number.

Property Name
Default
Description

alluxio.master.audit.logging.enabled

false

Set to true to enable file system master audit.

alluxio.master.audit.logging.queue.capacity

10000

Capacity of the queue used by audit logging.

alluxio.master.backup.abandon.timeout

1min

Duration after which leader will abandon the backup if it has not received heartbeat from backup-worker.

alluxio.master.backup.connect.interval.max

30sec

Maximum delay between each connection attempt to backup-leader.

alluxio.master.backup.connect.interval.min

1sec

Minimum delay between each connection attempt to backup-leader.

alluxio.master.backup.delegation.enabled

true

Whether to delegate journals to standby masters in HA cluster.

alluxio.master.backup.directory

/alluxio_backups

Default directory for writing master metadata backups. This path is an absolute path of the root UFS. For example, if the root ufs directory is hdfs://host:port/alluxio/data, the default backup directory will be hdfs://host:port/alluxio_backups.

alluxio.master.backup.entry.buffer.count

10000

How many journal entries to buffer during a back-up.

alluxio.master.backup.heartbeat.interval

2sec

Interval at which stand-by master that is taking the backup will update the leading master with current backup status.

alluxio.master.backup.state.lock.exclusive.duration

0ms

Alluxio master will allow only exclusive locking of the state-lock for this duration. This duration starts after masters are started for the first time. User RPCs will fail to acquire state-lock during this phase and a backup is guaranteed take the state-lock meanwhile.

alluxio.master.backup.state.lock.forced.duration

15min

Exclusive locking of the state-lock will timeout after this duration is spent on forced phase.

alluxio.master.backup.state.lock.interrupt.cycle.enabled

false

This controls whether RPCs that are waiting/holding state-lock in shared-mode will be interrupted while state-lock is taken exclusively.

alluxio.master.backup.state.lock.interrupt.cycle.interval

30sec

The interval at which the RPCs that are waiting/holding state-lock in shared-mode will be interrupted while state-lock is taken exclusively.

alluxio.master.backup.suspend.timeout

3min

Timeout for when suspend request is not followed by a backup request.

alluxio.master.backup.transport.timeout

30sec

Communication timeout for messaging between masters for coordinating backup.

alluxio.master.bind.host

0.0.0.0

The hostname that Alluxio master binds to.

alluxio.master.block.scan.invalid.batch.max.size

10000000

The invalid block max batch size when the master is scanning the invalid blocks, minus number means no limit.

alluxio.master.container.id.reservation.size

1000

The number of container ids to 'reserve' before having to journal container id state. This allows the master to return container ids within the reservation, without having to write to.

alluxio.master.cross.cluster.enabled

false

True to enable cross cluster synchronization.

alluxio.master.cross.cluster.external.rpc.addresses

A list of comma-separated host:port RPC addresses where external clusters can connect to this cluster's master RPC service for crosscluster synchronization. If not set, external clusters will use the addresses set in alluxio.master.rpc.addresses toconnect to this cluster.

alluxio.master.cross.cluster.id

A unique id for this cluster

alluxio.master.cross.cluster.invalidation.queue.size

10000

Maximum number of invalidation messages to buffer on the publisher before dropping the connection.

alluxio.master.cross.cluster.invalidation.queue.wait

1s

Maximum time to wait for the invalidation queue to be ready at the publisher before dropping the connection.

alluxio.master.cross.cluster.rpc.addresses

A list of comma-separated host:port RPC addresses where the client should look for the cross cluster name service.

alluxio.master.daily.backup.enabled

false

Whether or not to enable daily primary master metadata backup.

alluxio.master.daily.backup.files.retained

3

The maximum number of backup files to keep in the backup directory.

alluxio.master.daily.backup.state.lock.grace.mode

TIMEOUT

Grace mode helps taking the state-lock exclusively for backup with minimum disruption to existing RPCs. This low-impact locking phase is called grace-cycle. Two modes are supported: TIMEOUT/FORCED.TIMEOUT: Means exclusive locking will timeout if it cannot acquire the lockwith grace-cycle. FORCED: Means the state-lock will be taken forcefully if grace-cycle fails to acquire it. Forced phase might trigger interrupting of existing RPCs if it is enabled.

alluxio.master.daily.backup.state.lock.sleep.duration

5m

The duration that controls how long the lock waiter sleeps within a single grace-cycle.

alluxio.master.daily.backup.state.lock.timeout

1h

The max duration for a grace-cycle.

alluxio.master.daily.backup.state.lock.try.duration

2m

The duration that controls how long the state-lock is tried within a single grace-cycle.

alluxio.master.daily.backup.time

05:00

Default UTC time for writing daily master metadata backups. The accepted time format is hour:minute which is based on a 24-hour clock (E.g., 05:30, 06:00, and 22:04). Backing up metadata requires a pause in master metadata changes, so please set this value to an off-peak time to avoid interfering with other users of the system.

alluxio.master.embedded.journal.addresses

A comma-separated list of journal addresses for all masters in the cluster. The format is 'hostname1:port1,hostname2:port2,...'. When left unset, Alluxio uses ${alluxio.master.hostname}:${alluxio.master.embedded.journal.port} by default

alluxio.master.embedded.journal.catchup.retry.wait

1s

Time for embedded journal leader to wait before retrying a catch up. This is added to avoid excessive retries when server is not ready.

alluxio.master.embedded.journal.election.timeout.max

20s

The max election timeout for the embedded journal. When a random period between ${alluxio.master.embedded.journal.election.timeout.min} and ${alluxio.master.embedded.journal.election.timeout.max} elapses without a master receiving any messages, the master will attempt to become the primary Election timeout will be waited initially when the cluster is forming. So larger values for election timeout will cause longer start-up time. Smaller values might introduce instability to leadership.

alluxio.master.embedded.journal.election.timeout.min

10s

The min election timeout for the embedded journal.

alluxio.master.embedded.journal.entry.size.max

10MB

The maximum single journal entry size allowed to be flushed. This value should be smaller than 30MB. Set to a larger value to allow larger journal entries when using the Alluxio Catalog service.

alluxio.master.embedded.journal.flush.size.max

160MB

The maximum size in bytes of journal entries allowed in concurrent journal flushing (journal IO to standby masters and IO to local disks).

alluxio.master.embedded.journal.port

19200

The port to use for embedded journal communication with other masters.

alluxio.master.embedded.journal.raft.client.request.interval

100ms

Base interval for retrying Raft client calls. The retry policy is ExponentialBackoffRetry

alluxio.master.embedded.journal.raft.client.request.timeout

60sec

Time after which calls made through the Raft client timeout.

alluxio.master.embedded.journal.ratis.config

Prefix for Apache Ratis internal configuration options. For example, setting alluxio.master.embedded.journal.ratis.config.raft.server.rpc.request.timeout will set ratis.config.raft.server.rpc.request.timeout on the Ratis service in the Alluxio master.

alluxio.master.embedded.journal.retry.cache.expiry.time

60s

The time for embedded journal server retry cache to expire. Setting a bigger value allows embedded journal server to cache the responses for a longer time in case of journal writer retries, but will take up more memory in master.

alluxio.master.embedded.journal.snapshot.replication.chunk.size

4MB

The stream chunk size used by masters to replicate snapshots.

alluxio.master.embedded.journal.snapshot.replication.compression.level

1

The zip compression level of sending a snapshot from one master to another. Only applicable when alluxio.master.embedded.journal.snapshot.replication.compression.type is not NO_COMPRESSION. The zip format defines ten levels of compression, ranging from 0 (no compression, but very fast) to 9 (best compression, but slow). Or -1 for the system default compression level.

alluxio.master.embedded.journal.snapshot.replication.compression.type

NO_COMPRESSION

The type of compression to use when transferring a snapshot from one master to another. Options are NO_COMPRESSION, GZIP, TAR_GZIP

alluxio.master.embedded.journal.transport.max.inbound.message.size

100MB

The maximum size of a message that can be sent to the embedded journal server node.

alluxio.master.embedded.journal.transport.request.timeout.ms

5sec

The duration after which embedded journal masters will timeout messages sent between each other. Lower values might cause leadership instability when the network is slow.

alluxio.master.embedded.journal.unsafe.flush.enabled

false

If true, embedded journal entries will be committed without waiting for the entry to be flushed to disk. This may improve performance of write operations on the Alluxio master if the journal is written to a slow or contested disk. WARNING: enabling this property may result in metadata loss if half or more of the master nodes fail. See Ratis property raft.server.log.unsafe-flush.enabled at https://github.com/apache/ratis/blob/master/ratis-docs/src/site/markdown/configuraions.md.

alluxio.master.embedded.journal.write.timeout

30sec

Maximum time to wait for a write/flush on embedded journal.

alluxio.master.failover.collect.info

true

If true, the primary master will persist metrics and jstack into the log folder when it transitions to standby.

alluxio.master.file.access.time.journal.flush.interval

1h

The minimum interval between files access time update journal entries get flushed asynchronously. Setting it to a non-positive value will make the the journal update synchronous. Asynchronous update reduces the performance impact of tracking access time but can lose some access time update when master stops unexpectedly.

alluxio.master.file.access.time.update.precision

1d

The file last access time is precise up to this value. Setting it toa non-positive value will update last access time on every file access operation.Longer precision will help reduce the performance impact of tracking access time by reduce the amount of metadata writes occur while reading the same group of files repetitively.

alluxio.master.file.access.time.updater.enabled

true

If enabled, file access time updater will update the file last access time when an inode is accessed. This property can be turned off to improve performance and reduce the number of journal entries if your application does not rely on the file access time metadata.

alluxio.master.file.access.time.updater.shutdown.timeout

1sec

Maximum time to wait for access updater to stop on shutdown.

alluxio.master.file.async.persist.handler

alluxio.master.file.async.DefaultAsyncPersistHandler

The handler for processing the async persistence requests.

alluxio.master.filesystem.liststatus.result.message.length

10000

Count of items on each list-status response message.

alluxio.master.filesystem.merge.inode.journals

false

If enabled, the file system master inode related journalswill be merged and submitted BEFORE the inode path lock is released. Due to the performance consideration, this will not apply to the metadata sync, where journals are still flushed asynchronously.

alluxio.master.filesystem.operation.retry.cache.enabled

true

If enabled, each filesystem operation will be tracked on all masters, in order to avoid re-execution of client retries.

alluxio.master.filesystem.operation.retry.cache.size

100000

Size of fs operation retry cache.

alluxio.master.format.file.prefix

_format_

The file prefix of the file generated in the journal directory when the journal is formatted. The master will search for a file with this prefix when determining if the journal is formatted.

alluxio.master.grpc.server.shutdown.timeout

60sec

Maximum time to wait for gRPC server to stop on shutdown

alluxio.master.heartbeat.timeout

10min

Timeout between leader master and standby master indicating a lost master.

alluxio.master.hostname

The hostname of Alluxio master.

alluxio.master.journal.backup.when.corrupted

true

Takes a backup automatically when encountering journal corruption

alluxio.master.journal.catchup.protect.enabled

true

(Experimental) make sure the journal catchup finish before joining the quorum in fault tolerant mode when starting the master process and before the current master becoming the leader.This is added to prevent frequently leadership transition during heavy journal catchup stage. Catchup is only implemented in ufs journal with Zookeeper.

alluxio.master.journal.checkpoint.period.entries

2000000

The number of journal entries to write before creating a new journal checkpoint.

alluxio.master.journal.exit.on.demotion

false

(Experimental) When this flag is set to true, the master process may start as the primary or standby in a quorum, but at any point in time after becoming a primary it is demoted to standby, the process will shut down. This leaves the responsibility of restarting the master to re-join the quorum (e.g. in case of a journal failure on a particular node) to an external entity such as kubernetes or systemd.

alluxio.master.journal.flush.batch.time

100ms

Time to wait for batching journal writes.

alluxio.master.journal.flush.timeout

5min

The amount of time to keep retrying journal writes before giving up and shutting down the master.

alluxio.master.journal.folder

${alluxio.work.dir}/journal

The path to store master journal logs. When using the UFS journal this could be a URI like hdfs://namenode:port/alluxio/journal. When using the embedded journal this must be a local path

alluxio.master.journal.gc.period

2min

Frequency with which to scan for and delete stale journal checkpoints.

alluxio.master.journal.gc.threshold

5min

Minimum age for garbage collecting checkpoints.

alluxio.master.journal.init.from.backup

A uri for a backup to initialize the journal from. When the master becomes primary, if it sees that its journal is freshly formatted, it will restore its state from the backup. When running multiple masters, this property must be configured on all masters since it isn't known during startup which master will become the first primary.

alluxio.master.journal.local.log.compaction

true

Whether to employ a quorum level log compaction policy or a local (individual) log compaction policy.

alluxio.master.journal.log.size.bytes.max

10MB

If a log file is bigger than this value, it will rotate to next file.

alluxio.master.journal.request.data.timeout

20000

Time to wait for follower to respond to request to send a new snapshot

alluxio.master.journal.request.info.timeout

10000

Time to wait for follower to respond to request to get information about its latest snapshot

alluxio.master.journal.retry.interval

1sec

The amount of time to sleep between retrying journal flushes

alluxio.master.journal.space.monitor.interval

10min

How often to check and update information on space utilization of the journal disk. This is currently only compatible with linux-basedsystems and when alluxio.master.journal.type is configured to EMBEDDED

alluxio.master.journal.space.monitor.percent.free.threshold

10

When the percent of free space on any disk which backs the journal falls below this percentage, begin logging warning messages to let administrators know the journal disk(s) may be running low on space.

alluxio.master.journal.tailer.shutdown.quiet.wait.time

5sec

Before the standby master shuts down its tailer thread, there should be no update to the leader master's journal in this specified time period.

alluxio.master.journal.tailer.sleep.time

1sec

Time for the standby master to sleep for when it cannot find anything new in leader master's journal.

alluxio.master.journal.temporary.file.gc.threshold

30min

Minimum age for garbage collecting temporary checkpoint files.

alluxio.master.journal.type

EMBEDDED

The type of journal to use. Valid options are UFS (store journal in UFS), EMBEDDED (use a journal embedded in the masters), and NOOP (do not use a journal)

alluxio.master.journal.ufs.option

The configuration to use for the journal operations.

alluxio.master.jvm.monitor.enabled

true

Whether to enable start JVM monitor thread on the master. This will start a thread to detect JVM-wide pauses induced by GC or other reasons.

alluxio.master.keytab.file

Kerberos keytab file for Alluxio master.

alluxio.master.lock.pool.concurrency.level

100

Maximum concurrency level for the lock pool

alluxio.master.lock.pool.high.watermark

1000000

High watermark of lock pool size. When the size grows over the high watermark, a background thread starts evicting unused locks from the pool.

alluxio.master.lock.pool.initsize

1000

Initial size of the lock pool for master inodes.

alluxio.master.lock.pool.low.watermark

500000

Low watermark of lock pool size. When the size grows over the high watermark, a background thread will try to evict unused locks until the size reaches the low watermark.

alluxio.master.log.config.report.heartbeat.interval

1h

The interval for periodically logging the configuration check report.

alluxio.master.lost.proxy.deletion.timeout

30min

If an Alluxio Proxy has been lost for more than this timeout, the master will totally forget this worker.

alluxio.master.lost.worker.deletion.timeout

30min

If a worker has no heartbeat with the master for more than this timeout, the master will totally forget this worker.

alluxio.master.lost.worker.detection.interval

10sec

The interval between Alluxio master detections to find lost workers based on updates from Alluxio workers.

alluxio.master.lost.worker.file.detection.interval

5min

The interval between Alluxio master detections to find lost files based on updates from Alluxio workers.

alluxio.master.merge.journal.context.num.entries.logging.threshold

10000

The logging threshold of number of journal entries which are held in a merge journal context. This log may help debug memory exhaustion issues.

alluxio.master.metadata.concurrent.sync.dedup

false

If set to true, a metadata sync request will be skipped and doesn't trigger a UFS sync when there have already been other requests syncing the same path. The outstanding metadata sync request will wait until these syncs are done and return SyncStatus.NOT_NEED.

alluxio.master.metadata.sync.concurrency.level

6

The maximum number of concurrent sync tasks running for a given sync operation

alluxio.master.metadata.sync.executor.pool.size

The total number of threads which can concurrently execute metadata sync operations.

The number of threads used to execute all metadata syncoperations

alluxio.master.metadata.sync.ignore.ttl

false

Whether files created from metadata sync will ignore the TTL from the command/path conf and have no TTL.

alluxio.master.metadata.sync.instrument.executor

false

If true the metadata sync thread pool executors will be instrumented with additional metrics.

alluxio.master.metadata.sync.lock.pool.concurrency.level

20

Maximum concurrency level for the metadata sync lock pool

alluxio.master.metadata.sync.lock.pool.high.watermark

50000

High watermark of metadata sync lock pool size. When the size grows over the high watermark, a background thread starts evicting unused locks from the pool.

alluxio.master.metadata.sync.lock.pool.initsize

1000

Initial size of the lock pool for master metadata sync.

alluxio.master.metadata.sync.lock.pool.low.watermark

20000

Low watermark of metadata sync lock pool size. When the size grows over the high watermark, a background thread will try to evict unused locks until the size reaches the low watermark.

alluxio.master.metadata.sync.recursive.load.parent.dirs

true

This switch controls what happens on metadata sync. If this is true (by default), when Alluxio does not know path /a/b/c and a metadata sync loads path /a/b/c/file, Alluxio will recursively sync with the UFS on unknown parent paths /a/, /a/b/, and /a/b/c/ (just sync on the paths, without really listing them. If this is false, on a metadata sync on /a/b/c/file, Alluxio creates inodes /a/, /a/b/, /a/b/c/ in Alluxio namespace without checking with UFS. That means those inodes may not have the same permission with UFS. This switch is specially for object storage systems, because recursively checking parent directory can be very expensive, and the parent path permissions are usually not as important (authorization is controlled by keys.

alluxio.master.metadata.sync.traversal.order

BFS

The pending Path in the Inode SyncStream traversal order, DFS consumes less memory while BFS is more fair for all concurrent sync tasks. For more description see the comments of MetadataSyncTraversalOrder.

alluxio.master.metadata.sync.ufs.concurrent.get.status

true

Allows metadata sync operations on single items (i.e. getStatus) operations to run concurrently with metadata sync operations on directories (i.e listings) on intersecting paths.

alluxio.master.metadata.sync.ufs.concurrent.listing

true

Allows non-recursive metadata sync operations directories to run concurrently with recursive metadata sync operations on intersecting paths.

alluxio.master.metadata.sync.ufs.concurrent.loads

100

The number of concurrently running UFS listing operations during metadata sync. This includes loads that have completed, but have not yet been processed.

alluxio.master.metadata.sync.ufs.prefetch.pool.size

The number of threads which can concurrently fetch metadata from UFSes during a metadata sync operations.

The number of threads used to fetch UFS objects for all metadata syncoperations

alluxio.master.metadata.sync.ufs.prefetch.status

true

Whether or not to prefetch ufs status of children during metadata sync. Prefetching will facilitate the metadata sync process but will consume more memory to hold prefetched results.

alluxio.master.metadata.sync.ufs.prefetch.timeout

100ms

The timeout for a metadata fetch operation from the UFSes. Adjust this timeout according to the expected UFS worst-case response time.

alluxio.master.metadata.sync.ufs.rate.limit

The maximum number of operations per second to execute on an individual UFS during metadata sync operations. If 0 or unset then no rate limit is enforced.

alluxio.master.metastore

ROCKS

The type of metastore to use, either HEAP or ROCKS. The heap metastore keeps all metadata on-heap, while the rocks metastore stores some metadata on heap and some metadata on disk. The rocks metastore has the advantage of being able to support a large namespace (1 billion plus files) without needing a massive heap size.The metadata storage includes inode and block metadata. Users can override the type of metastore using alluxio.master.metastore.inode and alluxio.master.metastore.block. For example if alluxio.master.metastore=ROCKS but alluxio.master.metastore.inode=HEAP, then inodes are stored with HEAP and blocks are stored with ROCKS.

alluxio.master.metastore.block

ROCKS

The type of block metastore to use, either HEAP or ROCKS. By default this uses alluxio.master.metastore.

alluxio.master.metastore.dir

${alluxio.work.dir}/metastore

The metastore work directory. Only some metastores need disk.

alluxio.master.metastore.dir.block

${alluxio.master.metastore.dir}

If the metastore is ROCKS, this property controls where the RocksDB stores block metadata. This property defaults to alluxio.master.metastore.dir. And it can be used to change block metadata storage path to a different disk to improve RocksDB performance.

alluxio.master.metastore.dir.inode

${alluxio.master.metastore.dir}

If the metastore is ROCKS, this property controls where the RocksDB stores inode metadata. This property defaults to alluxio.master.metastore.dir. And it can be used to change inode metadata storage path to a different disk to improve RocksDB performance.

alluxio.master.metastore.inode

ROCKS

The type of inode metastore to use, either HEAP or ROCKS. By default this uses alluxio.master.metastore.

alluxio.master.metastore.inode.cache.evict.batch.size

1000

The batch size for evicting entries from the inode cache.

alluxio.master.metastore.inode.cache.high.water.mark.ratio

0.85

The high water mark for the inode cache, as a ratio from high water mark to total cache size. If this is 0.85 and the max size is 10 million, the high water mark value is 8.5 million. When the cache reaches the high water mark, the eviction process will evict down to the low water mark.

alluxio.master.metastore.inode.cache.low.water.mark.ratio

0.8

The low water mark for the inode cache, as a ratio from low water mark to total cache size. If this is 0.8 and the max size is 10 million, the low water mark value is 8 million. When the cache reaches the high water mark, the eviction process will evict down to the low water mark.

alluxio.master.metastore.inode.cache.max.size

{Max memory of master JVM} / 2 / 2 KB per inode

The number of inodes to cache on-heap. The default value is chosen based on half the amount of maximum available memory of master JVM at runtime, and the estimation that each inode takes up approximately 2 KB of memory. This only applies to off-heap metastores, e.g. ROCKS. Set this to 0 to disable the on-heap inode cache

alluxio.master.metastore.inode.enumerator.buffer.count

10000

The number of entries to buffer during read-ahead enumeration.

alluxio.master.metastore.inode.inherit.owner.and.group

true

Whether to inherit the owner/group from the parent when creating a new inode path if empty

alluxio.master.metastore.inode.iteration.crawler.count

Use {CPU core count} for enumeration.

The number of threads used during inode tree enumeration.

alluxio.master.metastore.iterator.readahead.size

64MB

The read-ahead size (in bytes) for metastore iterators.

alluxio.master.metastore.metrics.refresh.interval

5s

Interval with which the master refreshes and reports metastore metrics

alluxio.master.metastore.rocks.block.location.block.index

The block index type to be used in the RocksDB block location table. If unset, the RocksDB default will be used. See https://rocksdb.org/blog/2018/08/23/data-block-hash-index.html

alluxio.master.metastore.rocks.block.location.bloom.filter

false

Whether or not to use a bloom filter in the Block location table in RocksDB. If unset, the RocksDB default will be used. See https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter

alluxio.master.metastore.rocks.block.location.cache.size

The capacity in bytes of the RocksDB block location table LRU cache. If unset, the RocksDB default will be used. See https://github.com/facebook/rocksdb/wiki/Block-Cache

alluxio.master.metastore.rocks.block.location.index

The index type to be used in the RocksDB block location table. If unset, the RocksDB default will be used. See https://github.com/facebook/rocksdb/wiki/Index-Block-Format

alluxio.master.metastore.rocks.block.meta.block.index

The block index type to be used in the RocksDB block metadata table. If unset, the RocksDB default will be used.See https://rocksdb.org/blog/2018/08/23/data-block-hash-index.html

alluxio.master.metastore.rocks.block.meta.bloom.filter

false

Whether or not to use a bloom filter in the Block meta table in RocksDB. If unset, the RocksDB default will be used. See https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter

alluxio.master.metastore.rocks.block.meta.cache.size

The capacity in bytes of the RocksDB block metadata table LRU cache. If unset, the RocksDB default will be used. See https://github.com/facebook/rocksdb/wiki/Block-Cache

alluxio.master.metastore.rocks.block.meta.index

The index type to be used in the RocksDB block metadata table. If unset, the RocksDB default will be used. See https://github.com/facebook/rocksdb/wiki/Index-Block-Format

alluxio.master.metastore.rocks.checkpoint.compression.type

LZ4_COMPRESSION

The compression algorithm that RocksDB uses internally. One of {NO_COMPRESSION SNAPPY_COMPRESSION ZLIB_COMPRESSION BZLIB2_COMPRESSION LZ4_COMPRESSION LZ4HC_COMPRESSION XPRESS_COMPRESSION ZSTD_COMPRESSION DISABLE_COMPRESSION_OPTION}

alluxio.master.metastore.rocks.edge.block.index

The block index type to be used in the RocksDB inode edge table. If unset, the RocksDB default will be used. See https://rocksdb.org/blog/2018/08/23/data-block-hash-index.html

alluxio.master.metastore.rocks.edge.bloom.filter

false

Whether or not to use a bloom filter in the Inode edge table in RocksDB. If unset, the RocksDB default will be used. See https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter

alluxio.master.metastore.rocks.edge.cache.size

The capacity in bytes of the RocksDB Inode edge table LRU cache. If unset, the RocksDB default will be used. See https://github.com/facebook/rocksdb/wiki/Block-Cache

alluxio.master.metastore.rocks.edge.index

The index type to be used in the RocksDB Inode edge table. If unset, the RocksDB default will be used. See https://github.com/facebook/rocksdb/wiki/Index-Block-Format

alluxio.master.metastore.rocks.inode.block.index

The block index type to be used in the RocksDB inode table. If unset, the RocksDB default will be used. See https://rocksdb.org/blog/2018/08/23/data-block-hash-index.html

alluxio.master.metastore.rocks.inode.bloom.filter

false

Whether or not to use a bloom filter in the Inode table in RocksDB. If unset, the RocksDB default will be used. See https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter

alluxio.master.metastore.rocks.inode.cache.size

The capacity in bytes of the RocksDB Inode table LRU cache. If unset, the RocksDB default will be used. See https://github.com/facebook/rocksdb/wiki/Block-Cache

alluxio.master.metastore.rocks.inode.index

The index type to be used in the RocksDB Inode table. If unset, the RocksDB default will be used. See https://github.com/facebook/rocksdb/wiki/Index-Block-Format

alluxio.master.metastore.rocks.parallel.backup

false

Whether to checkpoint rocksdb in parallel using the number of threads set by alluxio.master.metastore.rocks.parallel.backup.threads.

alluxio.master.metastore.rocks.parallel.backup.threads

The default number of threads used by backing up rocksdb in parallel.

The number of threads used by backing up rocksdb in parallel.

alluxio.master.metrics.file.size.distribution.buckets

1KB,1MB,10MB,100MB,1GB,10GB

Master metrics file size buckets

alluxio.master.metrics.heap.enabled

false

Enable master heap estimate metrics

alluxio.master.metrics.service.threads

5

The number of threads in metrics master executor pool for parallel processing metrics submitted by workers or clients and update cluster metrics.

alluxio.master.metrics.time.series.interval

5min

Interval for which the master records metrics information. This affects the granularity of the metrics graphed in the UI.

alluxio.master.mount.table.root.alluxio

/

Alluxio root mount point.

alluxio.master.mount.table.root.cross.cluster

false

Whether Alluxio root mount point uses cross cluster sync.

alluxio.master.mount.table.root.option

Configuration for the UFS of Alluxio root mount point.

alluxio.master.mount.table.root.readonly

false

Whether Alluxio root mount point is readonly.

alluxio.master.mount.table.root.shared

true

Whether Alluxio root mount point is shared.

alluxio.master.mount.table.root.ufs

${alluxio.work.dir}/underFSStorage

The storage address of the UFS at the Alluxio root mount point.

alluxio.master.network.flowcontrol.window

2MB

The HTTP2 flow control window used by Alluxio master gRPC connections. Larger value will allow more data to be buffered but will use more memory.

alluxio.master.network.keepalive.time

2h

The amount of time for Alluxio master gRPC server to wait for a response before pinging the client to see if it is still alive.

alluxio.master.network.keepalive.timeout

30sec

The maximum time for Alluxio master gRPC server to wait for a keepalive response before closing the connection.

alluxio.master.network.max.inbound.message.size

100MB

The maximum size of a message that can be sent to the Alluxio master

alluxio.master.network.netty.channel

EPOLL

Netty channel type: NIO or EPOLL. If EPOLL is not available, this will automatically fall back to NIO.

alluxio.master.network.permit.keepalive.time

30sec

Specify the most aggressive keep-alive time clients are permitted to configure. The server will try to detect clients exceeding this rate and when detected will forcefully close the connection.

alluxio.master.periodic.block.integrity.check.interval

1hr

The period for the block integrity check, disabled if <= 0.

alluxio.master.periodic.block.integrity.check.repair

true

Whether the system should delete orphaned blocks found during the periodic integrity check.

alluxio.master.persistence.blacklist

Patterns to blacklist persist, comma separated, string match, no regex. This affects any async persist call (including ASYNC_THROUGH writes and CLI persist) but does not affect CACHE_THROUGH writes. Users may want to specify temporary files in the blacklist to avoid unnecessary I/O and errors. Some examples are `.staging` and `.tmp`.

alluxio.master.principal

Kerberos principal for Alluxio master.

alluxio.master.proxy.check.heartbeat.timeout

1min

The master will periodically check the last heartbeat time from all Proxy instances. This key specifies the frequency of the check.

alluxio.master.proxy.timeout

5m

An Alluxio Proxy instance will maintain heartbeat to the primary Alluxio Master. No heartbeat more than this timeout indicates a lost Proxy.

alluxio.master.recursive.operation.journal.force.flush.max.entries

100

The threshold of the number of completed single operations in a recursive file system operation, e.g. delete file/set file attributes to trigger a force journal flush. Increasing the threshold decreases the possibility to see partial state of a recursive operation on a standby master but increases the memory consumption as alluxio holds more journal entries in memory. This config is only available when alluxio.master.filesystem.merge.inode.journalsis enabled.

alluxio.master.replication.check.interval

1min

How often the master runs background process to check replication level for files

alluxio.master.rpc.addresses

A list of comma-separated host:port RPC addresses where the client should look for masters when using multiple masters without Zookeeper. This property is not used when Zookeeper is enabled, since Zookeeper already stores the master addresses.

alluxio.master.rpc.executor.core.pool.size

500

The number of threads to keep in thread pool of master RPC ExecutorService.

alluxio.master.rpc.executor.fjp.async

true

This property is effective when alluxio.master.rpc.executor.type is set to ForkJoinPool. if true, it establishes local first-in-first-out scheduling mode for forked tasks that are never joined. This mode may be more appropriate than default locally stack-based mode in applications in which worker threads only process event-style asynchronous tasks.

alluxio.master.rpc.executor.fjp.min.runnable

1

This property is effective when alluxio.master.rpc.executor.type is set to ForkJoinPool. It controls the minimum allowed number of core threads not blocked. A value of 1 ensures liveness. A larger value might improve throughput but might also increase overhead.

alluxio.master.rpc.executor.fjp.parallelism

2 * {CPU core count}

This property is effective when alluxio.master.rpc.executor.type is set to ForkJoinPool. It controls the parallelism level (internal queue count) of master RPC ExecutorService.

alluxio.master.rpc.executor.keepalive

60sec

The keep alive time of a thread in master RPC ExecutorServicelast used before this thread is terminated (and replaced if necessary).

alluxio.master.rpc.executor.max.pool.size

500

The maximum number of threads allowed for master RPC ExecutorService. When the maximum is reached, attempts to replace blocked threads fail.

alluxio.master.rpc.executor.tpe.allow.core.threads.timeout

true

This property is effective when alluxio.master.rpc.executor.type is set to ThreadPoolExecutor. It controls whether core threads can timeout and terminate when there is no work.

alluxio.master.rpc.executor.tpe.queue.type

LINKED_BLOCKING_QUEUE

This property is effective when alluxio.master.rpc.executor.type is set to TPE. It specifies the internal task queue that's used by RPC ExecutorService. Supported values are: LINKED_BLOCKING_QUEUE, LINKED_BLOCKING_QUEUE_WITH_CAP, ARRAY_BLOCKING_QUEUE and SYNCHRONOUS_BLOCKING_QUEUE

alluxio.master.rpc.executor.type

TPE

Type of ExecutorService for Alluxio master gRPC server. Supported values are TPE (for ThreadPoolExecutor) and FJP (for ForkJoinPool).

alluxio.master.rpc.port

19998

The port for Alluxio master's RPC service.

alluxio.master.secure.rpc.executor.core.pool.size

30

The number of threads to keep in thread pool of master secure RPC ExecutorService.

alluxio.master.secure.rpc.executor.fjp.async

true

This property is effective when alluxio.master.secure.rpc.executor.type is set to ForkJoinPool. if true, it establishes local first-in-first-out scheduling mode for forked tasks that are never joined. This mode may be more appropriate than default locally stack-based mode in applications in which master secure threads only process event-style asynchronous tasks.

alluxio.master.secure.rpc.executor.fjp.min.runnable

1

This property is effective when alluxio.master.secure.rpc.executor.type is set to ForkJoinPool. It controls the minimum allowed number of core threads not blocked. A value of 1 ensures liveness. A larger value might improve throughput but might also increase overhead.

alluxio.master.secure.rpc.executor.fjp.parallelism

2 * {CPU core count}

This property is effective when alluxio.master.secure.rpc.executor.type is set to ForkJoinPool. It controls the parallelism level (internal queue count) of master secure RPC ExecutorService.

alluxio.master.secure.rpc.executor.keepalive

60sec

The keep alive time of a thread in master secure RPC ExecutorServicelast used before this thread is terminated (and replaced if necessary).

alluxio.master.secure.rpc.executor.max.pool.size

100

The maximum number of threads allowed for master secure RPC ExecutorService. When the maximum is reached, attempts to replace blocked threads fail.

alluxio.master.secure.rpc.executor.tpe.allow.core.threads.timeout

true

This property is effective when alluxio.master.secure.rpc.executor.type is set to ThreadPoolExecutor. It controls whether core threads can timeout and terminate when there is no work.

alluxio.master.secure.rpc.executor.tpe.queue.type

LINKED_BLOCKING_QUEUE_WITH_CAP

This property is effective when alluxio.master.secure.rpc.executor.type is set to TPE. It specifies the internal task queue that's used by RPC ExecutorService. Supported values are: LINKED_BLOCKING_QUEUE, LINKED_BLOCKING_QUEUE_WITH_CAP, ARRAY_BLOCKING_QUEUE and SYNCHRONOUS_BLOCKING_QUEUE

alluxio.master.secure.rpc.executor.type

TPE

Type of ExecutorService for Alluxio master secure gRPC server. Supported values are TPE (for ThreadPoolExecutor) and FJP (for ForkJoinPool).

alluxio.master.secure.rpc.port

19996

The port for Alluxio master's secure RPC service.

alluxio.master.shell.backup.state.lock.grace.mode

FORCED

Grace mode helps taking the state-lock exclusively for backup with minimum disruption to existing RPCs. This low-impact locking phase is called grace-cycle. Two modes are supported: TIMEOUT/FORCED.TIMEOUT: Means exclusive locking will timeout if it cannot acquire the lockwith grace-cycle. FORCED: Means the state-lock will be taken forcefully if grace-cycle fails to acquire it. Forced phase might trigger interrupting of existing RPCs if it is enabled.

alluxio.master.shell.backup.state.lock.sleep.duration

0s

The duration that controls how long the lock waiter sleeps within a single grace-cycle.

alluxio.master.shell.backup.state.lock.timeout

0s

The max duration for a grace-cycle.

alluxio.master.shell.backup.state.lock.try.duration

0s

The duration that controls how long the state-lock is tried within a single grace-cycle.

alluxio.master.shimfs.auto.mount.enabled

false

If enabled, Alluxio will attempt to mount UFS for foreign URIs.

alluxio.master.shimfs.auto.mount.readonly

true

If true, UFSes are auto-mounted as read-only.

alluxio.master.shimfs.auto.mount.root

/auto-mount

Alluxio root path for auto-mounted UFSes. This directory should already exist in Alluxio.

alluxio.master.shimfs.auto.mount.shared

false

If true, UFSes are auto-mounted as shared.

alluxio.master.standby.heartbeat.interval

2min

The heartbeat interval between Alluxio primary master and standby masters.

alluxio.master.startup.block.integrity.check.enabled

false

Whether the system should be checked on startup for orphaned blocks (blocks having no corresponding files but still taking system resource due to various system failures). Orphaned blocks will be deleted during master startup if this property is true. This property is available since 1.7.1

alluxio.master.state.lock.error.threshold

20

Used to trace and debug state lock issues. When a thread recursively acquires the state lock more than threshold, log an error for further debugging.

alluxio.master.throttle.active.cpu.load.ratio

0.5

N/A

alluxio.master.throttle.active.heap.gc.time

1sec

N/A

alluxio.master.throttle.active.heap.used.ratio

0.5

N/A

alluxio.master.throttle.active.rpc.queue.size

50000

N/A

alluxio.master.throttle.background.enabled

false

Whether to throttle the background job

alluxio.master.throttle.enabled

true

The throttle service can monitor and throttle the master in case of overloaded

alluxio.master.throttle.filesystem.op.per.sec

2000

The max filesystem operations can be made per second if throttling is triggered

alluxio.master.throttle.filesystem.rpc.queue.size.limit

1000

N/A

alluxio.master.throttle.foreground.enabled

false

Whether to throttle the foreground job

alluxio.master.throttle.heartbeat.interval

3sec

The heartbeat interval for throttling monitor check

alluxio.master.throttle.observed.pit.number

3

The number of indicator PITs used to evaluate the system status.

alluxio.master.throttle.overloaded.cpu.load.ratio

0.95

N/A

alluxio.master.throttle.overloaded.heap.gc.time

10sec

N/A

alluxio.master.throttle.overloaded.heap.used.ratio

0.9

N/A

alluxio.master.throttle.overloaded.rpc.queue.size

150000

N/A

alluxio.master.throttle.stressed.cpu.load.ratio

0.8

N/A

alluxio.master.throttle.stressed.heap.gc.time

5sec

N/A

alluxio.master.throttle.stressed.heap.used.ratio

0.8

N/A

alluxio.master.throttle.stressed.rpc.queue.size

100000

N/A

alluxio.master.tieredstore.global.level0.alias

MEM

The name of the highest storage tier in the entire system.

alluxio.master.tieredstore.global.level1.alias

SSD

The name of the second highest storage tier in the entire system.

alluxio.master.tieredstore.global.level2.alias

HDD

The name of the third highest storage tier in the entire system.

alluxio.master.tieredstore.global.levels

3

The total number of storage tiers in the system.

alluxio.master.tieredstore.global.mediumtype

MEM,SSD,HDD

The list of medium types we support in the system.

alluxio.master.ttl.checker.interval

1hour

How often to periodically check and delete/free the files with expired ttl value.

alluxio.master.ufs.active.sync.event.rate.interval

60sec

The time interval we use to estimate incoming event rate

alluxio.master.ufs.active.sync.interval

30sec

Time interval to periodically actively sync UFS

alluxio.master.ufs.active.sync.max.activities

10

Max number of changes in a directory to be considered for active syncing

alluxio.master.ufs.active.sync.max.age

10

The maximum number of intervals we will wait to find a quiet period before we have to sync the directories

alluxio.master.ufs.active.sync.poll.batch.size

1024

The number of event batches that should be submitted together to a single thread for processing.

alluxio.master.ufs.active.sync.poll.timeout

10sec

Max time to wait before timing out a polling operation

alluxio.master.ufs.active.sync.retry.timeout

10sec

The max total duration to retry failed active sync operations.A large duration is useful to handle transient failures such as an unresponsive under storage but can lock the inode tree being synced longer.

alluxio.master.ufs.active.sync.thread.pool.size

The number of threads used by the active sync provider process active sync events. A higher number allow the master to use more CPU to process events from an event stream in parallel. If this value is too low, Alluxio may fall behind processing events. Defaults to # of processors / 2.

Max number of threads used to perform active sync

alluxio.master.ufs.block.location.cache.capacity

1000000

The capacity of the UFS block locations cache. This cache caches UFS block locations for files that are persisted but not in Alluxio space, so that listing status of these files do not need to repeatedly ask UFS for their block locations. If this is set to 0, the cache will be disabled.

alluxio.master.ufs.journal.max.catchup.time

10min

The maximum time to wait for ufs journal catching up before listening to Zookeeper state change. This is added to prevent frequently leadership transition during heavy journal replay stage.

alluxio.master.ufs.path.cache.capacity

100000

The capacity of the UFS sync path cache. This cache is used to approximate the `ONCE` metadata load behavior (see `alluxio.user.file.metadata.load.type`). Larger caches will consume more memory, but will better approximate the `ONCE` behavior.

alluxio.master.ufs.path.cache.threads

64

The maximum size of the thread pool for asynchronously processing paths for the UFS path cache. Greater number of threads will decrease the amount of staleness in the async cache, but may impact performance. If this is set to 0, the cache will be disabled, and `alluxio.user.file.metadata.load.type=ONCE` will behave like `ALWAYS`.

alluxio.master.unsafe.direct.persist.object.enabled

true

When set to false, writing files using ASYNC_THROUGH or persist CLI with object stores as the UFS will first create temporary objects suffixed by ".alluxio.TIMESTAMP.tmp" in the object store before committed to the final UFS path. When set to true, files will be put to the destination path directly in the object store without staging with a temp suffix. Enabling this optimization by directly persisting files can significantly improve the efficiency writing to object store by making less data copy as rename in object store can be slow, but leaving a short vulnerability window for undefined behavior if a file is written using ASYNC_THROUGH but renamed or removed before the async persist operation completes, while this same file path was reused for other new files in Alluxio.

alluxio.master.update.check.enabled

true

Whether to check for update availability.

alluxio.master.update.check.interval

7day

The interval to check for update availability.

alluxio.master.web.bind.host

0.0.0.0

The hostname Alluxio master web UI binds to.

alluxio.master.web.hostname

The hostname of Alluxio Master web UI.

alluxio.master.web.in.alluxio.data.page.count

1000

The number of URIs showing in the In-Alluxio Data Web UI page.

alluxio.master.web.port

19999

The port Alluxio web UI runs on.

alluxio.master.whitelist

/

A comma-separated list of prefixes of the paths which are cacheable, separated by semi-colons. Alluxio will try to cache the cacheable file when it is read for the first time.

alluxio.master.worker.connect.wait.time

5sec

Alluxio master will wait a period of time after start up for all workers to register, before it starts accepting client requests. This property determines the wait time.

alluxio.master.worker.info.cache.refresh.time

10sec

The worker information list will be refreshed after being cached for this time period. If the refresh time is too big, operations on the job servers or clients may fail because of the stale worker info. If it is too small, continuously updating worker information may case lock contention in the block master

alluxio.master.worker.register.lease.count

25

The number of workers that can register at the same time. Others will wait and retry until they are granted a RegisterLease. If you observe pressure on the master when many workers start up and register, tune down this parameter.

alluxio.master.worker.register.lease.enabled

true

Whether workers request for leases before they register. The RegisterLease is used by the master to control the concurrency of workers that are actively registering.

alluxio.master.worker.register.lease.respect.jvm.space

true

Whether the master checks the availability on the JVM before granting a lease to a worker. If the master determines the JVM does not have enough space to accept a new worker, the RegisterLease will not be granted.

alluxio.master.worker.register.lease.ttl

1min

The TTL for a RegisterLease granted to the worker. Leases that exceed the TTL will be recycled and granted to other workers.

alluxio.master.worker.register.stream.response.timeout

10min

When the worker registers the master with streaming, the worker will be sending messages to the master during the streaming.During an active stream if the master have not heard from the worker for more than this timeout, the worker will be considered hanging and the stream will be closed.

alluxio.master.worker.timeout

5min

Timeout between master and worker indicating a lost worker.

## Worker Configuration

The worker configuration specifies information regarding the worker nodes, such as the address and the port number.

Property Name
Default
Description

alluxio.worker.allocator.class

alluxio.worker.block.allocator.MaxFreeAllocator

The strategy that a worker uses to allocate space among storage directories in certain storage layer. Valid options include: `alluxio.worker.block.allocator.MaxFreeAllocator`, `alluxio.worker.block.allocator.GreedyAllocator`, `alluxio.worker.block.allocator.RoundRobinAllocator`.

alluxio.worker.bind.host

0.0.0.0

The hostname Alluxio's worker node binds to.

alluxio.worker.block.annotator.class

alluxio.worker.block.annotator.LRUAnnotator

The strategy that a worker uses to annotate blocks in order to have an ordered view of them during internalmanagement tasks such as eviction and promotion/demotion. Valid options include: `alluxio.worker.block.annotator.LRFUAnnotator`, `alluxio.worker.block.annotator.LRUAnnotator`,

alluxio.worker.block.annotator.lrfu.attenuation.factor

2.0

A attenuation factor in [2, INF) to control the behavior of LRFU annotator.

alluxio.worker.block.annotator.lrfu.step.factor

0.25

A factor in [0, 1] to control the behavior of LRFU: smaller value makes LRFU more similar to LFU; and larger value makes LRFU closer to LRU.

alluxio.worker.block.heartbeat.interval

1sec

The interval between block workers' heartbeats to update block status, storage health and other workers' information to Alluxio Master.

alluxio.worker.block.heartbeat.report.size.threshold

1000000

When alluxio.worker.register.to.all.masters=true, because a worker will send block reports to all masters, we use a threshold to limit the unsent block report size in worker's memory. If the worker block heartbeat is larger than the threshold, we discard the heartbeat message and force the worker to register with that master with a full report.

alluxio.worker.block.heartbeat.timeout

${alluxio.worker.master.connect.retry.timeout}

The timeout value of block workers' heartbeats. If the worker can't connect to master before this interval expires, the worker will exit.

alluxio.worker.block.master.client.pool.size

11

The block master client pool size on the Alluxio workers.

alluxio.worker.container.hostname

The container hostname if worker is running in a container.

alluxio.worker.data.encrypted.block.chunk.size

131072

The chunk size for chunk encryption

alluxio.worker.data.encrypted.file.meta.prefix

BLOCKMETA.

The metadata file prefix for encrypted block

alluxio.worker.data.encryption.method

ENCRYPTED_BY_CHUNK_JCE

The method to encrypt data

alluxio.worker.data.encryption.openssl.library.name

The crypto library so name

alluxio.worker.data.encryption.openssl.library.path

The crypto library path

alluxio.worker.data.encryption.zone.sync.interval

60sec

The interval for the worker to sync the encryption zone information from the leading Alluxio Master.

alluxio.worker.data.folder

/alluxioworker/

A relative path within each storage directory used as the data folder for Alluxio worker to put data for tiered store.

alluxio.worker.data.folder.encrypted

ENCRYPTED

The folder name to store encrypted blocks.

alluxio.worker.data.folder.permissions

rwxrwxrwx

The permission set for the worker data folder. If short circuit is used this folder should be accessible by all users (rwxrwxrwx).

alluxio.worker.data.folder.tmp

.tmp_blocks

A relative path in alluxio.worker.data.folder used to store the temporary data for uncommitted files.

alluxio.worker.data.server.domain.socket.address

The path to the domain socket. Short-circuit reads make use of a UNIX domain socket when this is set (non-empty). This is a special path in the file system that allows the client and the AlluxioWorker to communicate. You will need to set a path to this socket. The AlluxioWorker needs to be able to create the path. If alluxio.worker.data.server.domain.socket.as.uuid is set, the path should be the home directory for the domain socket. The full path for the domain socket with be {path}/{uuid}.

alluxio.worker.data.server.domain.socket.as.uuid

false

If true, the property alluxio.worker.data.server.domain.socket.addressis the path to the home directory for the domain socket and a unique identifier is used as the domain socket name. If false, the property is the absolute path to the UNIX domain socket.

alluxio.worker.data.tmp.subdir.max

1024

The maximum number of sub-directories allowed to be created in ${alluxio.worker.data.tmp.folder}.

alluxio.worker.data.ufs.read.rate.limit.mb

0

The maximum bandwidth for ufs reading per second. If 0 or unset then no rate limit is enforced.

alluxio.worker.evictor.class

The strategy that a worker uses to evict block files when a storage layer runs out of space. Valid options include `alluxio.worker.block.evictor.LRFUEvictor`, `alluxio.worker.block.evictor.GreedyEvictor`, `alluxio.worker.block.evictor.LRUEvictor`, `alluxio.worker.block.evictor.PartialLRUEvictor`.

alluxio.worker.free.space.timeout

10sec

The duration for which a worker will wait for eviction to make space available for a client write request.

alluxio.worker.fuse.enabled

false

If true, launch worker embedded Fuse application.

alluxio.worker.hostname

The hostname of Alluxio worker.

alluxio.worker.jvm.monitor.enabled

true

Whether to enable start JVM monitor thread on the worker. This will start a thread to detect JVM-wide pauses induced by GC or other reasons.

alluxio.worker.keytab.file

Kerberos keytab file for Alluxio worker.

alluxio.worker.management.backoff.strategy

ANY

Defines the backoff scope respected by background tasks. Supported values are ANY / DIRECTORY. ANY: Management tasks will backoff from worker when there is any user I/O.This mode will ensure low management task overhead in order to favor immediate user I/O performance. However, making progress on management tasks will require quite periods on the worker.DIRECTORY: Management tasks will backoff from directories with ongoing user I/O.This mode will give better chance of making progress on management tasks.However, immediate user I/O throughput might be reduced due to increased management task activity.

alluxio.worker.management.block.transfer.concurrency.limit

Use {CPU core count}/2 threads block transfer.

Puts a limit to how many block transfers are executed concurrently during management.

alluxio.worker.management.load.detection.cool.down.time

10sec

Management tasks will not run for this long after load detected. Any user I/O will still register as a load for this period of time after it is finished. Short durations might cause interference between user I/O and background tier management tasks. Long durations might cause starvation for background tasks.

alluxio.worker.management.task.thread.count

Use {CPU core count} threads for all management tasks.

The number of threads for management task executor

alluxio.worker.management.tier.align.enabled

true

Whether to align tiers based on access pattern.

alluxio.worker.management.tier.align.range

100

Maximum number of blocks to consider from one tier for a single alignment task.

alluxio.worker.management.tier.align.reserved.bytes

1GB

The amount of space that is reserved from each storage directory for internal management tasks.

alluxio.worker.management.tier.promote.enabled

true

Whether to promote blocks to higher tiers.

alluxio.worker.management.tier.promote.quota.percent

90

Max percentage of each tier that could be used for promotions. Promotions will be stopped to a tier once its used space go over this value. (0 means never promote, and, 100 means always promote.

alluxio.worker.management.tier.promote.range

100

Maximum number of blocks to consider from one tier for a single promote task.

alluxio.worker.management.tier.swap.restore.enabled

true

Whether to run management swap-restore task when tier alignment cannot make progress.

alluxio.worker.master.connect.retry.timeout

1hour

Retry period before workers give up on connecting to master and exit.

alluxio.worker.master.periodical.rpc.timeout

5min

Timeout for periodical RPC between workers and the leading master. This property is added to prevent workers from hanging in periodical RPCs with previous leading master during flaky network situations. If the timeout is too short, periodical RPCs may not have enough time to get response from the leading master during heavy cluster load and high network latency.

alluxio.worker.network.async.cache.manager.queue.max

512

The maximum number of outstanding async caching requests to cache blocks in each data server

alluxio.worker.network.async.cache.manager.threads.max

2 * {CPU core count}

The maximum number of threads used to cache blocks asynchronously in the data server.

alluxio.worker.network.block.reader.threads.max

2048

The maximum number of threads used to read blocks in the data server.

alluxio.worker.network.block.writer.threads.max

1024

The maximum number of threads used to write blocks in the data server.

alluxio.worker.network.flowcontrol.window

2MB

The HTTP2 flow control window used by worker gRPC connections. Larger value will allow more data to be buffered but will use more memory.

alluxio.worker.network.keepalive.time

30sec

The amount of time for data server (for block reads and block writes) to wait for a response before pinging the client to see if it is still alive.

alluxio.worker.network.keepalive.timeout

30sec

The maximum time for a data server (for block reads and block writes) to wait for a keepalive response before closing the connection.

alluxio.worker.network.max.inbound.message.size

4MB

The max inbound message size used by worker gRPC connections.

alluxio.worker.network.netty.boss.threads

1

How many threads to use for accepting new requests.

alluxio.worker.network.netty.channel

EPOLL

Netty channel type: NIO or EPOLL. If EPOLL is not available, this will automatically fall back to NIO.

alluxio.worker.network.netty.shutdown.quiet.period

2sec

The quiet period. When the netty server is shutting down, it will ensure that no RPCs occur during the quiet period. If an RPC occurs, then the quiet period will restart before shutting down the netty server.

alluxio.worker.network.netty.watermark.high

32KB

Determines how many bytes can be in the write queue before switching to non-writable.

alluxio.worker.network.netty.watermark.low

8KB

Once the high watermark limit is reached, the queue must be flushed down to the low watermark before switching back to writable.

alluxio.worker.network.netty.worker.threads

2 * {CPU core count}

Number of threads to use for processing requests in worker

alluxio.worker.network.permit.keepalive.time

30s

Specify the most aggressive keep-alive time clients are permitted to configure. The server will try to detect clients exceeding this rate and when detected will forcefully close the connection.

alluxio.worker.network.reader.buffer.pooled

true

Whether it is using pooled direct buffer or unpooled wrapped buffer when creating a buffer for remote read

alluxio.worker.network.reader.buffer.size

4MB

When a client reads from a remote worker, the maximum amount of data not received by client allowed before the worker pauses sending more data. If this value is lower than read chunk size, read performance may be impacted as worker waits more often for buffer to free up. Higher value will increase the memory consumed by each read request.

alluxio.worker.network.reader.max.chunk.size.bytes

2MB

When a client read from a remote worker, the maximum chunk size.

alluxio.worker.network.shutdown.timeout

15sec

Maximum amount of time to wait until the worker gRPC server is shutdown (regardless of the quiet period).

alluxio.worker.network.writer.buffer.size.messages

8

When a client writes to a remote worker, the maximum number of data messages to buffer by the server for each request.

alluxio.worker.network.zerocopy.enabled

true

Whether zero copy is enabled on worker when processing data streams.

alluxio.worker.principal

Kerberos principal for Alluxio worker.

alluxio.worker.ramdisk.size

2/3 of total system memory, or 1GB if system memory size cannot be determined

The allocated memory for each worker node's ramdisk(s). It is recommended to set this value explicitly.

alluxio.worker.register.lease.enabled

${alluxio.master.worker.register.lease.enabled}

Whether the worker requests a lease from the master before registering.This should be consistent with alluxio.master.worker.register.lease.enabled

alluxio.worker.register.lease.retry.max.duration

${alluxio.worker.master.connect.retry.timeout}

The total time on retrying to get a register lease, before giving up.

alluxio.worker.register.lease.retry.sleep.max

10sec

The maximum time to sleep before retrying to get a register lease.

alluxio.worker.register.lease.retry.sleep.min

1sec

The minimum time to sleep before retrying to get a register lease.

alluxio.worker.register.stream.batch.size

1000000

When the worker registers with the master using a stream, this defines the metadata of how many blocks should be send to the master in each batch.

alluxio.worker.register.stream.complete.timeout

5min

When the worker registers the master with streaming, after all messages have been sent to the master, the worker will wait for the registration to complete on the master side. If the master is unable to finish the registration and return success to the worker within this timeout, the worker will consider the registration failed.

alluxio.worker.register.stream.deadline

15min

When the worker registers with the master using a stream, this defines the total deadline for the full stream to finish.

alluxio.worker.register.stream.enabled

true

When the worker registers with the master, whether the request should be broken into a stream of smaller batches. This is useful when the worker's storage is large and we expect a large number of blocks.

alluxio.worker.register.stream.response.timeout

${alluxio.master.worker.register.stream.response.timeout}

When the worker registers the master with streaming, the worker will be sending messages to the master during the streaming.During an active stream if the master have not responded to the worker for more than this timeout, the worker will consider the master is hanging and close the stream.

alluxio.worker.register.to.all.masters

false

If enabled, workers will register themselves to all masters, instead of primary master only. This can be used to save the master failover time because the new primary immediately knows all existing workers and blocks. Can only be enabled when alluxio.standby.master.grpc.enabled is turned on.

alluxio.worker.remote.io.slow.threshold

10s

The time threshold for when a worker remote IO (read or write) of a single buffer is considered slow. When slow IO occurs, it is logged by a sampling logger.

alluxio.worker.reviewer.class

alluxio.worker.block.reviewer.ProbabilisticBufferReviewer

(Experimental) The API is subject to change in the future.The strategy that a worker uses to review space allocation in the Allocator. Each time a block allocation decision is made by the Allocator, the Reviewer will review the decision and rejects it,if the allocation does not meet certain criteria of the Reviewer.The Reviewer prevents the worker to make a bad block allocation decision.Valid options include:`alluxio.worker.block.reviewer.ProbabilisticBufferReviewer`.

alluxio.worker.reviewer.probabilistic.hardlimit.bytes

64MB

This is used by the `alluxio.worker.block.reviewer.ProbabilisticBufferReviewer`. When the free space in a storage dir falls below this hard limit, the ProbabilisticBufferReviewer will stop accepting new blocks into it.This is because we may load more data into existing blocks in the directory and their sizes may expand.

alluxio.worker.reviewer.probabilistic.softlimit.bytes

256MB

This is used by the `alluxio.worker.block.reviewer.ProbabilisticBufferReviewer`. We attempt to leave a buffer in each storage directory. When the free space in a certain storage directory on the worker falls below this soft limit, the chance that the Reviewer accepts new blocks into this directory goes down. This chance keeps falling linearly until it reaches 0, when the available space reaches the hard limit.

alluxio.worker.rpc.executor.core.pool.size

100

The number of threads to keep in thread pool of worker RPC ExecutorService.

alluxio.worker.rpc.executor.fjp.async

true

This property is effective when alluxio.worker.rpc.executor.type is set to ForkJoinPool. if true, it establishes local first-in-first-out scheduling mode for forked tasks that are never joined. This mode may be more appropriate than default locally stack-based mode in applications in which worker threads only process event-style asynchronous tasks.

alluxio.worker.rpc.executor.fjp.min.runnable

1

This property is effective when alluxio.worker.rpc.executor.type is set to ForkJoinPool. It controls the minimum allowed number of core threads not blocked. A value of 1 ensures liveness. A larger value might improve throughput but might also increase overhead.

alluxio.worker.rpc.executor.fjp.parallelism

2 * {CPU core count}

This property is effective when alluxio.worker.rpc.executor.type is set to ForkJoinPool. It controls the parallelism level (internal queue count) of master RPC ExecutorService.

alluxio.worker.rpc.executor.keepalive

60sec

The keep alive time of a thread in worker RPC ExecutorServicelast used before this thread is terminated (and replaced if necessary).

alluxio.worker.rpc.executor.max.pool.size

1000

The maximum number of threads allowed for worker RPC ExecutorService. When the maximum is reached, attempts to replace blocked threads fail.

alluxio.worker.rpc.executor.tpe.allow.core.threads.timeout

true

This property is effective when alluxio.worker.rpc.executor.type is set to ThreadPoolExecutor. It controls whether core threads can timeout and terminate when there is no work.

alluxio.worker.rpc.executor.tpe.queue.type

LINKED_BLOCKING_QUEUE_WITH_CAP

This property is effective when alluxio.worker.rpc.executor.type is set to TPE. It specifies the internal task queue that's used by RPC ExecutorService. Supported values are: LINKED_BLOCKING_QUEUE, LINKED_BLOCKING_QUEUE_WITH_CAP, ARRAY_BLOCKING_QUEUE and SYNCHRONOUS_BLOCKING_QUEUE

alluxio.worker.rpc.executor.type

TPE

Type of ExecutorService for Alluxio worker gRPC server. Supported values are TPE (for ThreadPoolExecutor) and FJP (for ForkJoinPool).

alluxio.worker.rpc.port

29999

The port for Alluxio worker's RPC service.

alluxio.worker.secure.rpc.bind.host

0.0.0.0

N/A

alluxio.worker.secure.rpc.hostname

N/A

alluxio.worker.secure.rpc.port

29997

N/A

alluxio.worker.session.timeout

1min

Timeout between worker and client connection indicating a lost session connection.

alluxio.worker.startup.timeout

10min

Maximum time to wait for worker startup.

alluxio.worker.storage.checker.enabled

true

Whether periodic storage health checker is enabled on Alluxio workers.

alluxio.worker.tieredstore.block.lock.readers

1000

The max number of concurrent readers for a block lock.

alluxio.worker.tieredstore.block.locks

1000

Total number of block locks for an Alluxio block worker. Larger value leads to finer locking granularity, but uses more space.

alluxio.worker.tieredstore.free.ahead.bytes

0

Amount to free ahead when worker storage is full. Higher values will help decrease CPU utilization under peak storage. Lower values will increase storage utilization.

alluxio.worker.tieredstore.level0.alias

MEM

The alias of the top storage tier on this worker. It must match one of the global storage tiers from the master configuration. We disable placing an alias lower in the global hierarchy before an alias with a higher position on the worker hierarchy. So by default, SSD cannot come before MEM on any worker.

alluxio.worker.tieredstore.level0.dirs.mediumtype

${alluxio.worker.tieredstore.level0.alias}

A comma-separated list of media types (e.g., "MEM,MEM,SSD") for each storage directory on the top storage tier specified by alluxio.worker.tieredstore.level0.dirs.path.

alluxio.worker.tieredstore.level0.dirs.path

/mnt/ramdisk on Linux, /Volumes/ramdisk on OSX

A comma-separated list of paths (eg., /mnt/ramdisk1,/mnt/ramdisk2,/mnt/ssd/alluxio/cache1) of storage directories for the top storage tier. Note that for MacOS, the root directory should be `/Volumes/` and not `/mnt/`.

alluxio.worker.tieredstore.level0.dirs.quota

${alluxio.worker.ramdisk.size}

A comma-separated list of capacities (e.g., "500MB,500MB,5GB") for each storage directory on the top storage tier specified by alluxio.worker.tieredstore.level0.dirs.path. For any "MEM"-type media (i.e, the ramdisks), this value should be set equivalent to the value specified by alluxio.worker.ramdisk.size.

alluxio.worker.tieredstore.level0.watermark.high.ratio

0.95

The high watermark of the space in the top storage tier (a value between 0 and 1).

alluxio.worker.tieredstore.level0.watermark.low.ratio

0.7

The low watermark of the space in the top storage tier (a value between 0 and 1).

alluxio.worker.tieredstore.level1.alias

The alias of the second storage tier on this worker.

alluxio.worker.tieredstore.level1.dirs.mediumtype

${alluxio.worker.tieredstore.level1.alias}

A list of media types (e.g., "SSD,SSD,HDD") for each storage directory on the second storage tier specified by alluxio.worker.tieredstore.level1.dirs.path.

alluxio.worker.tieredstore.level1.dirs.path

A comma-separated list of paths (eg., /mnt/ssd/alluxio/cache2,/mnt/ssd/alluxio/cache3,/mnt/hdd/alluxio/cache1) of storage directories for the second storage tier.

alluxio.worker.tieredstore.level1.dirs.quota

A comma-separated list of capacities (e.g., "5GB,5GB,50GB") for each storage directory on the second storage tier specified by alluxio.worker.tieredstore.level1.dirs.path.

alluxio.worker.tieredstore.level1.watermark.high.ratio

0.95

The high watermark of the space in the second storage tier (a value between 0 and 1).

alluxio.worker.tieredstore.level1.watermark.low.ratio

0.7

The low watermark of the space in the second storage tier (a value between 0 and 1).

alluxio.worker.tieredstore.level2.alias

The alias of the third storage tier on this worker.

alluxio.worker.tieredstore.level2.dirs.mediumtype

${alluxio.worker.tieredstore.level2.alias}

A list of media types (e.g., "SSD,HDD,HDD") for each storage directory on the third storage tier specified by alluxio.worker.tieredstore.level2.dirs.path.

alluxio.worker.tieredstore.level2.dirs.path

A comma-separated list of paths (eg., /mnt/ssd/alluxio/cache4,/mnt/hdd/alluxio/cache2,/mnt/hdd/alluxio/cache3) of storage directories for the third storage tier.

alluxio.worker.tieredstore.level2.dirs.quota

A comma-separated list of capacities (e.g., "5GB,50GB,50GB") for each storage directory on the third storage tier specified by alluxio.worker.tieredstore.level2.dirs.path.

alluxio.worker.tieredstore.level2.watermark.high.ratio

0.95

The high watermark of the space in the third storage tier (a value between 0 and 1).

alluxio.worker.tieredstore.level2.watermark.low.ratio

0.7

The low watermark of the space in the third storage tier (a value between 0 and 1).

alluxio.worker.tieredstore.levels

1

The number of storage tiers on the worker.

alluxio.worker.ufs.instream.cache.enabled

true

Enable caching for seekable under storage input stream, so that subsequent seek operations on the same file will reuse the cached input stream. This will improve position read performance as the open operations of some under file system would be expensive. The cached input stream would be stale, when the UFS file is modified without notifying alluxio.

alluxio.worker.ufs.instream.cache.expiration.time

5min

Cached UFS instream expiration time.

alluxio.worker.ufs.instream.cache.max.size

5000

The max entries in the UFS instream cache.

alluxio.worker.web.bind.host

0.0.0.0

The hostname Alluxio worker's web server binds to.

alluxio.worker.web.hostname

The hostname Alluxio worker's web UI binds to.

alluxio.worker.web.port

30000

The port Alluxio worker's web UI runs on.

alluxio.worker.whitelist

/

A comma-separated list of prefixes of the paths which are cacheable, separated by semi-colons. Alluxio will try to cache the cacheable file when it is read for the first time.

## User Configuration

The user configuration specifies values regarding file system access.

Property Name
Default
Description

alluxio.user.app.id

The custom id to use for labeling this client's info, such as metrics. If unset, a random long will be used. This value is displayed in the client logs on initialization. Note that using the same app id will cause client info to be aggregated, so different applications must set their own ids or leave this value unset to use a randomly generated id.

alluxio.user.block.avoid.eviction.policy.reserved.size.bytes

0MB

The portion of space reserved in a worker when using the LocalFirstAvoidEvictionPolicy class as block location policy.

alluxio.user.block.master.client.pool.gc.interval

120sec

The interval at which block master client GC checks occur.

alluxio.user.block.master.client.pool.gc.threshold

120sec

A block master client is closed if it has been idle for more than this threshold.

alluxio.user.block.master.client.pool.size.max

500

The maximum number of block master clients cached in the block master client pool.

alluxio.user.block.master.client.pool.size.min

0

The minimum number of block master clients cached in the block master client pool. For long running processes, this should be set to zero.

alluxio.user.block.read.metrics.enabled

false

Whether detailed block read metrics will be recorded and sink.

alluxio.user.block.read.retry.max.duration

5min

This duration controls for how long Alluxio clients should tryreading a single block. If a particular block can't be read within this duration, then the I/O will timeout.

alluxio.user.block.read.retry.sleep.base

250ms

N/A

alluxio.user.block.read.retry.sleep.max

2sec

N/A

alluxio.user.block.size.bytes.default

64MB

Default block size for Alluxio files.

alluxio.user.block.worker.client.pool.gc.threshold

300sec

A block worker client is closed if it has been idle for more than this threshold.

alluxio.user.block.worker.client.pool.max

1024

The maximum number of block worker clients cached in the block worker client pool.

alluxio.user.block.write.location.policy.class

alluxio.client.block.policy.LocalFirstPolicy

The default location policy for choosing workers for writing a file's blocks.

alluxio.user.client.cache.async.restore.enabled

true

If this is enabled, cache restore state asynchronously.

alluxio.user.client.cache.async.write.enabled

false

If this is enabled, cache data asynchronously.

alluxio.user.client.cache.async.write.threads

16

Number of threads to asynchronously cache data.

alluxio.user.client.cache.dirs

/tmp/alluxio_cache

A list of the directories where client-side cache is stored.

alluxio.user.client.cache.enabled

false

If this is enabled, data will be cached on Alluxio client.

alluxio.user.client.cache.eviction.retries

10

Max number of eviction retries.

alluxio.user.client.cache.evictor.class

alluxio.client.file.cache.evictor.LRUCacheEvictor

The strategy that client uses to evict local cached pages when running out of space. Currently valid options include `alluxio.client.file.cache.evictor.LRUCacheEvictor`,`alluxio.client.file.cache.evictor.LFUCacheEvictor`.

alluxio.user.client.cache.evictor.lfu.logbase

2.0

The log base for client cache LFU evictor bucket index.

alluxio.user.client.cache.evictor.nondeterministic.enabled

false

If this is enabled, the evictor picks uniformly from the worst k elements.Currently only LRU is supported.

alluxio.user.client.cache.filter.class

alluxio.client.file.cache.filter.DefaultCacheFilter

The default cache filter caches everything

alluxio.user.client.cache.filter.config-file

${alluxio.conf.dir}/cache_filter.properties

The alluxio cache filter config file

alluxio.user.client.cache.instream_buffer_size

0B

Size of the reading buffer for tiny read.

alluxio.user.client.cache.local.store.file.buckets

1000

The number of file buckets for the local page store of the client-side cache. It is recommended to set this to a high value if the number of unique files is expected to be high (# files / file buckets <= 100,000).

alluxio.user.client.cache.page.size

1MB

Size of each page in client-side cache.

alluxio.user.client.cache.quota.enabled

false

Whether to support cache quota.

alluxio.user.client.cache.shadow.bloomfilter.num

4

The number of bloom filters used for tracking. Each tracks a segment of window

alluxio.user.client.cache.shadow.cuckoo.clock.bits

6

The number of bits of each item's clock field.

alluxio.user.client.cache.shadow.cuckoo.scope.bits

8

The number of bits of each item's scope field.

alluxio.user.client.cache.shadow.cuckoo.size.bits

20

The number of bits of each item's size field.

alluxio.user.client.cache.shadow.cuckoo.size.encoder.enabled

false

The flag to enable the size encoder for cuckoo filter.

alluxio.user.client.cache.shadow.cuckoo.size.prefix.bits

8

The prefix bits length of the size field.

alluxio.user.client.cache.shadow.cuckoo.size.suffix.bits

12

The suffix bits length of the size field.

alluxio.user.client.cache.shadow.enabled

false

If this is enabled, a shadow cache will be created to tracking the working set of a past time window, and measure the hit ratio if the working set fits the cache

alluxio.user.client.cache.shadow.memory.overhead

125MB

The total memory overhead for bloom filters used for tracking

alluxio.user.client.cache.shadow.type

CLOCK_CUCKOO_FILTER

The type of shadow cache to be used. Valid options are `MULTIPLE_BLOOM_FILTER` (which uses a chain of bloom filters), `CLOCK_CUCKOO_FILTER` (which uses cuckoo filter with extended field).

alluxio.user.client.cache.shadow.window

24h

The past time window for the shadow cache to tracking the working set, and it is in the unit of second

alluxio.user.client.cache.size

512MB

A list of maximum cache size for each cache directory.

alluxio.user.client.cache.store.overhead

A fraction value representing the storage overhead writing to disk. For example, with 1GB allocated cache space, and 10% storage overhead we expect no more than 1024MB / (1 + 10%) user data to store.

alluxio.user.client.cache.store.type

LOCAL

The type of page store to use for client-side cache. Can be either `LOCAL` or `ROCKS`. The `LOCAL` page store stores all pages in a directory, the `ROCKS` page store utilizes rocksDB to persist the data.

alluxio.user.client.cache.timeout.duration

-1

The timeout duration for local cache I/O operations (reading/writing/deleting). When this property is a positive value,local cache operations after timing out will fail and fallback to external file system but transparent to applications; when this property is a negative value, this feature is disabled.

alluxio.user.client.cache.timeout.threads

32

The number of threads to handle cache I/O operation timeout, when alluxio.user.client.cache.timeout.duration is positive.

alluxio.user.client.cache.ttl.check.interval.seconds

3600

TTL check interval time in seconds.

alluxio.user.client.cache.ttl.enabled

false

Whether to support cache quota.

alluxio.user.client.cache.ttl.threshold.seconds

10800

TTL threshold time in seconds.

alluxio.user.client.report.version.enabled

false

Whether the client reports version information to the server.

alluxio.user.conf.cluster.default.enabled

true

When this property is true, an Alluxio client will load the default values of cluster-wide configuration and path-specific configuration set by Alluxio master.

alluxio.user.conf.sync.interval

1min

The time period of client master heartbeat to update the configuration if necessary from meta master.

alluxio.user.date.format.pattern

MM-dd-yyyy HH:mm:ss:SSS

Display formatted date in cli command and web UI by given date format pattern.

alluxio.user.file.buffer.bytes

8MB

The size of the file buffer to use for file system reads/writes.

alluxio.user.file.copyfromlocal.block.location.policy.class

alluxio.client.block.policy.RoundRobinPolicy

The default location policy for choosing workers for writing a file's blocks using copyFromLocal command.

alluxio.user.file.create.ttl

-1

Time to live for files created by a user, no ttl by default.

alluxio.user.file.create.ttl.action

FREE

When file's ttl is expired, the action performs on it. Options: FREE(default), DELETE_ALLUXIO or DELETE

alluxio.user.file.delete.unchecked

false

Whether to check if the UFS contents are in sync with Alluxio before attempting to delete persisted directories recursively.

alluxio.user.file.direct.access

A regular expression to define Alluxio paths that are not read or write cached and always fetches from the ufs for the latest listing

alluxio.user.file.include.operation.id

true

Whether to send a unique operation id with designated filesystem operations.

alluxio.user.file.master.client.pool.gc.interval

120sec

The interval at which file system master client GC checks occur.

alluxio.user.file.master.client.pool.gc.threshold

120sec

A fs master client is closed if it has been idle for more than this threshold.

alluxio.user.file.master.client.pool.size.max

500

The maximum number of fs master clients cached in the fs master client pool.

alluxio.user.file.master.client.pool.size.min

0

The minimum number of fs master clients cached in the fs master client pool. For long running processes, this should be set to zero.

alluxio.user.file.metadata.load.type

ONCE

The behavior of loading metadata from UFS. When information about a path is requested and the path does not exist in Alluxio, metadata can be loaded from the UFS. Valid options are `ALWAYS`, `NEVER`, and `ONCE`. `ALWAYS` will always access UFS to see if the path exists in the UFS. `NEVER` will never consult the UFS. `ONCE` will access the UFS the "first" time (according to a cache), but not after that. This parameter is ignored if a metadata sync is performed, via the parameter "alluxio.user.file.metadata.sync.interval"

alluxio.user.file.metadata.sync.interval

-1

The interval for syncing UFS metadata before invoking an operation on a path. -1 means no sync will occur. 0 means Alluxio will always sync the metadata of the path before an operation. If you specify a time interval, Alluxio will (best effort) not re-sync a path within that time interval. Syncing the metadata for a path must interact with the UFS, so it is an expensive operation. If a sync is performed for an operation, the configuration of "alluxio.user.file.metadata.load.type" will be ignored.

alluxio.user.file.metadata.sync.limit

1000000

A user operation may trigger metadata sync on the Alluxio master,and this property sets a limit on that metadata sync operation if there is one. This property is used to prevent user accidentally triggering very large metadata sync operations on the master, which may lead to too much overhead on the Alluxio master. If the metadata sync workload triggered by an operation exceeds this limit, the metadata sync will be aborted and the user (client) will receive an error. An when a metadata sync is aborted in this way, the partial results will be thrown away. As a user, if you see this error, please reduce the granularity of the metadata sync by loading/listing only the specific files or sub-directories. In other words, try to avoid Alluxio ls or loadMetadata commands with -R option. A user can override this limit manually by setting this client-side property in an operation, or specify -n in ls/loadMetadata commands, but that is not recommended. Please use it sparingly! Note that this limit is not precise in execution. Because of the concurrency and prefetching logics in metadata sync, metadata for more paths may have been fetched from the UFS when the metadata sync is stopped. The difference depends on concurrency, speed of the UFS and the directory structures etc. This limit defaults to 1 million paths. And -1 means no limit.

alluxio.user.file.passive.cache.enabled

true

Whether to cache files to local Alluxio workers when the files are read from remote workers (not UFS).

alluxio.user.file.persist.on.rename

false

Whether or not to asynchronously persist any files which have been renamed. This is helpful when working with compute frameworks which use rename to commit results.

alluxio.user.file.persistence.initial.wait.time

0

Time to wait before starting the persistence job. When the value is set to -1, the file will be persisted by rename operation or persist CLI but will not be automatically persisted in other cases. This value should be smaller than the value of alluxio.master.persistence.max.total.wait.time

alluxio.user.file.readtype.default

CACHE

Default read type when creating Alluxio files. Valid options are `CACHE_PROMOTE` (move data to highest tier if already in Alluxio storage, write data into highest tier of local Alluxio if data needs to be read from under storage), `CACHE` (write data into highest tier of local Alluxio if data needs to be read from under storage), `NO_CACHE` (no data interaction with Alluxio, if the read is from Alluxio data migration or eviction will not occur).

alluxio.user.file.replication.max

-1

The target max replication level of a file in Alluxio space. Setting this property to a negative value means no upper limit.

alluxio.user.file.replication.min

0

The target min replication level of a file in Alluxio space.

alluxio.user.file.reserved.bytes

${alluxio.user.block.size.bytes.default}

The size to reserve on workers for file system writes.Using smaller value will improve concurrency for writes smaller than block size.

alluxio.user.file.sequential.pread.threshold

2MB

An upper bound on the client buffer size for positioned read to hint at the sequential nature of reads. For reads with a buffer size greater than this threshold, the read op is treated to be sequential and the worker may handle the read differently. For instance, cold reads from the HDFS ufs may use a different HDFS client API.

alluxio.user.file.target.media

Preferred media type while storing file's blocks.

alluxio.user.file.waitcompleted.poll

1sec

The time interval to poll a file for its completion status when using waitCompleted.

alluxio.user.file.write.init.max.duration

2min

Controls how long to retry initialization of a file write, when Alluxio workers are required but not ready.

alluxio.user.file.write.init.sleep.max

5sec

N/A

alluxio.user.file.write.init.sleep.min

1sec

N/A

alluxio.user.file.write.tier.default

0

The default tier for choosing a where to write a block. Valid option is any integer. Non-negative values identify tiers starting from top going down (0 identifies the first tier, 1 identifies the second tier, and so on). If the provided value is greater than the number of tiers, it identifies the last tier. Negative values identify tiers starting from the bottom going up (-1 identifies the last tier, -2 identifies the second to last tier, and so on). If the absolute value of the provided value is greater than the number of tiers, it identifies the first tier.

alluxio.user.file.writetype.default

MUST_CACHE

Default write type when creating Alluxio files. Valid options are `MUST_CACHE` (write will only go to Alluxio and must be stored in Alluxio), `CACHE_THROUGH` (try to cache, write to UnderFS synchronously), `THROUGH` (no cache, write to UnderFS synchronously)

alluxio.user.hdfs.client.exclude.mount.info.on.list.status

false

If enabled, the mount info will be excluded from the response when a HDFS client calls alluxio to list status on a directory.

alluxio.user.heartbeat.interval

1sec

The interval between Alluxio workers' heartbeats.

alluxio.user.hostname

The hostname to use for an Alluxio client.

alluxio.user.local.reader.chunk.size.bytes

8MB

When a client reads from a local worker, the maximum data chunk size.

alluxio.user.local.writer.chunk.size.bytes

64KB

When a client writes to a local worker, the maximum data chunk size.

alluxio.user.logging.threshold

10s

Logging a client RPC when it takes more time than the threshold.

alluxio.user.master.polling.timeout

30sec

The maximum time for a rpc client to wait for master to respond.

alluxio.user.metadata.cache.enabled

false

If this is enabled, metadata of paths will be cached. The cached metadata will be evicted when it expires after alluxio.user.metadata.cache.expiration.time or the cache size is over the limit of alluxio.user.metadata.cache.max.size.

alluxio.user.metadata.cache.expiration.time

10min

Metadata will expire and be evicted after being cached for this time period. Only valid if alluxio.user.metadata.cache.enabled is set to true.

alluxio.user.metadata.cache.max.size

100000

Maximum number of paths with cached metadata. Only valid if alluxio.user.metadata.cache.enabled is set to true.

alluxio.user.metrics.collection.enabled

false

Enable collecting the client-side metrics and heartbeat them to master

alluxio.user.metrics.heartbeat.interval

10sec

The time period of client master heartbeat to send the client-side metrics.

alluxio.user.network.data.timeout

The maximum time for an Alluxio client to wait for a data response (e.g. block reads and block writes) from Alluxio worker.

alluxio.user.network.flowcontrol.window

The HTTP2 flow control window used by user gRPC connections. Larger value will allow more data to be buffered but will use more memory.

alluxio.user.network.keepalive.time

The amount of time for a gRPC client (for block reads and block writes) to wait for a response before pinging the server to see if it is still alive.

alluxio.user.network.keepalive.timeout

The maximum time for a gRPC client (for block reads and block writes) to wait for a keepalive response before closing the connection.

alluxio.user.network.max.inbound.message.size

The max inbound message size used by user gRPC connections.

alluxio.user.network.netty.channel

Type of netty channels. If EPOLL is not available, this will automatically fall back to NIO.

alluxio.user.network.netty.worker.threads

How many threads to use for remote block worker client to read from remote block workers.

alluxio.user.network.reader.buffer.size.messages

When a client reads from a remote worker, the maximum number of messages to buffer by the client. A message can be either a command response, a data chunk, or a gRPC stream event such as complete or error.

alluxio.user.network.reader.chunk.size.bytes

When a client reads from a remote worker, the maximum chunk size.

alluxio.user.network.rpc.flowcontrol.window

2MB

The HTTP2 flow control window used by user rpc connections. Larger value will allow more data to be buffered but will use more memory.

alluxio.user.network.rpc.keepalive.time

30sec

The amount of time for a rpc client to wait for a response before pinging the server to see if it is still alive.

alluxio.user.network.rpc.keepalive.timeout

30sec

The maximum time for a rpc client to wait for a keepalive response before closing the connection.

alluxio.user.network.rpc.max.connections

1

The maximum number of physical connections to be used per target host.

alluxio.user.network.rpc.max.inbound.message.size

100MB

The max inbound message size used by user rpc connections.

alluxio.user.network.rpc.netty.channel

EPOLL

Type of netty channels used by rpc connections. If EPOLL is not available, this will automatically fall back to NIO.

alluxio.user.network.rpc.netty.worker.threads

0

How many threads to use for rpc client to read from remote workers.

alluxio.user.network.secret.flowcontrol.window

2MB

The HTTP2 flow control window used by secret-key connections. Larger value will allow more data to be buffered but will use more memory.

alluxio.user.network.secret.keepalive.time

9223372036854775807

The amount of time for a secret-key client to wait for a response before pinging the server to see if it is still alive.

alluxio.user.network.secret.keepalive.timeout

30sec

The maximum time for a secret-key client to wait for a keepalive response before closing the connection.

alluxio.user.network.secret.max.connections

1

The maximum number of physical connections to be used per target host.

alluxio.user.network.secret.max.inbound.message.size

100MB

The max inbound message size used by user rpc connections.

alluxio.user.network.secret.netty.channel

EPOLL

Type of netty channels used by secret-key connections. If EPOLL is not available, this will automatically fall back to NIO.

alluxio.user.network.secret.netty.worker.threads

0

How many threads to use for secret-key client to read from remote workers.

alluxio.user.network.streaming.flowcontrol.window

2MB

The HTTP2 flow control window used by user streaming connections. Larger value will allow more data to be buffered but will use more memory.

alluxio.user.network.streaming.keepalive.time

9223372036854775807

The amount of time for a streaming client to wait for a response before pinging the server to see if it is still alive.

alluxio.user.network.streaming.keepalive.timeout

30sec

The maximum time for a streaming client to wait for a keepalive response before closing the connection.

alluxio.user.network.streaming.max.connections

64

The maximum number of physical connections to be used per target host.

alluxio.user.network.streaming.max.inbound.message.size

100MB

The max inbound message size used by user streaming connections.

alluxio.user.network.streaming.netty.channel

EPOLL

Type of netty channels used by streaming connections. If EPOLL is not available, this will automatically fall back to NIO.

alluxio.user.network.streaming.netty.worker.threads

0

How many threads to use for streaming client to read from remote workers.

alluxio.user.network.writer.buffer.size.messages

When a client writes to a remote worker, the maximum number of messages to buffer by the client. A message can be either a command response, a data chunk, or a gRPC stream event such as complete or error.

alluxio.user.network.writer.chunk.size.bytes

When a client writes to a remote worker, the maximum chunk size.

alluxio.user.network.writer.close.timeout

The timeout to close a writer client.

alluxio.user.network.writer.flush.timeout

The timeout to wait for flush to finish in a data writer.

alluxio.user.network.zerocopy.enabled

Whether zero copy is enabled on client when processing data streams.

alluxio.user.rpc.retry.base.sleep

50ms

Alluxio client RPCs automatically retry for transient errors with an exponential backoff. This property determines the base time in the exponential backoff.

alluxio.user.rpc.retry.max.duration

2min

Alluxio client RPCs automatically retry for transient errors with an exponential backoff. This property determines the maximum duration to retry for before giving up. Note that, this value is set to 5s for fs and fsadmin CLIs.

alluxio.user.rpc.retry.max.sleep

3sec

Alluxio client RPCs automatically retry for transient errors with an exponential backoff. This property determines the maximum wait time in the backoff.

alluxio.user.rpc.shuffle.masters.enabled

false

Shuffle the client-side configured master rpc addresses.

alluxio.user.shimfs.allow.list

A comma-separated list of paths to translate with transparent uri.When this is set, it overrides alluxio.user.shimfs.bypass.prefix.list

alluxio.user.shimfs.bypass.prefix.list

A comma-separated list of prefix paths to by-pass. User classpath should contain a native hadoop FileSystem implementation for target scheme. For example: "alluxio.user.shimfs.bypass.prefix.list=s3://bucket1/foo, s3://bucket1/bar"

alluxio.user.shimfs.shim.hdfs.stream.enabled

false

Whether to enable shim for HDFS input streams. Some compute applications rely on the existence of some methods of HDFS's concrete implementation class, which are not available on the generic parent class. The shim fills in the gap by providing stub implementations for those methods.

alluxio.user.short.circuit.enabled

true

The short circuit read/write which allows the clients to read/write data without going through Alluxio workers if the data is local is enabled if set to true.

alluxio.user.short.circuit.preferred

false

When short circuit and domain socket both enabled, prefer to use short circuit.

alluxio.user.streaming.data.read.timeout

3m

The maximum time for an Alluxio client to wait for a data response for read requests from Alluxio worker. Keep in mind that some streaming operations may take an unexpectedly long time, such as UFS io. In order to handle occasional slow operations, it is recommended for this parameter to be set to a large value, to avoid spurious timeouts.

alluxio.user.streaming.data.write.timeout

3m

The maximum time for an Alluxio client to wait for when writing 1 chunk for block writes to an Alluxio worker. This value can be tuned to offset instability from the UFS.

alluxio.user.streaming.reader.buffer.size.messages

16

When a client reads from a remote worker, the maximum number of messages to buffer by the client. A message can be either a command response, a data chunk, or a gRPC stream event such as complete or error.

alluxio.user.streaming.reader.chunk.size.bytes

1MB

When a client reads from a remote worker, the maximum chunk size.

alluxio.user.streaming.reader.close.timeout

5s

The timeout to close a grpc streaming reader client. If too long, it may add delays to closing clients. If too short, the client will complete the close() before the server confirms the close()

alluxio.user.streaming.writer.buffer.size.messages

16

When a client writes to a remote worker, the maximum number of messages to buffer by the client. A message can be either a command response, a data chunk, or a gRPC stream event such as complete or error.

alluxio.user.streaming.writer.chunk.size.bytes

1MB

When a client writes to a remote worker, the maximum chunk size.

alluxio.user.streaming.writer.close.timeout

30min

The timeout to close a writer client.

alluxio.user.streaming.writer.flush.timeout

30min

The timeout to wait for flush to finish in a data writer.

alluxio.user.streaming.zerocopy.enabled

true

Whether zero copy is enabled on client when processing data streams.

alluxio.user.ufs.block.location.all.fallback.enabled

true

Whether to return all workers as block location if ufs block locations are not co-located with any Alluxio workers or is empty.

alluxio.user.ufs.block.read.concurrency.max

2147483647

The maximum concurrent readers for one UFS block on one Block Worker.

alluxio.user.ufs.block.read.location.policy

alluxio.client.block.policy.LocalFirstPolicy

When an Alluxio client reads a file from the UFS, it delegates the read to an Alluxio worker. The client uses this policy to choose which worker to read through. Built-in choices: [<a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/CapacityBasedDeterministicHashPolicy.html">alluxio.client.block.policy.CapacityBasedDeterministicHashPolicy</a>, <a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/CapacityBaseRandomPolicy.html">alluxio.client.block.policy.CapacityBaseRandomPolicy</a>, <a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/DeterministicHashPolicy.html">alluxio.client.block.policy.DeterministicHashPolicy</a>, <a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/LocalFirstAvoidEvictionPolicy.html">alluxio.client.block.policy.LocalFirstAvoidEvictionPolicy</a>, <a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/LocalFirstPolicy.html">alluxio.client.block.policy.LocalFirstPolicy</a>, <a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/MostAvailableFirstPolicy.html">alluxio.client.block.policy.MostAvailableFirstPolicy</a>, <a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/RoundRobinPolicy.html">alluxio.client.block.policy.RoundRobinPolicy</a>, <a href="https://docs.alluxio.io/os/javadoc/edge/alluxio/client/block/policy/SpecificHostPolicy.html">alluxio.client.block.policy.SpecificHostPolicy</a>].

alluxio.user.ufs.block.read.location.policy.cache.expiration.time

10min

Deprecated - When alluxio.user.ufs.block.read.location.policy is set to alluxio.client.block.policy.CapacityBaseRandomPolicy, this specifies cache expire time of block location.

alluxio.user.ufs.block.read.location.policy.cache.size

10000

Deprecated - When alluxio.user.ufs.block.read.location.policy is set to alluxio.client.block.policy.CapacityBaseRandomPolicy, this specifies cache size of block location.

alluxio.user.ufs.block.read.location.policy.deterministic.hash.shards

1

When alluxio.user.ufs.block.read.location.policy is set to alluxio.client.block.policy.DeterministicHashPolicy or alluxio.client.block.policy.CapacityBasedDeterministicHashPolicy, this specifies the number of hash shards.

alluxio.user.worker.list.refresh.interval

2min

The interval used to refresh the live worker list on the client

## Resource Manager Configuration

When running Alluxio with resource managers like Mesos and YARN, Alluxio has additional configuration options.

Property Name
Default
Description

alluxio.integration.master.resource.cpu

1

The number of CPUs to run an Alluxio master for YARN framework.

alluxio.integration.master.resource.mem

1024MB

The amount of memory to run an Alluxio master for YARN framework.

alluxio.integration.worker.resource.cpu

1

The number of CPUs to run an Alluxio worker for YARN framework.

alluxio.integration.worker.resource.mem

1024MB

The amount of memory to run an Alluxio worker for YARN framework.

alluxio.integration.yarn.workers.per.host.max

1

The number of workers to run on an Alluxio host for YARN framework.

## Security Configuration

The security configuration specifies information regarding the security features, such as authentication and file permission. Settings for authentication take effect for master, worker, and user. Settings for file permission only take effect for master. See Security for more information about security features.

Property Name
Default
Description

alluxio.security.authentication.custom.provider.class

The class to provide customized authentication implementation, when alluxio.security.authentication.type is set to CUSTOM. It must implement the interface 'alluxio.security.authentication.AuthenticationProvider'.

alluxio.security.authentication.delegation.token.key.lifetime.ms

1d

Lifetime of a delegation token secret key.

alluxio.security.authentication.delegation.token.lifetime.ms

7d

Maximum lifetime of a delegation token.

alluxio.security.authentication.delegation.token.renew.interval.ms

1d

Time before which a delegation token must be renewed.

alluxio.security.authentication.delegation.token.server.uri.match

The server side service name used to match the uri from the client, * can match any

alluxio.security.authentication.delegation.token.use.ip.service.name

true

Whether to use master IP address as the service name of delegation tokens. If set to false, the hostname of master will be used instead. This is set to true by default to avoid service name mismatch due to hostname differences. It should be set to true if the master IP can change over time.

alluxio.security.authentication.type

SIMPLE

The authentication mode. Currently three modes are supported: NOSASL, SIMPLE, CUSTOM. The default value SIMPLE indicates that a simple authentication is enabled. Server trusts whoever the client claims to be.

alluxio.security.authorization.capability.enabled

false

N/A

alluxio.security.authorization.capability.lifetime.ms

3600000

N/A

alluxio.security.authorization.capability.threadpool.size

1

The distribute key thread pool size

alluxio.security.authorization.default.deny

false

The permission is not determined explicit by the external authorization service, then deny the access

alluxio.security.authorization.permission.enabled

true

Whether to enable access control based on file permission.

alluxio.security.authorization.permission.supergroup

supergroup

The super group of Alluxio file system. All users in this group have super permission.

alluxio.security.authorization.permission.umask

022

The umask of creating file and directory. The initial creation permission is 777, and the difference between directory and file is 111. So for default umask value 022, the created directory has permission 755 and file has permission 644.

alluxio.security.authorization.plugin.name

Plugin for master authorization.

alluxio.security.authorization.plugin.opa.address

localhost

OPA daemon address

alluxio.security.authorization.plugin.opa.cache.acl.file.level

true

Check ACL at file level otherwise directory level

alluxio.security.authorization.plugin.opa.cache.capacity

10000

The cache capacity in number of record.

alluxio.security.authorization.plugin.opa.cache.duration

60s

The time in seconds to cache one OPA decision.

alluxio.security.authorization.plugin.opa.policy.path

/

The policy path and name

alluxio.security.authorization.plugin.opa.port

8181

OPA daemon port

alluxio.security.authorization.plugin.opa.retry.times.on.network.error

3

Retry times in network error

alluxio.security.authorization.plugin.opa.ssl.ca.path

/

If SSL is enabled, ca file path is expected.

alluxio.security.authorization.plugin.opa.ssl.cert.path

/

If SSL is enabled, cert file path is expected.

alluxio.security.authorization.plugin.opa.ssl.enabled

false

Whether SSL connection is enabled in OPA daemon

alluxio.security.authorization.plugin.opa.ssl.key.path

/

If SSL is enabled, key file path is expected.

alluxio.security.authorization.plugin.paths

Classpath for master authorization plugin, separated by colons.

alluxio.security.authorization.plugins.enabled

false

Enable plugins for authorization.

alluxio.security.authorization.token.encryption.key.lifetime.ms

86400000

This decides the lifetime of an encryption key that's been shared between masters and workers for encrypting and decrypting security tokens. It is expected for workers to be provided by masters with new encryption keys before this duration finishes.

alluxio.security.group.mapping.cache.timeout

1min

Time for cached group mapping to expire.

alluxio.security.group.mapping.class

alluxio.security.group.provider.ShellBasedUnixGroupsMapping

The class to provide user-to-groups mapping service. Master could get the various group memberships of a given user. It must implement the interface 'alluxio.security.group.GroupMappingService'. The default implementation execute the 'groups' shell command to fetch the group memberships of a given user.

alluxio.security.group.mapping.ldap.attr.group.name

cn

N/A

alluxio.security.group.mapping.ldap.attr.member

member

N/A

alluxio.security.group.mapping.ldap.base

N/A

alluxio.security.group.mapping.ldap.bind.password

N/A

alluxio.security.group.mapping.ldap.bind.password.file

N/A

alluxio.security.group.mapping.ldap.bind.user

N/A

alluxio.security.group.mapping.ldap.search.filter.group

(objectClass=group)

N/A

alluxio.security.group.mapping.ldap.search.filter.user

(&(objectClass=user)(sAMAccountName={0}))

N/A

alluxio.security.group.mapping.ldap.search.timeout

10000

N/A

alluxio.security.group.mapping.ldap.ssl

false

N/A

alluxio.security.group.mapping.ldap.ssl.keystore

N/A

alluxio.security.group.mapping.ldap.ssl.keystore.password

N/A

alluxio.security.group.mapping.ldap.ssl.keystore.password.file

N/A

alluxio.security.group.mapping.ldap.url

N/A

alluxio.security.kerberos.auth.to.local

DEFAULT

N/A

alluxio.security.kerberos.client.keytab.file

N/A

alluxio.security.kerberos.client.principal

N/A

alluxio.security.kerberos.client.use.ticket.cache

true

Whether the client forces to disable the kerberos ticket cache

alluxio.security.kerberos.min.seconds.before.relogin

1min

N/A

alluxio.security.kerberos.server.keytab.file

N/A

alluxio.security.kerberos.server.principal

N/A

alluxio.security.kerberos.unified.instance.name

N/A

alluxio.security.login.impersonation.username

_HDFS_USER_

When alluxio.security.authentication.type is set to SIMPLE or CUSTOM, user application uses this property to indicate the IMPERSONATED user requesting Alluxio service. If it is not set explicitly, or set to _NONE_, impersonation will not be used. A special value of '_HDFS_USER_' can be specified to impersonate the hadoop client user.

alluxio.security.login.username

When alluxio.security.authentication.type is set to SIMPLE or CUSTOM, user application uses this property to indicate the user requesting Alluxio service. If it is not set explicitly, the OS login user will be used.

alluxio.security.privileges.enabled

false

N/A

alluxio.security.secret.store.hashicorp.vault.address

Sets the address (URL) of the Vault server instance to which API calls should be sent.

alluxio.security.secret.store.hashicorp.vault.auth.incremental.time

12h

The incremental leasing time of auth credential, -1 means use default incremental lease time in vault server.

alluxio.security.secret.store.hashicorp.vault.auth.renew.threshold

30m

The renew threshold of auth credential, vault secret provider will renew auth credential automatically if TTL of secret is shorter than threshold.

alluxio.security.secret.store.hashicorp.vault.authentication

TOKEN

The authentication method of secret store. Current support auth method: TOKEN.

alluxio.security.secret.store.hashicorp.vault.ca.cert

Sets the CA cert.

alluxio.security.secret.store.hashicorp.vault.cache.enabled

false

Enable the cache of vault secrets.

alluxio.security.secret.store.hashicorp.vault.client.cert

Sets the client cert.

alluxio.security.secret.store.hashicorp.vault.client.key

Sets the client key. This key file has to be in PKCS#8 format.

alluxio.security.secret.store.hashicorp.vault.client.retry

5

The number of times that vault client will be retried when a failure occurs.

alluxio.security.secret.store.hashicorp.vault.client.retry.interval

1s

The number of ms that the vault client will wait in between retries.

alluxio.security.secret.store.hashicorp.vault.enabled

false

Enable the Vault secret provider for Secret Store.

alluxio.security.secret.store.hashicorp.vault.kv.prefix

secret/alluxio

The hashicorp kv store path prefix. If the same vault is shared betweenmultiple Alluxio clusters, each cluster must use a different prefix.

alluxio.security.secret.store.hashicorp.vault.open.timeout

5s

The time to wait before giving up on establishing an HTTP(S) connection to the Vault server.

alluxio.security.secret.store.hashicorp.vault.read.timeout

30s

After an HTTP(S) connection has already been established, this is the time to wait for all data to finish downloading.

alluxio.security.secret.store.hashicorp.vault.secret.incremental.time

12h

The incremental leasing time of vault secret.

alluxio.security.secret.store.hashicorp.vault.secret.renew.threshold

30m

The renew threshold of auth credential, vault secret provider will renew secrets automatically if TTL of secret is shorter than threshold.

alluxio.security.secret.store.hashicorp.vault.tls.verify.cert

true

Whether to verify the certificate of the vault server against CA.

alluxio.security.secret.store.hashicorp.vault.token

The token used to authenticate with Vault.

alluxio.security.secret.store.renew.interval

-1

Periodical auto-renew check time for secrets and auth credential, -1 means periodical auto-renew is not enabled.

alluxio.security.stale.channel.purge.interval

3day

Interval for which client channels that have been inactive will be regarded as unauthenticated. Such channels will reauthenticate with their target master upon being used for new RPCs.

alluxio.security.tier.keystore.hashicorp.vault.enabled

false

This flag indicates whether the tier storage encryption is using hashicorp vault as key store

alluxio.security.tier.keystore.hashicorp.vault.version

2

The hashicorp vault key store version.

alluxio.security.tier.storage.encryption.cipher.bit.length

256

The key length

alluxio.security.tier.storage.encryption.cipher.type

AES/GCM/NoPadding

The cipher type, default one is AES

alluxio.security.tier.storage.encryption.cryptsoft.kmip.connect.pool.size

5

The kmip connect pool size

alluxio.security.tier.storage.encryption.cryptsoft.kmip.connector.config.file

${alluxio.conf.dir}/kmip.properties

The config file for cryptsoft kmip connector

alluxio.security.tier.storage.encryption.enabled

false

This flag indicates whether the data stored in Worker is encrypted.

alluxio.security.tier.storage.encryption.keystore.type

KEY_VALUE_STORE_SYSTEM

The keystore type for tier storage encryption.

alluxio.security.tier.storage.encryption.kmip.connector.vendor

Cryptsoft

The vendor of kmip connector

alluxio.security.tier.storage.encryption.kmip.custom.attribute

Custom attribute for locating the KMIP keys created for one cluster,it should be set unique for one cluster, the Alluxio would locate and check whether the key is used, the keys that is not used would be removed.

alluxio.security.tier.storage.encryption.kmip.key.sync.interval

24h

The interval for key synce between Encryption Master and KMS

alluxio.security.tier.storage.encryption.zone.max.key.number

10

The max key number per encryption zone

alluxio.security.underfs.hdfs.impersonation.enabled

true

N/A

alluxio.security.underfs.hdfs.kerberos.client.keytab.file

N/A

alluxio.security.underfs.hdfs.kerberos.client.principal

N/A

alluxio.security.underfs.hdfs.kerberos.login.per.instance

false

Whether to login per HDFS underfs instance

alluxio.security.underfs.mount.temporary.credential.enabled

false

This flag indicates whether the current mount is in temporary session model. If it is, its workers will depend on the temporary session token to communicate with its backend server.

alluxio.security.underfs.temporary.credential.enabled

false

This flag indicates whether the temporary session token is enabled. If it is enabled, the Master will distribute encryption keys for temporarysession token

alluxio.security.user.information.executor.min.thread.number

1

The thread number to renew the UserGroupInformation

alluxio.security.user.information.executor.schedule.interval

1min

The interval of the UserGroupInformation scheduling

Last updated