Configuration Settings
An Alluxio cluster can be configured by setting the values of Alluxio configuration properties within ${ALLUXIO_HOME}/conf/alluxio-site.properties
.
The two major components to configure are
Alluxio servers, consisting of masters and workers
Alluxio clients, which are typically a part of compute applications.
Configure Applications
Customizing how an application interacts with Alluxio is specific to each application. The following are recommendations for some common applications.
Alluxio properties mostly fall into three categories
properties prefixed with
alluxio.user
affect Alluxio client operations (e.g. compute applications)properties prefixed with
alluxio.master
affect the Alluxio master processesproperties prefixed with
alluxio.worker
affect the Alluxio worker processes
Alluxio Shell Commands
Alluxio shell users can put JVM system properties -Dproperty=value
after the fs
command and before the subcommand to specify Alluxio user properties from the command line. For example, the following Alluxio shell command sets the write type to CACHE_THROUGH
when copying files to Alluxio:
$ ./bin/alluxio fs -Dalluxio.user.file.writetype.default=CACHE_THROUGH \
copyFromLocal README.md /README.md
Note that, as a part of Alluxio deployment, the Alluxio shell will also take the configuration in ${ALLUXIO_HOME}/conf/alluxio-site.properties
when it is run from Alluxio installation at ${ALLUXIO_HOME}
.
Spark
To customize Alluxio client-side properties in Spark applications, Spark users can use pass Alluxio properties as JVM system properties. See examples for configuring the Spark service or for individual Spark jobs.
Hadoop MapReduce
See examples to configure Alluxio properties for the MapReduce service or for individual MapReduce jobs.
Hive
Hive can be configured to use customized Alluxio client-side properties for the entire service. See examples.
Presto
Presto can be configured to use customized Alluxio client-side properties for the entire service. See examples.
CDH (Enterprise-only)
CDH can be configured to use customized Alluxio client-side properties for the entire service. See examples.
HDP (Enterprise-only)
HDP can be configured to use customized Alluxio client-side properties for the entire service. See examples.
Configure an Alluxio Cluster
alluxio-site.properties
Files (Recommended)
alluxio-site.properties
Files (Recommended)Alluxio admins can create and edit the properties file alluxio-site.properties
to configure Alluxio masters or workers. If this file does not exist, it can be created from the template file under ${ALLUXIO_HOME}/conf
:
$ cp conf/alluxio-site.properties.template conf/alluxio-site.properties
Make sure that this file is distributed to ${ALLUXIO_HOME}/conf
on every Alluxio master and worker before starting the cluster. Any updates to the server configuration requires a restart of the process.
Environment Variables
Alluxio supports defining a few frequently used configuration settings through environment variables, including:
ALLUXIO_CONF_DIR
The path to the alluxio configuration directory.
ALLUXIO_LOGS_DIR
The path to the directory that stores Alluxio server logs
ALLUXIO_USER_LOGS_DIR
The path to the directory that stores Alluxio user logs
ALLUXIO_MASTER_HOSTNAME
hostname of the Alluxio master. Defaults to localhost
ALLUXIO_MASTER_MOUNT_TABLE_ROOT_UFS
The under storage system addess. Defaults to ${ALLUXIO_HOME}
/underFSStorage which is a local file system.
ALLUXIO_RAM_FOLDER
The directory where a worker stores its in-memory data. Defaults to /mnt/ramdisk
ALLUXIO_JAVA_OPTS
Java VM options for the Alluxio master, worker, and shell commands. Note that, by default ALLUXIO_JAVA_OPTS
is included in ALLUXIO_MASTER_JAVA_OPTS
, ALLUXIO_WORKER_JAVA_OPTS
, and ALLUXIO_USER_JAVA_OPTS
ALLUXIO_MASTER_JAVA_OPTS
Additional Java VM options for Alluxio master configuration.
ALLUXIO_WORKER_JAVA_OPTS
Additional Java VM options for Alluxio worker configuration.
ALLUXIO_USER_JAVA_OPTS
Additional Java VM options for Alluxio shell command configuration.
ALLUXIO_CLASSPATH
Additional classpath entries for Alluxio processes. Empty by default.
ALLUXIO_LOGSERVER_HOSTNAME
Hostname of the log server. Empty by default.
ALLUXIO_LOGSERVER_PORT
The port number of the log server. Defaults to 45600
ALLUXIO_LOGSERVER_LOGS_DIR
The path to the local directory where the Alluxio log server stores logs received from the Alluxio servers
For example, to setup the following:
an Alluxio master at
localhost
the root mount point as an HDFS cluster with a namenode also running at
localhost
enable Java remote debugging at port 7001 run the following commands before startingthe master process:
$ export ALLUXIO_MASTER_HOSTNAME="localhost"
$ export ALLUXIO_MASTER_MOUNT_TABLE_ROOT_UFS="hdfs://localhost:9000"
$ export ALLUXIO_MASTER_JAVA_OPTS=\
"$ALLUXIO_JAVA_OPTS -agentlib:jdwp=transport=dt_socket,server=y, suspend=n,address=7001"
Users can either set these variables through the shell or in conf/alluxio-env.sh
. If this file does not exist yet, create one by copying the template:
$ cp conf/alluxio-env.sh.template conf/alluxio-env.sh
Cluster Defaults
Since version 1.8, each Alluxio client or worker can initialize its configuration with the cluster-wide configuration values retrieved from Alluxio masters.
When different client applications (Alluxio Shell CLI, Spark jobs, MapReduce jobs) or Alluxio workers connect to an Alluxio master, they will initialize their own Alluxio configuration properties with the default values supplied by the masters based on the master-side ${ALLUXIO_HOME}/conf/alluxio-site.properties
files. As a result, cluster admins can set default client-side settings (e.g., alluxio.user.*
), or network transport settings (e.g., alluxio.security.authentication.type
) in ${ALLUXIO_HOME}/conf/alluxio-site.properties
on all the masters, which will be distributed and become cluster-wide default values when clients and workers connect.
For example, the property alluxio.user.file.writetype.default
defaults to MUST_CACHE
, which only writes to Alluxio space.
In an Alluxio cluster where data persistence is preferred and all jobs need to write to both the UFS and Alluxio, the administrator can add alluxio.user.file.writetype.default=CACHE_THROUGH
in each master's alluxio-site.properties
file. After restarting the cluster, all jobs will automatically set alluxio.user.file.writetype.default
to CACHE_THROUGH
.
Clients can ignore or overwrite the cluster-wide default values by following the approaches described in Configure Applications to overwrite the same properties.
Note that, before version 1.8,
${ALLUXIO_HOME}/conf/alluxio-site.properties
file is only loaded by Alluxio server processes and will be ignored by applications interacting with Alluxio service through Alluxio client, unless${ALLUXIO_HOME}/conf
is on applications' classpath.
Path Defaults
Since version 2.0, Alluxio administrators can set default client-side configurations for specific Alluxio filesystem paths. Filesystem client operations have options which are derived from client side configuration properties. Only client-side configuration properties can be set as as path defaults.
For example, the createFile
operation has an option to specify write type. By default, the write type is the value of the configuration key alluxio.user.file.writetype.default
. The administrator can set default value of alluxio.user.file.write.type.default
to MUST_CACHE
for all paths with prefix /tmp
by running:
$ bin/alluxio fsadmin pathConf add --property alluxio.user.file.writetype.default=MUST_CACHE /tmp`
After executing this command any create operations on paths with prefix /tmp
will use the MUST_CACHE
write type by default unless the application configuration overrides the cluster defaults.
Path defaults will be automatically propagated to long running clients if they are updated. If the administrator updates path defaults using
$ bin/alluxio fsadmin pathConf add --property alluxio.user.file.writetype.default=THROUGH /tmp
afterwards, all write operations that occur on a path with the prefix /tmp
prefix will use the THROUGH
write type by default.
See fsadmin pathConf
on how to show, add, update, and remove path defaults.
Configuration Sources
Alluxio properties can be configured from multiple sources. A property's final value is determined by the following priority list, from highest priority to lowest:
Property files: When an Alluxio cluster starts, each server process including master and worker searches for
alluxio-site.properties
within the following directories in the given order, stopping when a match is found:${CLASSPATH}
,${HOME}/.alluxio/
,/etc/alluxio/
, and${ALLUXIO_HOME}/conf
Cluster default values: An Alluxio client may initialize its configuration based on the cluster-wide default configuration served by the masters.
If no user-specified configuration is found for a property, Alluxio will fall back to its default property value.
To check the value of a specific configuration property and the source of its value, users can run the following command:
$ ./bin/alluxio getConf alluxio.worker.rpc.port
29998
$ ./bin/alluxio getConf --source alluxio.worker.rpc.port
DEFAULT
To list all of the configuration properties with sources:
$ ./bin/alluxio getConf --source
alluxio.conf.dir=/Users/bob/alluxio/conf (SYSTEM_PROPERTY)
alluxio.debug=false (DEFAULT)
...
Users can also specify the --master
option to list all of the cluster-wide configuration properties served by the masters. Note that with the --master
option, getConf
will query the master which requires the master process to be running. Otherwise, without --master
option, this command only checks the local configuration.
$ ./bin/alluxio getConf --master --source
alluxio.conf.dir=/Users/bob/alluxio/conf (SYSTEM_PROPERTY)
alluxio.debug=false (DEFAULT)
...
Java 11 Configuration
Alluxio now supports Java 11. To run alluxio on Java 11, configure the JAVA_HOME
environment variable to point to a Java 11 installation directory. If you only want to use Java 11 for Alluxio, you can set the JAVA_HOME
environment variable in the alluxio-env.sh
file. Setting the JAVA_HOME
in alluxio-env.sh
will not affect the Java version which may be used by other application running in the same environment.
Server Configuration Checker
The server-side configuration checker helps discover configuration errors and warnings. Suspected configuration errors are reported through the web UI, fsadmin doctor
CLI, and master logs.
The web UI shows the result of the server configuration check.

Users can also run the fsadmin doctor
command to get the same results.
$ ./bin/alluxio fsadmin doctor configuration
Configuration warnings can also be found in the master logs.

Last updated