User CLI

Alluxio's command line interface provides user access to various operations, such as:

  • Start or stop processes

  • Filesystem operations

  • Administrative commands

Invoke the executable to view the possible subcommands:

$ ./bin/alluxio
Usage:
  bin/alluxio [command]

Available Commands:
  conf        Get, set, and validate configuration settings, primarily those defined in conf/alluxio-site.properties
  exec        Run the main method of an Alluxio class, or end-to-end tests on an Alluxio cluster.
  fs          Operations to interface with the Alluxio filesystem
  help        Help about any command
  info        Retrieve and/or display info about the running Alluxio cluster
  job         Command line tool for interacting with the job service.
  journal     Journal related operations
  license     Check and manage license status
  mount       Operations to manage mount points
  process     Start/stop cluster processes or remove workers

Flags:
      --debug-log               True to enable debug logging
      --skip-license-warnings   Skip the automatic warnings before expiration

Use "bin/alluxio [command] --help" for more information about a command.

To set JVM system properties as part of the command, set the -D flag in the form of -Dproperty=value.

To attach debugging java options specified by $ALLUXIO_USER_ATTACH_OPTS, set the --attach-debug flag

Note that, as a part of Alluxio deployment, the Alluxio shell will also take the configuration in ${ALLUXIO_HOME}/conf/alluxio-site.properties when it is run from Alluxio installation at ${ALLUXIO_HOME}.

conf

Get, set, and validate configuration settings, primarily those defined in conf/alluxio-site.properties

conf get

Usage: bin/alluxio conf get [key] [flags]

The get command prints the configured value for the given key. If the key is invalid, it returns a nonzero exit code. If the key is valid but isn't set, an empty string is printed. If no key is specified, the full configuration is printed.

Note: This command does not require the Alluxio cluster to be running.

Flags:

  • --master: Show configuration properties used by the master (Default: false)

  • --source: Show source of the configuration property instead of the value (Default: false)

  • --unit: Unit of the value to return, converted to correspond to the given unit. E.g., with "--unit KB", a configuration value of "4096B" will return 4 Possible options include B, KB, MB, GB, TP, PB, MS, S, M, H, D (Default: "")

Examples:

exec

Run the main method of an Alluxio class, or end-to-end tests on an Alluxio cluster.

exec edgeTest

Usage: bin/alluxio exec edgeTest [flags]

Test whether the edge runs successfully.

Flags:

  • --no-cluster: only interact with edge and ufs, not to access Dora cluster. (Default: false)

  • --path: (Required) The path can be:

  1. An Alluxio path like 'alluxio:///data' only when not set '--no-cluster' flag

  2. A path without scheme, like '/' or '/s3'. This is a syntactic sugar for Alluxio paths.

  3. A UFS path with scheme, like 's3://bucket/data' The path must exists in one of the mount points in Alluxio namespace.

exec ufsIOTest

Usage: bin/alluxio exec ufsIOTest [flags]

A benchmarking tool for the I/O between Alluxio and UFS. This test will measure the I/O throughput between Alluxio workers and the specified UFS path. Each worker will create concurrent clients to first generate test files of the specified size then read those files. The write/read I/O throughput will be measured in the process.

Flags:

  • --io-size: specifies the amount of data each thread writes/reads. (Default: "")

  • --java-opt: The java options to add to the command line to for the task. This can be repeated. The options must be quoted and prefixed with a space. For example: --java-opt " -Xmx4g" --java-opt " -Xms2g". (Default: [])

  • --path: (Required) specifies the path to write/read temporary data in.

  • --threads: specifies the number of threads to concurrently use on each worker. (Default: 4)

Examples:

exec ufsTest

Usage: bin/alluxio exec ufsTest [flags]

Test the integration between Alluxio and the given UFS to validate UFS semantics

Flags:

  • --path: (Required) the full UFS path to run tests against.

  • --test: Test name, this option can be passed multiple times to indicate multipleZ tests (Default: [])

fs

Operations to interface with the Alluxio filesystem For commands that take Alluxio URIs as an argument such as ls or mkdir, the argument should be either

  • A complete Alluxio URI, such as alluxio://<masterHostname>:<masterPort>/<path>

  • A path without its scheme header, such as /path, in order to use the default hostname and port set in alluxio-site.properties

Note: All fs commands require the Alluxio cluster to be running.

Most of the commands which require path components allow wildcard arguments for ease of use. For example, the commandbin/alluxio fs rm '/data/2014*'deletes anything in the data directory with a prefix of 2014.

Some shells will attempt to glob the input paths, causing strange errors. As a workaround, you can disable globbing (depending on the shell type; for example, set -f) or by escaping wildcards For example, the command bin/alluxio fs cat /\\* uses the escape backslash character twice. This is because the shell script will eventually call a java program which should have the final escaped parameters cat /\\*.

fs cat

Usage: bin/alluxio fs cat [path]

The cat command prints the contents of a file in Alluxio to the shell.

Examples:

fs check-cached

Usage: bin/alluxio fs check-cached --path|--index-file <path> [--limit <limit-size>] [flags]

Checks if files under a path have been cached in alluxio.

Flags:

  • --index-file: The index file on local that contains a list of file paths to check. Each line should contain a ufs file path. (Default: "")

  • --limit: Limit number of files to check (Default: 1000)

  • --path: The path to check caching status. The path can be either an Alluxio path or a UFS path. (Default: "")

  • --recursive: Whether to check files recursively in the given path (Default: false)

fs checksum

Usage: bin/alluxio fs checksum [path]

The checksum command outputs the md5 value of a file in Alluxio. This can be used to verify the contents of a file stored in Alluxio.

Examples:

fs compact

Usage: bin/alluxio fs compact [path] [flags]

The compact command compact the position write file.

Flags:

  • --commitTimeout: Default is 1d. Uncommitted files (files with an mtime greater than the manifest's mtime) are retained during compaction. You can configure this value to delete uncommitted files. For example, we can set it to 8h, meaning files with an mtime greater than 8 hours before the current time will be deleted, while files with an mtime less than 8 hours will be retained. (Default: "")

  • --deleteOrphanFrameFiles: Default is false. Whether delete orphan frame files. (Default: false)

  • --deleteOrphanFrameListFiles: Default is false. Whether delete orphan frame list files. (Default: false)

  • --deleteSnapshotsReferencedFiles: Default is false. Whether delete the files referenced by the snapshots. If deleted, the files cannot be rolled back to a previous version. (Default: false)

  • --exportPath: Default is null. If the path is specified, during compaction, the frames will be merged into a complete file and output to the given path. (Default: "")

  • --fileLifecycle: Default is 0. Unreferenced files will be deleted during compaction. You can configure this value to retain some recently created files. For example, we can set it to 8h, meaning files with an mtime (modification time) greater than 8 hours before the current time will be deleted, while files with an mtime less than 8 hours will be retained. (Default: "")

  • --maxRetainedSnapshots: Default is 4. The maximum number of snapshots to be saved. When the number of snapshots exceeds this limit, the oldest snapshot will be deleted. (Default: "")

  • --parallelism: Default is 1. The concurrency of compaction tasks. (Default: "")

  • --recursive: Default is false. Whether compact given path recursively. (Default: false)

Examples:

fs cp

Usage: bin/alluxio fs cp [srcPath] [dstPath] [flags]

Copies a file or directory in the Alluxio filesystem or between local and Alluxio filesystems. The file:// scheme indicates a local filesystem path and the alluxio:// scheme or no scheme indicates an Alluxio filesystem path.

Flags:

  • --buffer-size: Read buffer size when coping to or from local, with defaults of 64MB and 8MB respectively (Default: "")

  • --forced,-f: Overwrite the destination path if it exists (Default: false)

  • --preserve,-p: Preserve file permission attributes when copying files; all ownership, permissions, and ACLs will be preserved (Default: false)

  • --recursive,-R: True to copy the directory subtree to the destination directory (Default: false)

  • --thread: Number of threads used to copy files in parallel, defaults to 2 * CPU cores (Default: 0)

Examples:

fs head

Usage: bin/alluxio fs head [path] [flags]

The head command prints the first 1KB of data of a file to the shell. Specifying the -c flag sets the number of bytes to print.

Flags:

  • --bytes,-c: Byte size to print (Default: "")

Examples:

fs location

Usage: bin/alluxio fs location [path]

Displays the list of hosts storing the specified file.

fs ls

Usage: bin/alluxio fs ls [path] [flags]

The ls command lists all the immediate children in a directory and displays their file info. Using ls on a file will only display the information for that specific file.

Flags:

  • --cache-filter,-c: Show the cacheability rules for metadata and data of the file based on the cache filter configuration; if cache filter is not enabled, show Disabled. Note the resolved cacheability are based on cache filter observed by the command line. (Default: false)

  • --help: help for this command (Default: false)

  • --human-readable,-h: Print sizes in human readable format (Default: false)

  • --list-dir-as-file,-d: List directories as files (Default: false)

  • --recursive,-R: List subdirectories recursively (Default: false)

Examples:

fs mkdir

Usage: bin/alluxio fs mkdir [path1 path2 ...]

The mkdir command creates a new directory in the Alluxio filesystem. It is recursive and will create any parent directories that do not exist. Note that the created directory will not be created in the under storage system until a file in the directory is persisted to the underlying storage. Using mkdir on an invalid or existing path will fail.

Examples:

fs mv

Usage: bin/alluxio fs mv [srcPath] [dstPath]

The mv command moves a file or directory to another path in Alluxio. The destination path must not exist or be a directory. If it is a directory, the file or directory will be placed as a child of the directory. The command is purely a metadata operation and does not affect the data blocks of the file.

Examples:

fs rm

Usage: bin/alluxio fs rm [path] [flags]

The rm command removes a file from Alluxio space and the under storage system. The file will be unavailable immediately after this command returns, but the actual data may be deleted a while later.

Flags:

  • --alluxio-only: True to only remove data and metadata from Alluxio cache (Default: false)

  • --recursive,-R: True to recursively remove files within the specified directory subtree (Default: false)

  • --skip-ufs-check,-U: True to skip checking if corresponding UFS contents are in sync (Default: false)

Examples:

fs stat

Usage: bin/alluxio fs stat [path] [flags]

The stat command dumps the FileInfo representation of a file or a directory to the shell.

Flags:

  • --format,-f: Display info in the given format: "%N": name of the file "%z": size of file in bytes "%u": owner "%g": group name of owner "%i": file id of the file "%y": modification time in UTC in 'yyyy-MM-dd HH:mm:ss' format "%Y": modification time as Unix timestamp in milliseconds "%b": Number of blocks allocated for file (Default: "")

Examples:

fs tail

Usage: bin/alluxio fs tail [path] [flags]

The tail command prints the last 1KB of data of a file to the shell. Specifying the --bytes flag sets the number of bytes to print.

Flags:

  • --bytes: Byte size to print (Default: "")

Examples:

fs test

Usage: bin/alluxio fs test [path] [flags]

Test a property of a path, returning 0 if the property is true, or 1 otherwise

Flags:

  • --dir,-d: Test if path is a directory (Default: false)

  • --exists,-e: Test if path exists (Default: false)

  • --file,-f: Test if path is a file (Default: false)

  • --not-empty,-s: Test if path is not empty (Default: false)

  • --zero,-z: Test if path is zero length (Default: false)

fs touch

Usage: bin/alluxio fs touch [path]

Create a 0 byte file at the specified path, which will also be created in the under file system

info

Retrieve and/or display info about the running Alluxio cluster

info cluster

Usage: bin/alluxio info cluster

Print the bound cluster information

info nodes

Usage: bin/alluxio info nodes

Show all registered workers' status

info production

Usage: bin/alluxio info production

Print the production ID

info version

Usage: bin/alluxio info version

Print Alluxio version.

job

Command line tool for interacting with the job service.

job copy

Usage: bin/alluxio job copy [flags]

The copy operator copies a file or directory in the Alluxio file system distributed across workers using the scheduler. If copy is run on a directory, files in the directory will be recursively copied.

Flags:

  • --batch-size: [submit] # of batch size to copy at every worker (Default: 0)

  • --check-content: [submit] Whether to check content hash after copying files (Default: false)

  • --dst: [all] Destination path of copy operation. The path can be either an Alluxio path or a UFS path. (Default: "")

  • --format: [progress] Format of output, either TEXT or JSON (Default: "")

  • --index-file: [all] copy index file. containing on each line a pair of source and destination paths, separated by a -> (Default: "")

  • --progress: View progress of submitted job (Default: false)

  • --src: [all] Source path of copy operation. The path can be either an Alluxio path or a UFS path. (Default: "")

  • --stop: Stop running job (Default: false)

  • --submit: Submit job (Default: false)

  • --verbose: [progress] Verbose output (Default: false)

Examples:

job free

Usage: bin/alluxio job free [flags]

The free command triggers a scheduler job to free a directory and release cached pages in all workers.

Flags:

  • --batch-size: [submit] The speed of free files per second per worker. If the value is 0 or empty, it will be set to the value of 'alluxio.job.batch.size' (Default: 0)

  • --force: [submit] Trigger free even if some workers are offline (Default: false)

  • --format: [progress] Format of output, either TEXT or JSON (Default: "")

  • --path: (Required) [All] Source path of free operation. The path can be either an Alluxio path or a UFS path.

  • --progress: View progress of submitted job (Default: false)

  • --recursive: [submit] recursive free all files (Default: true)

  • --stop: Stop running job (Default: false)

  • --submit: Submit job (Default: false)

  • --verbose: [progress] Verbose output (Default: false)

Examples:

job list

Usage: bin/alluxio job list [flags]

According to job type and job state list jobs.

Flags:

  • --job-state: job state. RUNNING|VERIFYING|STOPPED|SUCCEEDED|FAILED|ALL (Default: "")

  • --job-type: job type. LOAD|FREE|COPY|MOVE|ALL (Default: "")

Examples:

job load

Usage: bin/alluxio job load [flags]

The load command moves data from the under storage system into Alluxio storage. For example, load can be used to prefetch data for analytics jobs. If load is run on a directory, files in the directory will be recursively loaded.

Flags:

  • --bandwidth: [submit] Single worker read bandwidth limit (Default: "")

  • --batch-size: [submit] # of batch size to load at every worker. If the value is 0 or empty, it will be set to the value of alluxio.job.batch.size (Default: 0)

  • --file-filter-regx: [submit] Skip files that match the regx pattern (Default: "")

  • --force: [submit] Trigger metadata sync even if some workers are offline (Default: false)

  • --format: [progress] Format of output, either TEXT or JSON (Default: "")

  • --index-file: [all] Source path of the index file for load operation (Default: "")

  • --metadata-only: [submit] Only load file metadata (Default: false)

  • --partial-listing: [submit] Use partial directory listing, initializing load before reading the entire directory but cannot report on certain progress details (Default: false)

  • --path: [all] Source path of load operation. The path can be either an Alluxio path or a UFS path. (Default: "")

  • --progress: View progress of submitted job (Default: false)

  • --replicas: [submit] # of replicas to load (Default: 1)

  • --skip-if-exists: [submit] Skip existing fullly cached files (Default: false)

  • --skip-quota-check: [submit] skip quota check and force submit load hob (Default: false)

  • --stop: Stop running job (Default: false)

  • --submit: Submit job (Default: false)

  • --verbose: [progress] Verbose output (Default: false)

  • --verify: [submit] Run verification when load finishes and load new files if any (Default: false)

Examples:

journal

Journal related operations

journal checkpoint

Usage: bin/alluxio journal checkpoint

The checkpoint command creates a checkpoint the leading Alluxio master's journal. This command is mainly used for debugging and to avoid master journal logs from growing unbounded. Checkpointing requires a pause in master metadata changes, so use this command sparingly to avoid interfering with other users of the system.

journal format

Usage: bin/alluxio journal format

The format command formats the local Alluxio master's journal.

Warning: Formatting should only be called while the cluster is not running.

journal read

Usage: bin/alluxio journal read [flags]

The read command parses the current journal and outputs a human readable version to the local folder. This command may take a while depending on the size of the journal.

Note: This command requies that the Alluxio cluster is NOT running.

Flags:

  • --end: end log sequence number (exclusive) (Default: -1)

  • --input-dir: input directory on-disk to read the journal content from (Default: "")

  • --master: name of the master class (Default: "")

  • --output-dir: output directory to write journal content to (Default: "")

  • --start: start log sequence number (inclusive) (Default: 0)

Examples:

license

Check and manage license status

license check-expiration

Usage: bin/alluxio license check-expiration

Validates the license and print warnings if any constraint is about to be exceeded

license show

Usage: bin/alluxio license show [flags]

Show the details of the license based on the prod jar and configuration

Flags:

  • --output,-o: Output format, could be json/yaml (Default: "")

license status

Usage: bin/alluxio license status [flags]

List the current status of the cluster that license may use

Flags:

  • --raw: Output raw JSON data instead of human-readable format for bytes, datetime, and duration. (Default: false)

license update

Usage: bin/alluxio license update [flags]

Use the license from the current site properties file to update

Flags:

  • --process: The process you want to update with the new license (Default: "")

mount

The mount command manges the mapping from under storage path to an Alluxio path, where files and folders created in Alluxio space under the path will be backed by a corresponding file or folder in the under storage path.

mount add

Usage: bin/alluxio mount add [flags]

The add command can be used to make data in another storage system available in Alluxio.

Note that the --readonly flag mounts are useful to prevent accidental write operations. If multiple Alluxio satellite clusters mount a remote storage cluster which serves as the central source of truth, the --readonly option could help prevent any write operations on the satellite cluster from wiping out the remote storage.

To connect to the UFS for a mount point, Alluxio looks for the corresponding connector under ${ALLUXIO_HOME}/lib/ and will use the first one that supports the path. The connector jars look like lib/alluxio-underfs-hdfs-2.7.1.jar. The logic to decide whether a connector supports a path depends on the UnderFileSystemFactory implementation. When there are multiple connectors for the same UFS, like lib/alluxio-underfs-hdfs-2.7.1.jar, lib/alluxio-underfs-hdfs-2.7.1-patch1.jar, lib/alluxio-underfs-hdfs-2.7.1-patch2.jar, the option "alluxio.underfs.strict.version.match.enabled" can be used to make sure the correct one is picked up. For example, if the HDFS is running with 2.7.1-patch1, you can use "alluxio.underfs.version" and "alluxio.underfs.strict.version.match.enabled=true" to ensure "lib/alluxio-underfs-hdfs-2.7.1-patch1.jar" is used to connect to the target HDFS at hdfs://ns1/

Flags:

  • --master: Set true to create mount point tracked by Alluxio master (Default: false)

  • --option: Configuration options, in the form of =, associated with the mount point, such as credentials (Default: [])

  • --path: (Required) Alluxio path to mount onto

  • --shared: Sets the permission bits of the mount point to be accessible for all Alluxio users (Default: false)

  • --ufs-uri: (Required) UFS URI to mount

Examples:

mount list

Usage: bin/alluxio mount list [flags]

List all known mount points set on the Alluxio filesystem

Flags:

  • --master: Set true to list mount points under the master based registration (Default: false)

Examples:

mount remove

Usage: bin/alluxio mount remove [flags]

Removes the mount point at the specified path

Flags:

  • --master: Set true to delete mount point under the master based registration (Default: false)

  • --path: (Required) Alluxio path to unmount

Examples:

process

Start/stop cluster processes or remove workers

process remove-worker

Usage: bin/alluxio process remove-worker [flags]

Remove given worker from the cluster, so that clients and other workers will not consider the removed worker for services. The worker must have been stopped before it can be safely removed from the cluster.

Flags:

  • --name,-n: (Required) Worker id

process start

Usage: bin/alluxio process start [flags]

Starts a single process locally or a group of similar processes across the cluster. For starting a group, it is assumed the local host has passwordless SSH access to other nodes in the cluster. The command will parse the hostnames to run on by reading the conf/coordinator and conf/workers files, depending on the process type.

Flags:

  • --async,-a: Asynchronously start processes without monitoring for start completion (Default: false)

  • --console-log,-c: Log output to stdout in addition to log file (Default: false)

  • --direct,-d: (For use in docker) Directly run the start command, skipping all other steps and avoid using nohup to launch (Default: false)

  • --skip-kill-prev,-N: Avoid killing previous running processes when starting (Default: false)

process stop

Usage: bin/alluxio process stop [flags]

Stops a single process locally or a group of similar processes across the cluster. For stopping a group, it is assumed the local host has passwordless SSH access to other nodes in the cluster. The command will parse the hostnames to run on by reading the conf/coordinator and conf/workers files, depending on the process type.

Flags:

  • --soft,-s: Soft kill only, don't forcibly kill the process (Default: false)

Last updated