Alluxio
ProductsLanguageHome
  • Introduction
  • Overview
    • Architecture
    • Job Service
    • Quick Start Guide
    • FAQ
    • Use Cases
  • Core Services
    • Caching
    • Unified Namespace
  • Install Alluxio
    • Local Machine
    • Cluster
    • Cluster with HA
    • Docker
    • Software Requirements
  • Kubernetes
    • Deploy
    • Spark on Kubernetes
    • Metrics
  • Cloud Native
    • Alibaba Cloud ACK
    • AWS EMR
    • Tencent EMR
    • Google Dataproc
  • Compute Integration
    • Apache Spark
    • Apache Hadoop MapReduce
    • Apache Flink
    • Apache Hive
    • Presto on Iceberg (Experimental)
    • Presto
    • Trino
    • Tensorflow
  • Storage Integrations
    • Amazon AWS S3
    • HDFS
    • Azure Blob Store
    • Azure Data Lake Storage Gen2
    • Azure Data Lake Storage
    • Google Cloud Storage
    • Qiniu Kodo
    • COSN
    • CephObjectStorage
    • MinIO
    • NFS
    • Aliyun Object Storage Service
    • Ozone
    • Swift
    • WEB
    • CephFS
  • Security
  • Operations
    • Configuration Settings
    • User CLI
    • Admin CLI
    • Web UI
    • Journal Management
    • Metastore Management
    • Metrics
  • Administration
    • Troubleshooting
    • Basic Logging
    • Remote Logging
    • Performance Tuning
    • Scalability Tuning
    • StressBench (Experimental)
    • Upgrading
  • Solutions
  • Client APIs
    • Java API
    • S3 API
    • REST API
    • POSIX API
  • Contributor Resources
    • Building Alluxio From Source
    • Contribution Guide
    • Code Conventions
    • Documentation Conventions
    • Contributor Tools
  • Reference
    • List Of Configuration Properties
    • List of Metrics
  • REST API
    • Master REST API
    • Worker REST API
    • Proxy REST API
    • Job REST API
  • Javadoc
Powered by GitBook
On this page
  • Required Software
  • Checkout Source Code
  • (Optional) Checkout Building Environment Using Docker
  • Build
  • Test
  • Build Options
  • Compute Framework Support
  • Build different HDFS under storage
  • TroubleShooting
  • The exception of java.lang.OutOfMemoryError: Java heap space
  • An error occurred while running protolock
  • NullPointerException occurred while execute org.codehaus.mojo:buildnumber-maven-plugin:1.4:create
  1. Contributor Resources

Building Alluxio From Source

Last updated 6 months ago

This guide describes how to clone the Alluxio repository, compile the source code, and run tests in your environment.

Required Software

Alternatively, we have published a docker image with Java, Maven, and Git pre-installed to help build Alluxio source code.

Checkout Source Code

Checkout the Alluxio master branch from Github:

$ git clone https://github.com/Alluxio/alluxio.git
$ cd alluxio
$ export ALLUXIO_HOME=$(pwd)

By default, cloning the repository will check out the master branch. If you are looking to build a particular version of the code you may check out the version using a git tag.

$ git tag
$ git checkout <TAG_NAME>

(Optional) Checkout Building Environment Using Docker

This section guides you to setup pre-configured compilation environment based on our published docker image. You can skip this section and build Alluxio source code if JDK and Maven are already installed locally.

Start a container named alluxio-build based on this image and get into this container to proceed:

$ docker run -itd \
  --network=host \
  -v ${ALLUXIO_HOME}:/alluxio  \
  -v ${HOME}/.m2:/root/.m2 \
  --name alluxio-build \
  alluxio/alluxio-maven bash

$ docker exec -it -w /alluxio alluxio-build bash

Note that,

  • Container path /alluxio is mapped to host path ${ALLUXIO_HOME}, so the binary built will still be accessible outside the container afterwards.

  • Container path /root/.m2 is mapped to host path ${HOME}/.m2 to leverage your local copy of the maven cache. This is optional.

When done using the container, destroy it by running

$ docker rm -f alluxio-build

Build

Build the source code using Maven:

$ mvn clean install -DskipTests

To speed up the compilation, you can run the following instruction to skip different checks:

$ mvn -T 2C clean install \
    -DskipTests \
    -Dmaven.javadoc.skip=true \
    -Dfindbugs.skip=true \
    -Dcheckstyle.skip=true \
    -Dlicense.skip=true

The Maven build system fetches its dependencies, compiles source code, runs unit tests, and packages the system. If this is the first time you are building the project, it can take a while to download all the dependencies. Subsequent builds, however, will be much faster.

Test

Once Alluxio is built, you can validate and start it with:

# Alluxio uses ./underFSStorage for under file system storage by default
$ mkdir ./underFSStorage
$ ./bin/alluxio format
$ ./bin/alluxio-start.sh local SudoMount
$ ./bin/alluxio runTests

You should be able to see the result Passed the test!

You can stop the local Alluxio system by using:

$ ./bin/alluxio-stop.sh local

Build Options

Compute Framework Support

Since Alluxio 1.7, Alluxio client jar built and located at {{site.ALLUXIO_CLIENT_JAR_PATH}} will work with different compute frameworks (e.g., Spark, Flink, Presto and etc) by default.

Build different HDFS under storage

By default, Alluxio is built with the HDFS under storage of Hadoop 3.3. Run the following command by specifying <UFS_HADOOP_PROFILE> and the corresponding ufs.hadoop.version to build ufs with different versions.

$ mvn install -pl underfs/hdfs/ \
   -P<UFS_HADOOP_PROFILE> -Dufs.hadoop.version=<HADOOP_VERSION> -DskipTests

Here <UFS_HADOOP_VERSION> can be set for different distributions. Available Hadoop profiles include ufs-hadoop-1, ufs-hadoop-2, ufs-hadoop-3 to cover the major Hadoop versions 1.x, 2.x and 3.x.

Hadoop versions >= 3.0.0 are best for compatibility with newer releases of Alluxio.

For example,

$ mvn clean install -pl underfs/hdfs/ \
  -Dmaven.javadoc.skip=true -DskipTests -Dlicense.skip=true \
  -Dcheckstyle.skip=true -Dfindbugs.skip=true \
  -Pufs-hadoop-3 -Dufs.hadoop.version=3.3.4

If you find a jar named alluxio-underfs-hdfs-<UFS_HADOOP_VERSION>-{{site.ALLUXIO_VERSION_STRING}}.jar under ${ALLUXIO_HOME}/lib, it indicates successful compilation.

Checkout the flags for different HDFS distributions.

Apache

All main builds are from Apache so all Apache releases can be used directly

-Pufs-hadoop-1 -Dufs.hadoop.version=1.0.4
-Pufs-hadoop-1 -Dufs.hadoop.version=1.2.0
-Pufs-hadoop-2 -Dufs.hadoop.version=2.2.0
-Pufs-hadoop-2 -Dufs.hadoop.version=2.3.0
-Pufs-hadoop-2 -Dufs.hadoop.version=2.4.1
-Pufs-hadoop-2 -Dufs.hadoop.version=2.5.2
-Pufs-hadoop-2 -Dufs.hadoop.version=2.6.5
-Pufs-hadoop-2 -Dufs.hadoop.version=2.7.3
-Pufs-hadoop-2 -Dufs.hadoop.version=2.8.0
-Pufs-hadoop-2 -Dufs.hadoop.version=2.9.0
-Pufs-hadoop-2 -Dufs.hadoop.version=2.10.0
-Pufs-hadoop-3 -Dufs.hadoop.version=3.0.0
-Pufs-hadoop-3 -Dufs.hadoop.version=3.3.4
Cloudera

To build against Cloudera's releases, just use a version like $apacheRelease-cdh$cdhRelease

-Pufs-hadoop-2 -Dufs.hadoop.version=2.3.0-cdh5.1.0
-Pufs-hadoop-2 -Dufs.hadoop.version=2.0.0-cdh4.7.0
Hortonworks

To build against a Hortonworks release, just use a version like $apacheRelease.$hortonworksRelease

-Pufs-hadoop-2 -Dufs.hadoop.version=2.1.0.2.0.5.0-67
-Pufs-hadoop-2 -Dufs.hadoop.version=2.2.0.2.1.0.0-92
-Pufs-hadoop-2 -Dufs.hadoop.version=2.4.0.2.1.3.0-563

TroubleShooting

The exception of java.lang.OutOfMemoryError: Java heap space

If you are seeing java.lang.OutOfMemoryError: Java heap space, please set the following variable to increase the memory heap size for maven:

$ export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"

An error occurred while running protolock

If you see following error message by maven: "An error occurred while running protolock: Cannot run program "/alluxio/core/transport/target/protolock-bin/protolock" (in directory "/alluxio/core/transport/target/classes"): error=2, No such file or directory"

please make sure the maven flag "-Dskip.protoc" is NOT included when building the source code.

NullPointerException occurred while execute org.codehaus.mojo:buildnumber-maven-plugin:1.4:create

If you see following error message by maven like below: "Failed to execute goal org.codehaus.mojo:buildnumber-maven-plugin:1.4:create-metadata (default) on project alluxio-core-common: Execution default of goal org.codehaus.mojo:buildnumber-maven-plugin:1.4:create-metadata failed.: NullPointerException"

Because the build number is based on the revision number retrieved from SCM, it will check build number from git hash code. If check failed, SCM will throw NPE. To avoid the exception, please set the alluxio version with maven parameter "-Dmaven.buildNumber.revisionOnScmFailure".

For example, if the alluxio version is 2.7.3 then set parameter "-Dmaven.buildNumber.revisionOnScmFailure=2.7.3".

See https://www.mojohaus.org/buildnumber-maven-plugin/create-mojo.html#revisionOnScmFailure for more infomation.

To verify that Alluxio is running, you can visit or check the log in the alluxio/logs directory. The worker.log and master.log files will typically be the most useful. It may take a few seconds for the web server to start. You can also run a simple program to test that data can be read and written to Alluxio's UFS:

To enable active sync be sure to build using the hdfsActiveSync property. Please visit for more information on using active sync.

Java 8 installed on your system
Maven 3.3.9 or later
Git
alluxio/alluxio-maven
http://localhost:19999
Active Sync for HDFS