Azure Data Lake Storage Gen2
Last updated
Last updated
This guide describes how to configure Alluxio with as the under storage system.
The Alluxio binaries must be on your machine. You can either , or .
In preparation for using Azure Data Lake storage with Alluxio, or use an existing Data Lake storage. You should also note the directory you want to use, either by creating a new directory, or using an existing one. You also need a . For the purposes of this guide, the Azure storage account name is called <AZURE_ACCOUNT>
, the directory in that storage account is called <AZURE_DIRECTORY>
, and the name of the container is called <AZURE_CONTAINER>
.
To use Azure Data Lake Storage as the UFS of Alluxio root mount point, you need to configure Alluxio to use under storage systems by modifying conf/alluxio-site.properties
. If it does not exist, create the configuration file from the template.
Specify the underfs address by modifying conf/alluxio-site.properties
to include:
Specify the Shared Key by adding the following property in conf/alluxio-site.properties
:
An Azure Data Lake store location can be mounted at a nested directory in the Alluxio namespace to have unified access to multiple under storage systems. Alluxio's can be used for this purpose.
After these changes, Alluxio should be configured to work with Azure Data Lake storage as its under storage system, and you can run Alluxio locally with it.
To use Azure Data Lake Storage as the UFS of Alluxio root mount point, you need to configure Alluxio to use under storage systems by modifying conf/alluxio-site.properties
. If it does not exist, create the configuration file from the template.
Specify the underfs address by modifying conf/alluxio-site.properties
to include:
Specify the OAuth 2.0 Client Credentials by adding the following property in conf/alluxio-site.properties
: (Please note that for URL Endpoint, use the V1 token endpoint)
After these changes, Alluxio should be configured to work with Azure Data Lake storage as its under storage system, and you can run Alluxio locally with it.
To use Azure Data Lake Storage as the UFS of Alluxio root mount point, you need to configure Alluxio to use under storage systems by modifying conf/alluxio-site.properties
. If it does not exist, create the configuration file from the template.
Specify the underfs address by modifying conf/alluxio-site.properties
to include:
Specify the Azure Managed Identities by adding the following property in conf/alluxio-site.properties
:
After these changes, Alluxio should be configured to work with Azure Data Lake storage as its under storage system, and you can run Alluxio locally with it.
Start up Alluxio locally to see that everything works.
Run a simple example program:
Visit your directory <AZURE_DIRECTORY>
to verify the files and directories created by Alluxio exist. For this test, you should see files named like:
To stop Alluxio, you can run:
An Azure Data Lake store location can be mounted at a nested directory in the Alluxio namespace to have unified access to multiple under storage systems. Alluxio's can be used for this purpose.
An Azure Data Lake store location can be mounted at a nested directory in the Alluxio namespace to have unified access to multiple under storage systems. Alluxio's can be used for this purpose.
This should start an Alluxio master and an Alluxio worker. You can see the master UI at .