Deploy Alluxio Edge for Trino
This document describes how to deploy Alluxio Edge for Trino to work with AWS S3 as UFS. The key step is to place Alluxio jar files to the Trino path, update configuration files, then proceed with the normal deployment of Trino.
Prerequisites
Your AWS S3 credentials
Preparation of storage for Alluxio Edge: After identifying the storage mounted for Alluxio Edge cache, please note:
The size of the local storage to be provisioned for Alluxio Edge
alluxio.user.client.cache.size
The path where it is mounted
alluxio.user.client.cache.dirs
Assuming you already have Trino in your environment, and set up similarly as Trino deployment documentation. Note down the installation directory of Trino. We will refer to it as
${TRINO_HOME}
throughout this document which you will need to update.A running ETCD cluster; the set of endpoint URLs is a required configuration setting.
Request a trial version of Alluxio Edge. Contact your Alluxio account representative at sales@alluxio.com
to request a trial version of Alluxio Edge. Follow their instructions to download the installation tar file into the directory you prepared.
The tar file follows the naming convention alluxio-enterprise-edge-*.tar.gz
. For example, if the tarball is named alluxio-enterprise-edge-1.1-6.0.0.tar.gz
, the alluxio version is edge-1.1-6.0.0
.
Package Alluxio Edge with Trino
Remove any old Alluxio client jar files from the Trino directories
Running a command like this will usually work:
Extract Alluxio Edge jars and Place into the Trino Directories
Three Alluxio Edge Java JAR files must be installed on each Trino node.
Extract the Alluxio Edge S3 under store filesystem integration JAR file using this command:
Extract the client and prod jar files from the tarball using this command:
Depending on your the Trino version, the jars need to be copied to different destination.
For Trino versions 434 or later, copy to
${TRINO_HOME}/plugin/<pluginName>/hdfs
For Trino versions older than 434, copy to
${TRINO_HOME}/plugin/<pluginName>
Note there is no impact in copying jars to both destinations
The following example shows the copy commands for version 434+ into the plugin directories for hive, hudi, delta lake, and iceberg.
Along with the Alluxio client and prod jars in the plugin directories, an additional third party jar is needed. Download the commons-lang3-3.14.0 jar and place it in the same directories.
Download the Prometheus jar
Prometheus is the supported tool for observing the metrics that Alluxio Edge emits. Download the java agent jar and save it in the ${TRINO_HOME}/lib/
directory.
Update Configurations
Update Alluxio Configurations in Trino JVM Config
Configure Alluxio configuration properties in Trino's jvm.config
file, which is usually located at ${TRINO_HOME}/etc/
. The following settings need to be added:
alluxio.conf.dir
is the directory containing Alluxio specific configuration filesThe
add-opens
line allows compatibility with Java 17alluxio.metrics.conf.file
is the path to configuration file for metricsjavaagent
line starts a metrics agent to pull metrics from. In this example, it runs on port 9696 and the jmx_export_config.yaml file defines its configuration.
Note any additional Alluxio properties can be specified following the -D<key>=<value>
format.
Create metrics.properties and jmx_export_config.yaml for Prometheus integration with metrics
Create jmx_export_config.yaml
in ${TRINO_HOME}/etc/alluxio/
with the following sample content.
Also create metrics.properties
in ${TRINO_HOME}/etc/alluxio/
with the following sample content.
Create alluxio-site.properties for Alluxio Edge configuration
Create alluxio-site.properties
in ${TRINO_HOME}/etc/alluxio/
. The following configuration should be set:
Please refer to configuration settings for details.
Create alluxio-core-site.xml for overriding the S3 filesystem
Create alluxio-core-site.xml
in ${TRINO_HOME}/etc/alluxio/
to include the fs.s3a.impl
property to ensure that Trino uses the S3AFileSystem
when a Hive table LOCATION is set to the s3://
or s3a://
scheme.
Update Catalogs
Configure the Trino catalog (such as HIVE and Delta Lake catalog) to include alluxio-core-site.xml
file to the resources. You can likely find the catalog files in ${TRINO_HOME}/etc/catalog/
. Note that the hive.config.resources
property may already be set with existing xml files, ie. /etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml
In this case, alluxio-core-site.xml
should be appended to the end of the list.
Some optional configurations
Redeploy Trino
If Trino was previously running, it must be restarted to pick up the configuration changes applied.
Last updated