Trino on K8s

Trino is an open source distributed SQL query engine for running interactive analytic queries on data at a large scale. This guide describes how to run queries against Trino with Alluxio as a distributed caching layer, for any data storage systems that Alluxio supports, such as AWS S3 and HDFS. Alluxio allows Trino to access data regardless of the data source and transparently cache frequently accessed data (e.g., tables commonly used) into Alluxio distributed storage.

Prerequisites

This guide assumes that the Alluxio cluster is deployed on Kubernetes.

docker is also required to build the custom Trino image.

Prepare image

To integrate with Trino, Alluxio jars and configuration files must be added within the Trino image. Trino containers need to be launched with this modified image in order to connect to the Alluxio cluster.

Among the files listed in the Alluxio installation instructions to download, locate the tarball named alluxio-enterprise-DA-3.2-8.0.0-release.tar.gz. Extract the following Alluxio jars from the tarball:

  • client/alluxio-DA-3.2-8.0.0-client.jar

  • client/ufs/alluxio-underfs-s3a-shaded-DA-3.2-8.0.0.jar if using a S3 bucket as an UFS

  • client/ufs/alluxio-underfs-hadoop-3.3-shaded-DA-3.2-8.0.0.jar if using HDFS as an UFS

Prepare an empty directory as the working directory to build an image from. Within this directory, create the directory files/alluxio/ and copy the aforementioned jar files into it.

Download the commons-lang3 jar into files/ so that it can also be included in the image.

wget https://repo1.maven.org/maven2/org/apache/commons/commons-lang3/3.14.0/commons-lang3-3.14.0.jar -O files/commons-lang3-3.14.0.jar

Create a Dockerfile with the operations to modify the base Trino image. The following example defines arguments for:

  • TRINO_VERSION=449 as the Trino version

  • UFS_JAR=files/alluxio/alluxio-underfs-s3a-shaded-DA-3.2-8.0.0.jar as the path to the UFS jar copied into files/alluxio/

  • CLIENT_JAR=files/alluxio/alluxio-DA-3.2-8.0.0-client.jar as the path to the Alluxio client jar copied into files/alluxio/

  • COMMONS_LANG_JAR=files/commons-lang3-3.14.0.jar as the path to the commons-lang3 jar downloaded to files/

ARG  TRINO_VERSION=449
ARG  IMAGE=trinodb/trino:${TRINO_VERSION}
FROM $IMAGE

ARG  UFS_JAR=files/alluxio/alluxio-underfs-s3a-shaded-DA-3.2-8.0.0.jar
ARG  CLIENT_JAR=files/alluxio/alluxio-DA-3.2-8.0.0-client.jar
ARG  COMMONS_LANG_JAR=files/commons-lang3-3.14.0.jar

# remove existing alluxio jars
RUN  rm /usr/lib/trino/plugin/hive/alluxio*.jar \
     && rm /usr/lib/trino/plugin/delta-lake/alluxio*.jar \
     && rm /usr/lib/trino/plugin/iceberg/alluxio*.jar

# copy jars into image
ENV  ALLUXIO_HOME=/usr/lib/trino/alluxio
RUN  mkdir -p $ALLUXIO_HOME/lib && mkdir -p $ALLUXIO_HOME/conf
COPY --chown=trino:trino $UFS_JAR          /usr/lib/trino/alluxio/lib/
COPY --chown=trino:trino $CLIENT_JAR       /usr/lib/trino/alluxio/lib/
COPY --chown=trino:trino $COMMONS_LANG_JAR /usr/lib/trino/alluxio/lib/

# symlink UFS jar to lib/ and client+commons-lang3 jar to each plugin directory
RUN  ln -s $ALLUXIO_HOME/lib/alluxio-underfs-*.jar /usr/lib/trino/lib/
RUN  ln -s $ALLUXIO_HOME/lib/alluxio-*-client.jar    /usr/lib/trino/plugin/hive/hdfs/ \
     && ln -s $ALLUXIO_HOME/lib/alluxio-*-client.jar /usr/lib/trino/plugin/delta-lake/hdfs/ \
     && ln -s $ALLUXIO_HOME/lib/alluxio-*-client.jar /usr/lib/trino/plugin/iceberg/hdfs/
RUN  ln -s $ALLUXIO_HOME/lib/commons-lang3-3.14.0.jar     /usr/lib/trino/plugin/hive/hdfs/ \
     && ln -s $ALLUXIO_HOME/lib/commons-lang3-3.14.0.jar  /usr/lib/trino/plugin/delta-lake/hdfs/ \
     && ln -s $ALLUXIO_HOME/lib/commons-lang3-3.14.0.jar  /usr/lib/trino/plugin/iceberg/hdfs/

USER trino:trino

This will copy the necessary jars into the image. For the UFS jar, a symlink is created in the Trino lib/ directory pointing to it. For the Alluxio client and commons-lang3 jars, iterate through each of the possible plugin directories and create a symlink pointing to each of them.

In the above example, the jars are copied into the plugin directories for Hive, Delta Lake, and Iceberg, but only a subset may be needed depending on which connector(s) are being utilized.

Note that for Trino versions earlier than 434, the destination to copy the CLIENT_JAR and COMMONS_LANG_JAR should be /usr/lib/trino/plugin/<PLUGIN_NAME>/ instead of /usr/lib/trino/plugin/<PLUGIN_NAME>/hdfs/

Build the image by running, replacing <PRIVATE_REGISTRY> with the URL of your private container registry and <TRINO_VERSION> with the corresponding Trino version.

$ docker build --platform linux/amd64 -t <PRIVATE_REGISTRY>/alluxio/trino:<TRINO_VERSION> .

Push the image by running:

$ docker push <PRIVATE_REGISTRY>/alluxio/trino:<TRINO_VERSION>

Deploy Hive Metastore

To complete the Trino end-to-end example, a Hive Metastore will be launched. If there is already one available, this step can be skipped. The Hive Metastore URI will be needed to configure the Trino catalogs.

Use helm to create a stand-alone Hive metastore.

Add the helm repository by running:

$ helm repo add heva-helm-charts https://hevaweb.github.io/heva-helm-charts/

Install the Hive cluster by running:

$ helm install hive-metastore heva-helm-charts/hive-metastore

Check the status of the Hive pods:

$ kubectl get pods

It is known that the hive-metastore-db-init-schema-* pods result in an Error status, but this does not impact the rest of the workflow.

When completed, delete the Hive cluster by running:

$ helm uninstall hive-metastore

Configure AWS credentials for S3

If using S3, provide AWS credentials by editing the configmap:

$ kubectl edit configmap hive-metastore

In the editor, search for the hive-site.xml properties fs.s3a.access.key and fs.s3a.secret.key and populate them with AWS credentials to access to your S3 bucket.

Restart the Hive Metastore pod by deleting it:

$ kubectl delete pod hive-metastore-0

And it will restart automatically.

Configure Trino

Create a trino-alluxio.yaml configuration file in the installation directory to define the Alluxio specific configurations. We'll break down this configuration file into several parts that can be joined into a single yaml file.

image and server

image:
  registry: <PRIVATE_REGISTRY>
  repository: <PRIVATE_REPOSITORY>
  tag: <TRINO_VERSION>
server:
  workers: 2
  log:
    trino:
      level: INFO
  config:
    query:
      maxMemory: "4GB"

Specify the location of the custom Trino image that was previously built and pushed. Customize the Trino server specifications as needed.

additionalCatalogs

additionalCatalogs:
  memory: |
    connector.name=memory
    memory.max-data-per-node=128MB

In addition to memory related configurations, the catalog configurations are added under this section. Depending on which connector will be used, add the corresponding catalog configuration under additionalCatalogs. If following the previous section to launch the Hive Metastore, the value of hive.metastore.uri will be thrift://hive-metastore.default:9083.

Hive
  # connectors using alluxio
  hive: |
    connector.name=hive
    # use the uri of your hive-metastore deployment
    hive.metastore.uri=thrift://<HIVE_METASTORE_URI>
    hive.config.resources=/etc/trino/alluxio-core-site.xml
    hive.non-managed-table-writes-enabled=true
    hive.s3-file-system-type=HADOOP_DEFAULT

If using Trino version 458 or later, add fs.hadoop.enabled=true

Delta Lake
  # connectors using alluxio
  delta_lake: |
    connector.name=delta_lake
    hive.metastore.uri=thrift://<HIVE_METASTORE_URI>
    hive.config.resources=/etc/trino/alluxio-core-site.xml
    hive.s3-file-system-type=HADOOP_DEFAULT
    delta.enable-non-concurrent-writes=true
    delta.vacuum.min-retention=1h
    delta.register-table-procedure.enabled=true
    delta.extended-statistics.collect-on-write=false

If using Trino version 458 or later, add fs.hadoop.enabled=true

Iceberg
  # connectors using alluxio
  iceberg: |
    connector.name=iceberg
    hive.metastore.uri=thrift://<HIVE_METASTORE_URI>
    hive.config.resources=/etc/trino/alluxio-core-site.xml

If using Trino version 458 or later, add fs.hadoop.enabled=true

coordinator and worker

The coordinator and worker configurations both require similar modifications to their additionalJVMConfig and additionalConfigFiles sections.

For the coordinator:

coordinator:
  jvm:
    maxHeapSize: "8G"
    gcMethod:
      type: "UseG1GC"
      g1:
        heapRegionSize: "32M"
  config:
    memory:
      heapHeadroomPerNode: ""
    query:
      maxMemoryPerNode: "1GB"
  additionalJVMConfig:
  - "-Dalluxio.home=/usr/lib/trino/alluxio"
  - "-Dalluxio.conf.dir=/etc/trino/"
  - "-XX:+UnlockDiagnosticVMOptions"
  - "-XX:G1NumCollectionsKeepPinned=10000000"
  additionalConfigFiles:
    alluxio-core-site.xml: |
      <configuration>
        <property>
          <name>fs.s3a.impl</name>
          <value>alluxio.hadoop.FileSystem</value>
        </property>
        <property>
          <name>fs.hdfs.impl</name>
          <value>alluxio.hadoop.FileSystem</value>
        </property>
      </configuration>
    alluxio-site.properties: |
      alluxio.etcd.endpoints: http://<ETCD_POD_NAME>:2379
      alluxio.cluster.name: default-alluxio
      alluxio.k8s.env.deployment: true
      alluxio.mount.table.source: ETCD
      alluxio.worker.membership.manager.type: ETCD
    metrics.properties: |
      # Enable the Alluxio Jmx sink
      sink.jmx.class=alluxio.metrics.sink.JmxSink

Note the 2 additional JVM configurations to define the alluxio.home and alluxio.conf.dir properties.

Under additionalConfigFiles, 3 Alluxio configuration files are defined.

  • alluxio-core-site.xml: The xml file defines which underlying Filesystem class should be used for specific schemes of file URIs. By setting the value alluxio.hadoop.FileSystem, this ensures the file is executed through the Alluxio filesystem. Depending on the UFS, set the corresponding configurations:

    • For S3, set fs.s3a.impl to alluxio.hadoop.FileSystem

    • For HDFS, set fs.hdfs.impl to alluxio.hadoop.FileSystem

  • alluxio-site.properties: The properties file defines the Alluxio specific configuration properties for the Alluxio client. The next section will describe how to populate this file.

  • metrics.properties: Adding this one line enables Alluxio specific metrics to be scraped from Trino.

For the worker, include the same 2 additional JVM configurations and copy the same 3 Alluxio configuration files under additionalConfigFiles.

worker:
  jvm:
    maxHeapSize: "8G"
    gcMethod:
      type: "UseG1GC"
      g1:
        heapRegionSize: "32M"
  config:
    memory:
      heapHeadroomPerNode: ""
    query:
      maxMemoryPerNode: "1GB"
  additionalJVMConfig:
  - "-Dalluxio.home=/usr/lib/trino/alluxio"
  - "-Dalluxio.conf.dir=/etc/trino/"
  - "-XX:+UnlockDiagnosticVMOptions"
  - "-XX:G1NumCollectionsKeepPinned=10000000"
  additionalConfigFiles:
    alluxio-core-site.xml: |
      <configuration>
        <property>
          <name>fs.s3a.impl</name>
          <value>alluxio.hadoop.FileSystem</value>
        </property>
        <property>
          <name>fs.hdfs.impl</name>
          <value>alluxio.hadoop.FileSystem</value>
        </property>
      </configuration>
    alluxio-site.properties: |
      alluxio.etcd.endpoints: http://<ETCD_POD_NAME>:2379
      alluxio.cluster.name: default-alluxio
      alluxio.k8s.env.deployment: true
      alluxio.mount.table.source: ETCD
      alluxio.worker.membership.manager.type: ETCD
    metrics.properties: |
      sink.jmx.class=alluxio.metrics.sink.JmxSink

alluxio-site.properties

For the Trino cluster to properly communicate with the Alluxio cluster, certain properties must be aligned between the Alluxio client and Alluxio server.

To show alluxio-site.properties from the Alluxio cluster config map, run

$ kubectl get configmap alluxio-alluxio-conf -o yaml

where alluxio-alluxio-conf is the name of the configmap for the Alluxio cluster.

In this output, search for the following properties under alluxio-site.properties and set them for the coordinator's and worker's alluxio-site.properties:

  • alluxio.etcd.endpoints: The ETCD hosts of the Alluxio cluster. Assuming the hostnames are accessible from the Trino cluster, the same value can be used for the Trino cluster's alluxio-site.properties.

  • alluxio.cluster.name: This must be set with the exact same value for Trino cluster.

  • alluxio.k8s.env.deployment=true

  • alluxio.mount.table.source=ETCD

  • alluxio.worker.membership.manager.type=ETCD

If following the Install Alluxio on Kubernetes instructions, the values should be:

    alluxio-site.properties: |
      alluxio.etcd.endpoints: http://alluxio-etcd.default:2379
      alluxio.cluster.name: default-alluxio
      alluxio.k8s.env.deployment: true
      alluxio.mount.table.source: ETCD
      alluxio.worker.membership.manager.type: ETCD

Launch Trino

Add the helm chart to the local repository by running:

$ helm repo add trino https://trinodb.github.io/charts

To deploy Trino, run:

$ helm install -f trino-alluxio.yaml trino-cluster trino/trino

Check the status of the Trino pods and identify the pod name of the Trino coordinator:

$ kubectl get pods

When completed, delete the Trino cluster by running:

$ helm uninstall trino-cluster

Executing Queries

To enter the Trino coordinator pod, run:

$ kubectl exec -it $POD_NAME -- /bin/bash

where $POD_NAME is the full pod name of the Trino coordinator, starting with trino-cluster-trino-coordinator-

Inside the Trino pod, run:

$ trino

To create a simple schema in your UFS mount, run

trino> CREATE SCHEMA hive.example WITH (location = '<UFS_URI>');

where UFS_URI is a path within one of the defined Alluxio mounts. Using a S3 bucket as an example, this could be s3a://myBucket/myPrefix/.

Last updated