Deploy
Last updated
Last updated
Alluxio can be run on Kubernetes. This guide demonstrates how to run Alluxio on Kubernetes using the specification included in the Alluxio Docker image or helm
.
A Kubernetes cluster (version 1.11+) with Beta feature gate APIs enabled
The which the Kubernetes resource specifications are built from supports Kubernetes version 1.11+.
Beta feature gates are for Kubernetes cluster installations
Cluster access to an Alluxio Docker image . If using a private Docker registry, refer to the Kubernetes private image registry .
Ensure the cluster's allows for connectivity between applications (Alluxio clients) and the Alluxio Pods on the defined ports.
This tutorial walks through a basic Alluxio setup on Kubernetes. Alluxio supports two methods of installation on Kubernetes: either using charts or using kubectl
. When available, helm
is the preferred way to install Alluxio. If helm
is not available or if additional deployment customization is desired, kubectl
can be used directly using native Kubernetes resource specifications.
Note: From Alluxio 2.3 on, Alluxio only supports helm 3. See how to migrate from helm 2 to 3 .
If using persistent volumes, the status of the volume(s) should change to CLAIMED
and the status of the volume claims should be BOUNDED
. You can validate the status of your PersistentVolume and PersistentVolumeClaims using the follow kubectl
commands:
Once ready, access the Alluxio CLI from the master Pod and run basic I/O tests.
From the master Pod, execute the following:
The Alluxio UI can be accessed from outside the kubernetes cluster using port forwarding.
The command above allocates a port on the local node <local-port>
and forwards traffic on <local-port>
to port 19999 of pod alluxio-master-$i
. The pod alluxio-master-$i
does NOT have to be on the node you are running this command.
Note:
i=0
for the first master Pod. When running multiple masters, forward port for each master. Only the primary master serves the Web UI.
For example, you are on a node with hostname master-node-1
and you would like to serve the Alluxio master web UI for alluxio-master-0
on master-node-1:8080
. Here's the command you can run:
This forwards the local port master-node-1:8080
to the port on the Pod alluxio-master-0:19999
. The Pod alluxio-master-0
does NOT need to be running on master-node-1
.
You will see messages like below when there are incoming connections.
You can terminate the process to stop the port forwarding, with either Ctrl + C
or kill
.
Short-circuit access enables clients to perform read and write operations directly against the worker bypassing the networking interface. For performance-critical applications it is recommended to enable short-circuit operations against Alluxio because it can increase a client's read and write throughput when co-located with an Alluxio worker.
This feature is enabled by default (see next section to disable this feature), however requires extra configuration to work properly in Kubernetes environments.
There are two modes for using short-circuit.
In this mode, the Alluxio client and local Alluxio worker recognize each other if the client hostname matches the worker hostname. This is called Hostname Introspection. In this mode, the Alluxio client and local Alluxio worker share the tiered storage of Alluxio worker.
This is the default policy used for short-circuit in Kubernetes.
If the client or worker container is using virtual networking, their hostnames may not match. In such a scenario, set the following property to use filesystem inspection to enable short-circuit operations and make sure the client container mounts the directory specified as the domain socket path. Short-circuit writes are then enabled if the worker UUID is located on the client filesystem.
Domain Socket Path. The domain socket is a volume which should be mounted on:
All Alluxio workers
All application containers which intend to read/write through Alluxio
This domain socket volume can be either a PersistentVolumeClaim
or a hostPath Volume
.
Use PersistentVolumeClaim. By default, this domain socket volume is a PersistentVolumeClaim
. You need to provision a PersistentVolume
to this PersistentVolumeClaim
. And this PersistentVolume
should be either local
or hostPath
.
Use hostPath Volume. You can also directly define the workers to use a hostPath Volume
for domain socket.
To verify short-circuit reads and writes monitor the metrics displayed under:
the metrics tab of the web UI as Domain Socket Alluxio Read
and Domain Socket Alluxio Write
To disable short-circuit operations, the operation depends on how you deploy Alluxio.
Note: As mentioned, disabling short-circuit access for Alluxio workers will result in worse I/O throughput
Verify log server
You can go into the log server pod and verify the logs exist.
One way to use the POSIX API is to deploy the Alluxio FUSE daemon, creating pods running Alluxio Fuse processes at deployment time. The Fuse processes are long-running.
To access data in Alluxio inside application containers, simply mount Alluxio with a hostPath
mount of location /mnt/alluxio-fuse
.
Other than using Alluxio FUSE daemon, you could also use CSI to mount the Alluxio FileSystem into application containers. Unlike Fuse daemon which is a long-running process, the Fuse pod launched by CSI has the same life cycle as the application pods who mount Alluxio as a volume. Fuse pod is automatically launched when an application pod mounts Alluxio inside itself, and automatically terminated when such application pods are terminated.
Step 1: Customize configurations
alluxioPath
The path in Alluxio which will be mounted
mountInPod
Set to true to launch Fuse process in an alluxio-fuse pod. Otherwise in the same container as nodeserver
mountPath
The path that Alluxio will be mounted to in the application container
mountOptions
Alluxio Fuse mount options
Step 2: Deploy CSI services You can use Helm to start the Alluxio CSI components with Alluxio cluster, or kubectl
to create the resources manually, or parts from Helm and parts manually.
Step 3: Provisioning
Step 4: Deploy applications
Now you can put the PVC name in your application pod spec to use the Alluxio FileSystem.
One can use either helm
or kubectl
to set up Alluxio proxy servers inside a kubernetes cluster.
In use cases where you wish to install Alluxio masters and workers separately with the Helm chart, use the following respective toggles:
The following options are provided in our Helm chart as additional parameters for experienced Kubernetes users.
You should have helm 3.X installed. You can install helm following instructions .
To at the root of Alluxio namespace specify all required properties as a key-value pair under properties
.
The following configures with a persistent volume claim mounted locally to the master Pod at location /journal
.
The following configures with an emptyDir
volume mounted locally to the master Pod at location /journal
.
Note: An emptyDir
volume has the same lifetime as the Pod. It is NOT a persistent storage. The Alluxio journal will be LOST when the Pod is restarted or rescheduled. Please only use this for experimental use cases. Check for more details.
Note: An emptyDir
volume has the same lifetime as the Pod. It is NOT a persistent storage. The Alluxio journal will be LOST when the Pod is restarted or rescheduled. Please only use this for experimental use cases. Check for more details.
Note: An emptyDir
volume has the same lifetime as the Pod. It is NOT a persistent storage. The Alluxio metadata will be LOST when the Pod is restarted or rescheduled. Please only use this for experimental use cases. Check for more details.
Alluxio manages local storage, including memory, on the worker Pods. can be configured using the following reference configurations.
There 3 supported volume type
: , and .
Note: If a hostPath
file or directory is created at runtime, it can only be used by the root
user. hostPath
volumes do not have resource limits. You can either run Alluxio containers with root
or make sure the local paths exist and are accessible to the user alluxio
with UID and GID 1000. You can find more details .
You can also use PVCs for each tier and provision . For worker tiered storage please use either hostPath
or local
volume so that the worker will read and write locally to achieve the best performance.
Note: There is one PVC per tier. When the PVC is bound to a PV of type hostPath
or local
, each worker Pod will resolve to the local path on the Node. Please also note that a local
volumes requires nodeAffinity
and Pods using this volume can only run on the Nodes specified in the nodeAffinity
rule of this volume. You can find more details .
Additional configuration is required for the Alluxio Worker pod to be ready for use. See the section for .
singleMaster means the templates generate 1 Alluxio master process, while multiMaster means 3. embedded and ufs are the 2 that Alluxio supports.
For customized templated YAMLs, see the for how to use helm-generate.sh
. Otherwise you may manually write or modify YAML files as you see fit.
Step 1: Add hostAliases
for your HDFS connection. Kubernetes Pods don't recognize network hostnames that are not managed by Kubernetes (not a Kubernetes Service), unless if specified by .
Additional configuration is required for the Alluxio Worker pod to be ready for use. See the section for .
Each released Alluxio version will have the corresponding docker image released on .
Check the for whether the Alluxio master journal has to be formatted. If no format is needed, you are ready to skip the rest of this section and move on to restart all Alluxio master and worker Pods.
You can follow to format the Alluxio journals.
If you are running Alluxio workers with , and you have Persistent Volumes configured for Alluxio, the storage should be cleaned up too. You should delete and recreate the Persistent Volumes.
You can do more comprehensive verification following .
If you have unbound PersistentVolumeClaims, please ensure you have provisioned matching PersistentVolumes. See "(Optional) Provision a Persistent Volume" in .
For more information about K8s port-forward see the .
or, the as cluster.BytesReadDomain
and cluster.BytesWrittenDomain
or, the as Short-circuit Read (Domain Socket)
and Alluxio Write (Domain Socket)
Alluxio supports a centralized log server that collects logs for all Alluxio processes. You can find the specific section at . This can be enabled on K8s too, so that all Alluxio pods will send logs to this log server.
Once Alluxio is deployed on Kubernetes, there are multiple ways in which a client application can connect to it. For applications using the , application containers can simply mount the Alluxio FileSystem.
mountOptions
: The Fuse mount options. Default to allow_other
. See for more details.
Then follow the steps to install Alluxio with helm .
provides more details about how to configure Alluxio POSIX API.
provides more details about how to configure Alluxio POSIX API.
In order to use CSI, you need a Kubernetes cluster with version at least 1.17, with enabled in API Server.
You can either use the default CSI configurations provided in under the csi section, or you can customize them to make it suitable for your workload. Here are some common properties that you can customize:
Modify or add any configuration properties inside values.yaml
, then please use helm-generate.sh
(see for usage) to generate related templates. All CSI related templates will be under ${ALLUXIO_HOME}/integration/kubernetes/csi
.
We provide both templates for k8s dynamic provisioning and static provisioning. Please choose the suitable provisioning methods according to your use case. You can refer to and to get more details.
For more information on how to configure a pod to use a persistent volume for storage in Kubernetes, please refer to .
Kubernetes will assign the namespace's default
ServiceAccount to new pods in a namespace. You may specify for Alluxio pods to use any existing ServiceAccounts you may have in your cluster through the following:
Kubernetes provides many options to control the scheduling of pods onto nodes in the cluster. The most direct of which is a .
However, Kubernetes will avoid scheduling pods on any tainted nodes. To allow certain pods to schedule on such nodes, Kubernetes allows you to specify tolerations for those taints. See for more details.
If you wish to add or override hostname resolution in the pods, Kubernetes exposes the containers' /etc/hosts
file via . This can be particularly useful for providing hostname addresses for services not managed by Kubernetes, like HDFS.
Kubernetes will use the 'RollingUpdate' deployment strategy to progressively upgrade Pods when changes are detected.
Kubernetes supports . After creating the registry credentials Secret
in Kubernetes, you pass the secret to your Pods via imagePullSecrets
.
From Alluxio v2.1 on, Alluxio Docker containers will run as non-root user alluxio
with UID 1000 and GID 1000 by default. Kubernetes volumes are only writable by root so you need to update the permission accordingly.
This is most likely caused due to the Kubernetes configured having the limits.memory
set too low.
If you used the Helm chart, are:
Even if you did not configure any values with Helm, you may still have resource limits in place due to a applied to your namespace
Isolating Alluxio worker Pods from other Pods in your Kubernetes cluster can be accomplished with the help of and .
Keep in mind that the Alluxio worker Pod definition uses a , so there will be worker Pods assigned to all eligible nodes
Next, verify the Alluxio workers' configured ramdisk sizes (if any). See for additional details.
As stated in , if no size is specified then memory-backed emptyDir
volumes will have capacity allocated equal to half the available memory on the host node. This capacity is reflected inside of your containers (for example when running df -u
). However if the combined size of your ramdisk and container memory usage exceeds the pod's limits.memory
then the Kubernetes scheduler will trigger an OOMKilled
on that pod. This is a very likely overlooked source of memory consumption in Alluxio worker Pods.
See for more details.
Aside: There is currently an in Alluxio where Alluxio's interpretation of byte sizes differs from Kubernetes (due to Kubernetes distinguishing between "-bibytes"). This is unlikely to cause OOMKilled
errors unless you are operating on very tight memory margins.
It is a known issue that in some early versions of Java 8, the JVM running in a container will determine its heap size(if not specified with -Xmx
and -Xms
) based on the memory of the physical host instead of the container. In that case, the JVM may attempt to use more memory than the container resource limit and gets killed. You can find more detailed explanations .
Since Java 8u131, some JVM flags can be turned on in order to correctly read the memory from cgroup. You can refer to our values.yaml
from our Helm chart template, and uncomment the below options. These options will be added to the JVM options of all Alluxio containers, including the masters and workers etc. You can find more detailed explanations .