This documentation focus on how to configure and get metrics of different metrics sinks from Alluxio deployed on Kubernetes.
The metrics are exposed through the web ports of different components.
The web ports of Alluxio masters and workers are opened by default.
Alluxio standalone Fuse web port is not opened by default. It can be opened by setting alluxio.fuse.web.enabled to true.
Get Metrics Snapshot
You can send an HTTP request to an Alluxio process to get a snapshot of the metrics in JSON format.
# Get the metrics in JSON format from one specific component$ kubectl exec <COMPONENT_HOSTNAME> -c <CONTAINER_NAME> -- curl<COMPONENT_WEB_PORT>/metrics/json/# For example, get the metrics of the leading master with default web port 19999$ kubectl exec <alluxio-master-x> -c alluxio-master -- curl Get the metrics of a worker with default web port 30000$ kubectl exec <alluxio-worker-xxxxx> -c alluxio-worker -- curl Get the metrics of the leading job master with default web port 20002$ kubectl exec <alluxio-master-x> -c alluxio-job-master -- curl Get metrics of a job worker with default web port 30003$ kubectl exec <alluxio-worker-xxxxx> -c alluxio-job-worker -- curl Get metrics of a fuse process with default web port 49999$ kubectl exec <alluxio-fuse-xxxxx> -- curl
Master Web UI Metrics
Besides the raw metrics shown via metrics servlet or custom metrics configuration, users can track key cluster performance metrics in a more human-readable way in the web interface of Alluxio leading master.
To access the web UI of the leading master, you need to forward a port on a physical machine to the master web port. For example, if your leading master is running in pod alluxio-master-0 and you want to serve the Alluxio master web UI at node master-node-1 on port 8080, you can run the following command on your control plane:
Prometheus is a monitoring tool that can help to monitor Alluxio metrics changes.
Alluxio Helm Chart Configuration
PrometheusMetricsServlet needs to be enabled for Prometheus in Alluxio. Set the following properties in helm chart value.yaml to enable Prometheus metrics sink:
Note that similar to HTTP JSON Sink, fuse web port needs to be opened for accessing metrics by setting alluxio.fuse.web.enabled to true.
Prometheus Client Configuration
For a Prometheus client to get the metrics from Alluxio, configure the prometheus.yml of the client. For example, to read the master metrics:
scrape_configs: - job_name:'alluxio master'kubernetes_sd_configs: - role:podnamespaces:names: - alluxio# Only look at pods in namespace named `alluxio`relabel_configs:# Only check the pods with role `alluxio-master` - source_labels: [__meta_kubernetes_pod_label_role]action:keepregex:alluxio-master# Only check the pods with annotation is true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]action:keepregex:true# Use the value of in podAnnotation for endpoint - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]action:replacetarget_label:__metrics_path__regex:(.+)# Use the value of in podAnnotation for port - source_labels: [__address__,__meta_kubernetes_pod_annotation_prometheus_io_masterWebPort]action:replaceregex:([^:]+)(?::\d+)?;(\d+)replacement:$1:$2target_label:__address__ - action:labelmapregex:__meta_kubernetes_pod_label_(.+) - source_labels: [__meta_kubernetes_namespace]action:replacetarget_label:namespace - source_labels: [__meta_kubernetes_pod_name]action:replacetarget_label:pod_name - source_labels: [__meta_kubernetes_pod_node_name]action:replacetarget_label:node - source_labels: [__meta_kubernetes_pod_label_release]action:replacetarget_label:cluster_name
To read other components' metrics, use the respective pod role label and web port label.
Worker metrics
An example configuration reading worker metrics
scrape_configs: - job_name:'alluxio worker'kubernetes_sd_configs: - role:podnamespaces:names: - alluxio# Only look at pods in namespace named `alluxio`relabel_configs:# Only look at the pods with role `alluxio-worker` - source_labels: [__meta_kubernetes_pod_label_role]action:keepregex:alluxio-worker# Only check the pods with annotation is true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]action:keepregex:true# Use the value of in podAnnotation for endpoint - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]action:replacetarget_label:__metrics_path__regex:(.+)# Use the value of in podAnnotation for port - source_labels: [__address__,__meta_kubernetes_pod_annotation_prometheus_io_workerWebPort]action:replaceregex:([^:]+)(?::\d+)?;(\d+)replacement:$1:$2target_label:__address__ - action:labelmapregex:__meta_kubernetes_pod_label_(.+) - source_labels: [__meta_kubernetes_namespace]action:replacetarget_label:namespace - source_labels: [__meta_kubernetes_pod_name]action:replacetarget_label:pod_name - source_labels: [__meta_kubernetes_pod_node_name]action:replacetarget_label:node - source_labels: [__meta_kubernetes_pod_label_release]action:replacetarget_label:cluster_name
Job master metrics
An example configuration reading job master metrics
scrape_configs: - job_name:'alluxio job worker'kubernetes_sd_configs: - role:podnamespaces:names: - alluxio# Only look at pods in namespace named `alluxio`relabel_configs:# Only look at the pods with role `alluxio-worker` - source_labels: [__meta_kubernetes_pod_label_role]action:keepregex:alluxio-master# Only check the pods with annotation is true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]action:keepregex:true# Use the value of in podAnnotation for endpoint - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]action:replacetarget_label:__metrics_path__regex:(.+)# Use the value of in podAnnotation for port - source_labels: [__address__,__meta_kubernetes_pod_annotation_prometheus_io_jobMasterWebPort]action:replaceregex:([^:]+)(?::\d+)?;(\d+)replacement:$1:$2target_label:__address__ - action:labelmapregex:__meta_kubernetes_pod_label_(.+) - source_labels: [__meta_kubernetes_namespace]action:replacetarget_label:namespace - source_labels: [__meta_kubernetes_pod_name]action:replacetarget_label:pod_name - source_labels: [__meta_kubernetes_pod_node_name]action:replacetarget_label:node - source_labels: [__meta_kubernetes_pod_label_release]action:replacetarget_label:cluster_name
Job worker metrics
An example configuration reading job worker metrics
scrape_configs: - job_name:'alluxio job worker'kubernetes_sd_configs: - role:podnamespaces:names: - alluxio# Only look at pods in namespace named `alluxio`relabel_configs:# Only look at the pods with role `alluxio-worker` - source_labels: [__meta_kubernetes_pod_label_role]action:keepregex:alluxio-worker# Only check the pods with annotation is true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]action:keepregex:true# Use the value of in podAnnotation for endpoint - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]action:replacetarget_label:__metrics_path__regex:(.+)# Use the value of in podAnnotation for port - source_labels: [__address__,__meta_kubernetes_pod_annotation_prometheus_io_jobWorkerWebPort]action:replaceregex:([^:]+)(?::\d+)?;(\d+)replacement:$1:$2target_label:__address__ - action:labelmapregex:__meta_kubernetes_pod_label_(.+) - source_labels: [__meta_kubernetes_namespace]action:replacetarget_label:namespace - source_labels: [__meta_kubernetes_pod_name]action:replacetarget_label:pod_name - source_labels: [__meta_kubernetes_pod_node_name]action:replacetarget_label:node - source_labels: [__meta_kubernetes_pod_label_release]action:replacetarget_label:cluster_name
Get Metrics Snapshot
You can send an HTTP request to the Prometheus endpoint of an Alluxio process to get a snapshot of the metrics in Prometheus format.
$ kubectl exec <COMPONENT_HOSTNAME> -c <CONTAINER_NAME> -- curl<COMPONEMT_WEB_PORT>/metrics/prometheus/# For example, get the metrics of the leading master with default web port 19999$ kubectl exec <alluxio-master-x> -c alluxio-master -- curl Get the metrics of a worker with default web port 30000$ kubectl exec <alluxio-worker-xxxxx> -c alluxio-worker -- curl Get the metrics of the leading job master with default web port 20002$ kubectl exec <alluxio-master-x> -c alluxio-job-master -- curl Get metrics of a job worker with default web port 30003$ kubectl exec <alluxio-worker-xxxxx> -c alluxio-job-worker -- curl Get metrics of a fuse process with default web port 49999$ kubectl exec <alluxio-fuse-xxxxx> -- curl