# Metrics

To deploy Alluxio on Kubernetes, refer to [Deploy Alluxio on Kubernetes](https://documentation.alluxio.io/ee-da-en/da-2.10/kubernetes/running-alluxio-on-kubernetes).

This documentation focus on how to configure and get metrics of different metrics sinks from Alluxio deployed on Kubernetes.

### HTTP JSON Sink

#### Configuration

The metrics are exposed through the web ports of different components.

* The web ports of Alluxio masters and workers are opened by default.
* Alluxio standalone Fuse web port is not opened by default. It can be opened by setting `alluxio.fuse.web.enabled` to true.

#### Get Metrics Snapshot

You can send an HTTP request to an Alluxio process to get a snapshot of the metrics in JSON format.

```console
# Get the metrics in JSON format from one specific component
$ kubectl exec <COMPONENT_HOSTNAME> -c <CONTAINER_NAME> -- curl 127.0.0.1:<COMPONENT_WEB_PORT>/metrics/json/

# For example, get the metrics of the leading master with default web port 19999
$ kubectl exec <alluxio-master-x> -c alluxio-master -- curl 127.0.0.1:19999/metrics/json/
# Get the metrics of a worker with default web port 30000
$ kubectl exec <alluxio-worker-xxxxx> -c alluxio-worker -- curl 127.0.0.1:30000/metrics/json/
# Get the metrics of the leading job master with default web port 20002
$ kubectl exec <alluxio-master-x> -c alluxio-job-master -- curl 127.0.0.1:20002/metrics/json/
# Get metrics of a job worker with default web port 30003
$ kubectl exec <alluxio-worker-xxxxx> -c alluxio-job-worker -- curl 127.0.0.1:30003/metrics/json/
# Get metrics of a fuse process with default web port 49999
$ kubectl exec <alluxio-fuse-xxxxx> -- curl 127.0.0.1:49999/metrics/json/
```

#### Visualization

**Master Web UI Metrics**

Besides the raw metrics shown via metrics servlet or custom metrics configuration, users can track key cluster performance metrics in a more human-readable way in the web interface of Alluxio leading master.

<figure><img src="https://3828182785-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FB862N5IiUrGNYlXpJO9z%2Fuploads%2Fgit-blob-d6369e598f8b4b4854684972fbc6b3f71093a68c%2Fscreenshot_generalMetrics.png?alt=media" alt=""><figcaption></figcaption></figure>

To access the web UI of the leading master, you need to forward a port on a physical machine to the master web port. For example, if your leading master is running in pod `alluxio-master-0` and you want to serve the Alluxio master web UI at node `master-node-1` on port `8080`, you can run the following command on your control plane:

```console
$ kubectl port-forward <CONTROL_PLANE_HOSTNAME> pods/alluxio-master-0 8080:19999
```

The metrics can be then accessed at `http://<leading_master_node_hostname>:<forwared_port>/metrics`

Refer to [Access the Web UI](https://documentation.alluxio.io/ee-da-en/da-2.10/running-alluxio-on-kubernetes#access-the-web-ui) for more detailed instructions.

### Prometheus Sink

[Prometheus](https://prometheus.io/) is a monitoring tool that can help to monitor Alluxio metrics changes.

#### Alluxio Helm Chart Configuration

`PrometheusMetricsServlet` needs to be enabled for Prometheus in Alluxio. Set the following properties in helm chart `value.yaml` to enable Prometheus metrics sink:

```yaml
metrics:
  enabled: true
  PrometheusMetricsServlet: true
  podAnnotations:
    prometheus.io/scrape: "true"
    prometheus.io/masterWebPort: "<MASTER_WEB_PORT>"
    prometheus.io/jobMasterWebPort: "<JOB_MASTER_WEB_PORT>"
    prometheus.io/workerWebPort: "<WORKER_WEB_PORT>"
    prometheus.io/jobWorkerWebPort: "<JOB_WORKER_WEB_PORT>"
    prometheus.io/fuseWebPort: "<FUSE_WEB_PORT>"
    prometheus.io/path: "/metrics/prometheus/"
```

Note that similar to HTTP JSON Sink, fuse web port needs to be opened for accessing metrics by setting `alluxio.fuse.web.enabled` to true.

#### Prometheus Client Configuration

For a Prometheus client to get the metrics from Alluxio, configure the `prometheus.yml` of the client. For example, to read the master metrics:

```yaml
scrape_configs:
  - job_name: 'alluxio master'
    kubernetes_sd_configs:
      - role: pod
        namespaces:
          names:
            - alluxio # Only look at pods in namespace named `alluxio`
    relabel_configs:
      # Only check the pods with role `alluxio-master`
      - source_labels: [__meta_kubernetes_pod_label_role]
        action: keep
        regex: alluxio-master
      # Only check the pods with annotation prometheus.io/scrape is true
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      # Use the value of prometheus.io/path in podAnnotation for endpoint
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      # Use the value of prometheus.io/masterWebPort in podAnnotation for port
      - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_masterWebPort]
        action: replace
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
        target_label: __address__
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: namespace
      - source_labels: [__meta_kubernetes_pod_name]
        action: replace
        target_label: pod_name
      - source_labels: [__meta_kubernetes_pod_node_name]
        action: replace
        target_label: node
      - source_labels: [__meta_kubernetes_pod_label_release]
        action: replace
        target_label: cluster_name
```

To read other components' metrics, use the respective pod role label and web port label.

<details>

<summary>Worker metrics</summary>

An example configuration reading worker metrics \`\`\`yaml scrape\_configs: - job\_name: 'alluxio worker' kubernetes\_sd\_configs: - role: pod namespaces: names: - alluxio # Only look at pods in namespace named \`alluxio\` relabel\_configs: # Only look at the pods with role \`alluxio-worker\` - source\_labels: \[\_\_meta\_kubernetes\_pod\_label\_role] action: keep regex: alluxio-worker # Only check the pods with annotation prometheus.io/scrape is true - source\_labels: \[\_\_meta\_kubernetes\_pod\_annotation\_prometheus\_io\_scrape] action: keep regex: true # Use the value of prometheus.io/path in podAnnotation for endpoint - source\_labels: \[\_\_meta\_kubernetes\_pod\_annotation\_prometheus\_io\_path] action: replace target\_label: \_\_metrics\_path\_\_ regex: (.+) # Use the value of prometheus.io/workerWebPort in podAnnotation for port - source\_labels: \[\_\_address\_\_, \_\_meta\_kubernetes\_pod\_annotation\_prometheus\_io\_workerWebPort] action: replace regex: (\[^:]+)(?::\d+)?;(\d+) replacement: $1:$2 target\_label: \_\_address\_\_ - action: labelmap regex: \_\_meta\_kubernetes\_pod\_label\_(.+) - source\_labels: \[\_\_meta\_kubernetes\_namespace] action: replace target\_label: namespace - source\_labels: \[\_\_meta\_kubernetes\_pod\_name] action: replace target\_label: pod\_name - source\_labels: \[\_\_meta\_kubernetes\_pod\_node\_name] action: replace target\_label: node - source\_labels: \[\_\_meta\_kubernetes\_pod\_label\_release] action: replace target\_label: cluster\_name \`\`\`

</details>

<details>

<summary>Job master metrics</summary>

An example configuration reading job master metrics \`\`\`yaml scrape\_configs: - job\_name: 'alluxio job worker' kubernetes\_sd\_configs: - role: pod namespaces: names: - alluxio # Only look at pods in namespace named \`alluxio\` relabel\_configs: # Only look at the pods with role \`alluxio-worker\` - source\_labels: \[\_\_meta\_kubernetes\_pod\_label\_role] action: keep regex: alluxio-master # Only check the pods with annotation prometheus.io/scrape is true - source\_labels: \[\_\_meta\_kubernetes\_pod\_annotation\_prometheus\_io\_scrape] action: keep regex: true # Use the value of prometheus.io/path in podAnnotation for endpoint - source\_labels: \[\_\_meta\_kubernetes\_pod\_annotation\_prometheus\_io\_path] action: replace target\_label: \_\_metrics\_path\_\_ regex: (.+) # Use the value of prometheus.io/jobWorkerWebPort in podAnnotation for port - source\_labels: \[\_\_address\_\_, \_\_meta\_kubernetes\_pod\_annotation\_prometheus\_io\_jobMasterWebPort] action: replace regex: (\[^:]+)(?::\d+)?;(\d+) replacement: $1:$2 target\_label: \_\_address\_\_ - action: labelmap regex: \_\_meta\_kubernetes\_pod\_label\_(.+) - source\_labels: \[\_\_meta\_kubernetes\_namespace] action: replace target\_label: namespace - source\_labels: \[\_\_meta\_kubernetes\_pod\_name] action: replace target\_label: pod\_name - source\_labels: \[\_\_meta\_kubernetes\_pod\_node\_name] action: replace target\_label: node - source\_labels: \[\_\_meta\_kubernetes\_pod\_label\_release] action: replace target\_label: cluster\_name \`\`\`

</details>

<details>

<summary>Job worker metrics</summary>

An example configuration reading job worker metrics \`\`\`yaml scrape\_configs: - job\_name: 'alluxio job worker' kubernetes\_sd\_configs: - role: pod namespaces: names: - alluxio # Only look at pods in namespace named \`alluxio\` relabel\_configs: # Only look at the pods with role \`alluxio-worker\` - source\_labels: \[\_\_meta\_kubernetes\_pod\_label\_role] action: keep regex: alluxio-worker # Only check the pods with annotation prometheus.io/scrape is true - source\_labels: \[\_\_meta\_kubernetes\_pod\_annotation\_prometheus\_io\_scrape] action: keep regex: true # Use the value of prometheus.io/path in podAnnotation for endpoint - source\_labels: \[\_\_meta\_kubernetes\_pod\_annotation\_prometheus\_io\_path] action: replace target\_label: \_\_metrics\_path\_\_ regex: (.+) # Use the value of prometheus.io/workerWebPort in podAnnotation for port - source\_labels: \[\_\_address\_\_, \_\_meta\_kubernetes\_pod\_annotation\_prometheus\_io\_jobWorkerWebPort] action: replace regex: (\[^:]+)(?::\d+)?;(\d+) replacement: $1:$2 target\_label: \_\_address\_\_ - action: labelmap regex: \_\_meta\_kubernetes\_pod\_label\_(.+) - source\_labels: \[\_\_meta\_kubernetes\_namespace] action: replace target\_label: namespace - source\_labels: \[\_\_meta\_kubernetes\_pod\_name] action: replace target\_label: pod\_name - source\_labels: \[\_\_meta\_kubernetes\_pod\_node\_name] action: replace target\_label: node - source\_labels: \[\_\_meta\_kubernetes\_pod\_label\_release] action: replace target\_label: cluster\_name \`\`\`

</details>

#### Get Metrics Snapshot

You can send an HTTP request to the Prometheus endpoint of an Alluxio process to get a snapshot of the metrics in Prometheus format.

```console
$ kubectl exec <COMPONENT_HOSTNAME> -c <CONTAINER_NAME> -- curl 127.0.0.1:<COMPONEMT_WEB_PORT>/metrics/prometheus/

# For example, get the metrics of the leading master with default web port 19999
$ kubectl exec <alluxio-master-x> -c alluxio-master -- curl 127.0.0.1:19999/metrics/prometheus/
# Get the metrics of a worker with default web port 30000
$ kubectl exec <alluxio-worker-xxxxx> -c alluxio-worker -- curl 127.0.0.1:30000/metrics/prometheus/
# Get the metrics of the leading job master with default web port 20002
$ kubectl exec <alluxio-master-x> -c alluxio-job-master -- curl 127.0.0.1:20002/metrics/prometheus/
# Get metrics of a job worker with default web port 30003
$ kubectl exec <alluxio-worker-xxxxx> -c alluxio-job-worker -- curl 127.0.0.1:30003/metrics/prometheus/
# Get metrics of a fuse process with default web port 49999
$ kubectl exec <alluxio-fuse-xxxxx> -- curl 127.0.0.1:49999/metrics/prometheus/
```
