Metrics

Alluxio Metrics

Cache Storage

Metric
Labels
Type
Component
Description

alluxio_cached_storage

type

gauge

worker

amount of the cached data

alluxio_cached_capacity

type

gauge

worker

configured maximum cache storage

alluxio_data_cached_pages

-

gauge

worker

number of pages being cached in the page store

alluxio_data_cached_files

-

gauge

worker

number of cached files, including fully and partially cached files

alluxio_cached_storage_by_priority

priority

gauge

worker

amount of the cached data

alluxio_eviction_by_ttl

policy_path

counter

worker

Total number of bytes evicted from Alluxio workers by TTL policy.

alluxio_quota_size_used

dir

gauge

coordinator

Bytes used in the given quota scope.

alluxio_quota_size_capacity

dir

gauge

coordinator

Capacity of the given quota scope as defined in the quota rules.

alluxio_eviction_by_quota

dir

counter

worker

Total number of bytes evicted from Alluxio workers by quota rules.

Cache Access

Metric
Labels
Type
Component
Description

alluxio_data_access

method

histogram

worker

aggregated all data access requests

alluxio_data_throughput

dir, method, destination

counter

worker

counter of data throughput of all data access

alluxio_meta_operation

op

counter

worker

counter of rpc calls of the meta operations

alluxio_meta_operation_latency_ms

op, state

histogram

worker

latency of rpc calls of the meta operations

alluxio_meta_operation_errors

op

counter

worker

counter of errors during handling of rpc calls of the meta operations

alluxio_ufs_error

ufs_type, error_code

counter

worker

counter of the rpc calls

alluxio_ufs_latency_ms

method, ufs_type

histogram

worker

Histogram of ufs call latency

alluxio_ufs_client_latency_ms

method, ufs_type, state

histogram

worker

Histogram of ufs client api call latency

alluxio_ufs_client_call_processing

method, ufs_type

gauge

worker

Gauge of the ufs client calls that are being processed

alluxio_ufs_data_access

dir, method

counter

worker

amount of the ufs access

alluxio_ufs_fallback

method

counter

worker

amount of the ufs fallback

alluxio_cached_data_read

dir, is_pinned

counter

worker

amount of data that, when read, was present in and served from the Alluxio cache

alluxio_missed_data_read

dir, is_pinned

counter

worker

amount of data that, when read, was absent from the Alluxio cache

alluxio_cache_hit_calls

-

counter

worker

number of cache hits in page store

alluxio_cache_miss_calls

-

counter

worker

number of cache misses in page store

alluxio_external_data_read

dir

counter

amount of the read data when cache missed on client

alluxio_cleared_stale_cached_data

-

counter

worker

amount of cleared stale cached data

alluxio_cached_evicted_data

-

counter

worker

amount of the evicted data

alluxio_cached_async_evicted_data

-

counter

worker

amount of the async evicted data

alluxio_trigger_async_evicted_total

-

counter

worker

counter of times asynchronous eviction is triggered

alluxio_page_store_operation_errors

op, cause

counter

worker

counter of failures in page store operations

alluxio_page_store_dir_operation_errors

dir

counter

worker

counter of failures in specific page store directory

alluxio_page_store_dir_operations

dir

counter

worker

operation counter in specific page store directory

alluxio_page_store_io_latency_microseconds

dir, op, success

histogram

worker

latency of IO operations in page store

alluxio_metadata_cache_hit_calls

type

counter

worker

counter of metadata retrieval calls that resulted in a cache hit

alluxio_external_file_metadata_request_calls

-

counter

worker

counter of file metadata retrieval calls that are fetched from UFS, usually as a result of a cache miss in the file metadata cache

alluxio_metadata_cache_miss_calls

type

counter

worker

counter of metadata retrieval calls that resulted in a cache miss. Note that this is different from alluxio_external_file_metadata_request_calls in that a cache miss for a file does not always result in a request to an external data source.

alluxio_passive_cache_async_loaded_files

result

counter

worker

number of async loaded files when passive cache is enabled

alluxio_page_store_device_total_capacity

dir

gauge

worker

the total capacity of the physical storage device where the page store directory resides

alluxio_page_store_device_available_capacity

dir

gauge

worker

the available capacity of the physical storage device where the page store directory resides

alluxio_metastore_storage_size

dir

gauge

worker

Total logical size of files in the metastore RocksDB directory.

alluxio_metastore_disk_capacity

dir

gauge

worker

The capacity of the disk where the metastore rocksdb is located.

alluxio_netty_data_ingress

-

counter

worker

number of ingress bytes from clients to worker, excluding TLS

alluxio_netty_data_egress

-

counter

worker

number of egress bytes from worker to clients, excluding TLS

alluxio_worker_thread_pool_rejections

dir

counter

worker

counter of rejections in worker thread pool

alluxio_rpc_executor_current_queue_length

executor_name

gauge

worker

number of RPC requests currently being processed and pending processing

alluxio_rpc_executor_active_threads

executor_name

gauge

worker

number of threads that are actively executing RPCs

alluxio_rpc_executor_current_threads

executor_name

gauge

worker

number of threads for executing RPCs, both occupied and idle

alluxio_rpc_executor_max_threads

executor_name

gauge

worker

maximum number of threads for executing RPCs

S3 API

Metric
Labels
Type
Component
Description

alluxio_s3_api_throughput

method

histogram

worker

histogram of S3 API throughput

alluxio_s3_api_call_latency_ms

method, state

histogram

worker

latency of S3 API calls

alluxio_s3_api_call_processing

method

gauge

worker

counter of the S3 API calls that are being processed

alluxio_s3_authn_latency_ms

result, reason

histogram

worker

Latency of S3 authentication.

alluxio_s3_authz_latency_ms

result, method, reason

histogram

worker

Latency of S3 authorization.

alluxio_sts_api_call_processing

method

gauge

worker

Counter of the sts API calls that are being processed.

alluxio_sts_requests_total

result, reason

counter

worker

Total number of STS requests.

FUSE

Metric
Labels
Type
Component
Description

alluxio_fuse_concurrency

method

gauge

fuse

record the realtime concurrency for fuse method

alluxio_fuse_call_latency_ms

method, state

histogram

fuse

latency of fuse operations

alluxio_fuse_result

method, state

counter

fuse

counter of fuse operation results

alluxio_fuse_path_cache_hits_bytes

-

counter

fuse

counter of fuse path cache hits

alluxio_fuse_path_cache_misses

-

counter

fuse

counter of fuse path cache misses

alluxio_fuse_buffer_size

method, sequential

histogram

fuse

Record sequential or random read/write and its buffer size.

alluxio_fuse_block_size

method

histogram

fuse

Record the block size during random reads/writes. 'Block size' can be understood as the 'bs' parameter specified during fio testing.

alluxio_fuse_open_files

-

gauge

fuse

The number of fuse open files.

Client SDK

Metric
Labels
Type
Component
Description

alluxio_grpc_client_call_latency_ms

method, instance, state

histogram

worker, coordinator, fuse

latency of gRPC calls from the client

alluxio_grpc_client_concurrency

method, instance

gauge

worker, coordinator, fuse

concurrency of gRPC calls from the client

alluxio_grpc_client_errors

method, status_code, instance

counter

worker, coordinator, fuse

total number of gRPC errors from the client

alluxio_grpc_client_successes

method, instance

counter

worker, coordinator, fuse

total number of successful gRPC calls from the client

alluxio_netty_operations

op

counter

worker, coordinator, fuse

number of netty operations (e.g. read and write requests)

alluxio_netty_operation_errors

op, reason, instance

counter

worker, coordinator, fuse

total number of Netty operation errors from the client

alluxio_read_from_workers

instance

counter

worker, coordinator, fuse

total number of client read bytes from worker

alluxio_async_prefetch_cache_bytes

instance

counter

worker, coordinator, fuse

total number of bytes that client async prefetch data to local

alluxio_async_prefetch_hit_cache_bytes

instance

counter

worker, coordinator, fuse

total number of bytes that client hit cache from async prefetch cache

alluxio_async_prefetch_random_read_requests

instance

counter

worker, coordinator, fuse

total number of client random read recorded by async prefetch

alluxio_multi_replica_read_from_workers

cluster_name, local_cluster, hot_read

counter

worker

number of bytes read by a client from Alluxio workers when reading multi-replica files

alluxio_rpc_retry_on_different_workers

op, retry_count

counter

worker

counter of client retry on different workers if multi replica is enabled.

alluxio_rpc_position_reader_read_calls

component

counter

worker

counter of client position reader read success.

alluxio_rpc_position_reader_data_read

component

counter

worker

counter of bytes read by client position reader.

alluxio_rpc_position_reader_read_failed_total

component, final_attempt

counter

worker

counter of client position reader read failure.

alluxio_client_netty_read_time_to_receive_first_packet_ms

-

histogram

fuse

latency between when the client sends a read request to the worker, and when the worker sends the first packet of the response back to the client.

Job Service

Metric
Labels
Type
Component
Description

alluxio_completed_job

type, state

counter

coordinator

counter of the jobs.

alluxio_job_process_file

type, state

counter

coordinator

counter of the files.

alluxio_job_process_file_size

type, state

counter

coordinator

cumulative size of the files that are processed by job service.

alluxio_active_job_count

type

gauge

coordinator

counter of the jobs in scheduler. the value of type is running or waiting

alluxio_distributed_load_job_dispatched_size

-

counter

coordinator

counter of the bytes dispatched in distributed load

alluxio_distributed_load_job_failure

reason, final_attempt, worker

counter

coordinator

counter of the distributed load failure

alluxio_distributed_load_job_loaded_bytes

-

counter

coordinator

counter of the bytes loaded in distributed load

alluxio_distributed_load_job_processed

-

counter

coordinator

counter of the non empty file copies loaded in distributed load

alluxio_distributed_load_job_scanned

-

counter

coordinator

counter of the inodes scanned in distributed load

alluxio_distributed_load_job_skipped

-

counter

coordinator

counter of the inodes skipped in distributed load

alluxio_distributed_load_data_loaded

-

counter

worker

counter of the bytes loaded by a worker in distributed load

alluxio_distributed_load_data_loaded_from_ufs

-

counter

worker

counter of the bytes loaded by a worker from ufs in distributed load

alluxio_worker_job_task_count

-

gauge

coordinator

Number of tasks currently executed by each worker.

Write Cache

Metric
Labels
Type
Component
Description

alluxio_write_buffer_write_status

status

counter

worker

the status of write buffer writes

alluxio_write_buffer_worker_failure

worker

counter

worker

the failure count of writes to workers

alluxio_write_buffer_worker_bytes_written

worker

counter

worker

the bytes written to workers

alluxio_write_buffer_unique_bytes_written

-

counter

worker

the unique bytes written by the client

alluxio_write_buffer_foundationdb_call_latency_ms

method, state

histogram

worker

the latency of FoundationDB calls

alluxio_write_buffer_persist_tasks

status

counter

worker

the number of persist tasks

alluxio_write_buffer_transition_worker

worker

counter

worker

the number of worker transitions

alluxio_write_buffer_async_persist_throughput

-

counter

worker

the throughput of async persist

alluxio_write_buffer_async_file_checker_abnormal_files

-

counter

worker

the number of abnormal files found by the async file checker

alluxio_dual_buffer_file_system_requests

operation, buffer_type

counter

worker

counter of requests to dual buffer file system.

Authorization

Metric
Labels
Type
Component
Description

alluxio_auth_permission_check_total

-

counter

worker, fuse

The total number of authorization permission checks.

alluxio_auth_permission_check_cache_misses

-

counter

worker, fuse

The total number of misses in the authorization permission check cache.

Cluster & Process

Metric
Labels
Type
Component
Description

alluxio_version

version

gauge

worker, coordinator, fuse

Alluxio component version information

alluxio_license_expiration_date

-

gauge

coordinator

the license expiration date in epoch time format

alluxio_cumulative_unavailable_workers

worker_addr

counter

worker, coordinator

number of cumulative occurrences of unavailable workers encountered by a client.

alluxio_unavailable_worker_probe_attempts

worker_addr, attempt

counter

worker, coordinator

number of liveness probing attempts for an unavailable workers

alluxio_worker_membership_refresh_count

-

counter

worker

total number of worker membership refreshes

alluxio_dynamic_resource_pool_current_resources

pool_name

gauge

worker

current number of resources in the dynamic resource pool

alluxio_dynamic_resource_pool_capacity

pool_name

gauge

worker

capacity of the dynamic resource pool

alluxio_dynamic_resource_pool_acquisition_timeouts

pool_name

counter

worker

total number of acquisition timeouts in the dynamic resource pool

alluxio_dynamic_resource_pool_create_new_resource_latency_ms

pool_name

histogram

worker

latency of creating a new resource in the dynamic resource pool

alluxio_etcd_call_errors

type

counter

coordinator, worker, fuse

total number of etcd call errors

alluxio_etcd_client_calls

type

counter

coordinator, worker, fuse

total number of etcd client calls

alluxio_etcd_client_call_latency_ms

type

histogram

coordinator, worker, fuse

latency of etcd client calls

alluxio_netty_direct_memory_usage

-

gauge

worker

direct memory usage of Netty

alluxio_rocksdb_memory_usage

-

gauge

worker

memory usage of RocksDB

process_start_time_seconds

-

gauge

coordinator, worker, fuse

start time of the process since unix epoch in seconds

process_cpu_seconds_total

-

counter

coordinator, worker, fuse

total user and system CPU time spent in seconds

jvm_threads_current

-

gauge

coordinator, worker, fuse

current thread count of a JVM

jvm_memory_used_bytes

area=heap/nonheap

gauge

coordinator, worker, fuse

used bytes of a given JVM memory area

jvm_memory_max_bytes

area=heap/nonheap

gauge

coordinator, worker, fuse

max (bytes) of a given JVM memory area

jvm_gc_collection_seconds

gc="G1 Young Generation"/"G1 Old Generation"/...

summary

coordinator, worker, fuse

time spent in a given JVM garbage collector in seconds

jvm_buffer_pool_used_bytes

pool=direct/mapped

gauge

coordinator, worker, fuse

used bytes of a given JVM buffer pool

Last updated