Metrics version 2
MinIO 使用 Prometheus Data Model 发布集群和节点指标。 你可以使用任意抓取工具从 MinIO 拉取指标数据,以执行进一步分析和配置告警。
Version 2 端点
Metrics version 2 将指标划分为以下三个类别:
每个 v2 端点都会返回其所属类别的全部指标。 例如,抓取以下端点会返回所有集群指标:
http://HOSTNAME:PORT/minio/v2/metrics/cluster
仅访问基础端点 /minio/v2/metrics/ 也会返回集群指标。
如需更灵活的抓取方式和更广泛的指标集合,请使用 metrics version 3。 现有部署仍可继续使用 version 2 指标 和 Grafana 仪表板。
MinIO Grafana 仪表板
MinIO 提供两个 Grafana 仪表板,用于可视化 v2 指标。 有关为 Grafana 配置兼容 Prometheus 数据源的完整说明,请参见 Prometheus 关于 Grafana 支持的文档。
可用的 version 2 指标
以下各节描述 version 2 的端点与指标。
你可以使用以下 URL 端点抓取 集群级指标:
http://HOSTNAME:PORT/minio/v2/metrics/cluster
将 HOSTNAME:PORT 替换为 MinIO 部署的 FQDN 与端口。
对于使用负载均衡器管理 MinIO 节点间连接的部署,请指定负载均衡器地址。
Changed in version MinIO: RELEASE.2023-07-21T21-12-44Z
存储桶指标已迁移到独立端点。
Changed in version RELEASE.2023-08-31T15-31-16Z: 你可以使用以下 URL 端点抓取 存储桶级指标:
Changed in version RELEASE.2025-03-12T17-29-24Z: 出于性能原因,v2 指标最多支持 100 个存储桶。 如果需要覆盖更多存储桶的指标,请改用 v3 指标。
http://HOSTNAME:PORT/minio/v2/metrics/bucket
将 HOSTNAME:PORT 替换为 MinIO 部署的 FQDN 与端口。
对于使用负载均衡器管理 MinIO 节点间连接的部署,请指定负载均衡器地址。
New in version RELEASE.2023-10-07T15-07-38Z.
你可以使用以下 URL 端点抓取 资源指标:
http://HOSTNAME:PORT/minio/v2/metrics/resource
将 HOSTNAME:PORT 替换为 MinIO 部署的 FQDN 与端口。
对于使用负载均衡器管理 MinIO 节点间连接的部署,请指定负载均衡器地址。
Cluster Metrics
MinIO collects the following metrics at the cluster level. Metrics may include one or more labels, such as the server that calculated that metric.
These metrics can be obtained from any MinIO server once per collection by using the following URL:
https://HOSTNAME:PORT/minio/v2/metrics/cluster
Replace HOSTNAME:PORT with the hostname of your MinIO deployment.
For deployments behind a load balancer, use the load balancer hostname instead of a single node hostname.
Audit Metrics
Name |
Description |
|---|---|
|
Total number of messages that failed to send since start. |
|
Number of unsent messages in queue for target. |
|
Total number of messages sent since start. |
Cluster Capacity Metrics
Name |
Description |
|---|---|
|
Total free capacity online in the cluster. |
|
Total capacity online in the cluster. |
|
Total free usable capacity online in the cluster. |
|
Total usable capacity online in the cluster. |
|
Distribution of object sizes across a cluster |
|
Distribution of object versions across a cluster |
|
Total number of objects in a cluster |
|
Total cluster usage in bytes |
|
Total number of versions (includes delete marker) in a cluster |
|
Total number of delete markers in a cluster |
|
Total number of buckets in the cluster |
Cluster Drive Metrics
Name |
Description |
|---|---|
|
Total drives offline in this cluster. |
|
Total drives online in this cluster. |
|
Total drives in this cluster. |
Cluster ILM Metrics
Name |
Description |
|---|---|
|
Total bytes transitioned to a tier. |
|
Total number of objects transitioned to a tier. |
|
Total number of versions transitioned to a tier. |
Cluster KMS Metrics
Name |
Description |
|---|---|
|
Reports whether the KMS is online (1) or offline (0). |
|
Number of KMS requests that failed due to some error. (HTTP 4xx status code). |
|
Number of KMS requests that failed due to some internal failure. (HTTP 5xx status code). |
|
Number of KMS requests that succeeded. |
|
The time the KMS has been up and running in seconds. |
Cluster Health Metrics
Name |
Description |
|---|---|
|
Total number of MinIO nodes offline. |
|
Total number of MinIO nodes online. |
|
Maximum write quorum across all pools and sets |
|
Get current cluster health status |
|
Count of healing drives in the erasure set |
|
Count of online drives in the erasure set |
|
Get read quorum of the erasure set |
|
Get write quorum of the erasure set |
|
Get current health status of the erasure set |
Cluster Replication Metrics
Metrics marked as Site Replication Only only populate on deployments with Site Replication configurations.
For deployments with bucket or batch configurations, these metrics populate instead under the Bucket Metrics endpoint.
Name |
Description |
|---|---|
|
(Site Replication Only) Total number of bytes failed at least once to replicate in the last full hour. |
|
(Site Replication Only) Total number of objects which failed replication in the last full hour. |
|
Total number of bytes failed at least once to replicate in the last full minute. |
|
Total number of objects which failed replication in the last full minute. |
|
(Site Replication Only) Total number of bytes failed at least once to replicate since server start. |
|
(Site Replication Only) Total number of objects which failed replication since server start. |
|
(Site Replication Only) Total number of bytes replicated to this cluster from another source cluster. |
|
(Site Replication Only) Total number of objects received by this cluster from another source cluster. |
|
(Site Replication Only) Total number of bytes replicated to the target cluster. |
|
(Site Replication Only) Total number of objects replicated to the target cluster. |
|
(Site Replication Only) Total number of replication credential errors since server start |
|
(Site Replication Only)Number of GET requests proxied to replication target |
|
(Site Replication Only)Number of HEAD requests proxied to replication target |
|
(Site Replication Only)Number of DELETE tagging requests proxied to replication target |
|
(Site Replication Only)Number of GET tagging requests proxied to replication target |
|
(Site Replication Only)Number of PUT tagging requests proxied to replication target |
|
(Site Replication Only)Number of failures in GET requests proxied to replication target |
|
(Site Replication Only)Number of failures in HEAD requests proxied to replication target |
|
(Site Replication Only)Number of failures proxying DELETE tagging requests to replication target |
|
(Site Replication Only)Number of failures proxying GET tagging requests to replication target |
|
(Site Replication Only)Number of failures proxying PUT tagging requests to replication target |
Node Replication Metrics
Metrics marked as Site Replication Only only populate on deployments with Site Replication configurations.
For deployments with bucket or batch configurations, these metrics populate instead under the Bucket Metrics endpoint.
Name |
Description |
|---|---|
|
Total number of active replication workers |
|
Average number of active replication workers |
|
Maximum number of active replication workers seen since server start |
|
Reports whether the replication link is online (1) or offline (0). |
|
Total duration of replication link being offline in seconds since last offline event |
|
Total downtime of replication link in seconds since server start |
|
Average replication link latency in milliseconds |
|
Maximum replication link latency in milliseconds seen since server start |
|
Current replication link latency in milliseconds |
|
Current replication transfer rate in bytes/sec |
|
Average replication transfer rate in bytes/sec |
|
Maximum replication transfer rate in bytes/sec seen since server start |
|
Total number of objects queued for replication in the last full minute |
|
Total number of bytes queued for replication in the last full minute |
|
Average number of objects queued for replication since server start |
|
Average number of bytes queued for replication since server start |
|
Maximum number of bytes queued for replication seen since server start |
|
Maximum number of objects queued for replication seen since server start |
|
Total number of objects seen in replication backlog in the last 5 minutes |
Healing Metrics
Name |
Description |
|---|---|
|
Objects for which healing failed in current self healing run. |
|
Objects healed in current self healing run. |
|
Objects scanned in current self healing run. |
|
Time elapsed (in nano seconds) since last self healing activity. |
Inter Node Metrics
Name |
Description |
|---|---|
|
Average time of internodes TCP dial calls. |
|
Total number of internode TCP dial timeouts and errors. |
|
Total number of failed internode calls. |
|
Total number of bytes received from other peer nodes. |
|
Total number of bytes sent to the other peer nodes. |
Bucket Notification Metrics
Name |
Description |
|---|---|
|
Number of concurrent async Send calls active to all targets (deprecated, please use |
|
Events that were failed to be sent to the targets (deprecated, please use |
|
Total number of events sent to the targets (deprecated, please use |
|
Events that were skipped to be sent to the targets due to the in-memory queue being full |
|
Number of concurrent async Send calls active to the target |
|
Number of events currently staged in the queue_dir configured for the target. |
|
Total number of events sent (or) queued to the target |
S3 API Request Metrics
Name |
Description |
|---|---|
|
Total number S3 requests with (4xx) errors. |
|
Total number S3 requests with (5xx) errors. |
|
Total number S3 requests canceled by the client. |
|
Total number S3 requests with (4xx and 5xx) errors. |
|
Volatile number of total incoming S3 requests. |
|
Total number of S3 requests currently in flight. |
|
Total number S3 requests rejected for auth failure. |
|
Total number S3 requests rejected for invalid header. |
|
Total number S3 invalid requests. |
|
Total number S3 requests rejected for invalid timestamp. |
|
Total number S3 requests. |
|
Number of S3 requests in the waiting queue. |
|
Distribution of the time to first byte across API calls. |
|
Total number of s3 bytes received. |
|
Total number of s3 bytes sent. |
Software Metrics
Name |
Description |
|---|---|
|
Git commit hash for the MinIO release. |
|
MinIO Release tag for the server. |
Drive Metrics
Name |
Description |
|---|---|
|
Total storage available on a drive. |
|
Total free inodes. |
|
Average last minute latency in µs for drive API storage operations. |
|
Total drives offline in this node. |
|
Total drives online in this node. |
|
Total drives in this node. |
|
Total storage on a drive. |
|
Total storage used on a drive. |
|
Total number of drive timeout errors since server start |
|
Total number of drive I/O errors since server start |
|
Total number of drive I/O errors, timeouts since server start |
|
Total number I/O operations waiting on drive |
Identity and Access Management (IAM) Metrics
Name |
Description |
|---|---|
|
Last successful IAM data sync duration in milliseconds. |
|
Time (in milliseconds) since last successful IAM data sync. |
|
Number of failed IAM data syncs since server start. |
|
Number of successful IAM data syncs since server start. |
Information Lifecycle Management (ILM) Metrics
Name |
Description |
|---|---|
|
Number of pending ILM expiry tasks in the queue. |
|
Number of active ILM transition tasks. |
|
Number of pending ILM transition tasks in the queue. |
|
Number of missed immediate ILM transition tasks. |
|
Total number of object versions checked for ilm actions since server start. |
|
Total action outcome of lifecycle checks since server start for deleting object |
|
Total action outcome of lifecycle checks since server start for deleting a version |
|
Total action outcome of lifecycle checks since server start for transition of an object |
|
Total action outcome of lifecycle checks since server start for transition of a particular object version |
|
Total action outcome of lifecycle checks since server start for deletion of temporarily restored object |
|
Total action outcome of lifecycle checks since server start for deletion of a temporarily restored version |
|
Total action outcome of lifecycle checks since server start for deletion of all versions |
Tier Metrics
Name |
Description |
|---|---|
|
Distribution of time to last byte for objects downloaded from warm tier |
|
Number of requests to download object from warm tier that were successful |
|
Number of requests to download object from warm tier that were failure |
System Metrics
Name |
Description |
|---|---|
|
Limit on total number of open file descriptors for the MinIO Server process. |
|
Total number of open file descriptors by the MinIO Server process. |
|
Total number of go routines running. |
|
Total bytes read by the process from the underlying storage system including cache, /proc/[pid]/io rchar. |
|
Total bytes read by the process from the underlying storage system, /proc/[pid]/io read_bytes. |
|
Total bytes written by the process to the underlying storage system including page cache, /proc/[pid]/io wchar. |
|
Total bytes written by the process to the underlying storage system, /proc/[pid]/io write_bytes. |
|
Total user and system CPU time spent in seconds by the process. |
|
Resident memory size in bytes. |
|
Virtual memory size in bytes. |
|
Start time for MinIO process per node, time in seconds since Unix epoc. |
|
Uptime for MinIO process per node in seconds. |
Scanner Metrics
Name |
Description |
|---|---|
|
Total number of bucket scans finished since server start. |
|
Total number of bucket scans started since server start. |
|
Total number of directories scanned since server start. |
|
Total number of unique objects scanned since server start. |
|
Total number of object versions scanned since server start. |
|
Total read SysCalls to the kernel. /proc/[pid]/io syscr. |
|
Total write SysCalls to the kernel. /proc/[pid]/io syscw. |
|
Time elapsed (in nano seconds) since last scan activity. |
Changed in version RELEASE.2025-03-12T17-29-24Z: 出于性能原因,v2 指标最多支持 100 个存储桶。 如果需要覆盖更多存储桶的指标,请改用 v3 指标。
Bucket Metrics
MinIO collects the following metrics at the bucket level.
Each metric includes the bucket label to identify the corresponding bucket.
Metrics may include one or more additional labels, such as the server that calculated that metric.
These metrics can be obtained from any MinIO server once per collection by using the following URL:
https://HOSTNAME:PORT/minio/v2/metrics/bucket
Replace HOSTNAME:PORT with the hostname of your MinIO deployment.
For deployments behind a load balancer, use the load balancer hostname instead of a single node hostname.
Distribution Metrics
Name |
Description |
|---|---|
|
Distribution of object sizes in the bucket, includes label for the bucket name. |
|
Distribution of object sizes in a bucket, by number of versions |
Replication Metrics
These metrics only populate on deployments with Bucket Replication or Batch Replication configurations. For deployments with Site Replication configured, select metrics populate under the Cluster Metrics endpoint.
Name |
Description |
|---|---|
|
Total number of bytes failed at least once to replicate in the last full minute. |
|
Total number of objects which failed replication in the last full minute. |
|
Total number of bytes failed at least once to replicate in the last full hour. |
|
Total number of objects which failed replication in the last full hour. |
|
Total number of bytes failed at least once to replicate since server start. |
|
Total number of objects which failed replication since server start. |
|
Replication latency in milliseconds. |
|
Total number of bytes replicated to this bucket from another source bucket. |
|
Total number of objects received by this bucket from another source bucket. |
|
Total number of bytes replicated to the target bucket. |
|
Total number of objects replicated to the target bucket. |
|
Total number of replication credential errors since server start |
|
Number of GET requests proxied to replication target |
|
Number of HEAD requests proxied to replication target |
|
Number of DELETE tagging requests proxied to replication target |
|
Number of GET tagging requests proxied to replication target |
|
Number of PUT tagging requests proxied to replication target |
|
Number of failures in GET requests proxied to replication target |
|
Number of failures in HEAD requests proxied to replication target |
|
Number of failures in DELETE tagging proxy requests to replication target |
|
Number of failures in GET tagging proxy requests to replication target |
|
Number of failures in PUT tagging proxy requests to replication target |
Traffic Metrics
Name |
Description |
|---|---|
|
Total number of S3 bytes received for this bucket. |
|
Total number of S3 bytes sent for this bucket. |
Usage Metrics
Name |
Description |
|---|---|
|
Total number of objects. |
|
Total number of versions (includes delete marker) |
|
Total number of delete markers. |
|
Total bucket size in bytes. |
|
Total bucket quota size in bytes. |
Requests Metrics
Name |
Description |
|---|---|
|
Total number of S3 requests with (4xx) errors on a bucket. |
|
Total number of S3 requests with (5xx) errors on a bucket. |
|
Total number of S3 requests currently in flight on a bucket. |
|
Total number of S3 requests on a bucket. |
|
Total number S3 requests canceled by the client. |
|
Distribution of time to first byte across API calls per bucket. |
Resource Metrics
MinIO collects the following resource metrics at the node level.
Each metric includes the server label to identify the corresponding node.
Metrics may include one or more additional labels, such as the drive path, interface name, etc.
These metrics can be obtained from any MinIO server once per collection by using the following URL:
https://HOSTNAME:PORT/minio/v2/metrics/resource
Replace HOSTNAME:PORT with the hostname of your MinIO deployment.
For deployments behind a load balancer, use the load balancer hostname instead of a single node hostname.
Drive Resource Metrics
Name |
Description |
|---|---|
|
Total bytes on a drive. |
|
Used bytes on a drive. |
|
Total inodes on a drive. |
|
Total inodes used on a drive. |
|
Reads per second on a drive. |
|
Kilobytes read per second on a drive. |
|
Average time for read requests to be served on a drive. |
|
Writes per second on a drive. |
|
Kilobytes written per second on a drive. |
|
Average time for write requests to be served on a drive. |
|
Percentage of time the disk was busy since uptime. |
Network Interface Metrics
Name |
Description |
|---|---|
|
Bytes received on the interface in 60s. |
|
Bytes received on the interface in 60s (avg) since uptime. |
|
Bytes received on the interface in 60s (max) since uptime. |
|
Receive errors in 60s. |
|
Receive errors in 60s (avg). |
|
Receive errors in 60s (max). |
|
Bytes transmitted in 60s. |
|
Bytes transmitted in 60s (avg). |
|
Bytes transmitted in 60s (max). |
|
Transmit errors in 60s. |
|
Transmit errors in 60s (avg). |
|
Transmit errors in 60s (max). |
CPU Metrics
Name |
Description |
|---|---|
|
CPU user time. |
|
CPU user time (avg). |
|
CPU user time (max). |
|
CPU system time. |
|
CPU system time (avg). |
|
CPU system time (max). |
|
CPU idle time. |
|
CPU idle time (avg). |
|
CPU idle time (max). |
|
CPU ioWait time. |
|
CPU ioWait time (avg). |
|
CPU ioWait time (max). |
|
CPU nice time. |
|
CPU nice time (avg). |
|
CPU nice time (max). |
|
CPU steam time. |
|
CPU steam time (avg). |
|
CPU steam time (max). |
|
CPU load average 1min. |
|
CPU load average 1min (avg). |
|
CPU load average 1min (max). |
|
CPU load average 1min (percentage). |
|
CPU load average 1min (percentage) (avg). |
|
CPU load average 1min (percentage) (max). |
|
CPU load average 5min. |
|
CPU load average 5min (avg). |
|
CPU load average 5min (max). |
|
CPU load average 5min (percentage). |
|
CPU load average 5min (percentage) (avg). |
|
CPU load average 5min (percentage) (max). |
|
CPU load average 15min. |
|
CPU load average 15min (avg). |
|
CPU load average 15min (max). |
|
CPU load average 15min (percentage). |
|
CPU load average 15min (percentage) (avg). |
|
CPU load average 15min (percentage) (max). |
Memory Metrics
Name |
Description |
|---|---|
|
Available memory on the node. |
|
Available memory on the node (avg). |
|
Available memory on the node (max). |
|
Buffers memory on the node. |
|
Buffers memory on the node (avg). |
|
Buffers memory on the node (max). |
|
Cache memory on the node. |
|
Cache memory on the node (avg). |
|
Cache memory on the node (max). |
|
Free memory on the node. |
|
Free memory on the node (avg). |
|
Free memory on the node (max). |
|
Shared memory on the node. |
|
Shared memory on the node (avg). |
|
Shared memory on the node (max). |
|
Total memory on the node. |
|
Total memory on the node (avg). |
|
Total memory on the node (max). |
|
Used memory on the node. |
|
Used memory on the node (avg). |
|
Used memory on the node (max). |
|
Used memory percentage on the node. |
|
Used memory percentage on the node (avg). |
|
Used memory percentage on the node (max). |