Expand description
Defines the metrics system for pmem devices.
§Metrics format
The metrics are flushed in JSON when requested by vmm::logger::metrics::METRICS.write().
§JSON example with metrics:
{
"pmem_drv0": {
"activate_fails": "SharedIncMetric",
"cfg_fails": "SharedIncMetric",
"no_avail_buffer": "SharedIncMetric",
"event_fails": "SharedIncMetric",
"execute_fails": "SharedIncMetric",
...
}
"pmem_drv1": {
"activate_fails": "SharedIncMetric",
"cfg_fails": "SharedIncMetric",
"no_avail_buffer": "SharedIncMetric",
"event_fails": "SharedIncMetric",
"execute_fails": "SharedIncMetric",
...
}
...
"pmem_drive_id": {
"activate_fails": "SharedIncMetric",
"cfg_fails": "SharedIncMetric",
"no_avail_buffer": "SharedIncMetric",
"event_fails": "SharedIncMetric",
"execute_fails": "SharedIncMetric",
...
}
"pmem": {
"activate_fails": "SharedIncMetric",
"cfg_fails": "SharedIncMetric",
"no_avail_buffer": "SharedIncMetric",
"event_fails": "SharedIncMetric",
"execute_fails": "SharedIncMetric",
...
}
}Each pmem field in the example above is a serializable PmemDeviceMetrics structure
collecting metrics such as activate_fails, cfg_fails, etc. for the pmem device.
pmem_drv0 represent metrics for the endpoint “/pmem/drv0”,
pmem_drv1 represent metrics for the endpoint “/pmem/drv1”, and
pmem_drive_id represent metrics for the endpoint “/pmem/{drive_id}”
pmem device respectively and pmem is the aggregate of all the per device metrics.
§Limitations
pmem device currently do not have vmm::logger::metrics::StoreMetrics so aggregate
doesn’t consider them.
§Design
The main design goals of this system are:
-
To improve pmem device metrics by logging them at per device granularity.
-
Continue to provide aggregate pmem metrics to maintain backward compatibility.
-
Move PmemDeviceMetrics out of from logger and decouple it.
-
Rely on
serdeto provide the actual serialization for writing the metrics. -
Since all metrics start at 0, we implement the
Defaulttrait via derive for all of them, to avoid having to initialize everything by hand. -
Devices could be created in any order i.e. the first device created could either be drv0 or drv1 so if we use a vector for PmemDeviceMetrics and call 1st device as pmem0, then pmem0 could sometimes point to drv0 and sometimes to drv1 which doesn’t help with analysing the metrics. So, use Map instead of Vec to help understand which drive the metrics actually belongs to.
The system implements 1 type of metrics:
- Shared Incremental Metrics (SharedIncMetrics) - dedicated for the metrics which need a counter (i.e the number of times an API request failed). These metrics are reset upon flush.
We add PmemDeviceMetrics entries from pmem::metrics::METRICS into Pmem device instead of Pmem device having individual separate PmemDeviceMetrics entries because Pmem device is not accessible from signal handlers to flush metrics and pmem::metrics::METRICS is.
Structs§
- Pmem
Metrics - Pmem Device associated metrics.
- Pmem
Metrics PerDevice - map of pmem drive id and metrics this should be protected by a lock before accessing.
Functions§
- flush_
metrics - This function facilitates aggregation and serialization of per pmem device metrics.