Expand description
Defines the metrics system for vhost-user devices.
§Metrics format
The metrics are flushed in JSON when requested by vmm::logger::metrics::METRICS.write().
§JSON example with metrics:
{
"vhost_user_{mod}_id0": {
"activate_fails": "SharedIncMetric",
"cfg_fails": "SharedIncMetric",
"init_time_us": SharedStoreMetric,
"activate_time_us": SharedStoreMetric,
"config_change_time_us": SharedStoreMetric,
}
"vhost_user_{mod}_id1": {
"activate_fails": "SharedIncMetric",
"cfg_fails": "SharedIncMetric",
"init_time_us": SharedStoreMetric,
"activate_time_us": SharedStoreMetric,
"config_change_time_us": SharedStoreMetric,
}
...
"vhost_user_{mod}_idN": {
"activate_fails": "SharedIncMetric",
"cfg_fails": "SharedIncMetric",
"init_time_us": SharedStoreMetric,
"activate_time_us": SharedStoreMetric,
"config_change_time_us": SharedStoreMetric,
}
}Each vhost_user field in the example above is a serializable VhostUserDeviceMetrics
structure collecting metrics such as activate_fails, cfg_fails, init_time_us,
activate_time_us and config_change_time_us for the vhost_user device.
For vhost-user block device having endpoint “/drives/drv0” the emitted metrics would be
vhost_user_block_drv0.
For vhost-user block device having endpoint “/drives/drvN” the emitted metrics would be
vhost_user_block_drvN.
Aggregate metrics for vhost_user if not emitted as it can be easily obtained in
typical observability tools.
§Design
The main design goals of this system are:
-
To improve vhost_user device metrics by logging them at per device granularity.
-
vhost_useris a new device with no metrics emitted before so, backward compatibility doesn’t come into picture like it was in the case of block/net devices. And since, metrics can be easily aggregated using typical observability tools, we chose not to provide aggregate vhost_user metrics. -
Rely on
serdeto provide the actual serialization for writing the metrics. -
Since all metrics start at 0, we implement the
Defaulttrait via derive for all of them, to avoid having to initialize everything by hand. -
Follow the design of Block and Net device metrics and use a map of vhost_user device name and corresponding metrics.
-
Metrics are flushed with key
vhost_user_{module_specific_name}and each module sets an appropriatemodule_specific_namein the format{mod}_{id}. e.g. vhost-user block device in this commit set this asformat!("{}_{}", "block_", config.drive_id.clone());This way vhost_user_metrics stay generic while the specific vhost_user devices can have their unique metrics.
The system implements 2 type of metrics:
- Shared Incremental Metrics (SharedIncMetrics) - dedicated for the metrics which need a counter (i.e the number of times activating a device failed). These metrics are reset upon flush.
- Shared Store Metrics (SharedStoreMetrics) - are targeted at keeping a persistent value, it is
notintended to act as a counter (i.e for measure the process start up time for example).
We add VhostUserDeviceMetrics entries from vhost_user_metrics::METRICS into vhost_user device instead of vhost_user device having individual separate VhostUserDeviceMetrics entries because vhost_user device is not accessible from signal handlers to flush metrics and vhost_user_metrics::METRICS is.
Structs§
- Vhost
User Device Metrics - vhost_user Device associated metrics.
- Vhost
User Metrics PerDevice - map of vhost_user drive id and metrics this should be protected by a lock before accessing.
Functions§
- flush_
metrics - This function facilitates serialization of vhost_user device metrics.