Module metrics

Module metrics 

Source
Expand description

Defines the metrics system for Network devices.

§Metrics format

The metrics are flushed in JSON when requested by vmm::logger::metrics::METRICS.write().

§JSON example with metrics:

{
 "net_eth0": {
    "activate_fails": "SharedIncMetric",
    "cfg_fails": "SharedIncMetric",
    "mac_address_updates": "SharedIncMetric",
    "no_rx_avail_buffer": "SharedIncMetric",
    "no_tx_avail_buffer": "SharedIncMetric",
    ...
 }
 "net_eth1": {
    "activate_fails": "SharedIncMetric",
    "cfg_fails": "SharedIncMetric",
    "mac_address_updates": "SharedIncMetric",
    "no_rx_avail_buffer": "SharedIncMetric",
    "no_tx_avail_buffer": "SharedIncMetric",
    ...
 }
 ...
 "net_iface_id": {
    "activate_fails": "SharedIncMetric",
    "cfg_fails": "SharedIncMetric",
    "mac_address_updates": "SharedIncMetric",
    "no_rx_avail_buffer": "SharedIncMetric",
    "no_tx_avail_buffer": "SharedIncMetric",
    ...
 }
 "net": {
    "activate_fails": "SharedIncMetric",
    "cfg_fails": "SharedIncMetric",
    "mac_address_updates": "SharedIncMetric",
    "no_rx_avail_buffer": "SharedIncMetric",
    "no_tx_avail_buffer": "SharedIncMetric",
    ...
 }
}

Each net field in the example above is a serializable NetDeviceMetrics structure collecting metrics such as activate_fails, cfg_fails, etc. for the network device. net_eth0 represent metrics for the endpoint “/network-interfaces/eth0”, net_eth1 represent metrics for the endpoint “/network-interfaces/eth1”, and net_iface_id represent metrics for the endpoint “/network-interfaces/{iface_id}” network device respectively and net is the aggregate of all the per device metrics.

§Limitations

Network device currently do not have vmm::logger::metrics::StoreMetrics so aggregate doesn’t consider them.

§Design

The main design goals of this system are:

  • To improve network device metrics by logging them at per device granularity.

  • Continue to provide aggregate net metrics to maintain backward compatibility.

  • Move NetDeviceMetrics out of from logger and decouple it.

  • Use lockless operations, preferably ones that don’t require anything other than simple reads/writes being atomic.

  • Rely on serde to provide the actual serialization for writing the metrics.

  • Since all metrics start at 0, we implement the Default trait via derive for all of them, to avoid having to initialize everything by hand.

  • Devices could be created in any order i.e. the first device created could either be eth0 or eth1 so if we use a vector for NetDeviceMetrics and call 1st device as net0, then net0 could sometimes point to eth0 and sometimes to eth1 which doesn’t help with analysing the metrics. So, use Map instead of Vec to help understand which interface the metrics actually belongs to.

  • We use “net_$iface_id” for the metrics name instead of “net_$tap_name” to be consistent with the net endpoint “/network-interfaces/{iface_id}”.

The system implements 1 types of metrics:

  • Shared Incremental Metrics (SharedIncMetrics) - dedicated for the metrics which need a counter (i.e the number of times an API request failed). These metrics are reset upon flush.

We use net::metrics::METRICS instead of adding an entry of NetDeviceMetrics in Net so that metrics are accessible to be flushed even from signal handlers.

Structs§

NetDeviceMetrics
Network-related metrics.
NetMetricsPerDevice
map of network interface id and metrics this should be protected by a lock before accessing.

Functions§

flush_metrics
This function facilitates aggregation and serialization of per net device metrics.