Stack Probe: Metrics Aggregation using Stat Collectors
The two main ways in which stack probes provide information to VMX-Analysis or via the Core Data Feed are:
As individual messages
As summarised statistics.
Message publishing is covered in the following section on Message Collectors.
Summarised statistics are useful, for example, for network information where downstream systems would not be sized to cope with the full message rate of the data from the network. Summarised statistics are also useful as an overview of performance over longer time periods. Careful use of the percentiles that are stored makes these statistics even more valuable.
Both VMX-Capture and VMX-Analysis share the concept of aggregations as the data structure in which statistics are presented. See the Analytics Concepts Guide or Beeks Analytics Data Guide for a high-level introduction to aggregations.
VMX-Analysis computes stats based on the Agent events it receives, correlates, and associates.
VMX-Capture can generate stats about the quality of the Visibility Points that the Agents reside within - such as network, middleware, and market data stats.
Aggregations of the results of calculations that are performed in the VMX-Capture layer are known as pre-aggregated statistics. This distinguishes them from the aggregators that are defined only in the VMX-Analysis layer.
Where statistics are pre-aggregated by VMX-Capture, this can be performed in two places:
The stat_collector within an individual probe can perform pre-aggregation of the traffic that it has visibility of, and can pass these statistics to VMX-Analysis (including aggregations).
If multiple stack probes are processing messages at high volume, and the results need to be combined before they are passed to VMX-Analysis, then the P3 pre-aggregation function can be used for this.
This section describes the stat_collector within the stack probe, which is the standard way of performing pre-aggregation. See Advanced VMX-Capture Configuration for an overview of the P3 process, including P3 pre-aggregation.
Once the statistics are calculated, they can be output to another system using one of the following methods:
The statistics can be sent to VMX-Analysis as statistics events. VMX-Analysis server does not need to calculate the statistics but does take responsibility for persisting the statistics and making them available to query.
The statistics can be sent to a customer application via the Core Data Feed. See the CDF-T section of the Core Data Feed Guide for more information. The kafka collector is the stat collector responsible for CDF-T output.
Example Stat Collector Configuration
The stat collector defines which statistics to compute for the packets that match the BPF.
For example:
"stat_collector"
: [
{
"type"
:
"module"
,
"value"
:
"vmxgenericaggregatorinputeventconnector"
,
"id"
:
"mdStats"
// md stats is configured for gap detection, wire latency, micro bursts etc
},
{
"type"
:
"module"
,
"value"
:
"vmxanomalyconnector"
,
"id"
:
"coll_vmxanomalyconnector"
// anomaly connector creates stats on anomalies.
}
]
As with the decoder definitions, where there is an id defined, there will be further definition for that particular statistic collector later in the configuration file.
Statistics are provided to VMX-Analysis as aggregations. In the above example, a separate file containing an aggregator definition is provided in the mdStats part of the configuration:
"mdStats"
: {
"parameters"
: {
"blocking"
:
false
,
"pool_size"
:
1000
,
"buffer_size"
:
336
,
"publish_interval_us"
:
10000000
,
"timestamp"
:
"TIMESTAMP"
,
"connector_id"
:
"MD_statsAgent"
,
"node_path_stats_json_filename"
:
"$VMX_HOME/../../server/config/agent/global/preagg/MD_stats.stack.agg.json"
}
},