Analytics Concepts Guide

Aggregators in VMX-Analysis analyse Agent Events in real-time, allowing you to perform calculations on specific Attributes and roll-up (or slice and dice) the result using other business properties of those Agent Events.

Statistics can also be passed directly to VMX-Analysis, which can then persist and supplement the statistics with other calculations. For example, VMX-Analysis can add additional columns (derived from the statistics passed to it), or can provide ‘roll-up’ summary aggregator levels that are not included in the original statistics.

For a fuller introduction to aggregation and aggregators, see Creating useful statistics with Aggregators.

VMX-Analysis Aggregator Input Filters

An input filter lets you control which interval events are presented to a given Aggregator. For example, you can select specific intervals or intervals from one or more Flows within the monitored environment.

VMX-Analysis Aggregator Statistics and Persistence

As well as Agent Event data, Aggregators are also used to display statistics that have been pre-aggregated by VMX-Capture. Pre-aggregation supports statistics derived from high volume traffic, e.g. market data.

Historic tick data capture

Any ticking Aggregator cell value can be persisted to a database to provide a historical record of how the value changed over time. For example, you can record overall moving average latency and flow, or some sub-aggregation such as location, subsystem, and so on.

Historic data for a given cell is known as a time series. The ticking cell value is monitored over a short configurable interval, for example 10 seconds, and the open, close, high, low, mean, and time-weighted mean are recorded in the database along with an identifier to indicate the time series and a time-stamp.

By selecting a given time series, a chart can be generated from a set of value points in the database. To reduce the amount of data stored in the database, a compression scheme may be applied to store older data at a lower resolution.

Historic data charts are consequently available for each day of operation. They can be used in more advanced statistical operations such as determining a 30-day moving average. This data can also be plotted on charts and can be used as benchmarks for relative alerts.

Although we refer to this historic data as ‘charts’ for ease of reference, you can more accurately think of these as a set of timeseries data that covers a particular period of time for a particular Aggregator cell at a particular interval, which may be further compressed to a lower resolution interval (e.g. one data point every minute instead of one data point every 10 seconds) for timeseries data that is further in the past.

Time series sets

You can record time series' for a set of related cells. This is called a Time Series Set (TSS). Managing time series as a set simplifies configuration and offers new possibilities when charting and alerting new time series. The benefits of using a Time Series Set rather a single time series include:

A Time Series Set can expand automatically to include new time series.
For example, if a TSS includes all discovered clients, then as new clients are added to the monitored system, they are discovered and automatically added to the TSS.
A large number of time series can be configured in a single action, regardless of whether they are currently required.
A Time Series Set is efficient at bulk recording.
Because Time Series Sets have a hierarchical structure, you can drill down to explore details when viewing a TSS on a chart.
Because a Time Series Set represents a group of related time series, you can rank them for display purposes; for example, you can show the top 5 customers by order volume this month.

Types of Time Series Set

There are two kinds of Time Series Set:

Aggregator region Time Series Set
An Aggregator region is a subset of all the cells in an Aggregator. All cells in a defined region are recorded in bulk. You can define multiple overlapping Aggregator regions to work with charts.
Externally recorded Time Series Set
An External time series set allows access to time series data recorded outside the Beeks Analytics server; most typically from an external capture device. By defining an external TSS you can use the externally captured data in VMX-Explorer dashboards, potentially combining external series with locally recorded series onto a single chart.

Using Time Series Sets on Charts

Charts can help you to group and explore the related items in the set. You can use a region to refer to a group of related items rather than defining individual trend lines. For example, by defining a region from the children of a customer node, you can plot trends for a specific per-client measure and display the top five customers by that measure. As new children are discovered, they will automatically be added to the chart. This makes it possible to build dynamic charts without prior knowledge of future data content.

Charts also allow drill-down into a time series so you can navigate through the tree structure of the data to discover more detail. For example, if your Aggregator breaks down customer data further by connection, you could drill down through a given customer's trend line to show the individual connections that customer uses.

Long term moving average

Beeks Analytics supports querying of long term moving average (or derived) columns if they are available on an aggregator source. These will appear in the column selection alongside normal columns. These differ slightly from other columns in that they can apply only to a “region” of the aggregator, ie. they may not cover all available nodepaths from the aggregator.

Distributions

Appropriate Aggregator cell values can also be setup to store distributions for them.

Beeks Analytics uses High Dynamic Range (HDR) distributions. This is distinct from distributions which use so-called linear buckets. In a linear distribution, each linear bucket is the same “width”. This is useful where the shape of the input data is well understood and the shape of the bucket can be tuned appropriately. By contrast, HDR distributions record the data in buckets of varying “width”. This makes the distribution easy to tune and able to cope well with varied input data.