Beeks Analytics Performance Guide

In addition to IP and ethernet statistics (standard network information, which will include microburst calculations), we now introduce specific decoding of the financial protocol in question. This allows us to decode the sequence numbers in the market data messages and identify when gaps are occurring.

We also start to use mappers to be more specific about the messages that we want to perform gap detection on, and to add supplementary information to the messages which will be useful when we summarise the statistics (e.g. feed name).

The following configuration changes support this:

In addition to the ethernet and ip protocol decoders, this adds the NYSE XDP decoder (which uses C++ decoder technology).
The NYSE XDP decoder implements gap detection as part of its embedded logic.
This configuration introduces the overhead of stacked mappers as part of the transform collectors. The specific configuration and performance impact of these mappers is covered in more depth below.
As above, the SummaryDumper stat collector is used for testing purposes.

The benefit of being able to stack transform collectors are many, but include the ability to progressively filter the packet stream down so that more intensive operations are performed on a smaller subset of the data.

In addition to the headline numbers presented in Performance Benchmarks , we have benchmarked the performance of both mappers separately to provide extra guidance to users on the performance characteristics of different mapper functionality.

See Beeks Analytics Decoder Information for more information about the many different decoders available within Beeks Analytics.

Transform collectors were covered briefly in the earlier diagram, and are covered in more depth in the Configuration Guide for VMX-Capture.

Market Data - IP Stats + gap detection: Stack Probe Configuration

{
  "probe": {
    "parameters": {
      "name": "MDPort1_nyse_arca_bbo",
      "debug": false,
      "filter": "vlan and vlan and (udp and (dst 224.0.76.146 or dst 224.0.76.23 or dst 224.0.76.148 or dst 224.0.76.21 or dst 224.0.76.149 or dst 224.0.76.20 or dst 224.0.76.147 or dst 224.0.76.22 or dst 224.0.76.151 or dst 224.0.59.153 or dst 224.0.76.19 or dst 224.0.76.18) ) ",
      "protocols": [
        {
          "type": "module",
          "value": "ethernet"
        },
        {
          "type": "module",
          "value": "ip"
        },
        {
          "type": "module",
          "value": "nyse_xdp"
        }
      ],
      "transform_collector": [
        {
            "type": "module",
            "value": "mapper",
            "id": "mdFeedMapper"
        },
        {
            "type": "module",
            "value": "mapper",
            "id": "internalEntitiesMapper"
        }
      ],
      "stat_collector": [
        {
          "type": "module",
          "value": "summary_dumper",
          "id": "summaryDumper"
        }
      ]
    }
  },
  "mdFeedMapper": {
        "parameters": {
            "json_filename": "$VMX_HOME/../server/config/agent/pmux/VP/mdMapper/MDPort1_nyse_arca_bbo.stack.mapper.json"
        }
    },
    "internalEntitiesMapper": {
        "parameters": {
            "json_filename": "$VMX_HOME/../server/config/agent/pmux/VP/entsMapper/MDPort1_nyse_arca_bbo.stack.mapper.json"
        }
    },
  "summaryDumper": {
    "parameters": {
      "dump_file": "/data/debug/probe.stats.json",
      "dump_interval_us": "1000000",
      "flush_every_event": true,
      "output_format": "json_lines"
    }
  }
}

There are two mappers in this configuration - the mdFeedMapper and the internalEntitiesMapper.

Market Data - IP Stats + gap detection: mdFeedMapper

See below for the contents of the mdMapper/MDPort1_nyse_arca_bbo.stack.mapper.json" referenced in the above configuration.

The mdFeedMapper one for this test simply chooses the particular ports which are used for the market data protocols that we want to collect statistics for. This type of conditional logic based on a single datafield value is a common use for mappers.

Note that it would be even more efficient to perform this type of filtering in the Protocols part of the configuration, as that would filter the messages earlier in the processing chain (this could be performed before the nyse_xdp decoder, for example, which would be a more BAM-standard way of implementing filtering logic).

For a more realistic example of mdFeedMapper configuration, see the Market Data Gap Detection Worked Example in the Configuration Guide for VMX-Capture.

{
    "comment": "Only forward packet to next decoder layer if port in mapping list",
    "actions": [
        {
            "map": {
                "key": "ip.dst_port",
                "mapping": {
                    "11151": [
                        {
                            "nextDecoderPacketData": {}
                        }
                    ],
                    "11152": [
                        {
                            "nextDecoderPacketData": {}
                        }
                    ],
                    "11251": [
                        {
                            "nextDecoderPacketData": {}
                        }
                    ],
                    "11252": [
                        {
                            "nextDecoderPacketData": {}
                        }
                    ]
                }
            }
        }
    ]
}

Market Data - IP Stats + gap detection: internalEntitiesMapper

See below for the contents of the entsMapper/MDPort1_nyse_arca_bbo.stack.mapper.json" referenced in the above configuration.

An entities mapper is used for all stack probes in the Beeks Analytics for Markets template. Whereas the mdFeedMapper maps the entities that are public for the external market data feeds that are being monitored, the internalEntities mapper maps the internal IP addresses.

Given that market data is published to a multicast address, and the IP address of the consumer is not seen in the packets, the standard mapping here for all market data probes is a placeholder, which would allow, for example, the static mapping of a particular multicast group to defined consumers. This is what the placeholder mapping looks like:

{
    "datafields": {
        "beeks.DS1": "string"
    },
    "actions": [
        {
            "map": {
                "comment": "Use MC Group to assign all destination internal entities (DS or IntGrp)",
                "key": "ip.dst_host",
                "default": [
                    {
                        "assign": {
                            "beeks.DS1": "unassigned"
                        }
                    }
                ],
                "mapping": {}
            }
        }
    ]
}

Performance Comparison: mdFeedMapper and internalEntitiesMapper

We have observed that, with our test data for the market data test, the above feed mapper configuration imposes a performance overhead of 25% while the simpler internalEntitiesMapper imposes a performance overhead of just 13%.