This document explains the NMSG protocol and provides an introduction to some example tools that can be used to acquire and process NMSG packets. Common use cases for the tools will also be reviewed.

NMSG is an adaptable container format that allows for consistent or variable message types. NMSG container data may be streamed to a file or transmitted as UDP datagrams. NMSG containers can contain multiple NMSG messages or a fragment of a message too large to fit in a single container. The data in an NMSG container can also be compressed. Additional capabilities include sequencing and rate-limiting. This includes:

  • Adaptable: NMSG functionality can be modified to meet the needs of new data formats using its adaptable message module interface. As new data feeds are added to SIE, related message modules can be developed for nmsg that do not necessitate library compilation or API changes.
  • Container-Based: NMSG data is serialized inside containers that can contain one payload, many payloads, or the fragment of payload too large to fit in a single container.
  • Wire Format: NMSG specifies a wire-format optimized for transmitting UDP datagrams over jumbo Ethernet.
  • File Format: NMSG also specifies an on-disk file-format for storage of NMSG data.
  • Container Data: A core principle of NMSG is data compatibility. Some of the data Farsight consumes, transmits, and stores is inadequately represented in its native format (such as frames, packets, datagrams, segments, or other data formats). As such, NMSG was designed to be ignorant about the data it transports. NMSG offloads the details of encoding to external message modules and can also work with opaque containers.
  • Dynamic Message Types: NMSG provides an adaptable message module interface that can be modified at run-time for message types it understands. This ensures the library is generic and offloads the more unique message handling to external modules that can be loaded as-needed.
  • Compression: NMSG supports per-payload compression. This is implemented in nmsg using zlib.
  • Fragmentation: Payloads that are too large to fit in a single container for the underlying transport, NMSG provides a fragmentation capability that is seamless to the user or application programmer.
  • Sequencing: NMSG can optionally be configured to assign sequentially increasing numbers to the containers it emits. This can be used by the receiving end to detect potential container loss.
  • Rate-Limiting: NMSG can optionally be configured to rate-limit its emission of containers to ensure receivers on slower networks are not overwhelmed.

NMSG is available for the application programmer as a C library called libnmsg. The library offers a complete API for the programmer to build NMSG-capable applications and configure, tune, and/or tweak its many options and features.

The reference implementation of libnmsg is nmsgtool, which is a thin wrapper around the libnmsg C library. nmsgtool provides comprehensive NMSG functionality at the command line interface (CLI) for Unix-like systems.

To ensure development is easy in Python and Perl, modules also exist for each programming language.

Note: Using NMSG requires that Farsight Security has provisioned at least one of the following Access Methods to acquire data from SIE:

  1. SIE Remote Access (SRA) (Also known as Advanced Exchange Access (AXA))
  2. SIE Direct Connect
  3. SIE Blade Server leased from Farsight
  4. SIE Batch

Introduction

nmsgtool is a CLI tool for the libnmsg library and is a wrapper around libnmsg‘s input/output (I/O) engine. libnmsg controls the transmission, storage, creation, and conversion of NMSG payloads.

nmsgtool‘s primary purpose is for prototyping and debugging access to NSMG container format data. While it provides comprehensive NMSG functionality and can acquire and store data from SIE, it also lacks some advanced features of libnmsg.

NMSG Inputs and Outputs

The nmsgtool program is a single tool for acquiring a variety of different inputs, like data streams from the network, capturing data from network interfaces, reading data from files, or even standard input and making NMSG payloads available to one or more outputs. The outputs are files in binary or human-readable (ASCII presentation) form, or binary payloads to network sockets for transport. Without having to create a program for each function, nmsgtool handles all types of data processing which includes serialization, fragmentation, compression, striping or mirroring, rolling file outputs, and executing data processing programs on file outputs.

The nmsgtool program is a single tool for acquiring a variety of different inputs, like data streams from the network, capturing data from network interfaces, reading data from files, or even standard input and making NMSG payloads available to one or more outputs. The outputs are files in binary or human-readable (ASCII presentation) form, or binary payloads to network sockets for transport. Without having to create a program for each function, nmsgtool handles all types of data processing which includes serialization, fragmentation, compression, striping or mirroring, rolling file outputs, and executing data processing programs on file outputs. nmsgtool inputs can take the following forms:

  • File that contains binary NMSG data, which could have been created nmsgtool
  • A socket that is configured to transport binary NMSG data
  • IP packets in packet capture library (libpcap) file (also known as “PCAP”)
  • IP packets acquired from a network interface
  • A file containing ASCII presentation data

nmsgtool outputs can take the following forms:

  • Binary NMSG data stored in a file
  • Binary NMSG data sent to a network socket
  • ASCII presentation data in a file and including standard output (stdout)
  • You can specify more than one of each

Features

nmsgtool is multithreaded

libnmsg has a feature that enables you to create user-specified Dynamic Shared Objects (DSOs) for inline filtering of network messages, which may be quicker than post filtering the data. nmsgtool can load these filtering DSOs using the -F or –filter CLI argument. Documentation about this feature is available at:

https://github.com/farsightsec/nmsg/blob/master/nmsg/io.h#L139-L206

https://github.com/farsightsec/nmsg/blob/master/fltmod/nmsg_flt_sample.c

Limitations

nmsgtool does not provide easy access to seqsrc NMSG container data.

Considerations

Performance and potential container loss may be dependent on, but is not limited to, the following. All may be potential reasons for nmsgtool to throttle and drop containers.

  • Transmitting NMSG data over the Internet
  • Network bandwidth
  • Congestion between the transmitting instance of nmsgtool and program on the system receiving the NMSG data
  • Host system or receiving system
    • Disk
    • Memory
    • CPU
    • Network interface

Examples

The following examples and use-cases demonstrate the functionality of nmsgtool.

Display Data when Physically Connected to SIE

A common nmsgtool use-case for SIE customers is to acquire real-time data from an SIE channel and display the NMSG data on the screen.

The following invocation of nmsgtool reads one (1) NMSG payload [-c 1] from SIE Channel 212, Newly Observed Domains (NOD), [-C ch212] and emits it to stdout as ASCII presentation data [-o -].

$ nmsgtool -C ch212 -c 1 -o -
[72] [2015-02-03 13:35:29.678474903] [2:5 SIE newdomain] [a1ba02cf] [] []
domain: s47rbh.xyz.
time_seen: 2015-02-03 13:33:20
rrname: s47rbh.xyz.
rrclass: IN (1)
rrtype: NS (2)
rdata: ns1.51dns.com.
rdata: ns2.51dns.com.

Note: If no outputs are specified, ASCII presentation to stdout [-o -] is the default behavior of nmsgtool. In examples that follow, [-o -] will be omitted.

nmsgtool uses the nmsgtool.chalias configuration file to determine the proper network interface to collect data from. This configuration file contains the SIE channel number, IP address, and UDP port mappings. When a channel number is specified on the command line, nmsgtool looks it up in the nmsgtool.chalias file and listens on the specified network interface.

The header from an NMSG datagram is the first line displayed for an NMSG message. Breaking this down, the header fields for SIE Channel 212 (Newly Observed Domains) follow:

  • [72]: Message size in bytes
  • [2015-02-03 13:35:29.678474903]: UTC timestamp with nanosecond resolution
  • [2:5 SIE newdomain]: Vendor and message ID, vendor and message type
  • [a1ba02cf]: Optional source identifier
  • []: Optional operator code
  • []: Optional group code

The message payload contains key-value pairs that conform to a schema specified by the vendor for the message type. In the preceding example, the vendor is SIE and the message type is newdomain. nmsgtool includes dynamically loadable modules that enable it to display the data as you see in the preceding example and also enable NMSG-based programs or scripts to load the key-value pairs into structures. These concepts will be explained in more detail in future NMSG articles.

Acquire Data from SIE and Write to Binary NMSG Files

Another use-case for SIE customers is to acquire real-time data from an SIE channel and save it to the local filesystem in a binary NMSG file for analysis at a later time/date.

The following invocation of nmsgtool acquires 100,000 NMSG payloads [-c 1000000] from SIE Channel 208, DNSDB Verified Data (deduplicated, verified, and prior to filtering), [-C ch208] and writes them to a binary NMSG file [-w ch208.nmsg]. The result is a file approximately 15.25 megabytes in size.

$ nmsgtool -C ch208 -c 100000 -w ch208.nmsg
$ stat -c "%n %s" ch208.nmsg ch208.nmsg 15246581

The binary file can then be read by nmsgtool using [-r ch208.nmsg] and it will display one (1) NMSG payload [-c 1] containing a dnsdedupe message. The message is emitted to stdout in ASCII presentation data.

$ nmsgtool -r ch208.nmsg -c 1
[72] [2015-02-01 00:07:53.596907788] [2:1 SIE dnsdedupe] [a1ba02cf] [] []
type: EXPIRATION
count: 2
time_first: 2015-01-31 07:29:37
time_last: 2015-01-31 07:29:37
bailiwick: 
rrname: 
rrclass: IN (1)
rrtype: A (1)
rrttl: 43200
rdata: <redacted>

NMSG Payload Compression

nmsgtool can optionally compress the payload of each NMSG container using zlib compression (the same algorithm used by gzip), either to a file or for NMSG data transmitted across the network.

Note: Compression is performed on each payload in an NMSG container, not across the entire file.

To demonstrate the on-disk storage benefit that compression provides, we can compress the data acquired in the previous example.

The following invocation reads the binary file [-r ch208.nmsg] from the previous example, compresses each payload using [-z], and then writes the output to a new file [-w ch208z.nmsg]. The result is a file approximately 6.43 megabytes in size, which is a 58% decrease in file size.

$ stat -c "%n %s" ch208.nmsg
ch208.nmsg 15246581

$ nmsgtool -r ch208.nmsg -w ch208z.nmsg -z

$ stat -c "%n %s" ch208z.nmsg
ch208z.nmsg 6428829

Kicker Scripts and Rotating Output Files

Another useful capability that nmsgtool provides is the ability to perform automatic file rotation (rolling) based on a duration of time or payload count. Additionally, the user can specify a kicker script or command to run on output files.

An example shell script follows. The script, count.sh, counts the number of dnsdedupe payloads from a binary NMSG file using grep -c.

#!/bin/sh
echo "$1: " `nmsgtool -r "$1" | grep -c "\[2:1 SIE dnsdedupe\]"`

The following invocation of nmsgtool acquires data from SIE Channel 208, DNSDB Verified Data (deduplicated, verified, and prior to filtering), [-C ch208] and writes compressed payloads [-z] to a binary NMSG file [-w ch208] that is prefixed with ch208. Every two (2) seconds [-t 2] the file is closed, rotated, and the kicker script [-k count.sh] is run on the output file. The output from each count.sh invocation is the filename followed by the number of NMSG payloads that each file contains. The count.sh script is invoked in the following example:

$ nmsgtool -C ch208 -w ch208 -t 2 -z -k count.sh
./ch208.20150202.0110.1422839406.364843292.nmsg:  49404
./ch208.20150202.0110.1422839408.013136741.nmsg:  80446
./ch208.20150202.0110.1422839410.024261700.nmsg:  91067
./ch208.20150202.0110.1422839412.024284315.nmsg:  86070
./ch208.20150202.0110.1422839414.033887391.nmsg:  85490
./ch208.20150202.0110.1422839416.014162500.nmsg:  90793

Note: The -k cmd or --kicker cmd arguments make [-t] and [-c] continuous. In this mode, output file names are suffixed with a timestamp and nmsgtool runs continuously, rotating output files as payload counts are reached or a specified duration of time expires.

Transport NMSG Data Across the Network

nmsgtool can transport NMSG payloads across an IP network to either a unicast or broadcast address IPv4 address or a unicast IPv6 address.

For this example, two (2) nmsgtool sessions on run on two (2) separate systems. On the receiving system, we run nmsgtool as follows:

System Receiving NMSG Payloads

The following invocation of nmsgtool listens on a network socket for NMSG payloads sent to IPv4 address 10.0.1.52 on UDP port 9430 [-l 10.0.1.52/9430]. When NMSG payloads are observed, they will be displayed as ASCII presentation data to stdout.

$ nmsgtool -l 10.0.1.52/9430

System Sending NMSG Payloads

The following invocation of nmsgtool reads two (2) NMSG payloads [-c 2] from the binary NMSG file created in the previous example [-r ch208...]. The NMSG payloads are then sent to IPv4 unicast address 10.0.1.52 on UDP port 9430 [-s 10.0.1.52/9430]. On the sending system, we run nmsgtool as follows:

$ nmsgtool -r ch208.20150202.0110.1422839406.364843292.nmsg -c 2 -s 10.0.1.52/9430

System Receiving and Displaying NMSG Payloads

On the receiving system, the output from one (1) of the two (2) dnsqr NMSG payloads is displayed to stdout in ASCII presentation format. The other is redacted and cropped for publication.

[293] [2015-02-02 01:08:21.902736000] [1:9 base dnsqr] [e9b019b8] [] []
type: UDP_QUERY_RESPONSE
query_ip: 
response_ip: 
proto: UDP (17)
query_port: 31211
response_port: 53
id: 7644
qname: 
qclass: IN (1)
qtype: AAAA (28)
rcode: NOERROR (0)
delay: 0.182413
udp_checksum: ABSENT
[...]
[352] [2015-02-02 01:08:22.095911000] [1:9 base dnsqr] [e9b019b8] [] []
[...]

Note: The system sending NMSG payloads has the option to tune network performance, including setting the NMSG container maximum transmission unit size (note this is distinct from IP MTU buffering, and rate limiting).

For more information:

Payload Striping vs Mirroring

When multiple outputs are specified, nmsgtool defaults to striping https://en.wikipedia.org/wiki/Data_striping payloads across each output. However, nmsgtool can also be configured to mirror https://en.wikipedia.org/wiki/Disk_mirroring payloads to each output.

The following invocation acquires 100 NMSG payloads [-c 100] from SIE Channel 211, Newly Active Domains, and mirrors [--mirror] the data to two (2) outputs:

  • [-o -]: ASCII presentation format is sent to stdout and displayed on the screen
  • [-s 10.0.1.52/9430]: Binary NMSG payloads sent to a network socket, IPv4 destination address 10.0.1.52 on UDP port 9430
$ nmsgtool -C ch211 -c 100 -o - -s 10.0.1.52/9430 --mirror
[94] [2015-02-03 08:50:19.277158975] [2:5 SIE newdomain] [a1ba02cf] [] []
[...]

Acquire Data from a Network Interface or Read a PCAP File with BPF filtering

Perhaps you would like to acquire and create your own stream of NMSG payloads sourced from live network traffic. But more specifically, you only want to observe DNS traffic. When running nmsgtool, you can tell it to acquire IP datagrams directly from a network interface or it can read a PCAP file.

Additionally, an optional user-defined Berkeley_Packet_Filter (BPF) can be specified to filter packets. When acquiring data from a network interface or reading a PCAP file, nmsgtool requires the user to specify the vendor and message type so it knows how to properly encode each NMSG payload.

The following invocation of nmsgtool acquires data from a network interface [-i eth1] and filters packets for UDP port 53 [-b "udp 53"]. The NMSG payloads are encoded as base/dnsqr [-V base] and [-T dnsqr] and emitted in ASCII presentation format to stdout.

$ nmsgtool -i eth1 -V base -T dnsqr -b "udp 53"
[220] [2010-05-09 05:08:54.951124000] [1:9 base dnsqr] [00000000] [] []
[...]

Reading from a PCAP file is syntactically similar, just substitute [-i eth1] with [-p example.pcap].

Python module: pynmsg

Farsight is the maintainer of a Python module named pynmsg, a Python 2.7 extension module implemented in Cython for the nmsg C library. See Farsight’s Network Message, Volume 5: The Python Programming API for more information.

Introduction

Installation

  • To build from source, install Python 2.7 and required dependencies.
$ sudo apt-get install python2.7 python-pip cython 

Install pynmsg from source code.

$ wget "https://github.com/farsightsec/pynmsg/archive/tags/v0.4.0.tar.gz" -O "pynmsg-tags-v0.4.0.tar.gz"
$ tar xzvf "pynmsg-tags-v0.4.0.tar.gz"
$ cd pynmsg-tags-v0.4.0/
$ python setup.py build
$ sudo python setup.py install
$ cd ..

Install pywdns from source code.

$ wget "https://github.com/farsightsec/pywdns/archive/tags/v0.10.0.tar.gz" -O "pywdns-tags-v0.10.0.tar.gz"
$ tar xzvf "pywdns-tags-v0.10.0.tar.gz"
$ cd pywdns-tags-v0.10.0/
$ python setup.py build
$ sudo python setup.py install
$ cd ..

Perl module: Net::Nmsg

Perl

Introduction

Net::Nmsg is a perl binding to libnmsg, the reference implementation of the NMSG binary structured message interchange format. See https://metacpan.org/release/Net-Nmsg for additional information.

Installation

For Debian/Ubuntu, Farsight maintains a package called libnet-nmsg-perl.

$ apt-get install libnet-nmsg-perl

For FreeBSD, Net::Nmsg is available as an official binary package.

$ pkg install p5-Net-Nmsg

For other operating systems it is possible to install Net:Nmsg using CPAN. Prior to installing, it requires libpcap development header files or libpcap to be installed from source.

$ perl -MCPAN -e shell
cpan> install Bundle::CPAN
cpan> install Net::Nmsg

Some software package dependencies may ask installation questions, like hitting [enter] when asked for a mathematic expression, or entering some minimal system information when configuring IO::Socket.

Farsight Blog

Farsight blog articles related to SIE and NMSG.

Farsight blog articles related to processing SIE channels

Contacting Support

To request a demonstration of DNSDB or to inquire about a trial API key please contact the DomainTools sales team.

Appendix A – Installing NMSG Software

Compile and install from source code

Source code tarballs for the software packages below are also available from .

Example installation instructions from source on Ubuntu 16.04

Install wdns, nmsg, and sie-nmsg

  • Install dependencies from Ubuntu repositories
$ sudo apt-get install build-essential pkg-config libpcap0.8-dev libprotobuf-c-dev protobuf-c-compiler libxs-dev libyajl-dev zlib1g-dev
  • Install wdns from source code.
$ wget "https://github.com/farsightsec/wdns/archive/tags/v0.10.0.tar.gz" -O "wdns-tags-v0.10.0.tar.gz"
$ tar xzvf "wdns-tags-v0.10.0.tar.gz"
$ cd wdns-tags-v0.10.0/
$ ./configure
$ make
$ sudo make install
$ cd ..
  • Install nmsg from source code.
$ wget "https://github.com/farsightsec/nmsg/archive/tags/v0.15.1.tar.gz" -O "nmsg-tags-v0.15.1.tar.gz"
$ tar xzvf "nmsg-tags-v0.15.1.tar.gz"
$ cd nmsg-tags-v0.15.1/
$ ./configure
$ make
$ sudo make install
$ cd ..
  • Install sie-nmsg from source code.
$ wget "https://github.com/farsightsec/sie-nmsg/archive/tags/v1.2.1.tar.gz" -O "sie-nmsg-tags-v1.2.1.tar.gz"
$ tar xzvf "sie-nmsg-tags-v1.2.1.tar.gz"
$ cd sie-nmsg-tags-v1.2.1/
$ ./configure
$ make
$ sudo make install
$ cd ..
  • Run ldconfig to update the shared library cache.
$ sudo ldconfig

Additional information