Farsight Security’s Security Information Exchange (SIE) passively observes and collects unique DNS answers based on analysis of the DNS messages’ associated RRname, RRtype, Bailiwick, and Rdata. After the data is sent to SIE, it then passes through a series of processing phases, in what is called a waterfall model, to prepare for insertion to the DNSDB historical database. The SIE “Passive DNS Channels” enable customers to access, acquire, and process these incoming observations in near real-time at various phases of the processing model.

Table of Contents

About Security Information Exchange (SIE)

The Security Information Exchange (SIE), from Farsight Security® Inc., is a scalable and adaptable real-time data streaming and information sharing platform. SIE collects and provides access to more than 200,000 observations per-second of raw data from its global sensor network. Farsight also applies unique and proprietary methods for improving usability of the data, directly sharing the refined intelligence with SIE customers and DNSDB®, one of the world’s largest passive DNS (pDNS) databases.

The diverse set of data available from SIE includes the following and is relevant and useful for practitioners in various technology roles:

Each unique set of data in SIE is known as a channel and the data acquired from a specific channel can be customized to meet the needs of each customer, enabling you to subscribe to and access only the channels needed to solve your problem. A channel in SIE may be the result from analyzing the data or a subset of data from other channels.

Why Passive DNS (pDNS)?

DNS is a critical component of Internet communication and almost all Internet transactions begin with a DNS query and response.

DNS serves as early warning and detection solution for phishing, spam, malicious and suspicious behaviors, and other attacks. DNS intelligence is considered the only source of “ground truth” information for the Internet.

Passive DNS (pDNS) begins with raw DNS traffic that is observed and collected by passive DNS sensors and contributed to Farsight’s Security Information Exchange (SIE) by pDNS sensor operators. Once the data is sent to SIE, the data then passes through a series of processing phases:

  1. Deduplication: Channel 207, DNSDB Deduplicated Data
  2. Verification: Channel 208, DNSDB Verified Data
  3. Filtering: Channel 204, Processed DNS Data (which used by DNSDB)

The end result is the highest-quality and most comprehensive passive DNS database, DNSDB, of its kind-with more than 100 billion unique DNS resource records since 2010.

Farsight Security’s mission is to make the Internet a safer place. We provide security solutions that empower customers with meaningful and relevant intelligence. This information provides customers with insights about the network configuration of a threat and the surrounding network on the Internet for improving the value and impact of threat intelligence and research.

The Security Information Exchange (SIE), from Farsight Security Inc., is designed with privacy in mind. The passive DNS (pDNS) sensors do not collect Personally Identifiable Information (PII) from client resolvers (also known as stub) by deliberately collecting between recursive resolvers and authoritative servers.

The data from SIE enables security professionals to accurately identify, map, and protect their networks from cybercriminal activity by providing global visibility. It provides immediate access to a real-time global sensor network without the need to develop or deploy your own data collection infrastructure.

About SIE Passive DNS Channels

Passive DNS (pDNS) begins with raw DNS traffic that is observed and collected by passive DNS sensors and contributed to Farsight’s Security Information Exchange (SIE) by pDNS sensor operators. Once the data is sent to SIE, the data then passes through a series of processing phases, starting with deduplication.

Four (4) SIE channels are available that allow access to DNS intelligence observed at different phases of the “waterfall model” processing. These channels are:

Channel Name Description
204 Processed DNS Data Passive DNS observations that passed through the waterfall processing phases1.
206 DNSDB Rejected Records (Chaff) Passive DNS observations that were malformed, unsuccessful queries, or otherwise fail the verification process.
207 DNSDB Deduplicated Data Passive DNS observations that passed through the deduplication phase and will be sent to the verification phase of the waterfall.
208 DNSDB Verified Data Passive DNS observations that passed through the verification phase and will be passed on to the filtering phase.

1: Filtering removes domains related to some wildcards, DNS VPNs, DNS block lists, and auto-generated names from IP addresses.

Data is processed by Security Information Exchange (SIE) in what is called a waterfall model. The following diagram can help inform and guide you in understanding the data that is available from the various SIE DNS channels. Farsight’s Solution Architect’s (SAs) are happy to discuss criteria for selecting the appropriate SIE Channels with customers.

SIE passive DNS (pDNS) Waterfall Model

For more information on these terms, please reference the ISC Passive DNS Architecture document.

There are articles with more details about Passive DNS (pDNS) listed in the Additional Information section below. For more info on RRname, RRtype and Rdata, see DNS Terminology: RRname (RRset), Rdata, RRtype and Bailiwick.

SIE Channel 207 (DNSDB De-duplicated Data) – Deduplication Phase

Farsight’s passive DNS (pDNS) solution observes and collects unique DNS answers based on analysis of the associated RRname, RRtype, RData, and bailiwick. Since the raw DNS data includes many duplicate answers for common DNS questions, that may be observed many times per-second, SIE deduplicates the data in the first phase of the waterfall model.

The deduplication phase performs data reduction and exports the unique DNS records with counts for the number of times each unique DNS answer was observed in the data.

However, some types of DNS data are also filtered at this state, such as DNS messages that have a bad checksum value or data that has been delayed for more than an hour. Some of the data is sent to Channel 206, DNSDB Rejected Records (also known as Chaff), while other records are discarded.

In the following table, the average and max bitrates (in bits per second) indicate the volume of data recently observed on Channel 207 DNSDB De-duplicated Data. Due to variations in the volume of data, it is possible for spikes above that value.

Note: Quoted bitrates and payloads are representative of SIE traffic as of June 2021.

Channel Name DNSDB De-duplicated Data
Channel Number 207
Description Passive DNS observations after the deduplication processing phase and immediately prior to the verification phase.
Schema SIE:dnsdedupe
Bit Rate 120Mb/sec
Bit Rate (Max) 150Mb/sec
Payloads 90K/sec
Payloads (Max) 130K/sec
SIE Direct Connect Yes
SIE Direct Connect Data Format NMSG
SIE Remote Access (SRA) No
SRA Data Format N/A
SIE Batch Yes
SIE Batch Data Format NMSG
Advanced Exchange Access Middleware Daemon (AXAMD) No
AXAMD Data Format N/A

SIE Channel 208 (DNSDB Verified Data) – Verification Phase

Rogue, malicious, or misconfigured name servers may respond with misleading resource record information for a domain or domains. The verification phase ensures that only bailiwick-appropriate DNS data is passed on to Channel 208, DNSDB Verified Data. DNS data that fails bailiwick verification is sent to Channel 206, DNSDB Rejected Records (also known as Chaff).

For more information on bailiwicks and how they are used in DNSDB, see What is a Bailiwick.

In the following table, the average and max bitrates (in bits per second) indicate the volume of data recently observed on Channel 208 DNSDB Verified Data. Due to variations in the volume of data, it is possible for spikes above that value.

Channel Name DNSDB Verified Data
Channel Number 208
Description Passive DNS observations after the verification processing phase and prior to filtering.
Schema SIE:dnsdedupe
Bit Rate 60Mb/sec
Bit Rate (Max) 90Mb/sec
Payloads 45K/sec
Payloads (Max) 65K/sec
SIE Direct Connect Yes
SIE Direct Connect Data Format NMSG
SIE Remote Access (SRA) No
SRA Data Format N/A
SIE Batch Yes
SIE Batch Data Format NMSG
Advanced Exchange Access Middleware Daemon (AXAMD) No
AXAMD Data Format N/A

For more information on bailiwicks and how they are used in DNSDB, see What is a Bailiwick.

SIE Channel 204 (Processed DNS Data) – Filtering Phase

The next phase is filtering and the final phase in the waterfall model processing. In this phase, various categories of DNS data are filtered, which may including the following:

Some DNS records are filtered at this phase and sent to Channel 206, DNSDB Rejected Records (also known as Chaff), while others are discarded, depending on factors beyond the scope of this document.

DNS records that are not filtered are sent to Channel 204, Processed DNS Data, which is the channel also used by DNSDB.

In the following table, the average and max bitrates (in bits per second) indicate the volume of data recently observed on Channel 208 Processed DNS Data. Due to variations in the volume of data, it is possible for spikes above that value.

Channel Name Processed DNS Data
Channel Number 204
Description Passive DNS observations after deduplication, verification, and filtering.
Schema SIE:dnsdedupe
Bit Rate 37Mb/sec
Bit Rate (Max) 70Mb/sec
Payloads 27K/sec
Payloads (Max) 50K/sec
SIE Direct Connect Yes
SIE Direct Connect Data Format NMSG
SIE Remote Access (SRA) Yes
SRA Data Format NMSG
SIE Batch Yes
SIE Batch Data Format NMSG
Advanced Exchange Access Middleware Daemon (AXAMD) No
AXAMD Data Format N/A

SIE Channel 206 (DNSDB Rejected Records) – Rejected Records

DNS data that is rejected for various reasons during the passive DNS “waterfall model” processing phases is sent to Channel 206 DNSDB Rejected Records (Chaff). DNS data may be rejected due to malformed data, unsuccessful queries, fails bailiwick verification, or otherwise fail the verification process.

Channel Name DNSDB Rejected Records
Channel Number 206
Description Passive DNS observations that were malformed, unsuccessful queries, or otherwise fail the verification process.
Schema SIE:dnsdedupe
Bit Rate 28Mb/sec
Bit Rate (Peak) 40Mb/sec
Payloads 20K/sec
Payloads (Peak) 25K/sec
SIE Direct Connect Yes
SIE Direct Connect Data Format NMSG
SIE Remote Access (SRA) Yes
SRA Data Format NMSG
Advanced Exchange Access Middleware Daemon (AXAMD) Yes
AXAMD Data Format NMSG
SIE Batch Yes
SIE Batch Data Format NMSG

Using Passive DNS Channel Data

Data acquired from Channel 204, 206, 207, or 208 is returned in NMSG format for all access methods. NMSG is an adaptable container format that allows for consistent or variable message types.

The nmsgtool program is a tool for acquiring a variety of different inputs, like data streams from the network, capturing data from network interfaces, reading data from files, or even standard input and making NMSG payloads available to one or more outputs. The nmsgtool program can acquire data from SIE Channel 220 and convert it to a ND-JSON (newline-delimited JSON) text format for display or additional processing and analysis. nmsgtool is a program written by Farsight and released as open source.

See the following pages for instructions on how to install software packages for a specific distribution.

After data for Channel 220 has been acquired, written, and saved to a file, you need to decode it to ND-JSON using nmsgtool. The [-r pdns-data.nmsg] option tells nmsgtool to read binary NMSG data from a file, [-c 1] limits the output to single NMSG payload, and [-J -] displays the record in ND-JSON format to stdout, which is typically the screen.

$ nmsgtool -r pdns-data.nmsg -c 1 -J -
(returned ND-JSON record)

Once the data has been formatted to ND-JSON, a record from the DNS Changes channel will look similar to the following. The following output can be sent to another tool for additional processing.

{"time":"2020-04-06 21:48:59.039279480","vname":"SIE","mname":"dnsdedupe",
"message":{"type":"EXPIRATION", "count":2,"time_first":"2020-04-06 18:47:22",
"time_last":"2020-04-06 18:47:22","bailiwick":"example.com.",
"rrname":"www.example.com.",
"rrclass":"IN","rrtype":"CNAME","rrttl":3600,"rdata":["dns.example.com."]}}

If you want to display a pretty-printed output of ND-JSON formatted records, we recommend using jq, a lightweight and flexible command-line JSON processor.

The open source software package is available on Debian and can be installed using $ sudo apt-get install jq. The output from nmsgtool in JSON format [-J -] can be piped to jq using the following:

$ nmsgtool -r pdns-data.nmsg -c 1 -J - | jq -r '.'
{
  "time": "2020-04-06 21:48:59.039279480",
  "vname": "SIE",
  "mname": "dnsdedupe",
  "message": {
    "type": "EXPIRATION",
    "count": 2,
    "time_first": "2020-04-06 18:47:22",
    "time_last": "2020-04-06 18:47:22",
    "bailiwick": "example.com.",
    "rrname": "www.example.com.",
    "rrclass": "IN",
    "rrtype": "CNAME",
    "rrttl": 3600,
    "rdata": [
      "dns.example.com."
    ]
  }
}

Data Format for SIE Passive DNS Channels – 204, 206, 207, and 208

The SIE NMSG dnsdedupe schema is a DNS Query and Response resource record (RR) schema that observes and collects data returned from a query. You can find the definition for the schema here:

The data available from these channels contain NMSG SIE:dnsdedup type messages that include the following fields:

The NMSG header includes the following fields:

KEY VALUE
time Time when hostname was first observed in Channel 204.
vname Vendor Name, SIE.
mname Message type, dnsdedupe.
group2 Reason DNS message was rejected.
message Embedded JSON record describing the observed DNS Query and Response RR.

2: Field is only present in messages sent to Channel 206 DNSDB Rejected Records (Chaff).

The embedded NMSG message payload is JSON formatted and includes the following fields:

KEY VALUE
type Types are INSERTION, EXPIRATION, or CHAFF.
count Number of times an RRset was observed since the last message was sent to the channel.
time_first3 4 Indicates first time the RRset was observed by pDNS.
time_last3 4 Indicates last time the RRset was observed by pDNS.
response_ip5 IP address of the name server replying to the query.
rrname Domain name or hostname of the query observed by pDNS or extracted from a zone file import.
rrclass RR CLASS is always “Internet (IN)”, which is decimal value “1”.
rrtype RR TYPE describes the type of RR, e.g., A(1), NS(2), CNAME(5).
rrttl Time to live (TTL) of the RR.
rdata Data that describes the RR type (may repeat).
bailiwick6 The domain under which the RRset answer was given7.

3: Unix epoch timestamps with second granularity in UTC.

4: Field is not present if the RRset was only observed from a zone file import.

5: Field always exists in Channel 207 and optional in Channels 204 and 208. If the answer was returned from one (1) name server, it lists bailiwick instead of response_ip.

6: Field always exists in channels 204 and 208. It is not returned in Channel 207.

7: For example, an authoritative generic TLD (gTLD) name server for “com.” may respond with different answers for the same query than the authoritative name servers for “farsightsecurity.com.” would respond with.

Understanding Passive DNS INSERTION & EXPIRATION Mesages

DNS data sent to channel DNSDB Deduplicated Data (207), DNSDB Verified Data (208), or Processed DNS Data (204) will be either INSERTION or EXPIRATION type messages.

The DNSDB Rejected Records (206) channel carries SIE:dnsdedupe records of type CHAFF along with other NMSG message types. Discussion of those message types is beyond the scope of this document.

To understand what INSERTION and EXPIRATION mean, we need to discuss how deduplication is implemented in SIE. During processing, the waterfall model maintains a cache table of observed RRsets as a large ring buffer in memory.

When DNS data is received by SIE, the cache table is checked to see if the RRset already exists. If the RRset exists in the cache table, the cache entry’s count is incremented and the time_last field is updated, and the record discarded as a duplicate. If the RRset does not exist in the cache table, the record is inserted into the cache, and an “INSERTION” record is sent from the deduplicator to the next phase of processing. This causes the oldest record in the ring buffer to be expired from the cache table, and an “EXPIRATION” record is sent from the deduplicator to document the removal. These records are broadcast to the DNSDB Deduplicated Data (207) channel and for the verification phase of the waterfall mode.

If you are primarily interested when an RRset is first observed, you can focus on “INSERTION” records and if you are interested in how often an RRset is observed, you should monitor “EXPIRATION” records.

Example Message – INSERTION Record

The time_first and time_last fields for INSERTION records are always the same and the count is always 1. In the following example, the query was received from IP address 10.10.10.10 (which is acting as the authoritative name server for com.) and the message indicates an SOA record was observed in the response. rrttl displays Time to Live (TTL) value for the record that would be used when caching the data, and rdata is the data returned for the query.

{
  "time": "2020-04-06 22:39:55.429865036",
  "vname": "SIE",
  "mname": "dnsdedupe",
  "message": {
    "type": "INSERTION",
    "count": 1,
    "time_first": "2020-04-06 22:38:48",
    "time_last": "2020-04-06 22:38:48",
    "response_ip": "10.10.10.10",
    "bailiwick": "com.",
    "rrname": "com.",
    "rrclass": "IN",
    "rrtype": "SOA",
    "rrttl": 86400,
    "rdata": [
      "dns.example.com. dns2.example.com. 1586212699 1800 900 604800 86400"
    ]
  }
}

Example Message – EXPIRATION Record

The following example message is for a AAAA resource record. The data returned in the rdata field are the IPv6 addresses for the domain in the rrname field, which is www.example.com.. With the site acting as its own authoritative name server in the bailiwick. Since count is 1, time_first will match time_last, indicating only one query was seen before record expired from the hash table. If value of count was more than 1, time_last may or may not be the same as time_first.

{
  "time": "2020-04-06 22:39:57.420893762",
  "vname": "SIE",
  "mname": "dnsdedupe",
  "message": {
    "type": "EXPIRATION",
    "count": 1,
    "time_first": "2020-04-06 19:32:42",
    "time_last": "2020-04-06 19:32:42",
    "bailiwick": "www.example.com.",
    "rrname": "www.example.com.",
    "rrclass": "IN",
    "rrtype": "AAAA",
    "rrttl": 120,
    "rdata": [
      "2001:db8::1"
      "2001:db8::2"
      "2001:db8::3"
      "2001:db8::a"
      "2001:db8::b"
      "2001:db8::c"
    ]
  }
}

Example Message from SIE Channel 206 (DNSDB Rejected Records)

An example CHAFF message follows. Messages sent to Channel 206 DNSDB Rejected Records include an additional group field (numerical value), which provides a reason why the record was rejected. Channel 206 may include NMSG messages that are not SIE:dnsdedupe and were rejected due to dns_parse_failure or dns_udp_truncated, but discussion of these messages is out of scope for this document. Otherwise the records observed in this Channel behave the same as INSERTION records.

{
  "time": "2020-04-06 22:46:59.036477589",
  "vname": "SIE",
  "mname": "dnsdedupe",
  "group": 130,
  "message": {
    "type": "CHAFF",
    "count": 1,
    "time_first": "2020-04-06 22:45:39",
    "time_last": "2020-04-06 22:45:39",
    "response_ip": "68.142.254.15",
    "bailiwick": "bailiwick.example.com.",
    "rrname": "www.example.com.",
    "rrclass": "IN",
    "rrtype": "AAAA",
    "rrttl": 60,
    "rdata": [
      "10.10.10.10"
    ]
  }
}

DNS Terminology: RRname (RRset), Rdata, RRtype and Bailiwick

There are DNS terms used in this document that my be unfamiliar to the reader. Definitions and links to additional information for these terms follows:

Example DNS Resource Record: RRname, RRclass, RRtype, and Rdata

In the example DNS resource record that follows, RRname (left-side) refers to www.farsightsecurity.com and Rdata (right-side) refers to 66.160.140.81 or the IP address. Rdata can also refer to a Fully Qualified Domain Name (FQDN) such as info.farsightsecurity.com. See RRset and Rdata Demystified for additional information.

RRname/RRset (Left-Side) RRclass RRtype Rdata (Right-Side)
www.farsightsecurity.com IN A 66.160.140.81

SIE Access Methods

Data from SIE can be accessed and acquired using the following methods:

For additional information about SIE access methods, please see the SIE Technical Overview document.

Direct Connect

SIE Direct Connect allows a customer to physically connect a server to the Farsight SIE network for maximum data throughput. This can be done in one of two ways:

If a blade server is leased from Farsight, it will be pre-installed with the essential software components needed to acquire, process, compress, buffer, and transfer data from SIE channels to the customer’s data center for additional analysis, enrichment, and storage.

If a customer uses their own server, an order can be submitted for a cross-connect to the SIE switches hosted at select Equinix data centers (Ashburn DC3 and Palo Alto SV8). An FSI account manager can help guide cross-connect provisioning details, hosting, or colocation options.

For additional information about SIE connection methods, please see the SIE Technical Overview document. A Farsight’s sales representatives is happy to share a copy of this document with you. This will help inform and guide you in understanding which connection method will work best for you.

SIE Remote Access (SRA)

SIE Remote Access (SRA) enables a customer to remotely connect to the Security Information Exchange (SIE) from anywhere on the Internet. SRA provides access to SIE channel data on customer’s local servers, allowing their analysis and processing systems to be located in their own data centers rather than physically co-located at a Farsight’s data center.

Due to the technical limitations of transporting high bitrate SIE channels across the Internet, the SRA access method is not available for all SIE channels. Please reference the SIE Channel Guide for channels that can be accessed using SRA.

SRA uses the Advanced Exchange Access (AXA) transport protocol which enables SRA sessions to perform the following:

The streaming search and filtering capabilities of AXA enables SRA to access and acquire meaningful and relevant data from SIE while avoiding the costs of transporting enormous volumes of data across the Internet.

Note: For high volume channels accessed using SRA, it is expected that customer’s will specify a search or filter for IP addresses and DNS domain names or hostnames of interest. The SRA service will only collect and send data matching the specified criteria across the Internet to the customer.

SIE Batch

SIE Batch provides on-demand access for downloading data from SIE channels using a RESTful API or web-based interface. You select the channel and duration of time you are interested in, and then download the data for analysis. The duration of available data is dependent on the channel, but is typically the most recent 12-18 hours. SIE Batch allows you to acquire data from SIE channel using two (2) methods:

Advanced Exchange Access Middleware Daemon (AXAMD)

Farsight also provides a RESTful middleware layer in front of its AXA service. This service is called the AXA Middleware Daemon (AXAMD) and provides a RESTful capability that adds a streaming HTTP interface on top of the AXA toolkit. This enables web-application developers to interface with SIE using SRA. Farsight also published a command line tool and Python extension library called axamd_client. This toolkit is licensed under the Apache 2.0 license.

The Advanced Exchange Access (AXA) toolkit contains tools and a C library to bring Farsight’s real-time data and services directly from the Farsight Security Information Exchange (SIE) to the customers network.

Advanced Exchange Access Middleware Daemon (AXAMD) is a suite of tools and library code to bring Farsight’s real-time data and services directly from the Farsight Security Information Exchange (SIE) to the customers network.

Due to the technical limitations of transporting high bitrate SIE channels across the Internet, the AXAMD access method is not available for all SIE channels.

Additional Information

About Farsight Security

Farsight Security, Inc. is the world’s leading provider of historical and real-time DNS intelligence solutions. We enable security teams to qualify, enrich, and correlate all sources of threat data and ultimately save time when it is most critical - during an attack or investigation. Our solutions provide enterprise, government, and security industry personnel and platforms with unmatched global visibility, context, and response. Farsight Security is headquartered in San Mateo, California, USA. To learn more about how we can empower your security, threat, and intelligence platforms and security organization with Farsight Security passive DNS (pDNS) and threat intelligence solutions, please visit us at www.farsightsecurity.com or follow us on Twitter at @FarsightSecInc.