User Guides

Security Information Exchange (SIE) Spam Email Channels

Channel 24 Spam-Full and 25 Spam-Select share information about email messages sent to email honeypot systems (also known as “spamtraps”). The data available from channel 24 Spam-Full are full copies of emails sent to spamtrap email addresses and the data available from channel 25 Spam-Select are select fields from the emails sent to channel 24. The honeypots have been configured to collect and store all email messages for analysis and they use email addresses that:

  • Have never been used to receive email, or
  • Are no longer in use and should not receive email from legitimate sources

While collecting spam sent to honeypots can be as simple as creating email addresses that have never been used, obtaining a large volume of meaningful spam information requires the creation of many email addresses on many different domains.

Farsight partners with a honeypot operator that collects email messages sent to a vast number of spamtraps. The raw email messages are sent to a Farsight system where they are analyzed to extract interesting and meaningful spam information.

The information is then encoded in a consistent and easy-to-process format, eliminating the need to understand and parse various email formats.

About Security Information Exchange (SIE)

The Security Information Exchange (SIE), from Farsight Security® Inc. (now part of DomainTools), is a scalable and adaptable real-time data streaming and information sharing platform. SIE collects and provides access to more than 200,000 observations per-second of raw data from its global sensor network. Farsight also applies unique and proprietary methods for improving usability of the data, directly sharing the refined intelligence with SIE customers and DNSDB®, one of the world’s largest passive DNS (pDNS) databases.

The diverse set of data available from SIE includes the following and is relevant and useful for practitioners in various technology roles:

  • Raw and processed passive DNS data
  • Darknet/darkspace telescope data
  • SPAM sources and URLs
  • Phishing URLs and associated targeted brands
  • Connection attempts from malware-infected systems (as seen by a sinkhole)
  • Network traffic blocked by Intrusion Detection Systems (IDS) and firewall devices

Each unique set of data in SIE is known as a channel and the data acquired from a specific channel can be customized to meet the needs of each customer, enabling you to subscribe to and access only the channels needed to solve your problem. A channel in SIE may be the result from analyzing the data or a subset of data from other channels.

Data Format for 24 Spam-Full

Channel NameSpam-Full
DescriptionFull copies of emails sent to spamtrap email addresses
Channel Number24
Bit Rate16Kb/sec
Bit Rate (Peak)55Kb/sec
Payloads55Kb/sec
Payloads (Peak)1.5/sec
Available via SIE BatchYes
SIE Batch formatNewline-Delimited JSON (ND-JSON)
Available via SIE Remote Access (SRA)Yes

Data Format for 25 Spam-Select

Channel NameSpam-Select
DescriptionSelect fields from the emails sent to Channel 24 (Spam-Full)
Channel Number25
Bit Rate16Kb/sec
Bit Rate (Peak)55Kb/sec
Payloads (Peak)1.5/sec
Available via SIE BatchYes
Available via SIE Remote Access (SRA)Yes

Using Spam-Full and Spam-Select Data

These channels use the email.proto record format.

A sample Spam-Full record looks like this:

    "time":"2021-09-28 21:01:49.598700994",
    "vname":"base",
    "mname":"email",
    "source":"f4e78b44",
    "message":
        {
        "type":"spamtrap",
        "headers":"Return-Path: <[email protected]>
X-Original-To: [email protected]
Delivered-To: [email protected]
Received: 
        from hotmail.com (unknown [112.66.246.219]) by mail.ops-netman.net (Postfix) with ESMTP id B7A13221 for <[email protected]>; Tue, 28 Sep 2021 21:01:47  +0000 (UTC)
From: [email protected]
Subject: =?GB2312?B?s/a/2rGoudhRUToxNTc5MzEzMjk=?=
To: [email protected]
Content-Type: text/plain;charset=\"GB2312\"
Content-Transfer-Encoding: 8bit
Date: Wed, 29 Sep 2021 05:01:43 +0800
X-Priority: 3
X-Mailer: Microsoft Outlook Express  6.00.2800.1106", "srcip":"112.66.246.219", "helo":"hotmail.com", "from":"[email protected]", "rcpt":["[email protected]"], "bodyurl":[]

A sample Spam-Select record looks like this:

{
    "time":"2021-09-28 21:03:16.155494987",
    "vname":"base",
    "mname":"email",
    "source":"f4e78b44",
    "message":
    {
    "type":"spamtrap",
    "headers":"Return-Path: <[email protected]>
    X-Original-To: [email protected]
    Delivered-To: [email protected]
    Received: from etc-meisai.jp (unknown [116.85.19.56]) by mail.ops-netman.net (Postfix)  with ESMTP id 49D2C221 for <[email protected]>; Tue, 28 Sep 2021 21:03:14  +0000 (UTC)
    Message-ID: <[email protected]>
    From: =?utf-8?B?77yl77y077yj5Yip55So54Wn5Lya44K144O844OT44K5?= <[email protected]>
    To: <[email protected]>
    Subject: =?utf-8?B?RVRD44Kr44O844OJ44GM5LiA5pmC5YGc5q2i44GV44KM44G+44GX44Gf?=
    Date: Wed, 29 Sep 2021 05:03:02 +0800
    Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=\"----=_NextPart_000_0474_01E73594. 16AD4380\"
    X-Priority: 3
    X-MSMail-Priority: Normal
    X-Mailer: Microsoft Outlook Express 6.00.2900.5512
    X-MimeOLE: Produced By Microsoft MimeOLE V10.0.17763.1", "srcip":"116.85.19.56", "helo":"etc-meisai.jp", "from":"=?utf-8?B?77yl77y077yj5Yip55So54Wn5Lya44K144O844OT44K5?= <[email protected]>", "rcpt":["<[email protected]>"],"bodyurl":[]
    }
}

The bodyurl field

If the spam email contains URLs, those URLs are extracted and presented in the bodyurl field. A bodyurl record can contain zero or more URLs depending on the complexity of the email.

A sample bodyurl would look like this:

"bodyurl":
[
"https://example.com/928dave1042",
"https://example.com/928dave1028",
"https://example.com/928dave1029",
"https://example.com/10240/10240562/products/0x720@1632036731746111fcf=",
"https://example.com/928dave1042>",
"https://example.com/928dave1028>",
"https://example.com/928dave1029>",
"https://example.com/kb928luisdave828-unsub>",
"https://example.com/10240/10240562/products/0x720@16320349745ebf=",
"https://example.com/10240/10240562/products/0x720@163203=",
"https://example.com/928dave1037",
"https://example.com/10240/10240562/products/0x720@1632036500=",
"https://example.com/928dave1045>",
"https://example.com/928dave1045",
"https://example.com/928dave1026tibo>",
"https://example.com/928dave1037>",
"https://example.com/file=",
"https://example.com/kb928l="
]

SIE Access Methods

Data from SIE can be accessed and acquired using the following methods:

  • Direct Connect: Connect a system to the SIE network. This 1.) requires a server to be installed in a data center where Farsight has a point of presence, and 2.) then ordering a network cross connect between your server and the SIE network. Customers can optionally, and prefer to, lease a blade server from Farsight
  • SIE Remote Access (SRA): Remotely connect to the SIE network using an encrypted tunnel from your workstation or a server in your local data center
  • SIE Batch: Provides on-demand access for downloading data from SIE channels using a RESTful API or web-based interface. You select the channel and duration of time you are interested in, and then download the data for analysis. The duration of available data is dependent on the channel, but is typically the most recent 12-18 hours

For additional information about SIE access methods, please see the SIE Technical Overview document.

Direct Connect

SIE Direct Connect allows a customer to physically connect a server to the Farsight SIE network for maximum data throughput. This can be done in one of two ways:

  • Blade Server: Pre-configured blade servers co-located in one of Farsight’s data centers that can be leased by customers for direct access to SIE channels
  • Customer Server: Customer (owned, managed, and operated) servers that can be installed in one of Farsight’s data centers and physically connected to the SIE network with a network cross-connect

If a blade server is leased from Farsight, it will be pre-installed with the essential software components needed to acquire, process, compress, buffer, and transfer data from SIE channels to the customer’s data center for additional analysis, enrichment, and storage.

If a customer uses their own server, an order can be submitted for a cross-connect to the SIE switches hosted at select Equinix data centers (Ashburn DC3 and Palo Alto SV8). An FSI account manager can help guide cross-connect provisioning details, hosting, or colocation options.

For additional information about SIE connection methods, please see the SIE Technical Overview document. A Farsight’s sales representatives is happy to share a copy of this document with you. This will help inform and guide you in understanding which connection method will work best for you.

SIE Remote Access (SRA)

SIE Remote Access (SRA) enables a customer to remotely connect to the Security Information Exchange (SIE) from anywhere on the Internet. SRA provides access to SIE channel data on customer’s local servers, allowing their analysis and processing systems to be located in their own data centers rather than physically co-located at a Farsight’s data center.

Due to the technical limitations of transporting high bitrate SIE channels across the Internet, the SRA access method is not available for all SIE channels. Please reference the SIE Channel Guide for channels that can be accessed using SRA.

SRA uses the Advanced Exchange Access (AXA) transport protocol which enables SRA sessions to perform the following:

  • Select which SIE channel or channels to monitor and acquire data from
  • Define user-specified search or filtering criteria to match IP or DNS traffic
  • Control rate-limits and other AXA parameters

The streaming search and filtering capabilities of AXA enables SRA to access and acquire meaningful and relevant data from SIE while avoiding the costs of transporting enormous volumes of data across the Internet.

Note: For high volume channels accessed using SRA, it is expected that customer’s will specify a search or filter for IP addresses and DNS domain names or hostnames of interest. The SRA service will only collect and send data matching the specified criteria across the Internet to the customer.

SIE Batch

SIE Batch provides on-demand access for downloading data from SIE channels using a RESTful API or web-based interface. You select the channel and duration of time you are interested in, and then download the data for analysis. The duration of available data is dependent on the channel, but is typically the most recent 12-18 hours. SIE Batch allows you to acquire data from SIE channel using two (2) methods:

  • API: Allows you to write tools to programmatically download data from SIE channels for analysis
  • Interactively: Web-based interface to the API that enables you to select and download SIE channel data on-demand

Advanced Exchange Access Middleware Daemon (AXAMD)

Farsight also provides a RESTful middleware layer in front of its AXA service. This service is called the AXA Middleware Daemon (AXAMD) and provides a RESTful capability that adds a streaming HTTP interface on top of the AXA toolkit. This enables web-application developers to interface with SIE using SRA. Farsight also published a command line tool and Python extension library called axamd_client . This toolkit is licensed under the Apache 2.0 license.

The Advanced Exchange Access (AXA) toolkit contains tools and a C library to bring Farsight’s real-time data and services directly from the Farsight Security Information Exchange (SIE) to the customers network.

Advanced Exchange Access Middleware Daemon (AXAMD) is a suite of tools and library code to bring Farsight’s real-time data and services directly from the Farsight Security Information Exchange (SIE) to the customers network.

Due to the technical limitations of transporting high bitrate SIE channels across the Internet, the AXAMD access method is not available for all SIE channels.

Additional Information

Additional information about the creation and use of honeypots/spamtraps is available at: