How to Analyze a DDoS Attack on DNS Infrastructure

Posted by on June 23, 2016

On May 16th, 2016, NS1, a DNS and traffic management vendor, was a victim of a widespread DDoS attack that affected a number of prominent websites and services, including Yelp and Alexa. The DDoS attack was broadly targeted at NS1, affecting their DNS infrastructure, the hosting provider of their website and other online-facing assets. While over the last month there have been multiple smaller attacks, in today’s blog post we will take a closer look at the symptoms and characteristics of the May 16th DDoS attack and the mitigation techniques involved.

Yelp Cries for Help: Early Signs

Starting around 9:00 am on Monday, May 16th, Yelp showed signs of an outage, where users trying to access the website encountered the following error message:

Figure-1
Figure 1: Yelp.com service interruption pointing to a DNS resolution error.

We immediately sensed that something bigger was brewing and quickly set up Page Load and DNS Server tests to monitor yelp.com and the domain name, respectively. As expected, HTTP server availability was poor, dipping to a low 10% across 20 globally distributed Cloud Agents, as shown in Figure 2.

Figure-2
Figure 2: HTTP server availability of Yelp.com dipped as low as 10%.

During performance tests, our agents query for ‘fresh’ non-cached DNS records. Over 50% of our agents were complaining of DNS resolution errors, which was consistent with what Yelp’s error page was reporting.

Figure-3
Figure 3: DNS resolution errors observed by the majority of our Cloud Agents.

Peeling through the layers, the DNS server tests showed signatures of an impending DDoS attack against NS1’s DNS infrastructure. DNS Server tests targeting A records of a domain are executed against the authoritative nameservers for that domain. For example, when we run a DNS server test for yelp.com’s A record, we are monitoring the availability and network performance for each of the authoritative nameservers for yelp.com: dns1.p06.nsone.net, dns2.p06.nsone.net, dns3.p06.nsone.net and dns4.p06.nsonet.net. During the outage, the average availability of the authoritative name servers dropped to 22%, and all four servers returned increasing numbers of errors, as seen in Figure 4. Feel free to follow along by using our share link to the DNS Server test.

Figure-4
Figure 4: DNS Server tests to Yelp.com A indicate the increasing errors on the authoritative name servers.

Interestingly, if we take a look at the map view of the Cloud Agents querying these servers in Figure 5, the outage looks focused on Europe and North America. Given that the NS1’s DNS infrastructure is anycast-based, and anycast traffic tends to flow to the nearest Autonomous System advertising the IP prefix, this could mean that attack traffic was created primarily in these regions.

Figure-5
Figure 5: The outage was confined to Europe and North America, suggesting that the attack was initiated primarily in these regions.

NS1 on Fire

At this point, it became obvious that NS1, the parent company that owns the nsone domain, was going through a DDoS attack. This was later confirmed by NS1 as well.

NS1 is an anycast DNS provider. Anycast is an addressing and routing methodology where multiple physical endpoints are logically denoted by a single IP address. While the distributed nature of anycast makes the network resilient to DDoS attacks, they are obviously not foolproof. Even during the peak of the attack, a handful of our Cloud Agents were able to resolve Yelp.com (as shown in Figure 6), indicating that while the DNS infrastructure was under attack, it was not completely disrupted.

Figure-6
Figure 6: A few of the Cloud Agents were able to resolve Yelp.com to an IP address, highlighting the anycast nature of NS1’s DNS infrastructure.

DDoS attacks are not necessarily directed against a single target. The nature of DDoS attacks are varied and can be widespread, not limited to only a single asset or region. In addition to targeting the DNS infrastructure, the attackers also targeted the hosting providers of ns1.com. A Page Load test to ns1.com indicated that while there were no DNS-related errors, there were many SSL and Receive errors, as shown in Figure 7. In this situation, NS1 was a clear target, with all assets belonging to NS1 falling victim to the attacks.

Figure-7
Figure 7: Page Load test to ns1.com under duress, indicating that the DDoS attack was targeted towards the hosting provider of NS1.com as well.

Mitigating a DDoS Attack

Being a victim of a DDoS attack creates havoc by interrupting services and requires deep-rooted and complex mitigation strategies. Being attacked is only half the problem — cleaning up is a whole other story. Organizations use a combination of different mitigation strategies based on the type and nature of the attack. It can involve on-premises, appliance-based filtering approaches, partnering with ISPs to filter or blackhole traffic, or cloud-based mitigation techniques where enterprises use DNS or BGP to reroute traffic to a third-party mitigation vendor.

NS1 very quickly responded to the DDoS attack by incorporating a mix of filtering and cloud-based mitigation techniques. Starting at about 10:00am PST, we saw several BGP route changes affecting the NS1 authoritative name server prefixes. Before the attack, NS1 was peering with Telia, Cogent and Tinet. Reacting to the DDoS attack, NS1 started peering with Zenedge, a SaaS-based DDoS mitigation vendor. Around 70% of our BGP monitors observed path changes and a new peer, as seen in Figure 8.

Figure-8
Figure 8: BGP visualizations show NSONE peering with Zenedge, a DDoS mitigation vendor.

Cloud-based mitigation techniques are becoming a more popular and reliable way to mitigate DDoS attacks. If you are interested in learning more about DDoS mitigation, take a look at another example of a cloud-based DDoS mitigation technique employed by a large financial organization.

Apart from engaging a DDoS mitigation vendor, NS1 also resorted to ACL based filtering to block out the attack. End-to-end network tests targeting the authoritative name servers on TCP port 53 indicated 100% packet loss at the edge nodes, as shown in Figure 9. The 100% packet loss persisted over the next two days, an indication that cleaning up DDoS attacks is not only complex, but also time consuming.

Figure-9
Figure 9: NS1 was blocking all TCP port 53 packets resulting in 100% packet loss at the edge.

At the end of the day, you can’t completely prevent a DDoS attack. Organizations employ multi-level security systems and deploy complex networks in the hopes that they won’t be attacked. But attackers always find new methods to breach networks. However, there are ways by which you can minimize the impact of these attacks on your networks. For example, if DNS is a critical part of your infrastructure, then redundancy is key. Distributing your DNS services among two or more DNS providers is recommended. You can also secure your DNS infrastructure by adhering to some common best practices.

Visibility into an ongoing DDoS attack is critical. Monitoring your network for potential threats allows you to be immediately notified of critical changes and service interruptions and also gives you insight into whether mitigation strategies are going according to plan. Find out more about using ThousandEyes to monitor and analyze DDoS attacks with our white paper, ThousandEyes for DDoS Attack Analysis, or immediately get started with monitoring BGP and DDoS attacks with a free version of ThousandEyes.

Processing...