Border Gateway Protocol (BGP) is a key component of Internet routing and is responsible for exchanging information on how Autonomous Systems (ASes) can reach one another. When BGP issues occur, inter-network traffic can be affected, from packet loss and latency to complete loss of connectivity. This makes BGP an important protocol for network operators to be able to troubleshoot.
Using BGP route visualizations from real events, we’ll illustrate four scenarios where BGP may be a factor to consider while troubleshooting:
- Peering changes
- Route flapping
- Route hijacking
- DDoS mitigation
One common scenario where BGP comes into play is when a network operator changes peering with ISPs. Peering can change for a variety of reasons, including commercial peering relationships, equipment failures or maintenance. During and after a peering change, it is important to confirm reachability to your service from networks around the world. ThousandEyes presents reachability and route change views, as well as proactive alerts to help troubleshoot issues that may occur.
An example of a peering change in action is with Github and their upstream ISPs. In Figure 1, we see that the peering relationship between Github (AS 36459) changes, as routes are withdrawn to Level 3 (AS 3356 and AS 3549). This is important data when tracking down network performance, which can be adversely affected by major or frequent peering changes.
Route flapping occurs when routes alternate or are advertised and then withdrawn in rapid sequence, often resulting from equipment or configuration errors. Flapping often causes packet loss and results in performance degradation for traffic traversing the affected networks. Route flaps are visible in ThousandEyes as repeating spikes in route changes on the timeline.
While monitoring Ancestry.com, a popular genealogy website, we noticed a route flap with their upstream providers. In this case, shown in Figure 2, the route flap with XO Communications lasted for about 15 minutes and disrupted connectivity from networks such as GTT and NTT while others such as Level 3 and Cogent that peered with American Fiber had no issues. For an in-depth look at this route flapping event, read our post on Monitoring BGP Routes with ThousandEyes.
Route hijacking occurs when a network advertises a prefix that it does not control, either by mistake or in order to maliciously deny service or inspect traffic. Since BGP advertisements are generally trusted among ISPs, errors or improper filtering by an ISP can be propagated quickly throughout routing tables around the Internet. As an AS operator, route hijacking is evident when the origin AS of your prefixes changes or when a more specific prefix is broadcast by another party. In some cases, the effects may be localized to only a few networks, but in serious cases hijacks can affect reachability from the entire Internet. You can set alerts in ThousandEyes to notify you of route changes and new subprefixes.
In early April 2014, Indosat, a large Indonesian telecom, incorrectly advertised a majority of the global routing table, in effect claiming that their network was the destination for a large portion of the Internet. The CDN Akamai was particularly hard hit, with a substantial portion of its customers’ traffic rerouted to the Indosat network for nearly an hour. We can see this hijack play out with a test for one of Akamai’s customers, Paypal. Figure 3 shows the hijack in progress, with two origin ASes, the correct one for Akamai (AS 16625) and the incorrect one for Indosat (AS 4761), which for approximately 30 minutes was the destination for 90% of our public BGP vantage points. While this hijack was not intentional, the effects are nonetheless serious.
For companies using cloud-based DDoS mitigation providers, such as Prolexic and Verisign, BGP is a common way to shift traffic to these providers during an attack. Monitoring BGP routes during a DDoS attack is important to confirm that traffic is being routed properly to the mitigation provider’s scrubbing centers. In the case of DDoS mitigation, you’d expect to see the origin AS for your prefixes change from your own AS name to that of your mitigation provider.
We can see this origin AS change in a real example from a global bank that was subject to a DDoS. In Figure 4 we can see the routes to the bank’s AS are withdrawn and new routes to the cloud-based DDoS mitigation vendor are advertised. The process then happens in reverse at the end of the attack when mitigation is turned off. Read more about Using BGP to Reroute Traffic During a DDoS attack.
Monitoring and BGP troubleshooting is a crucial part of managing most large networks. Visibility of BGP route changes and reachability is a powerful tool for operators to correlate events and diagnose root causes. For more information about tracking and correlating BGP changes with ThousandEyes, check out our on-demand webinar, Visualizing and Troubleshooting BGP, or sign up for a free version of ThousandEyes.