Dissecting Network Outages: Irish Sea Cable Cut and Yahoo!

Posted by on February 10th, 2015
March 18th, 2015

Last month I presented Dissecting Significant Outages from 2014 to several hundred networking experts at UKNOF30 in London. The talk highlighted ways to find insights from active monitoring in order to diagnose and mitigate network outages and threats. We covered the Craigslist DNS Hijack, Indosat BGP Hijack/Leak, Country Financial BGP Prepending and HSBC America DDoS Mitigation.

Yahoo! Email Outage in Europe

In addition, I presented an event that resonated with everyone in the audience: the slow performance of Yahoo! mail services across Europe that occurred in November 2014. Yahoo and its related Sky and British Telecom email services were severely impacted by a cable cut in the Irish Sea. The issue affected most European countries and was completely resolved only after 11 days! You can follow along with the timeline of this event with this interactive data set.

On November 20th at 06:45 UTC ThousandEyes observed a significant increase in the latency to reach the Yahoo data center located in Dublin, Yahoo!’s primary European data center at the time. Yahoo, Sky and British Telecom email servers started being very slow or not available at all. During the first 3 hours, availability of Yahoo!’s main site fell to below 25%, while response time (time to first byte) exceed 1 second, double the normal time.

Figure 1: Availability drop to Yahoo.com lasts for over 3 hours.
Figure 1: Availability drop to Yahoo.com lasts for over 3 hours.

Latency from locations across Ireland, the UK and Continental Europe went from an average of 35 milliseconds to more than 110 milliseconds.

Figure 2: Latency to Yahoo! from 13 European locations, with latency spiking from an average of 35ms to more than 110ms.
Figure 2: Latency to Yahoo! from 13 European locations, with latency spiking from an average of 35ms to more than 110ms.
Figure 3: Latencies exceed 80ms for all European locations.
Figure 3: Latencies exceed 80ms for all European locations.

Under normal conditions traffic from Europe and the UK flowed to Yahoo!’s data center in Dublin (“ir” in the hostname) through an undersea cable from the UK to Ireland and one from Amsterdam to Ireland. Latencies on the UK to Ireland cable averaged 20-25ms.

Figure 4: Typical traffic paths to Dublin, with the Amsterdam-Ireland cable above and the UK-Ireland cable below.
Figure 4: Typical traffic paths to Dublin, with the Amsterdam-Ireland cable above and the UK-Ireland cable below.

Finding the Culprit

Between 6:00 and 6:15 UTC on November 20th, a Cogent cable repair ship in the Irish Sea accidentally cut a key submarine cable as it was trying to fix another damaged cable, causing widespread internet outages in Ireland. Latencies on the UK-Ireland cable rose from 25ms to more than 90ms. End-to-end packet loss, however, only shows a little spike peaking at 4% around the time of the outage.

Figure 5: At 6:15 UTC locations in Europe that traversed the UK-Ireland cable were showing dramatically higher latencies over the undersea segment.
Figure 5: At 6:15 UTC locations in Europe that traversed the UK-Ireland cable were showing dramatically higher latencies over the undersea segment.
Figure 6: Low levels of packet loss after cable was cut (approximately 6:15) but before rerouting (approximately 6:45).
Figure 6: Low levels of packet loss after cable was cut (approximately 6:15) but before rerouting (approximately 6:45).

Within 30 minutes of the fiber cable cut, traffic was already being rerouted to a US data center in Lockport, New York, near Buffalo (hence the “bf” hostname). The trip across the Atlantic, back and forth, is why we saw such a dramatic increase in latencies from Europe (typically 80ms roundtrip from New York to London). In the Path Visualization view, we can see that the upstream node to the data center is marked in red due to the packet loss. This was probably due to the unexpected amount of traffic coming from Europe that still needed to be load-balanced. As a result, US users also started having poor performances in accessing their emails.

Figure 7: Rerouted paths to Buffalo NY, with high latency trans-Atlantic links in red.
Figure 7: Rerouted paths to Buffalo NY, with high latency trans-Atlantic links in red.

The performance degradation and rerouting of traffic across the Atlantic lasted for more than 2 weeks, prompting many Yahoo! email users to switch to other email providers. In addition, widespread complaints were seen on Twitter and various tech forums. Subsequent to the outage, Yahoo! has begun using additional data centers in Europe to serve email traffic, including one near Geneva.

Check out the slides or the video recording of my whole UKNOF talk to see the other outages presented, and if you haven’t taken advantage of the new ThousandEyes Lite, our recently launched free tier, you can start measuring and monitoring your own infrastructure in minutes here.

Processing...