Businesses looking to capitalize on digital transformation efforts are building strategies around cloud environments and cloud services—hoping to improve service connectivity to end users and reduce overhead related to on-premise data centers. For years, IT decision-makers based their cloud computing strategies predominantly on criteria such as web application services offered, pricing tiers and global data center presence, while critical areas, such as performance comparisons, were largely unavailable. This meant taking into account complex multi-cloud infrastructures combined with IT infrastructures was a real challenge for IT teams.

Last year, we introduced the industry’s first comparative look at performance metrics of three public cloud service providers: Amazon Web Services (AWS), Google Cloud Platform (GCP) and Microsoft Azure. This year, we’ve added IBM Cloud and Alibaba Cloud into the cloud application mix while also examining North American broadband ISP performance, connectivity to and from China as well as AWS’ Global Accelerator—raising the number of measurements to more than 320 million data points.

As such, the 2019 Cloud Performance Benchmark provides a unique, unbiased third-party and metric-based perspective on cloud performance and cloud monitoring as it relates to both end-user experience and back-end application architecture. In this blog, I’ll share our top findings from the study and how performance issues impact you.

Finding 1: Some clouds rely heavily on the public Internet to transport traffic while others do not. Although the tested cloud providers generally demonstrated comparable performance related to bi-directional network latency, architectural and connectivity differences do have an impact on traffic between users and certain cloud hosting regions. For example, Azure and GCP use their backbones extensively to carry user-to-hosting-region traffic. AWS and Alibaba, alternatively, heavily rely on the Internet for user traffic transport, while IBM takes a hybrid approach. Exposure to the Internet increases unpredictability in performance, creates risk for cloud strategies and raises operational complexity, so enterprises planning public cloud connectivity, should consider their organization’s tolerance for exposure to the unpredictable nature of the Internet.

cloud connectivity backbone Internet
Figure 1: Cloud connectivity falls into two camps.

Finding 2: Significant cloud performance anomalies exist depending on provider, hosting region and user locations. While the five cloud providers exhibited comparable, robust network performance across North America and Western Europe, performance exceptions surfaced in Asia and Latin America. For example, GCP exhibits 2.5x-3x the network latency in comparison to AWS, Azure, Alibaba Cloud from Europe to regions in India. In another example, network latency from Rio to GCP’s São Paulo hosting region is 6x compared to the other three cloud providers due to a suboptimal reverse path. When choosing public cloud regions, enterprises should include user-to-hosting-region performance data in their selection criteria.

Finding 3: All cloud providers, including Alibaba, pay a performance toll when crossing the Great Firewall of China. Sitting in between Chinese citizens and the global Internet is the Great Firewall of China, a sophisticated content filtering machine. Our study reveals that Internet traffic to and from China, irrespective of which cloud hosting region it is destined to or originating from, is subject to high packet loss—a characteristic that was not common across any other political or geographical boundaries.

China Great Firewall Packet Loss
Figure 2: Performance toll, in the form of packet loss, is seen for all cloud provider traffic crossing the Great Firewall of China.

Finding 4: AWS Global Accelerator doesn’t always out-perform the Internet. While the Global Accelerator uses an optimized route through AWS’ densely connected backbone, performance improvements were not uniform across the globe. In many cases, the Global Accelerator trumps the Internet connectivity path in performance, but there are also examples of negligible improvement and even cases of worse performance when compared to default AWS connectivity. Enterprises considering the benefits of the Global Accelerator should validate performance gains even developing their strategy to ensure ROI for their unique deployment.

Finding 5: US broadband ISP choice makes a difference in cloud performance. Broadband performance is relatively consistent across providers, but performance anomalies do occur, even in the mature US market. One example we observed has to do with traffic from Verizon-connected sites located in Seattle, San Jose and Los Angeles that are accessing GCP’s us-west2 region in Los Angeles. In this scenario, traffic is routed to enter the Google backbone in New Jersey before being routed back to the hosting region, which is located in Los Angeles. This sub-optimal routing results in up to 10x the expected network latency. Enterprises considering a hybrid WAN deployment should ensure sound Internet visibility so that they can detect anomalies and work collaboratively with their ISP or cloud provider to resolve them.

Verizon GCP Peering Anomaly
Figure 3: Suboptimal peering between Verizon and GCP LA.

Cloud Results May Vary

The major public cloud providers are constantly making optimizations to their networks to (hopefully) improve performance and stability for their end users. In our tests, Amazon Web Services (AWS) performance predictability metrics improved noticeably over the last year, especially in Asia, which showed a 42 percent reduction in latency variability. However, when compared to Azure and GCP, AWS still has lower performance predictability due to its extensive reliance on the Internet rather than leveraging its own backbone for delivery.

Google Cloud Platform, on the other hand, continues to favor its own backbone for user-to-hosting-region traffic delivery. While it has excellent response time in most regions, it still has some significant global gaps. Our tests revealed that traffic from Europe and Africa takes 2.5-3 times longer to get to India, going around the rest of the world instead of taking a direct route. Google Cloud also decreased visibility into their internal network, making it harder for its users to understand its network paths and performance.

Microsoft Azure continues its strong network performance based on aggressive use of its own backbone to carry user traffic to cloud hosting regions. Between 2018 to 2019, performance predictability in Sydney improved by an average of 50 percent, whereas in India, there was a 31 percent decrease in performance predictability. Despite a slight decrease year over year, Azure continues to lead in performance predictability in Asia when compared to the other cloud providers.

Added for this year’s study, both Alibaba Cloud and IBM Cloud offer comparable performance to other cloud providers. While Alibaba Cloud relies heavily on the Internet rather than its own private network backbone for the majority of user traffic transport, IBM takes a hybrid approach to traffic delivery by using its own private backbone and the public Internet, depending on which regions user traffic is accessing.

Complex Cloud Architectures Require a Data-Driven Approach

Cloud architectures are complex, and what is right for one enterprise might not be best for another. When it comes to choosing providers, regions and connectivity approaches, cloud architectures are best built through the lens of one’s own business. For enterprises embracing digital transformation and venturing into the public cloud, the metrics and insights presented in the benchmark report can serve as a data-driven guide to best practices and decision-making for operating in the cloud.

Want to know more? Get your free copy of the Cloud Performance Benchmark to read the in-depth, expert analysis on cloud provider performance and connectivity.

Subscribe to the
Network Intelligence Blog!
Subscribe
Back to ThousandEyes Blog