The Evolution of Network Monitoring

Posted by on July 1, 2014

In the last 10 years we’ve seen a radical change in the way applications are delivered in enterprise environments. Workstations and desktops have been replaced with mobile devices such as phones and tablets or thin clients that connect to centralized VDI servers. On premise bare metal servers have been virtualized and consolidated, and applications that used to be on the data center (e.g. CRM) are now in the cloud (e.g. salesforce.com). The enterprise data center that used to be the hub for application delivery is now just another segment in the application delivery chain. The Internet has become the new hub, connecting a myriad of different clients to an ever growing number of SaaS applications. Network monitoring technologies that worked in the 2000s, most of them clunky boxes, have now a diminished role in this new paradigm. Current environments are much more complex, diverse and distributed, and require a new way to do network monitoring.

10 Years Ago: The Data Center as the Hub

The network enterprise architectures of a decade ago were simple: employees in branch offices/campus used workstations to access applications hosted in the enterprise data center (Figure 1). Branch offices connected to data centers through point-to-point T1 links and only a small fraction of traffic travelled upstream through Internet Service Providers. Critical apps were all deployed on-premises. Network monitoring solutions were also themselves fully deployed on-premises and usually consisted of hardware-based solutions that analyzed traffic from a SPAN port of a switch.

Enterprise Network 10 Years Ago
Figure 1: Enterprise network 10 years ago

In this environment, most of the network flows only had to travel a small number of hops (e.g. 4) on the journey from the client to the server, which made fault isolation easier for a few reasons:

  • the number of hops traversed by packets was short
  • network traffic stayed within the same administrative domain
  • clients were static, mostly workstations connected to the network through wired Ethernet cables

Identifying performance degradation was straightforward based on, for example, the number of out of order TCP packets between endpoints or increased round trip time.

In addition to traffic analysis products, SNMP-based solutions provided a network device-centric view of the network, not necessarily aligned with application performance. So for example, if an interface of a switch went down, an alert would be triggered, even if there was no critical traffic passing through the interface. This type of device-centric alerting could get very noisy and detached from real application performance. In most cases, network engineers had to manually correlate the end-to-end performance inferred from traffic analysis with per-device data provided by SNMP, often involving products from different vendors.

Today: The Internet as the Hub

Fast forwarding to today, the picture of the enterprise network is radically different (Figure 2). End devices are now mobile and can connect to the enterprise network at the branch using Wi-Fi. Users often telecommute and work from their home office connecting directly to the Internet using their local ISP, or working from public Wi-Fi hotspots. In most cases traffic is still backhauled through the corporate data center through VPN tunnels. T1 point-to-point circuits have been replaced by a multi-hop MPLS WAN that is easier to scale and expand. Applications that were hosted in the data center are now served in the cloud by SaaS providers such as salesforce.com.

Enterprise Network Today
Figure 2: Enterprise network today

Network monitoring solutions from 10 years ago are no longer suitable for this new environment of multiple moving pieces where both endpoints (clients and servers) can be anywhere. Network paths are now much longer and complex, and include segments completely out of control of the enterprise domain (Figure 3). Network fault isolation for this environment requires visibility beyond the four walls of the data center, and can only be achieved by instrumenting the different segments. This includes instrumentation of client devices (e.g. mobile/wireless devices), corporate data center and the Internet. Active probing has become a key monitoring technique to provide a complete hop-by-hop picture of performance between a client and a server. Because the root cause of problems can be anywhere along the path, the troubleshooting process does not depend only on the enterprise team anymore, but is rather a joint effort that also involves ISPs, third party PaaS/IaaS providers and the providers of the application (SaaS).

Network Paths Are More Complex
Figure 3: Network paths are now more complex than ever

Applications are now much more complex and dependent on multiple third-party components. Modern network monitoring solutions need to be application-aware and able to single out underlying problems that actually impact user experience. This is a major difference over monitoring solutions from the past that missed application context and only provided visibility into the network layer or over individual devices.

A New Way to do Network Monitoring

In the same way that the focus of application delivery shifted from the data center to the cloud, modern network monitoring solutions need to become cloud-based themselves in order to catch up. Monolithic monitoring solutions that lived within the four walls of data centers fall short in capturing the highly distributed nature and complexity of the modern enterprise. Even though some level of instrumentation is still required in the branch and/or data center, this is mostly in the form of “dummy“ collection agents that pre-process the data and export aggregates to be processed elsewhere. This approach brings multiple advantages including centralized management, better support, faster and more frequent version releases, and easier deployment. Some of these are highlighted in the table.

Capability Legacy On-Premises Solutions Modern Cloud-Based Solutions
Data collectionMostly traffic captured in DC or SNMP polling of devicesDistributed collection agents covering endpoints, branch, DC and cloud using a mix of active and passive measurement techniques
Application MetricsLimitedApplication-aware and tight integration between application and network layers
Data sharing and collaborationLimitedEnabled in the cloud
User InterfaceOn-premisesCloud
Management (collection agent, user access, configuration)Per instance local configCentralized cloud management
Software updatesLocal, manual triggerFrequent auto-updates

For most organizations that have already invested in on-premises monitoring tools, the next few years will be transitional, with increasing use of cloud-based monitoring. This shift will first focus on additional visibility into external networks and distributed applications. A second wave of change will come as on-premises systems reach their end-of-life and are replaced with new, lighter weight cloud-based solutions. We’re excited to advance the development of cloud-based network monitoring and have a few tricks up our sleeve that we’ll be sharing over the coming months.

Processing...