Following on the heels of “the great cloud migration”, DevOps is the next tidal wave in digital transformation. It’s a radical re-thinking of traditional IT that focuses squarely on the rapid development and delivery of software and digital services to drive value. The disruptive success of DevOps pioneers like Netflix and Amazon highlight its potential for delivering exponential business value.
Rolling out DevOps is no simple task and requires investing heavily in projects to transform IT processes, upskill teams, integrate a new ecosystem of tools, and build out a technical framework that forms the software delivery engine. However, because of its central role in value creation, it’s equally critical for IT leaders to invest in ensuring the operational availability of all the services that make up the DevOps toolchain.
To understand this further, we’ll look a bit deeper at DevOps to see why it presents some unique operational challenges and why sophisticated monitoring is critical to optimize the complex toolchain that makes up the DevOps system.
DevOps and the DevOps Toolchain
DevOps is all about transforming IT organizations into software innovation centers that can compete in today’s digital economy. It replaces inefficient, traditional IT practices and siloed structures with a holistic approach that emphasizes collaboration, integration, automation, and self-service. To do this requires an equal investment in new processes and cultural practices along with new tools and technical practices.
The technical innovation at the heart of DevOps is the software delivery pipeline. Delivery pipelines form the “engine” of DevOps, and their purpose is to integrate and automate all the tools and processes involved translating a business idea into code and shipping that code to production. This could be a web update, a new backend service, a mobile app feature, a network VNF, a new API endpoint, or any other digital service. The overall goal is to make the software delivery process as reliable, iterative, and repeatable as possible, with constant feedback to drive improvements.
The technical practice is referred to as continuous integration and continuous delivery (CI/CD), where each developer code commit triggers a pipeline or sequence of pipelines. Each pipeline is made up of one or more stages, each stage leveraging specific tools to perform specific tasks that move the software one step closer to being released to production. For any given software component, typical pipeline stages can include building, packaging, scanning, testing, securing, deploying, and monitoring the component. The ecosystem of tools supporting the rapid delivery of software continues to explode, including services for emerging use cases like security, key management, data virtualization, and more.
Operational Complexity of DevOps Toolchains
The DevOps toolchain may sound simple in concept, but in practice—particularly for enterprise organizations—it’s highly complex from an operational point of view. There are four key factors that contribute to toolchain complexity.
- 1. The large number of pipelines and users. The DevOps toolchain is like a factory operating many different assembly lines. The more digital “products” there are, the more pipelines exist. Each application will have at least one if not more CI/CD pipelines for facilitating their lifecycle.
- 2. The large number of tools. Every stage of a pipeline has a specific purpose in moving the delivery process forward and an entire ecosystem of tools available to support that purpose. That ecosystem continues to explode, especially with the increased availability of cloud-based DevOps tools. Furthermore, different target platforms and architectures require different tools for performing similar pipeline functions. For example—different tools are required for security scanning of Docker containers than Java war files.
- 3. The high level of interconnection. A key DevOps principle is collaboration and integration. As much as possible you’ll want to share tools across pipelines. For example – your Build/CI, source repository, and artifact repository tools will typically be widely shared. Your container security scanning tool may be shared across any pipeline that’s building containers. Your pipelines will also manage dependencies and flow between tools. For example – an artifact built may need to pass a security scan before being automatically pushed to an artifact repository and consumed elsewhere. Pipelines also have dependencies on one another. For example, a full integration or deploy/release pipeline may need to wait for or be triggered by upstream CI pipelines. Figure 1 below gives a partial sense of this complexity, showing the interconnected relationship between your many CI/CD pipelines and your DevOps toolchain.
- 4. The increasingly cloud-based nature of DevOps tools. Using cloud-based tools in your DevOps toolchain has huge advantages including easier integration via standardized service APIs, easier sharing across projects and teams, flexibility to choose the right tool for the job and availability of more specific tools for specific tasks.
All these factors mean your DevOps engine depends on a large and complex network of interconnected services, many of which are cloud services you don’t own. Furthermore, with a largely cloud-based toolset, the Internet becomes the primary transport for all your services. This introduces even more third-party services that can adversely impact DevOps operations, like DNS services, web gateways, CDNs, DDoS mitigation services, and more. It also makes your DevOps toolchain susceptible to Internet vulnerabilities like security threats, BGP issues, and global outages. The cloud makes it very challenging to understand, much less identify and resolve any issues adversely affecting performance or availability.
The Impact of DevOps Toolchain Disruption
In the context of digital transformations, DevOps is somewhat unique in the absolutely central role it plays as the value delivery engine for your entire business.
Your DevOps toolchain can easily scale to serve 1000s of employees from developers, QA testers, SREs, network ops, and security/compliance teams that use and depend on its availability. The DevOps toolchain is also like an “engine”, constantly running in high gear and capable of a pace of operation far greater than your traditional IT processes. Like a Formula 1 race car running at top speed, even the smallest malfunction can be relatively costly when you’re maintaining the rapid delivery of 1000s of internal and customer-facing applications.
Service interruptions to key DevOps services like GitHub will cause immediate work stoppage. If developers can’t check in code, or pipelines can’t scan for security vulnerabilities, or pull artifacts, or run automated tests, your capacity for delivering value will come to a halt.
The value of DevOps is tied directly to the pace of operation. That value is huge. According to a 2018 report by DevOps Research and Assessment, businesses who have a mature DevOps engine can deliver 46x more frequently, recover 2604x more quickly, and get to market 2555x faster than those who don’t. By contrast, significant disruptions to your DevOps toolchain can be devastating.
Move Beyond Traditional Monitoring for Cloud-Based DevOps
Traditional monitoring tools like SNMP, packet flow, and APM have for some time been used by IT teams to ensure quality and availability of critical business services. These kinds of tools have grown up in the context of enterprise applications and business services running primarily in on-prem data centers and branch offices. But when your business-critical services are third-party tools running in the cloud, you no longer own the application, infrastructure and most of the network connectivity. This significantly limits the ability for traditional network and application monitoring to provide effective service visibility.
ThousandEyes’ active monitoring approach bridges this gap by providing deep visibility across entire service paths that include your network as well as all the networks, infrastructure and software you don’t own. ThousandEyes makes it easy to quickly setup monitoring that targets all the cloud and on-prem services that make up your DevOps toolchain. The result is complete visibility from the application layer (including service APIs and DNS) all the way down to the network and routing level – providing a full path view to each specific service. Monitoring can be performed from multiple vantage points as well, like your business locations, datacenter, VPC or hundreds of pre-deployed cloud locations, giving you insight into the overall health of your DevOps service mesh.
Mature DevOps Requires Mature Toolchain Visibility
A mature DevOps practice and toolchain is necessary to drive the rapid service delivery needed to innovate and stay competitive in today’s software-driven economy. An equally mature monitoring approach that is cloud-ready and Internet-aware is needed to ensure DevOps service availability issues before they slow down your capacity to deliver software and stay competitive.
To learn how ThousandEyes helps ensure the availability of business-critical services that are highly cloud-dependent, check out our Five Cloud Migration Issues You Shouldn’t Ignore eBook, or request a demo today.