Earlier this week, Andrew Lerner, a renowned analyst for data center networking and a VP at Gartner Research wrote about the divide between networking and DevOps with strategies for integration in his blog post Networking and DevOps.
We wholeheartedly agree with Andrew when he states that we must “Treat networking configuration changes and operational tasks that are required to support a DevOps project as code”, which allows network operations teams to “Treat network changes as products, not projects.” This is inline with our founding principles and illustrates the transformation necessary to solve Day 2 operational issues.
Over the many years I spent as a designer and operator of high-profile large data center networks, I have come to understand one common truth that exists across all DCs – the network is the problem. Most people have heard the adage that “the definition of insanity is doing the same thing over and over again and expecting a different result” and yet as an industry, when it comes to networking, that’s exactly the path we have taken. Networking lags behind on adopting new innovations in infrastructure by roughly 15 years (adoption of virtualization, introduction of Linux based systems, and now containerization).
Wanting to do more than just talk about these issues is what led us to found SnapRoute, to deliver a modern architectural solution with a fully containerized microservices Network OS (NOS), which bridges the gap and natively brings networking into the DevOps culture (think DevNetOps). DevOps for networking is something that we are very passionate about here at SnapRoute. We see a world that cannot move forward with networking stuck in its silo, isolated from the rest of the infrastructure. To put it plainly – applications matter! As an industry, we need to start approaching networking as a tool which is leveraged to improve an application’s time-to-service.
The way the NOS has been built over the past 30 years is the root cause of the problem – it is preventing the industry from advancing. As an operator, I saw the greatest level of success when I worked hand-in-hand with DevOps teams – making sure there was a high level of coordination between network and applications. Without even realizing it, we were following a primitive version of DevOps – tackling Day 2 operational problems head-on. Despite the success of this collaboration and adoption of DevOps principles, it would inevitably be hampered by the limitations of the Network Operating Systems that we deployed. It became clear that we needed to fundamentally change the way the NOS was built.
The problems that operators are facing today are not the same as when networking was in its infancy – so why are we still building NOSs as if nothing has changed? Monolithic designs require downtime to perform complicated upgrade procedures and result in slow, static (i.e. non-dynamic) networks. The world Andrew describes in his blog can’t be fulfilled without the reimagining of the architecture of the NOS – for us at SnapRoute this means a NOS that is built for the Cloud Native world.
Cloud Native tools and principles were developed out of the necessity to solve the challenges of Day 2 operations. Kubernetes was born from the need for an orchestrator to manage the containerized microservices that are the result of hyperscaler operators breaking up monolithic code bases. Kubernetes directly drives automation using a declarative model by dictating the desired state of the system, using automated control-loops to ensure this matches the current state. Complex and resource intensive config-driven application models are no longer necessary, and applications can now be rolled out in minutes and en masse – rather than taking months to rip and replace large monolithic systems.
These same concepts can be applied directly to how the NOS is built to gain similar benefits. Network services can be added when needed or updated when bugs or security vulnerabilities are seen. The best part – changes run through an automated CI/CD test pipeline, ensuring network, compute and storage are tested in a coordinated fashion and deployed into production, following validation. This is done reliably, safely and quickly with a much lower risk of causing application impact, and without holding up a new deployment that are required by the business.
This is not a fantasy, today at SnapRoute we have built a NOS that tackles these Day 2 operational issues and allows “CI/CD for Networking” to become a reality. By utilizing containerized microservices and natively leveraging the Cloud Native approach, we can break the silos that exist between networking and DevOps. Managing the network using the same methods as DevOps allows automated CI/CD test pipelines to ensure these infrastructure components can be tested in concert. They are controlled by leveraging the same API mechanism and the same operational approaches as compute – lowering the barrier for cross team collaboration. Now that changes are confidently tested at both network and application layers prior to deployment – bugs are resolved and security vulnerabilities are fixed without service interruption. Rolling out new features is no longer an intimidating proposition. This leads to more agile network infrastructure, faster time-to-service for applications, and a more secure infrastructure – all in a manner that results in greater uptime and reliability.