Immutability for cloud infrastructure (and for turning pets into cattle)
Immutability is one of the most important attributes for a software system’s success in the cloud. Immutability means that once your infrastructure is instantiated (in the cloud), it can only be replaced by a different version and not modified to achieve the target deployment. This enables industrialization of infrastructure. With immutability, infrastructure units become commodity units enabling teams to quickly replicate them with confidence. Randy Bias of OpenStack foundation came up with a wonderful analogy comparing legacy server-based systems with pets and the cloud based immutable infrastructure with cattle. The essence of his analogy is that with legacy systems you had to tend to units of infrastructure as pets, when they got sick you nursed them back to health. On the other hand if infrastructure is like cattle, replacing a unit in the herd is quicker and easier. In this post, we discuss how containers have brought immutability inside the compute units of cloud deployments.
Success of the cloud
The success of cloud can be attributed to cost reduction, improved availability, and easier maintainability among other factors. One such key attribute of the cloud is elasticity. Cloud enabled teams to go from x to 10x or even 1000x infrastructure with little to no friction, and rapidly growing startups bought into this lock, stock and barrel. Cloud enabled startups to start competing with established players because the capital required to get your product in front of the audience was reduced dramatically. Not only was infrastructure cheaper, the number of people required to run it was also greatly reduced. Faced with the increasing competition from new entrants enabled by cloud, many players adapted and became cloud compliant. Many established players however struggled to adapt to the cloud. One of the stumbling blocks was that their software systems were monoliths designed to run in servers and elasticity of their systems was never a concern before cloud came around.
Why it was not easy to adopt the cloud
Organizations unable to adapt to the cloud world would often find mutability being the biggest blocker to the flexibility required to be cloud-native. At my previous stint at a large company, we evaluated the effort to move our team’s deployments to an internal cloud, as part of a larger company-wide effort to modernize our infrastructure. Our team struggled with this effort since most systems were written as a monolith and assumed to always run in a server, with their components being tightly coupled with each other, and this meant a rewrite of large portions of the codebase. For example, you would always write to a local disk and hence it’s OK to think in terms of files and local namespaces, and not endpoints and buckets. Adopting the cloud meant a rewrite to eliminate such tight dependencies on underlying physical resources, which in itself is an undertaking with huge risks. Many people on the team perceived all this risk as untenable to just be configurable and hence flexible by adopting the cloud.
At other stints I noticed that teams which had bought into the cloud pitch were unable to achieve a complete decoupling of their components. Software travelled as a unit of related services, the deployment platform can be immutable. For example, folks would ship Amazon Machine Images (AMIs) in AWS containing all the services. A new release meant a new AMI. So while the AMIs are immutable and hence replicable, the units inside are not. Upgrading a software would mean rolling out newer versions of these VM like artifacts. While this reduces uncertainty introduced due to mutable infrastructure, it still suffered from some of the legacy issues like long release cycles of the whole product rather than quick releases of individual components of the product.
Containers have enabled true immutability down to the component level. Every single component of the stack can now be shipped independently and the users can start using it immediately. Once a component is finished a developer can be sure that all the dependencies of that unit will travel with the component. Teams now only have to worry about the interface of the unit being consistent and not have to deal with dependency management, planning releases, etc. Each component and all its dependencies are a single immutable unit no matter where they run. Coupled with best practices like unit and integration testing and CI/CD flows, teams can now easily predict how their software will behave without worrying about collisions with other systems on the machine, installation issues etc. Once the container is up, it will behave predictably. Additionally, container orchestration mechanisms like Kubernetes, which provide container management, scaling, configuration management etc of your container-based components make it easy to reason about external dependencies as logical abstractions. The implication of this ecosystem is increased portability between cloud providers without the huge costs associated with it earlier.
Infrastructure as Code is the Future
When used together, containers and their orchestration mechanisms provide a very high degree of confidence to the teams about what they are building and deploying. All the code that needs to run is immutable via containers and all the mechanisms running the stack are also driven by code. For example: Kubernetes deployments are declared in yaml and they run immutable containers inside them for most cases. While it is still possible to do things the old way and have “pets” , the ecosystem actively advocates against it and guides adopters towards immutability. If the advocated practices are adopted, a whole stack can be traced to checkpoints in code. What changed and why it changed is easy to parse because all of it is in code. Reverting things is a matter of simply reverting to the last known good checkpoint, something which required significant investments earlier.
At Cyral we have fully embraced this infrastructure-as-code paradigm. We believe that building our service for a cloud-native world means designing it as “cattle” that can be easily managed by modern teams. It means providing data security as code, because code should be the source of truth for any observer of our service. With Cyral, data security is as simple as a “docker pull” command.
-  http://cloudscaling.com/blog/cloud-computing/the-history-of-pets-vs-cattle/
-  https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html
Image by Caroline Matthews via the OpenIDEO Cybersecurity Visuals Challenge under a Creative Commons Attribution 4.0 International License
Observability Metrics for Troubleshooting Database Performance
In this blog post, we show how Cyral’s observability metrics can be used by DevOps and SRE teams for tracking usage of and diagnosing performance …
Life at Cyral: All-Hands with Gokul Rajaram
Part two of our new Cyral community blogpost series (find part one here) finds the Parliament of Owls continuing our discussion on product leadership and …