Digging into IAC parameters
You probably already know the two key aspects of IAC: (1) imperative vs declarative code and (2) immutable vs mutable infrastructure. Let's dig right into them.
I am aiming to give you a high-level refresher on these two aspects of IAC with the help of quoted experts.
You may get a new perspective for explaining to new engineers and leadership why you have architected your IAC files the way you have. After all, many of us still have to defend our choices well after making them. So let’s begin:
Imperative vs declarative configuration
The early days of IaC meant telling the computer exactly what you wanted it to do. In terms of infrastructure provisioning, the IaC provisioning tool followed imperative (also known as procedural) **instructions. In this paradigm, the engineer specifies the exact steps.
The system does not deviate from the instructions even if they are inefficient. For example, imperative code may execute from a command-line interface with a Bash script. It would then drive a step-by-step configuration of the infrastructure.
Its simplicity makes workflow audits easier, but imperative code doesn’t scale well because the results will vary depending on what’s going on in the infrastructure at the time. The code is not designed to be reusable.
If you run an imperative script multiple times, you may begin to notice variability in the infrastructure that is provisioned. This is especially the case if an error occurs somewhere along the execution process of the script. (Sai Vennam, Developer Advocate at IBM, 2019)
Declarative instructions, on the other hand, give consistent results as the code is about the end result. In your code file, you define the pattern required then leave it to the tool to work out how to achieve it. The system then works continuously until it reaches that state.
Declarative instructions are the better preferred way for IaC provisioning, to create and change infrastructure configurations. However, they may not be adequate for more complex infrastructure needs where the path needs to be as clear as the destination.
In high CPU workload situations, declarative code has the advantage. According to Martin Kleppman, author of Designing Data-Intensive Applications, declarative languages are better suited for parallel execution. He wrote that recent advances in CPU performance have been due to adding more cores and not through higher clock speeds than before”.
Imperative code is very hard to parallelize across multiple cores and multiple machines, because it specifies instructions that must be performed in a particular order. Declarative languages have a better chance of getting faster in parallel execution because they specify only the pattern of the results, not the algorithm that is used to determine the results. — Martin Kleppman
In some situations, using tools that allow for both imperative and declarative instructions may be wise. Most of the earlier tools that were originally imperative only now work in both imperative and declarative modes.
Mutable vs immutable infrastructure
A mutable approach to infrastructure is where you can change the state of the infrastructure to another. An immutable approach to infrastructure is when you don’t change the infrastructure to another form. Instead, you create new infrastructure and retire the old infrastructure.
There is a benefit to this non-changeability, which we will get to in a moment. But first, let’s go through an example, which is an abridged version of Armon Dadgar’s explanation of mutable vs immutable infrastructure.
Say you have a server version 1 running Apache and your team decides to switch to NGINX for version 2. In a mutable environment, you’re essentially upgrading the existing infrastructure. You’d run the IaC config tool (e.g. Ansible) to the Apache server to execute the change to NGINX.
There is a benefit to this:
We already have this existing server. Maybe we have data that we've written locally and that our web server is consuming. When we update in place [with mutable configuration], we don't have to worry about moving the data around to other machines, creating a new machine, all of the infrastructures already exists. All we're gonna do is perform this upgrade. — Armon Dadgar, CTO at Hashicorp
The issue with this is that things in the real world can go wrong, especially during an upgrade procedure. Mutable infrastructure faces the risk of messy version control, as we’ll now explore by continuing on with the Apache → NGINX example.
Let’s say we run this code to install NGINX: apt-get install NGINX
but then the APT repo isn’t responsive, or the network is lagging excessively, or a myriad of other things. The risk is that NGINX might not get installed properly or at all. So the provisioning tool has executed the version change, but it’s not what we expected.
This creates three inter-related issues:
You haven’t necessarily achieved the goal state of Version 2 having NGINX for the webserver
It’s difficult to test half-successful upgrades because there may be variability in performance
Upgrades done at scale means varying degrees of change depending on the real-world issue with the network, repo etc
In a large scale web server change configuration, you may end up with:
most servers having NGINX installed and running 😄
some servers with no NGINX installed 😔 and
a few servers with NGINX installed but no webserver running 😕
An immutable approach does away with the risk of half-upgraded virtual or physical servers. You would follow this process for the Apache → NGINX example:
Keep running Apache server while you complete steps 2 to 4
Spool up a new virtual machine
Install and configure NGINX webserver on that
Test that all parameters are performant
Retire the Apache server
It may seem a lot more work than simply upgrading the existing Apache server to NGINX, but may save on the above risk of varying NGINX installs.
Coburn Watson is head of infrastructure & SRE at Pinterest. His take on the issue might sum up the whole argument for moving away from mutable infrastructure. In the 2018 book, Seeking SRE, he wrote:
"I think we’ve all worked at companies where we upgrade something [mutable], and it turns out it was bad and we spend the night firefighting it because we’re trying to get the site back up because the patch didn’t work. [Now] when we go to put new code into production, we never upgrade anything, we just push a new version alongside the current code base.”
Note that immutable infrastructure works best if the various application components run on separate servers. In particular, store data on another server. So when you run a create-retire routine like above, you don’t risk precious user data in the process.
I hope you found this breakdown useful.