I have been setting up computers and configuring web servers for a long time now. I started my computing journey by building computers and setting up operating systems for others. Soon, I started configuring servers first using shared hosting and then dedicated servers. As virtualization became mainstream, I started configuring cloud instances to run websites. At a certain point, I was maintaining several projects (some seasonal), it became harder to remember how exactly I had configured a particular server when I needed to upgrade or set it up again. That is why I have been interested in Infrastructure as Code (IaC) for a long time.
You might say that it is easier to do this by just documenting the server details and all the configuration for each project. Sure, it is great if you can manage to keep the documentation updated as the software evolves and requirements change. Realistically, that doesn’t happen. Instead, if you start with the perspective that you are going to only configure servers with code, never manually, you are forced to code all the changes you want to make.
Infrastructure as Code
So, what does IaC look like? There are several tools out there and each has its own conventions. Broadly speaking, there are two types of code you would write for IaC: declarative or imperative. If you are a programmer, you are already familiar with the imperative style of programming. This is essentially the style of almost all programming languages out there. In these languages, you would write line-by-line instructions to tell the computer exactly what to do and how to do it. Consider this shell script used for creating an instance on DigitalOcean.
#!/usr/bin/env bash
read -p "Are you sure you want to create a new droplet? " -n 1 -r
if [[ $REPLY =~ ^[Yy]$ ]]
then
doctl compute droplet create --image ubuntu-20-04-x64 --size s-1vcpu-1gb --region tor1 ps5-stock-checker --context personal
echo "Waiting to create..."
sleep 5
doctl compute droplet list --context personal
fi
Here, we are running a sequential set of instructions to create a droplet and verify that it got created. We are also confirming this with the user before actually creating the droplet. This is a very simple example but you could expand it to create whatever style of infrastructure you need, albeit not easily.
The declarative style of programming
Most IaC tools support some form of declarative syntax which lets you define what infrastructure you need rather than how to create it. The above example in Terraform, for example, would look like this.
resource "digitalocean_droplet" "web" {
image = "ubuntu-20-04-x64"
name = "ps5-stock-checker"
region = "tor1"
size = "s-1vcpu-1gb"
}
As you can see, this example is easier to read. Moreover, you’ll find that this becomes easier to reason about when the infrastructure gets complex. My personal preference is to use Terraform but whatever you use would have a similar structure. It is the tools job to how exactly implement this infrastructure. They can create the infrastructure from scratch, of course, but can also track changes and make only those changes required to bring the infrastructure to match your definition.
Where is the simple in this?
You might think this is overkill and I can understand that sentiment. After all, I thought the same but I have found it useful for projects both large and small. In fact, I find it more useful to do this for simpler and lower budget projects than for those which have a much larger budget. At least as far as Drupal is concerned, projects with larger budgets use one of the PaaS providers. There are several providers such as Acquia, Pantheon, platform.sh, or others that do a great job at Drupal specific hosting. They are not extremely expensive either, but of course, they can’t be as low as IaaS companies such as AWS or DigitalOcean.
So, it may not be simple but we can get there. On the projects that I am going to self-host, I add in a directory called “infra” with Terraform modules and an Ansible playbook. To make it findable, I have put it up on Github at hussainweb/lamp-ansible-terraform. There’s no documentation, unfortunately, but I hope to put up something soon. Meanwhile, this blog can be an informal introduction to the repository.
My workflow
When I want to start a new project that I know won’t be on one of the PaaS providers, I copy the repository above into my project and start editing the config files I need. Broadly, the repository contains Terraform modules to provision a server (or two) to run Drupal and the actual configuration of the server happens through Ansible. As of right now, there are modules for AWS and Azure to provision the servers. The one for AWS supports setting up instances with security groups configured. You can optionally set up a separate instance for a Database server as well. You can find all the options that the module supports in the variables.tf file.
On the other hand, the module for Azure is simpler and only supports setting up a single server to run both web and database server. You can take a look at its variables.tf file to see what is exposed (TL;DR, just the instance size). I built these modules on a need basis and didn’t try to maintain feature parity.
Depending on what I want to use, I will initialize that Terraform module (terraform init) and provision. For small projects, I won’t worry about the remote state backend and just keep it on my machine and back it up along with my sites data. It’s a long process but it works and I haven’t needed to simplify that yet. At the end of this, I get the IP address(es) of the instance(s).
Sometimes, I need to set up different servers for a staging environment, for example. For this, I just provision another server in a different Terraform workspace. The module itself does not support multiple environments and does not need to.
Configuring the instance
Now that I have the IP address(es), I can set up Ansible. I edit the relevant inventory files (for dev or for production) and set up relevant variables in various yml files. Out of these, I absolutely have to change the app.yml file to set my project’s repository URL. I can optionally also change the PHP version, configure Redis, set up SSH keys (edit this one if you want to use the repo), etc. Once all this is done, I can run ansible-playbook to execute the playbook.
I realize this repo is hardly usable without documentation. So far, it’s just a bunch of scripts I have cobbled together to help me with some of my projects. In time, I do want to improve the documentation and add in more resources. This also intersects with my efforts in another direction to set up remote development instances (not for hosting, but for live development). It’s called Yakht (after yacht as I wanted an ocean related metaphor). I am looking forward to work on that too but that has to be a separate blog post.