Deploying Web stacks DRY-ly with Ansible, Part 1: Infrastructure

I've been working on a new site for the last several months. It runs great locally, but when I started thinking of putting it on a live server, I ran into a series of problems. I was hoping that I could simply upgrade my web server from PHP 5.6 to PHP 7, then deploy my Drupal 8 site in place. Sounds simple, right? unfortunately, PHP 7 introduced compatibility-breaking changes which caused problems for my old, Drupal 7-powered site. If I wanted to install both PHP 5.6 and PHP 7 on the same server, I would need a complex FCGI set up that I simply wasn't willing to invest in given my limited project time. That really only left me with one choice.

Build a new server.

Keeping servers DRY

For most DIYers, building a new server isn't that big of a deal. If you're not self-hosing, you'll probably go to your favorite hosting provider, and provision a new base image. Unless you're using a Platform-as-a-Service (PaaS), you'll probably barely get more than a Linux OS and some basic utilities out of the box. You're first inclination is to log in via SSH and start working your way through installing your web stack of choice. If you only need to do this once and a while, that's not a problem. The time invested isn't a big deal once you have it up and running. The problem with manually configured servers is that they're like a house of cards. They keep standing because you got them to stand that one time, and nothing has caused them to collapse.

Yet.

In programming, there's a popular aphorism called The DRY Principle, or Do not Repeat Yourself. If you write a bit of code to do a particular function, you should reuse that same code for every time you need that function. The problem with servers is that it hasn't been traditionally easy to adhere to DRY. Automation software to rapidly and consistently install software, configure settings, and audit compliance was propriety and very, very expensive. In recent years some open source tools like Puppet and Chef appeared that removed the price barrier. Instead of a complex (but boardroom friendly) UI to configure servers, these tools put everything into code. This had the key advantage that you could start treating server configuration as a programming problem. You could commit configuration changes to a git repository, maintaining a running record of what was changed and by whom. Whenever you need to stand up a new server, you need only change a few variables and enter the deploy command.

Waterproofing tools

For my new server, I chose to deploy to Linode using Ansible. Linode is a Linux-centric Infrastructure-as-a-Service provider. I've used Linode for years for my website. While they are more web-hosting centric than say, AWS, that has the advantage of making the administration interface simpler. Ansible is a server configuration management utility maintained by Red Hat. I've become a huge fan of Ansible since I started using it a year ago for a variety of reasons:

Ansible is free and Open Source.
Configuration is written in YAML, just like Drupal 8.
Ansible is written in and can be extended with Python.
No background process or agent is required on the systems Ansible manages.
There's no central puppetmaster server; whoever runs Ansible is the controller.
All of Ansible's official modules are included out of the box and do not need to be install separately.

Since I'm also a big fan of Continuous Integration (CI), I also didn't settle for only running Ansible from my laptop. Instead, I wanted to be able to push my code to a remote repository, and then have CI provision the server for me. For that, I used my existing CI setup:

A self-hosted instance of Gitlab as the repository.
Gitlab CI as the task runner to execute builds over SSH.
Ansible is run in local mode on the target server, rather than from the Gitlab server.

While my instance of Gitlab and Gitlab CI is self-hosted, you can also get a free account on Gitlab.com to do the same thing (I may do that in the future to save costs). If you're wondering how I set that up, I touched upon the process in my Drupalcon Baltimore talk, "Avoid Deep Hurting!".

Building a beachhead

To start, I created a new "infrastructure" repository in my Gitlab server and then cloned it locally. While I have Ansible and CI stuff in my website repository, I decided to keep the server provisioning code separate for several reasons. I can assign different user access rights to the repositories in Gitlab, so that only a trusted team members have access to changing the server infrastructure. The separation also makes each repository more focused; the infrastructure repository installs and configures software, while the site repository houses the site and the site deployment code. The last, and most important advantage for separating your infra and site repos is speed. In CI, you don't want to have builds take too long. If I make a minor change to my website, I don't want it to try to upgrade Apache at the same time!

The first problem I encountered was deciding if I wanted to provision the server instance on Linode with my CI. In theory, I could use the Linode API to select the service tier, base software, geographic reason, and root password. Ansible does have a Linode module explicitly for this purpose. It was a thrilling idea, but an intimidating one. If I did something wrong or my CI failed, it could start provisioning instance after instance until I kill it. The module documentation wasn't entirely clear on if the name parameter is used to uniquely identify server instances within your account. I wanted to maintain just a bit more control over this process and provision the server instance manually.

Once the new server instance was built and running, I logged in manually via SSH and did two things:

Install Ansible.
Add public SSH keys so I could log in without using a password.

That's it. That was the only thing I manually did on the server, but it was enough to get the rest working. All we wanted to do was create a beachhead for our CI to latch on to. From this point forward, we never want to directly log in to this server unless if it's to troubleshoot a problem. All of our permanent installations and configurations need to be done through the repository. I created a new runner on my Gitlab server to connect via SSH to the new server instance with sudo authority. Now, I could write all the software installation and configuration tasks in Ansible, push them to the repository, and have CI configure the server.

In my infrastructure repository, I created the following directory structure:

/path/to/my/infra-repo
├── main.yml
├── .gitlab-ci.yml
├── group_vars
│   └── myserver.yml
└── inventories
    └── myserver

The primary playbook is in the root of the repository, main.yml. Right now there's nothing in it, since we've only just started filling things out. The same is true for a variables file, group_vars/myserver.yml. The file under the inventory specifies the sole target of this entire repository:

[myserver]
example.com

The only other file in the repository is the Gitlab CI file, .gitlab-ci.yml. It's not terribly long either. All it does is instructs the CI to run the main.yml playbook whenever someone pushes to the master branch:

stages:
  - deploy
job_live_deploy:
  tags:
    - myserver
  stage: deploy
  script:
    - "ansible-playbook -i inventories/myserver main.yml"
  only:
    - master

Leveraging roles

At this point, we could start writing an Ansible playbook with all the commands necessary to install the components of our web stack. That would have worked, but there's a better, faster way. When someone wants to share Ansible code for others to use they can package it as a role and then share it on Ansible Galaxy. Roles sound complex, but really they're just playbooks with some default values built in. Most of the roles on Ansible Galaxy are for -- you guessed it -- installing software. To use a role, all we have to do is specify the role's name on Galaxy in our playbook, main.yml:

---
- hosts: all
  roles:
    - geerlingguy.apache
    - geerlingguy.php
    - geerlingguy.mysql
    - geerlingguy.composer
    - geerlingguy.drush

Finding the right roles on Galaxy often takes longer than it does to use them. Since I've used some of Jeff Gerrling's Ansible stuff before, I decided to use what he had available. There's no requirement to stick to any one role author here, this is just my preference. The playbook above doesn't do more than specify the target (all) and which roles to install, how do we customize the role? Remember how roles are just playbooks with default variable values? Customizing how we use a role is simply a matter of overriding those variables. Earlier we had set up a group variables file, group_vars/myserver.yml. Now we can start to populate that file with our customizations.

---
# Apache
apache_listen_ip: "*"
apache_listen_port: 80
apache_listen_port_ssl: 443
apache_remove_default_vhost: true
apache_create_vhosts: true
apache_vhosts:
  - servername: "example.com"
    documentroot: "/var/www/html"
apache_mods_enabled:
  - rewrite.load
  - ssl.load
  - authz_groupfile.load

# PHP
php_enable_apc: false
php_memory_limit: "320M"
php_max_execution_time: "90"
php_upload_max_filesize: "30M"
php_packages:
  - php
  - php-cli
  - php-common
  - php-gd
  - php-mbstring
  - php-pdo
  - php-xml
  - php-memcached
  - php-opcache
  - php-sqlite3
  - php-phar
  - php-mysql

# MySQL
mysql_key_buffer_size: "256M"
mysql_max_allowed_packet: "64M"
mysql_table_open_cache: "256"

# Composer
composer_path: /usr/local/bin/composer
composer_keep_updated: true
composer_add_to_path: true

# Drush
drush_install_path: /usr/local/share/drush

Typically, you can find the most important customization options by visiting the role's page on Github. Jeff does an excellent job of providing documentation for many of the above roles. Barring that, you can always look at the role's code and find what variable you need to override. The above is a fairly boring LAMP server configuration. Most of the tweaks are in memory allocation, and what additional packages or plugins to install.

If we were to commit and push our work so far, the CI would start up and try to run the playbook on our new target server. Unfortunately, it'll also fail shortly after starting because we forgot an important step. We need to install the roles. This is done by using the ansible-galaxy command on whatever system is running the playbook. We don't want to do that manually, so instead we add it to the Gitlab CI file, .gitlab-ci.yml:

stages:
  - deploy
job_live_deploy:
  tags:
    - myserver
  stage: deploy
  script:
    - "ansible-galaxy install -fr requirements.yml"
    - "ansible-playbook -i inventories/myserver main.yml"
  only:
    - master

Notice how this command runs before the ansible-playbook command. This ensures that our roles are always installed. To keep our Gitlab CI file short, we put all the roles to install in a separate file in the root of our repo, requirements.yml:

---
- src: geerlingguy.apache
- src: geerlingguy.php
- src: geerlingguy.mysql
- src: geerlingguy.composer
- src: geerlingguy.drush

Now we can commit and push!

Summary

When I started building my new server, I didn't expect it to turn out to be such an interesting experiment! Initially, I wanted to use this as an oppotunity to explore how to create and configure greenfield server infrastructure in a completely tracked and git-backed fashion. And I think we've succeeded! We now have the basic playbook to set up a LAMP server using just a few bits of onfiguration, rather than hours of mucking about with the command line. Even better, it's integrated with CI so that we only need to push changes to the repo.

In the next part, we'll expand our playbook to include databases, users, logins, and other configurations that require secure credentials.