Scalable and Understandable Provisioning with Ansible and Vagrant
— Initially published on 15 Oct 2013
Loosing or migrating a server and having to rebuild its configuration by hand is anything but a fun job.
It may be the case that you heard of the “infrastructure as code” mantra of the DevOps culture, tried popular tools like Chef or Puppet but found them to be just incredibly complex even for simple cases.
Not everyone is managing 500 servers in the cloud after all.
This post will walk you through Ansible, a powerful yet simple configuration management tool that I am particularly fond of, and not just because it took me 5 minutes to understand it.
We will also see how it fits well with Vagrant, a tool that makes playing with virtual machines as simple as editing a small configuration file and then forgetting about GUIs, ISO files and all the funk.
Ansible is an automation and orchestration tool written in Python. It works through SSH connections and does not require installing agents on hosts.
By the way if you don’t know about password-less SSH connections using public keys, it may be a good idea to do so.
Configuration specifications are written in YAML documents called playbooks, providing tasks and event handlers. An example task would be to update the configuration files or a database server, and an event handler would restart the database upon completion of the task.
Ansible is push-based
Ansible is commonly used in a push-style architecture:
The control host is typically your machine from that you will initiate Ansible runs. In more elaborated scenarios it can be a cronjob.
Ansible takes advantage of a hosts inventory file. It contains a list of machine addresses arranged by groups. In the previous example we have 2 groups: database servers and web servers. The inventory file would simple list the IP addresses and/or host names for each one.
Of course there are cases where the inventory is dynamic by nature: cloud computing environments, elastic groups of hosts, etc. Ansible provides support for that, too, but we won’t discuss it here.
Each playbook addresses one or several groups from the inventory. In the previous example we would have a playbook to configure and orchestrate database servers, and another one for web servers. The definition of playbooks can be factored out, too, so you could have a third playbook with the common parts, serving as an include for the database and web server playbooks.
Ansible can be pull-based, too
It is not always desirable to have a push-based infrastructure. There are many valid reasons such as network constraints preventing easy SSH connections or scalability / automation issues.
In such situations Ansible remains your next very best friend. I will not cover the details here,
ansible-pull command can be used as follows:
- each host has Ansible installed,
- the configuration is stored in a Git repository,
ansible-pullcheckouts the configuration repository at a given branch or tag (hint: think
ansible-pullexecutes a specified playbook,
- you automate the process using a cronjob, and then all you have to do is pushing the configuration changes to a Git repository.
Sounds good? Let us now get back to Ansible in push-mode.
Ansible comes with several command-line tools. The first one is simply called…
The purpose of the
ansible tool is mainly to execute a command over selected groups of an
Now is the good time to define an inventory file. If you know how to write Windows-style
files then you have basically won:
This would define 2 groups
db over hosts. Note that a single host can belong to more
than one group, too.
Now for testing purposes let’s just consider the case of a single host with the content of a
We can now run the
ansible tool to execute a command (
ls -lsa /usr) over the
Not bad, not bad.
The takeaway is simply that
ansible can execute commands to many hosts over SSH and report back to
Playbooks and the
While you could probably use
ansible as a general-purpose command execution tool over SSH, it does
not scale to deal with automation and configuration management. This is where playbooks enter the
Playbooks serve both as a way to express commands to be executed, and as abstractions over the tasks
to be done. While you can execute commands such as
ls -lsa, you can also take advantage of
higher-level actions to, say, require the
nginx service to be running.
Ansible has a fairly large set of modules that can be used to construct powerful playbooks. I encourage you to look at the list.
What’s in a playbook
Because a sample is always better than anything else, here is the
playbook.yml file that I am
using for a server that I manage internally:
While it is quite easy to understand, there are a few points worth detailing.
hosts key does what you would expect: it specifies what groups shall the playbook be aimed at.
Many actions require you to be run through
sudo, in that case you simply add a
This is the case of upgrading the system though
apt (or any other similar package management
Note that running
sudo may require typing a password, which is a sure way of blocking Ansible
forever. A simple fix is to run
visudo on the target host, and make sure that the user Ansible
will use to login does not have to type a password:
@username ALL=(ALL) NOPASSWD: ALL@
An action can define a
notify attribute to fire an event once it is done. The case of the Nginx
server configuration is a good one.
First, we copy a local file from the relative path
files/niginx-default to the host at path
/etc/nginx/sites-available/default. Once this is done, the
restart nginx notification is fired:
The notification handler can then restart the Nginx service:
enabled attribute: it ensures that the service is run as part of the system init scripts.
The details of how to do that is managed by the
service module that knows how to do so on each
A good example for loops is the installation of packages:
This simply repeats the action for many items, and eventually fires a
restart nginx notification.
Touching file contents
There are many ways to edit files instead of copying them from your control host. One of these is
No matter what the rest of the
sshd_config file is, this ensure that a line contains the
instruction to disable root logins over SSH.
Not everything is captured by an Ansible module. While you can develop your own actions, you may simply issue shell commands, too.
The following example manipulates the
ufw firewall to ensure that only ports 22 and 80 are open
from the hostile Internet:
Using console output
We can register the console output of an action:
This runs the
whoami command to know what login is being used on the target host. This makes a
playbook flexible for, say, update permissions without having a hard-coded login:
Running a playbook
By now you should have a good understanding of what a playbook is like.
Running a playbook is equally simple:
The great thing with Ansible playbooks is that they are mostly idempotent, so you can run them as often as you want.
Indeed, modules store some state called facts, and Ansible won’t perform an action again if some fact hasn’t changed between 2 runs.
Most Ansible-provided modules provide actions that store facts, but always keep in mind that not
everything can be idempotent. Running shell commands is a good example. We did that in the previous
playbook to update the firewall configuration. If we wanted to avoid redoing it on each playbook
run, we would need to write some kind of
upf action and ensure that facts are being stored
regarding the firewall configuration.
Testing with Vagrant
There is little chance that you will get a correct configuration out of the box.
Like any good software project, the only solution is to test, test and test.
Getting a server up and running is costly, so your best solution is to try in a virtual machine. Oracle VirtualBox is a well-known opensource solution. The problem is that booting an ISO image and starting the installation from scratch is tedious, boring and time-consuming.
Vagrant to the rescue
Vagrant is a the real deal. It is a command-line tool to manage virtual machines from simple configuration files. You can start a virtual machine from a single command, log-in through SSH, stop it and trash it at will. Vagrant is your best friend when you want to test a given server(s) configuration using virtual machines.
Vagrant is most often used in combination with VirtualBox, but it can run other virtualization engines too. Getting Vagrant up and running is easy:
- install Oracle VirtualBox, then
- install Vagrant.
Voilà, you’re done!
Vagrant configuration files are very simple. They use a Ruby DSL to describe the configuration, including:
- how many machines to run,
- what base images to use for each machine (Ubuntu, Fedora, FreeBSD, your own, etc),
- what network configuration shall be used,
- how much CPU / memory do you want,
- which local folder shall be synchronized with a folder in the virtual machine, etc.
For the Ansible configuration above, my testing
Vagrantfile looks as follows:
This single-machine configuration is quite simple. It boots a Ubuntu-based box. There are many more
community-contributed boxes, and you can create your own ones. Next, the machine is being put in a
private network with IP
192.168.100.10. We also forward connections from port 80 to port 8080 on
the host machine.
Playing with Vagrant
The configuration file above is simple, and so is running the VM:
vagrant upstarts the machine, possibly downloading and caching the box image,
vagrant sshlogs you into the VM,
vagrant haltstops the VM,
vagrant suspend… suspends the VM,
vagrant destroytrashes the VM.
This is quite handy: a simple
Vagrantfile is all you need, and Vagrant takes care of preparing the
VMs for you.
Ansible and Vagrant integration
Vagrant supports different types of provisioning methods, including shells scripts, Puppet and your new best friend Ansible.
Configuration is easy by adding the following to your
I suggest having a specific inventory file that matches the IP addresses of your Vagrant configuration.
Once this is done, Vagrant calls Ansible to provision the VM. There are a few extra commands that are useful while working on your Ansible setup with Vagrant:
vagrant reload, and
vagrant provisionto force calling Ansible without a reboot.
When you are confident with your Ansible configuration, I suggest a
vagrant destroy followed by a
vagrant up, just to retry your automated configuration from scratch.
Automatic configuration of machines is quite easy with Ansible. Knowing that you can configure a whole set of machines or just a single one with a reproducible process is priceless.
Ansible is very approchable. While primarily push-based, it can also work in a pull fashion with little friction.
Good programmers test, and Vagrant makes it so easy to play with virtual machines that you have no excuse for not fine-crafting Ansible configuration with it.
We only scratched the surface of what you can do with Ansible, yet the simplicity of the tool should be convincing.
On a final note, Ansible is useful for more than servers. It knows how to deal with package managers such as MacPorts and Homebrew, so you could also use it for managing desktops. We successfully used Ansible as part of our research experiments to provision RaspberryPi devices from a generic Raspbian image.