Scalable and Understandable Provisioning with Ansible and Vagrant
Introduction
Loosing or migrating a server and having to rebuild its configuration by hand is anything but a fun
job.
It may be the case that you heard of the “infrastructure as code” mantra of the DevOps culture,
tried popular tools like Chef or Puppet but
found them to be just incredibly complex even for simple cases.
Not everyone is managing 500 servers in the cloud after all.
This post will walk you through Ansible, a powerful yet simple configuration management tool that I
am particularly fond of, and not just because it took me 5 minutes to understand it.
We will also see how it fits well with Vagrant, a tool that makes playing with virtual machines as
simple as editing a small configuration file and then forgetting about GUIs, ISO files and all the
funk.
Overview
Ansible is an automation and orchestration tool written in Python. It works through SSH connections
and does not require installing agents on hosts.
By the way if you don’t know about password-less SSH connections using public keys,
it may be a good idea to do so.
Configuration specifications are written in YAML documents called playbooks, providing tasks and
event handlers. An example task would be to update the configuration files or a database server, and
an event handler would restart the database upon completion of the task.
Ansible is push-based
Ansible is commonly used in a push-style architecture:
The control host is typically your machine from that you will initiate Ansible runs. In more
elaborated scenarios it can be a cronjob.
Ansible takes advantage of a hosts inventory file. It contains a list of machine addresses
arranged by groups. In the previous example we have 2 groups: database servers and web servers. The
inventory file would simple list the IP addresses and/or host names for each one.
Of course there are cases where the inventory is dynamic by nature: cloud computing environments,
elastic groups of hosts, etc. Ansible provides support for that, too, but we won’t discuss it here.
Each playbook addresses one or several groups from the inventory. In the previous example we would
have a playbook to configure and orchestrate database servers, and another one for web servers. The
definition of playbooks can be factored out, too, so you could have a third playbook with the common
parts, serving as an include for the database and web server playbooks.
Ansible can be pull-based, too
It is not always desirable to have a push-based infrastructure. There are many valid reasons such as
network constraints preventing easy SSH connections or scalability / automation issues.
In such situations Ansible remains your next very best friend. I will not cover the details here,
but the ansible-pull
command can be used as follows:
- each host has Ansible installed,
- the configuration is stored in a Git repository,
ansible-pull
checkouts the configuration repository at a given branch or tag (hint: thinkprod
,staging
, etc),ansible-pull
executes a specified playbook,- you automate the process using a cronjob, and then all you have to do is pushing the configuration changes to a Git repository.
Sounds good? Let us now get back to Ansible in push-mode.
The ansible
tool
Ansible comes with several command-line tools. The first one is simply called… ansible
.
The purpose of the ansible
tool is mainly to execute a command over selected groups of an
inventory.
Now is the good time to define an inventory file. If you know how to write Windows-style .ini
files then you have basically won:
This would define 2 groups web
and db
over hosts. Note that a single host can belong to more
than one group, too.
Now for testing purposes let’s just consider the case of a single host with the content of a hosts
file:
We can now run the ansible
tool to execute a command (ls -lsa /usr
) over the main
group:
Not bad, not bad.
The takeaway is simply that ansible
can execute commands to many hosts over SSH and report back to
you.
Playbooks and the ansible-playbook
tool
While you could probably use ansible
as a general-purpose command execution tool over SSH, it does
not scale to deal with automation and configuration management. This is where playbooks enter the
fray.
Playbooks serve both as a way to express commands to be executed, and as abstractions over the tasks
to be done. While you can execute commands such as ls -lsa
, you can also take advantage of
higher-level actions to, say, require the nginx
service to be running.
Ansible has a fairly large set of modules that can
be used to construct powerful playbooks. I encourage you to look at the list.
What’s in a playbook
Because a sample is always better than anything else, here is the playbook.yml
file that I am
using for a server that I manage internally:
While it is quite easy to understand, there are a few points worth detailing.
Hosts definition
The hosts
key does what you would expect: it specifies what groups shall the playbook be aimed at.
sudo
-ing
Many actions require you to be run through sudo
, in that case you simply add a sudo
attribute.
This is the case of upgrading the system though apt
(or any other similar package management
tool):
Note that running sudo
may require typing a password, which is a sure way of blocking Ansible
forever. A simple fix is to run visudo
on the target host, and make sure that the user Ansible
will use to login does not have to type a password:
@username ALL=(ALL) NOPASSWD: ALL@
Action handlers
An action can define a notify
attribute to fire an event once it is done. The case of the Nginx
server configuration is a good one.
First, we copy a local file from the relative path files/niginx-default
to the host at path
/etc/nginx/sites-available/default
. Once this is done, the restart nginx
notification is fired:
The notification handler can then restart the Nginx service:
Note the enabled
attribute: it ensures that the service is run as part of the system init scripts.
The details of how to do that is managed by the service
module that knows how to do so on each
operating system.
Loops
A good example for loops is the installation of packages:
This simply repeats the action for many items, and eventually fires a restart nginx
notification.
Touching file contents
There are many ways to edit files instead of copying them from your control host. One of these is
the lineinfile
action:
No matter what the rest of the sshd_config
file is, this ensure that a line contains the
instruction to disable root logins over SSH.
Shell actions
Not everything is captured by an Ansible module. While you can develop your own actions, you may
simply issue shell commands, too.
The following example manipulates the ufw
firewall to ensure that only ports 22 and 80 are open
from the hostile Internet:
Using console output
We can register the console output of an action:
This runs the whoami
command to know what login is being used on the target host. This makes a
playbook flexible for, say, update permissions without having a hard-coded login:
Running a playbook
By now you should have a good understanding of what a playbook is like.
Running a playbook is equally simple:
The great thing with Ansible playbooks is that they are mostly idempotent, so you can run them as
often as you want.
Indeed, modules store some state called facts, and Ansible won’t perform an action again if some
fact hasn’t changed between 2 runs.
Most Ansible-provided modules provide actions that store facts, but always keep in mind that not
everything can be idempotent. Running shell commands is a good example. We did that in the previous
playbook to update the firewall configuration. If we wanted to avoid redoing it on each playbook
run, we would need to write some kind of upf
action and ensure that facts are being stored
regarding the firewall configuration.
Testing with Vagrant
There is little chance that you will get a correct configuration out of the box.
Like any good software project, the only solution is to test, test and test.
Getting a server up and running is costly, so your best solution is to try in a virtual machine.
Oracle VirtualBox is a well-known opensource solution. The problem is
that booting an ISO image and starting the installation from scratch is tedious, boring and
time-consuming.
Vagrant to the rescue
Vagrant is a the real deal. It is a command-line tool to manage virtual
machines from simple configuration files. You can start a virtual machine from a single command,
log-in through SSH, stop it and trash it at will. Vagrant is your best friend when you want to test
a given server(s) configuration using virtual machines.
Vagrant is most often used in combination with VirtualBox, but it can run other virtualization
engines too. Getting Vagrant up and running is easy:
- install Oracle VirtualBox, then
- install Vagrant.
VoilĂ , you’re done!
Vagrant files
Vagrant configuration files are very simple. They use a Ruby DSL to describe the configuration,
including:
- how many machines to run,
- what base images to use for each machine (Ubuntu, Fedora, FreeBSD, your own, etc),
- what network configuration shall be used,
- how much CPU / memory do you want,
- which local folder shall be synchronized with a folder in the virtual machine, etc.
For the Ansible configuration above, my testing Vagrantfile
looks as follows:
This single-machine configuration is quite simple. It boots a Ubuntu-based box. There are many more
community-contributed boxes, and you can create your own ones. Next, the machine is being put in a
private network with IP 192.168.100.10
. We also forward connections from port 80 to port 8080 on
the host machine.
Playing with Vagrant
The configuration file above is simple, and so is running the VM:
vagrant up
starts the machine, possibly downloading and caching the box image,vagrant ssh
logs you into the VM,vagrant halt
stops the VM,vagrant suspend
… suspends the VM,vagrant destroy
trashes the VM.
This is quite handy: a simple Vagrantfile
is all you need, and Vagrant takes care of preparing the
VMs for you.
Ansible and Vagrant integration
Vagrant supports different types of provisioning methods, including shells scripts, Puppet and
your new best friend Ansible.
Configuration is easy by adding the following to your Vagrantfile
:
I suggest having a specific inventory file that matches the IP addresses of your Vagrant
configuration.
Once this is done, Vagrant calls Ansible to provision the VM. There are a few extra commands that
are useful while working on your Ansible setup with Vagrant:
vagrant reload
, andvagrant provision
to force calling Ansible without a reboot.
When you are confident with your Ansible configuration, I suggest a vagrant destroy
followed by a
vagrant up
, just to retry your automated configuration from scratch.
Conclusion
Automatic configuration of machines is quite easy with Ansible. Knowing that you can configure a
whole set of machines or just a single one with a reproducible process is priceless.
Ansible is very approchable. While primarily push-based, it can also work in a pull fashion with
little friction.
Good programmers test, and Vagrant makes it so easy to play with virtual machines that you have no
excuse for not fine-crafting Ansible configuration with it.
We only scratched the surface of what you can do with Ansible, yet the simplicity of the tool should
be convincing.
On a final note, Ansible is useful for more than servers. It knows how to deal with package managers
such as MacPorts and Homebrew, so you could also use it for managing desktops. We successfully used
Ansible as part of our research experiments to provision RaspberryPi devices from a generic Raspbian
image.