Google Compute Engine with Ansible

Recently I was working on using Ansible with Google Compute Engine (GCE). This turned out to be a somewhat frustrating experience, so I thought I’d write down some notes of what issues I ran into.

The tutorial I was using was: https://googlecloudplatform.github.io/compute-video-demo-ansible/

The corresponding Github repository is here: https://github.com/GoogleCloudPlatform/compute-video-demo-ansible

I didn’t follow the exact tutorial. I made manual made modifications to the Ansible configuration, mainly I simplified it to use only one instance. I also copied out the code by hand and made modifications as I went along. This I believe led to some of the problems I ran into (lack of understanding / misconceptions on my part).

Install apache-libcloud

Before proceeding, apache-libcloud library is required to be installed. This isn’t installed when gcloud is installed. I can’t remember what is used for.

My current python installation is via Anaconda with python 2.7 as the default. I find that using Anaconda is a useful in that I avoid installing pip libraries as root, which tends to lead to problems and pollutes the base python installation. I could use virtualenv, but I prefer to not switch to the default environment each time. Anaconda has virtualenv support, so I can use that when needed. I just like to have a default installation.

pip install apache-libcloud==0.20.1

Create a service account

The first thing that’s needed is to create a service account, which is basically a bot account and helps avoid having to share credentials. You can use the website control panel to do this, but to me using the gcloud CLI seems preferable, easier to automate.

gcloud iam service-accounts create <account-name> --display-name "Account Display Name"

Source: https://cloud.google.com/iam/docs/creating-managing-service-accounts

Service account roles

Next we need to grant the service account a role. I picked the “editor” role. Not sure if that’s the right one, but it made the most sense looking at the long list of roles.

gcloud projects add-iam-policy-binding compute-trial \
    --member serviceAccount:<account-name>@<project-name>.iam.gserviceaccount.com --role roles/editor

source: https://cloud.google.com/iam/docs/granting-roles-to-service-accounts

Service account JSON key

Next we need to generate the service account key to authenticate the service account and allow it to automate various GCE management steps.

The key can be generated from the web console: [https://console.cloud.google.com/apis/credentials?project=](https://console.cloud.google.com/apis/credentials?project=)

But as always, using gcloud is preferable:

gcloud iam service-accounts keys create key.json --iam-account=<account-name>@<project-name>.iam.gserviceaccount.com

source: https://cloud.google.com/sdk/gcloud/reference/iam/service-accounts/keys/create

SSH connection

Here I had some misunderstandings on how the ssh worked.

gcloud allows you to connect to your instance using: gcloud compute ssh <instance-name>. This is nice in that you don’t have to type in the ip to connect to the server. I believe it can also connect to instances in a private network.

You can read more about this command here: https://cloud.google.com/sdk/gcloud/reference/compute/ssh

Now if you want to connect to an instance using the normal ssh command, you can use gcloud compute config-ssh and this will generate a configuration in .ssh/config.

This works great. But now I wondered how the instances created via Ansible and the service account could connect via ssh without this configuration. I haven’t completely figured this out. But one reason is obvious. The GCE module in Ansible is able to collect the ip addresses of the new instances. I guess the one question is how do the user and ssh key work. In the tutorial I was using, the ssh key was referenced in /group_vars/all as ansible_ssh_private_key_file. Here you would put the ssh key generated by gcloud I think. But when is this generated? Is this already pre-generated? Does it get regenerated? I am still unclear on this. On a clean run, I noticed that it does fail on the ssh connection. But after running gcloud compute config-ssh it seemed to work. But I don’t know if that’s true or not.

Dynamic inventory

Related to the SSH Connection problem is the inventory file. I did not realize that you could do dynamically generated inventory files in Ansible. I’ve never had to do that yet. This is what gce.py is for. This is a dynamic inventory script for GCE. This makes sense. We won’t know what the IP addresses of the new instances will be.

One thing that needs to be done is to make sure gce.py is executable. This signals to Ansible to run the file and the output.

I initially had a lot of problems getting the gce.py to work. It kept asking me for authentication and I didn’t know why. Turns out I had filled in the gce.ini file incorrectly. I mistook it for a python file and added quotes. Once I removed the quotes, the authentication worked, but then I ran into a certificate problem.

I’d run this test command:

GCE_INI_PATH=./gce.ini ansible all -i gce.py -m setup

And get this error:

RuntimeError: No CA Certificates were found in CA_CERTS_PATH. For information on how to get required certificate files, please visit https://libcloud.readthedocs.org/en/latest/other/ssl-certificate-validation.html

So I installed certifi via pip, but this didn’t work. I’m still not sure why it didn’t work. Turned out I had to download the cert file manually. I downloaded the certs here http://curl.haxx.se/docs/caextract.html

Then I set the SSL_CERT_FILE to the path of cert.pem downloaded from above.

SSL_CERT_FILE=./cacert.pem GCE_INI_PATH=./gce.ini ansible all -i gce.py -m setup

source: https://groups.google.com/forum/#!topic/ansible-project/WTef9t1TyA0

No hosts found error

In gce-instances.yml, those steps are run locally and not remotely. So needed to add connection: local. You can also add that line to the hosts file.

---
# compute-video-demo-ansible
- name: Create Compute Engine instances
  hosts: local
  connection: local

ImportError: No module named utils.display

I got the tutorial to run, but then a few days later my new laptop arrived, so I decided to manually set up the tutorial again. This time I ran into this error.

It took me a while but I ended up solving it by setting the correct python interpreter since I’m using the Anaconda install.

I added this to my hosts file ansible_python_interpreter=/Users/<username>/anaconda/bin/python

Modified ansible hosts file:

[local]
127.0.0.1 ansible_python_interpreter=/Users/<username>/anaconda/bin/python

[gce_instances]
myinstance[1:4]

Via: https://github.com/jlund/streisand/issues/629#issuecomment-296398774