Recently I was working on using Ansible with Google Compute Engine (GCE). This turned out to be a somewhat frustrating experience, so I thought I’d write down some notes of what issues I ran into.
The tutorial I was using was: https://googlecloudplatform.github.io/compute-video-demo-ansible/
The corresponding Github repository is here: https://github.com/GoogleCloudPlatform/compute-video-demo-ansible
I didn’t follow the exact tutorial. I made manual made modifications to the Ansible configuration, mainly I simplified it to use only one instance. I also copied out the code by hand and made modifications as I went along. This I believe led to some of the problems I ran into (lack of understanding / misconceptions on my part).
Install apache-libcloud
Before proceeding, apache-libcloud library is required to be installed. This isn’t installed when gcloud is installed. I can’t remember what is used for.
My current python installation is via Anaconda with python 2.7 as the default. I find that using Anaconda is a useful in that I avoid installing pip libraries as root, which tends to lead to problems and pollutes the base python installation. I could use virtualenv, but I prefer to not switch to the default environment each time. Anaconda has virtualenv support, so I can use that when needed. I just like to have a default installation.
pip install apache-libcloud==0.20.1
Create a service account
The first thing that’s needed is to create a service account, which is basically a bot account and helps avoid having to share credentials. You can use the website control panel to do this, but to me using the gcloud CLI seems preferable, easier to automate.
gcloud iam service-accounts create <account-name> --display-name "Account Display Name"
Source: https://cloud.google.com/iam/docs/creating-managing-service-accounts
Service account roles
Next we need to grant the service account a role. I picked the “editor” role. Not sure if that’s the right one, but it made the most sense looking at the long list of roles.
gcloud projects add-iam-policy-binding compute-trial \
--member serviceAccount:<account-name>@<project-name>.iam.gserviceaccount.com --role roles/editor
source: https://cloud.google.com/iam/docs/granting-roles-to-service-accounts
Service account JSON key
Next we need to generate the service account key to authenticate the service account and allow it to automate various GCE management steps.
The key can be generated from the web console: [https://console.cloud.google.com/apis/credentials?project=
But as always, using gcloud is preferable:
gcloud iam service-accounts keys create key.json --iam-account=<account-name>@<project-name>.iam.gserviceaccount.com
source: https://cloud.google.com/sdk/gcloud/reference/iam/service-accounts/keys/create
SSH connection
Here I had some misunderstandings on how the ssh worked.
gcloud allows you to connect to your instance using: gcloud compute ssh <instance-name>
. This is nice in that you don’t have
to type in the ip to connect to the server. I believe it can also connect to instances in a private network.
You can read more about this command here: https://cloud.google.com/sdk/gcloud/reference/compute/ssh
Now if you want to connect to an instance using the normal ssh command, you can use gcloud compute config-ssh
and
this will generate a configuration in .ssh/config
.
This works great. But now I wondered how the instances created via Ansible and the service account could connect via ssh
without this configuration. I haven’t completely figured this out. But one reason is obvious. The GCE module in Ansible is
able to collect the ip addresses of the new instances. I guess the one question is how do the user and ssh key work. In the
tutorial I was using, the ssh key was referenced in /group_vars/all
as ansible_ssh_private_key_file
. Here you would put
the ssh key generated by gcloud I think. But when is this generated? Is this already pre-generated? Does it get regenerated?
I am still unclear on this. On a clean run, I noticed that it does fail on the ssh connection. But after running gcloud compute config-ssh
it seemed to work. But I don’t know if that’s true or not.
Dynamic inventory
Related to the SSH Connection problem is the inventory file. I did not realize that you could do dynamically generated inventory
files in Ansible. I’ve never had to do that yet. This is what gce.py
is for. This is a dynamic inventory script for GCE. This
makes sense. We won’t know what the IP addresses of the new instances will be.
One thing that needs to be done is to make sure gce.py
is executable. This signals to Ansible to run the file and the output.
I initially had a lot of problems getting the gce.py
to work. It kept asking me for authentication and I didn’t know why.
Turns out I had filled in the gce.ini
file incorrectly. I mistook it for a python file and added quotes. Once I removed
the quotes, the authentication worked, but then I ran into a certificate problem.
I’d run this test command:
GCE_INI_PATH=./gce.ini ansible all -i gce.py -m setup
And get this error:
RuntimeError: No CA Certificates were found in CA_CERTS_PATH. For information on how to get required certificate files, please visit https://libcloud.readthedocs.org/en/latest/other/ssl-certificate-validation.html
So I installed certifi via pip, but this didn’t work. I’m still not sure why it didn’t work. Turned out I had to download the cert file manually. I downloaded the certs here http://curl.haxx.se/docs/caextract.html
Then I set the SSL_CERT_FILE to the path of cert.pem downloaded from above.
SSL_CERT_FILE=./cacert.pem GCE_INI_PATH=./gce.ini ansible all -i gce.py -m setup
source: https://groups.google.com/forum/#!topic/ansible-project/WTef9t1TyA0
No hosts found error
In gce-instances.yml
, those steps are run locally and not remotely. So needed to add connection: local
. You can also add
that line to the hosts file.
---
# compute-video-demo-ansible
- name: Create Compute Engine instances
hosts: local
connection: local
ImportError: No module named utils.display
I got the tutorial to run, but then a few days later my new laptop arrived, so I decided to manually set up the tutorial again. This time I ran into this error.
It took me a while but I ended up solving it by setting the correct python interpreter since I’m using the Anaconda install.
I added this to my hosts file ansible_python_interpreter=/Users/<username>/anaconda/bin/python
Modified ansible hosts file:
[local]
127.0.0.1 ansible_python_interpreter=/Users/<username>/anaconda/bin/python
[gce_instances]
myinstance[1:4]
Via: https://github.com/jlund/streisand/issues/629#issuecomment-296398774