Custom Ansible filters: Easy solution to difficult problems

I have recently been using Ansible to automate health checks for some of our software-defined network (SDN) infrastructure. One of the devices my code must query for health is a soft router running the SROS operating system. Ansible 2.2 recently introduced support for an sros_command module (Info here) that simplifies my task somewhat, but I’m still left to do screen-scraping of the command output.

Screen scraping is nasty work. Lots of string processing with split(), strip(), and other commands. The resulting code is heavily dependent on the exact format of the command output. If it changes, the code breaks.

I initially implemented the screen-scraping using Jinja2 code in my playbooks. That put some pretty ugly, complex code right in the playbook. I found a better answer: Create a custom filter or two. Now things are *so much cleaner* in the playbooks themselves, the format-dependent code is now separated from the main code, and Python made it so much easier to code.

The best part: Ansible filters are very easy to create. The Ansible docs aren’t very helpful, perhaps because creation is so simple they thought it didn’t need explanation! The best way to figure out how to create your own filters is to look at some existing filters as a pattern to follow. The simplest of these is in Ansible itself, json_query. Here’s a stripped and simplified version of that code for the purpose of illustration. This code implements two trivial filters, my_to_upper and my_to_lower:

from ansible.errors import AnsibleError


def my_to_upper(string):
    ''' Given a string, return an all-uppercase version of the string.
    '''
    if string is None:
        raise AnsibleError('String not found')
    return string.upper()


def my_to_lower(string):
    ''' Given a string, return an all-lowercase version of the string.
    '''
    if string is None:
        raise AnsibleError('String not found')
    return string.lower()

class FilterModule(object):
    ''' Query filter '''

    def filters(self):
        return {
            'my_to_upper': my_to_upper,
            'my_to_lower': my_to_lower
    }

Developing this code is as simple as creating the FilterModule class, defining filters for each of the custom filters you need, and then providing a function for each filter. The example is trivial. I think you can see that you can make the filter functions as complex as required for your application.

Note that I have included AnsibleError in the example for illustration purposes because it is an extremely-useful way to get errors all the way to the console. If I were *really* implementing these filters, empty string wouldn’t be an error. I’d just return an empty string.

Here’s a couple of simple examples of how to call the filters and the resultant output:

- name: Create a mixed-case string
  shell: echo "A Mixed String"
  register: mixed_string
  delegate_to: localhost

- name: Print the UPPERCASE string
  debug: msg="{{ mixed_string.stdout|my_to_upper }}"

- name: Print the LOWERCASE string
  debug: msg="{{ mixed_string.stdout|my_to_lower }}"

<snip...>

TASK [my_task : Create a mixed-case string] *********************************
changed: [host.example.com -> localhost]

TASK [my_task : Print the UPPERCASE string] *********************************
ok: [host.example.com] => {
 "msg": "A MIXED STRING"
}

TASK [my_task : Print the LOWERCASE string] *********************************
ok: [host.example.com] => {
 "msg": "a mixed string"
}

In my case, instead of my_to_upper and my_to_lower, I created *command*_to_json filters that convert the SROS command output into JSON that is easily parsed in the playbook. This keeps my playbooks generic and isolates my filters as the place where the nasty code lives.

Advertisements

Verbose Output from git

Here’s a simple trick that provides more verbose text when using git:

GIT_CURL_VERBOSE=1 git clone https://github.com/repo/project.git

The

GIT_CURL_VERBOSE=1

is the key.

This change provided the difference I needed to debug.

Before:

Cloning into 'project'...
fatal: unable to access 'https://github.com/repo/project.git/': server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none

After:

Cloning into 'project'...
* Couldn't find host github.com in the .netrc file; using defaults
* Hostname was NOT found in DNS cache
* Trying 192.30.253.113...
* Connected to github.com (192.30.253.113) port 443 (#0)
* found 173 certificates in /etc/ssl/certs/ca-certificates.crt
* server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none
* Closing connection 0
fatal: unable to access 'https://github.com/repo/project.git/': server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none

Interesting!

Public key for epel-release-7-8.noarch.rpm is not installed

I was trying to run some ansible playbooks on my CentOS 7 Linux machine. I hit a failure because the version of ansible on the machine (1.9.4.0) was less than the minimum version required by the playbooks (2.1.0.0). yum install was seeing 1.9.4.0 as the latest.

It turns out what I needed was to pull a version of ansible from the EPEL repo rather than the default repo. yum repolist showed that the EPEL repo was already available on the machine, so I followed the instructions I found on the Internet: yum install ansible-2.1.0.0

The package was found and downloaded. But before the installation completed it hit an error:

Public key for epel-release-7-8.noarch.rpm is not installed

There is a very simple fix for this, as documented here and in other places:

rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY*

I don’t know how the system got into this state. It is a lab machine that gets used for many different experiments. I was happy to have found a simple fix.

What if ansible_default_ipv4 is Empty?

A colleague was attempting to use ansible to install Kubernetes, but he hit an error that confused him:

TASK [etcd : Write etcd config file] *******************************************

task path: /root/k8s-20160803-vishpat-contrib-git/contrib/ansible/roles/etcd/tasks/main.yml:23

fatal: [k8s-master.vnslab.net]: FAILED! => {"changed": false, "failed": true,
 "invocation": {"module_args": {"dest": "/etc/etcd/etcd.conf", "src": "etcd.conf.j2"},
 "module_name": "template"}, "msg": "AnsibleUndefinedVariable:
 {{ etcd_peer_url_scheme }}://{{ etcd_machine_address }}:{{ etcd_peer_port }}:
 {{ hostvars[inventory_hostname]['ansible_' + etcd_interface].ipv4.address }}:
 {{ ansible_default_ipv4.interface }}: 'dict object' has no attribute 'interface'"}

I asked him for a copy of the setup module (gather facts) for the host in question:

ansible -i 'your_host_name,' -m setup

This portion of the output jumped out at me:

<snip>
       },
        "ansible_default_ipv4": {},
        "ansible_default_ipv6": {},
        "ansible_devices": {
</snip>

ansible_default_ipv4 was empty. This was the root cause of the problem. When ansible tries to deploy the etcd template from roles/etcd/templates/etcd.conf.j2 it hits the following lines and attempts to substitute values for the variables:

<snip>
{% for host in groups[etcd_peers_group] -%}
  {{ hostvars[host]['ansible_hostname'] }}={{ etcd_peer_url_scheme }}:
    //{{ hostvars[host]['ansible_' + etcd_interface].ipv4.address }}:
    {{ etcd_peer_port }}
  {%- if not loop.last -%},{%- endif -%}
{%- endfor -%}
</snip>

And the definition of etcd_interface depends on ansible_default_ipv4 being populated. From roles/etcd/defaults/main.yaml: 

<snip>
# Interface on which etcd listens.
# Useful on systems when default interface is not connected to other machines,
# for example as in Vagrant+VirtualBox configuration.
# Note that this variable can't be set in per-host manner with current implementation.
etcd_interface: "{{ ansible_default_ipv4.interface }}"
</snip>

The result: When ansible tries to deploy the etcd.config template, it discovers that ansible_default_ipv4.interface doesn’t exist. It throws up its hands.

The fix: Setup a default route on the host under consideration. Instructions can be found here:

http://linux-ip.net/html/basic-changing.html#basic-changing-default

Once the change to to the host was made, ansible_default_ipv4.interface was populated! Problem solved!

Configuring Go CD for passwordless ssh clone from GitHub

I recently installed Go CD to do some CI/CD proof of concept work. (https://www.go.cd/) Go CD is a continuous delivery server based on pipelines of work to be accomplished. It integrates with several CRM systems, including GitHub.

After the install, I was prompted by the Go CD GUI to create my first pipeline. Each pipeline has 3 main parts:

  • Basic Settings
  • Materials
  • Stage/Job

Basic settings are simple: Name of the pipeline and the pipeline group it belongs to. It is in the setting up of the Materials that I ran into trouble.

The Materials page requires 3 pieces of information:

  • Material Type (GitHub)
  • URL (SSH clone URL)
  • Branch (master by default)

When I clicked “Check Connection”, I received the following error:

--- ERROR ---
STDERR: Permission denied (publickey).
STDERR: fatal: Could not read from remote repository.
STDERR: 
STDERR: Please make sure you have the correct access rights
STDERR: and the repository exists.
---

The problem is that Go CD Server runs as the “go” user on my Go CD master and slaves. I needed to add the “go” user’s public ssh key for each server (master and all slaves) to GitHub as a Deploy Key.

On each server (master and slaves), I did the following:

  • Login to the server using ssh
  • Change to the “go” user via ‘sudo su go’
  • Create the “go” user’s public key via ‘ssh-keygen’. I went with default path and file name (‘/var/go/.ssh/id_rsa’) and gave no passphrase.
  • Copied the contents of ‘/var/go/.ssh/id_rsa.pub’ to GitHub as a new Deploy Key.

On each server (master and slaves), I tested the connection by executing a command-line git clone. In each case, I was prompted to permanently add the GitHub server to my list of known hosts.

Once this was complete, “Check Connection” passed with “OK”.

 

launchd on Mac, part 2

In a previous post I wrote about my positive experience with a tool that automated the creation of launchd plist files for Mac, http://launched.zerowidth.com/. After several weeks of running, I found myself in a position of not knowing if the job was running as scheduled. A bit more searching yielded two interesting things.

First, I found that the plist syntax supports two keywords that allow me to capture the output of my script in files:

 <key>StandardErrorPath</key>
 <string>/Users/myname/backup.stderr.log</string>
 <key>StandardOutPath</key>
 <string>/Users/myname/backup.stdout.log</string>

I then changed the program arguments from this:

 <key>ProgramArguments</key>
 <array>
     <string>sh</string>
     <string>-c</string>
     <string>rsync -av -e ssh ~/Documents user@10.10.10.10:/home/user/backup/</string>
 </array>

to this:

 <key>ProgramArguments</key>
 <array>
     <string>sh</string>
     <string>-c</string>
     <string>date;rsync -av -e ssh ~/Documents user@10.10.10.10:/home/user/backup/</string>
 </array>

Now the stdout log file contains a timestamp for every time the task is run. I can simply open the file and scroll to the end when I feel uncertain about whether the task ran. 🙂

Second, I stumbled upon an excellent primer on launchd and the launchctl command that exercises it: http://nathangrigg.net/2012/07/schedule-jobs-using-launchd/

 

Writing a WebApp with Go

The standard packages available in Go for creating a WebApp are net/http and html/template. There is an excellent step-by-step walk-through of creating a simple wiki using Go located here:

https://golang.org/doc/articles/wiki/

The golang folks have done us a great service. The wiki builds from a bare-bones web server to a primitive wiki, showing us how to effectively use redirection, templates, and a number of other concepts.

My familiarity with Go has increased as a result of this tutorial. Bravo!