Programming, Technology

Integrating Icinga2 with InfluxDB and Grafana

Typically when you are monitoring a platform for performance metrics you will inevitably probably end up considering things like Collectd or Diamond for collection of metrics and Graphite for receipt, storage and visualisation of metrics.  That was state of the art 3 years ago and times change rapidly in computing.  I’d like to take you on a journey of how we developed our current monitoring, alerting and visualisation platform.

The Problem With Ceph…

Ceph is the new starlet on the block for scale-out, fault-tolerant storage. We’ve been operating a petabyte scale cluster in production for well over 2 years now, and one of the things you will soon learn is that when a journal drive fails it’s a fairly big deal. All drives reliant on that journal disk are fairly quickly removed from the cluster which results in objects replicating to replace the lost redundancy, and redistributing objects across the cluster to cater for the altered topology. This process, depending on how much data is in the cluster can take days to complete, and unfortunately have a significant impact on client performance. Luckily we as an operator handle these situations for you in a way that minimises impact, typically as a result of being woken up at 3am!

The dream however is to predict that an SSD journal drive is going to fail and pro-actively replace it during core working hours, transparent to the client. Initially with one vendor’s devices, we noted that I/O wait times increased quite dramatically before the device failed completely, giving plenty of notice (in the order of days) that the device should be replaced. Obviously this has a knock-on effect on storage performance as writes to the cluster are synchronous to ensure redundancy, so not the best situation.

Eventually we changed to use of devices from another manufacturer, which last longer and offer better performance. The downside is they no longer exhibit slow I/O. They just go pop, cue mad scramble to stop the cluster rebalancing and replace the failed journal hastily.

Can we do anything to predict failure with these devices? The answer is possibly. SMART monitoring of ATA devices allows the system to interrogate the device and pull off a number of metrics and performance counters that may lead to clues as to impending failure. Existing monitoring plug-ins that are available with our operating system were only capable of working with directly attached devices, so monitoring ATA devices behind a SAS expander device was impossible, and also only alert when the SMART firmware predicts a failure, which I have never seen in the field! This led to my authoring of the check-scsi-smart plug-in which allows the vast majority of devices in our platform to be monitored, every available counter to be individually monitored via performance data, and individually raise alerts based on user provided warning and critical thresholds.

Data Collection

A while ago I made the bold (most will say sensible) statement that Nagios/Icinga was no longer fit for purpose and needed replacing with a modern monitoring platform. My biggest gripes were the reliance on things like NRPE and NSCA. The former has quite frankly, broken and insecure transport. The latter has none, so when it comes to throwing possibly sensitive monitoring metrics across the public internet in plain text these solutions were pretty much untenable.

Luckily the good folks at Icinga had been slavishly working away at a ground up replacement of the old Nagios based code. Icinga2 is a breath of fresh air. All communications are secured via X.509 public key cryptography, they are initiated by either end point so work behind a NAT boundary. Hosts can monitor themselves, thus distributing the load across the platform, they can also raise notifications about themselves, so you are no longer reliant on a central monitoring server, however check results are propagated towards the root of the tree. Configuration is generated on a top-level master node and propagated to satellite zones and end hosts. The system is flexible so it need not work in this way, but I’ve arrived at this architecture as a best practice.

For me the real genius is how service checks are applied to hosts. Consider the following host definition:

object Host "ceph-osd-0.example.com" {
  import "satellite-host"

  address = "10.10.112.156"
  display_name = "ceph-osd-0.example.com"
  zone = "icinga2.example.com"

  vars.kernel = "Linux"
  vars.role = "ceph_osd"
  vars.architecture = "amd64"
  vars.productname = "X8DTT-H"
  vars.operatingsystem = "Ubuntu"
  vars.lsbdistcodename = "trusty"
  vars.enable_pagerduty = true
  vars.is_virtual = false

  vars.blockdevices["sda"] = {
     path = "/dev/sda"
  }
  vars.blockdevices["sdb"] = {
     path = "/dev/sdb"
  }
  vars.blockdevices["sdc"] = {
     path = "/dev/sdc"
  }
  vars.blockdevices["sdd"] = {
     path = "/dev/sdd"
  }
  vars.blockdevices["sde"] = {
     path = "/dev/sde"
  }
  vars.blockdevices["sdf"] = {
     path = "/dev/sdf"
  }
  vars.blockdevices["sdg"] = {
     path = "/dev/sdg"
  }

  vars.interfaces["eth0"] = {
     address = "10.10.112.156"
     cidr = "10.10.112.0/24"
     mac = "00:30:48:f6:de:fe"
  }
  vars.foreman_interfaces["p1p1"] = {
     address = "10.10.104.107"
     mac = "00:1b:21:76:86:d8"
     netmask = "255.255.255.0"
  }
  vars.interfaces["p1p2"] = {
     address = "10.10.96.129"
     cidr = "10.10.96.0/24"
     mac = "00:1b:21:76:86:d9"
  }

}

 

Importing satellite-host basically inherits a number of parameters from a template that describe how to check that the host is alive, and how often. The zone parameter describes where this check will be performed from e.g. the north bound icinga2 satellite. The vars data structure is a dictionary of various key value pairs and can be utterly arbitrary. In this example we define everything about the operating system, the architecture and machine type, whether or not the machine is virtual. Because this is generated by Puppet orchestration software, we can inspect even more parts of the system e.g. block devices, network interfaces. The possibilities are endless.

object CheckCommand "smart" {
  import "plugin-check-command"
  command = [ "sudo", PluginDir + "/check_scsi_smart" ]
  arguments = {
     "-d" = "$smart_device$"
  }
}

 

The CheckCommand object defines a executable to perform a service check. Here we define the check as having to run with elevated privileges, and its absolute path. You can also specify potential arguments, in this case if the macro smart_device is able to be expanded (it will look in host or service variables for a match) then the option will be generated on the command line with the parameter. There is also provision to set the option only without a parameter if needs be.

apply Service "smart" for (blockdevice => attributes in host.vars.blockdevices) {
  import "generic-service"
  check_command = "smart"
  display_name = "smart " + blockdevice
  vars.smart_device = attributes.path
  zone = host.name
  assign where match("sd*", blockdevice)
}

 

The last piece of the jigsaw is the Service object. Here we are saying that for each blockdevice/attributes pair on each host, if the block device name begins with sd then apply the service check to it. This way you write the service check once, and it will be applied correctly to every SCSI disk on every host, no host specific hacks involved ever. Much like the host definition generic-service is a template that defines how often a check should be performed, the zone which performs the check is the host itself. The check_command defines which check to perform, as defined above, and we set vars.smart_device to the device path of the block device which will be picked up by the macro expansion in the check command as discussed earlier.

Time Series Data Collection

With that all in place we now have a single pane of glass view onto all current states of all SCSI devices on all hosts. However what we really need is to gather all of these snapshots into a database which allows us to plot the counters over time, derive trends that indicate potential disk failure and then set alerting thresholds accordingly.

Anecdotally we previously had Graphite Carbon aggregating statistics we gathered via Collectd. However with several hundred servers sending many tens of metrics a second, it wasn’t up to the task. Even with local SSD backed storage the I/O queues were constantly full to capacity. We needed a better solution, and one which looked promising was InfluxDB. Although a fledgling product still in flux, it is built to perform many operations in memory, support clustering for horizontal scaling and be schema-less. To illustrate take a look at the following example from my test environment.

load,domain=angel.net,fqdn=ns.angel.net,hostname=ns,service=load,metric=load15,type=value value=0.05 1460907584

 

The measurement load is in essence a big bucket that all metrics to do with load fit into. Arbitrary pieces of meta data can be associated with a data point, here we attach the domain, fqdn and hostname which are useful for organising data based on physical location. The metric correlates with a performance data metric returned by a monitoring plug-in, the type references the field in the performance data, in this case referring to the actual value, but it may represent alerting thresholds or physical limits. The value records the actual data value and the final parameter is the time stamp, in this case to a second precision, but defaults to nanoseconds.

By arranging data like this you can ask questions such as, give me all metrics of type value, from the last hour for hosts in a specific domain, grouping the data by host name. I for one find this a lot more intuitive than the existing methodologies bound up in Graphite. You can also query the meta data asking questions like, for the load metric, give me all possible values of hostname, which makes automatically generating fields a dream.

The missing part in this puzzle is getting performance data from Icinga2 into InfluxDB, along with all the tags which makes InfluxDB so powerful. Luckily I was able to spend a few days making this a reality, although at the time of press still in review, it looks set to be a great addition to the ecosystem.

library "perfdata"

object InfluxdbWriter "influxdb" {
  host = "influxdb.angel.net"
  port = 8086
  database = "icinga2"
  ssl_enable = false
  ssl_ca_cert = "/var/lib/puppet/ssl/certs/ca.pem"
  ssl_cert = "/var/lib/puppet/ssl/certs/icinga.angel.net.pem"
  ssl_key = "/var/lib/puppet/ssl/private_keys/icinga.angel.net.pem"
  host_template = {
    measurement = "$host.check_command$"
    tags = {
      fqdn = "$host.name$"
      domain = "$host.vars.domain$"
      hostname = "$host.vars.hostname$"
    }
  }
  service_template = {
    measurement = "$service.check_command$"
    tags = {
      fqdn = "$host.name$"
      domain = "$host.vars.domain$"
      hostname = "$host.vars.hostname$"
      service = "$service.name$"
      fake = "$host.vars.nonexistant$"
    }
  }
}

 

Here’s the current state of play; it allows a connection to any port on any host, specification of the database to write to, with optional full SSL support. The powerful piece is in the host and service templates which allow the measurement to be set, typically the check_command e.g. ssh, smart, and any tags that can be derived from the host or service objects, if the value doesn’t exist, the tag is not generated for the data point. Remember how we can associate all manner of meta data with a host, well all of that rich data is available here to be used as tags.

Presentation Layer

Putting it all together we need to visualise this data, and choose Grafana. Below is a demonstration of where we are today.

The dashboard is templated on the domain, which is extracted from InfluxDB meta data. We can then ask for all hosts within that domain, and finally all mount points on that host in that domain. This makes organising data simple, flexible and extremely powerful. Going back to my example on Ceph journals, I can now select the domain a faulty machine resides in, select the host, the disk that has failed, and then look at individual performance metrics over time to identify predictive failure indicators which can then be fed back into the monitoring platform as alert thresholds. Luckily I have as of yet been unable to test this theory as nothing has gone pop yet.

There you have it, from problem to modern and powerful solution. I hope this inspires you to have a play with these emerging technologies and come up with innovative ways to monitor and analyse your estates, and predict failures or plan of capacity trends.

Update

Quite soon after this functionality was introduced we experienced an OSD journal failure. Now to put the theory to the test…

 

As the graphic depicts for the failing drive certain counters will start to increase from zero before the drive is about to fail. Importantly these will gradually increase over time for a period of several weeks before the drive fatally fails. Crucially we now have visibility of potential failures and can replace them in time periods which will be less likely to cause customer impact, and can be handled at more healthy time of the day. Failures can also be correlated with logical block addresses written, which now enables us to predict operating expenditure over the lifetime of the cluster.

Updated blog post 23 August 2016

Icinga 2.5 is now in the wild! See my updated blog post on integrating your own monitoring platform with InfluxDB and Grafana.

Programming, Technology

Ceph Monitoring with Telegraf, InfluxDB and Grafana

Ceph Input Plug-in

An improved Ceph input plug-in for Telegraf is the core of how Data News Blog collect metrics to be graphed and analysed. You can follow the progress here as the code makes its way into the main release.  Eventually you too can enjoy it as much as we do.

Our transition to InfluxDB as our time-series database of choice motivated this work. Some previous posts go some way to showing why we love InfluxDB. The ability to tag measurements with context specific data is the big win. It helps us to create simplified dashboards which a less clutter which dynamically adapt.

The existing metrics were collected with the Ceph collector for Collectd and stored in Graphite.  Like for like functionality was not available for Telegraf so we decided to contribute code that met our needs.  Setting up the Ceph input plug-in for Telegraf is intended to be simple.  For those familiar with Ceph all you need to do is make a configuration available which can find the cluster.  You will also need a key which provides access to the cluster.

Configuration

The following shows a typical set up.

[[inputs.ceph]]
  interval = '1m'
  ceph_user = "client.admin"
  ceph_config = "/etc/ceph/ceph.conf"
  gather_cluster_stats = true

The interval setting is set to be fairly relaxed. When the system is under heavy load e.g. during recovery operations, measurement collection can take some time.  Instead of have the collection time-out we make sure that there is enough time for it to complete.  After all the reason we want to see the measurements is to see what happens when these heavy operations happen.  It is no good if we have no data.  We chose to do this work also as the collectd plug-in fell in to this trap.

The ceph_user specifies a specific user to attach to the cluster with.  It allows the collector to find the access key and also can optionally pick up additional settings from the configuration file.  The default of client.admin can be automatically found by the plug-in by the ceph command when run.  Key location can be also be set in the configuration file for the user if necessary.

The ceph_config setting tells the plug-in where to find the settings for your ceph cluster.  Normally this will tell us where we can make contact with it and also how to authorise the user.  Finally the gather_cluster_stats option turns the collection of measurements on.

Measurements

So what does the plug-in measure?  It all comes down to running the ceph command.  People who have used this before should have an idea about what it can do.  For now the plug-in collects the cluster summary, pool use and pool statistics.

The cluster summary (ceph status) measures things like how many disks you have, if they are in the cluster and if they are running.  It also gives a summary of the amount of space used and available, how much data is being read and written and the number of operations being performed.  The final things measured are the states of placement groups so you can see how many objects are in a good state, and how many need to be fixed to bring the cluster back into a healthy state.

Pool use (ceph df) show you the amount of memory available and used per pool.  It also shows you the number of objects stored in each pool.  These measurements are tagged with the pool name.  This is useful because pools may be located on specific groups of disks, for example hard drives or flash drives.  You can then monitor and manage these as logically separate entities.

Pool statistics (ceph pool stats) much like global statistics show on a per pool level the number of reads, writes and operations each pool is handling.  Again these are tagged with the pool name and can be used to managed hard drive and solid state drives independently even though they are part of the same cluster.

Show Me The Money

A brief look at what can be collected is all well and good however a real life demonstration speaks a thousand words.

Here is a live demonstration of the plug-in running during an operation performed recently.  This was an operation that moved objects between servers so that we are now able to handle an entire rack failing.  This protects us against a switch failure and allows us to power off a rack to reorganise it.

Global Cluster Statistics

The top pane shows the overall cluster state.  The first graph on the left shows the state of all placement groups.  When the operation begins groups that were clean become misplaced and must be moved to new locations.  From this we can make predictions into how long the maintenance will take and provide feed back to our customers.  You can also see a distinct change in the angle of the graph as the SSD storage completes.  Substantially quicker I think you’ll agree!

To the right we can see the number of groups which are degraded e.g only have two object copies not the full three, and the number of misplaced objects.  The former is interesting in that it show how many objects are at risk from a component failure which would reduce the number of copies down to one.

Per-Pool Statistics

The lower pane is constructed from the pool name.  It is selected at the top of the page.  Here we are displaying (left to right, top to bottom) the number of client operations per second, the storage used and available, the amount of data read and written, and finally the number of objects recovering per second.

Here we can see that although the peak number of client operations are reduced they hardly go below the minimum seen before the operation stated.  This is good news because it means we can handle the customer workload and recover without too much disruption.  Importantly we are able to quantify the impact a similar operation is likely to have in the future.

Some other interesting uses would be to watch for operations, reads or writes ‘clipping’ which would mean you have reached the available limits of the devices and need to add more.  If the pool is less concerned for performance and more with the amount of data, such as a cold storage pool, then the utilisation graph can be used to plan for the future and predict when you will need to expand.

Summing Up

We have demonstrated the upcoming improvements to the Ceph input plug-in for Telegraf, shown what can be collected with it and how this can improve your level of service by gleaning insight into the impact of maintenance on performance, and predicting future outcomes.

As always if you like it, please try it out, share your experiences and help us to improve the experience of running a Ceph cluster for the world as a whole.  The InfluxData community is very friendly in my experience so if you want to make improvements to this or other input plug-ins give it a go!

Update 31 August 2016

As of today the patch has hit the master branch so feel free to check out and build the latest Telegraf. Alternatively it will be released in the official 1.1 version.

Programming, Technology

Multiple Class Definitions With Puppet

One issue we’ve discovered with running puppet orchestration is bumping into classes being multiply defined. In our setup all hosts get a generic role which among other things contains a definition of the foreman puppet class to manage the configuration file (agent stanza) on all hosts. The problem comes when you include the puppet master role which also pulls in the puppet class.

With hindsight the two roles should have been separated out so that all hosts include puppet::agent and the master(s) include puppet::master. But we are:

  1. Severely time constrained being a start-up
  2. Wish to leverage updates and improvements automatically from the community

As such we just roll with the provided API and have to deal with the fallout. First port of call for a newbie is to try ignore the definition with some conditional code

class profile::puppet {
  if !defined('profile::puppetmaster') {
    class { '::puppet': }
  }
}

or

class profile::puppet {
  if !defined(Class['profile::puppetmaster']) {
    class { '::puppet': }
  }
}

 

Either of which will land you with the same problem of this puppet definition clashing with the one defined in profile::puppetmaster. It’s a common mistake, but one that can be remedied. Oddly enough somehow the second example did work in our production environment, but upon playing about with the pattern to understand its inner workings I just could not recreate! This led to the development of the following. Can’t keep a good academic down even when in the role of sysadmin!

Allowing Multiple Class Definitions In Multiple Locations

Now here is how my compiler background head works. The previous examples rely on the entire manifest being parsed before the defined function can be evaluated, at which point you’re already too late. If however the conditional could be made to be evaluated at file parse time, and if it resolves to false, then why bother parsing the code block?

class profile::puppet {
  if $::fqdn != $::puppetmaster {
    class { '::puppet': }
  }
}

Here we are comparing facts, which are available before every run, and can be evaluated at parse time ($::puppetmaster is provided by foreman). The code works exactly as you’d expect every time regardless of ordering.

Obviously this may not be the official puppet methodology and more than likely dependant on the underlying implementation of the parsing and execution engine. It does provide a quick get out of jail free option for when resource is unavailable to do the job properly.

Programming, Technology

Getting started with OpenStack’s Heat

Introduction

OpenStack’s Heat is the project’s infrastructure orchestration component, and can simplify deploying your project on a cloud platform in a way that’s repeatable and easy to understand.  If you’ve come from the Amazon AWS world then it’s analogous to CloudFormation, and indeed it provides compatibility with this service making migration from AWS to OpenStack a little less painful.  However, Heat has its own templating format and that’s what we’ll walk through today.

This post is a quick tutorial on getting started with your first Heat template, and will deploy a pair of webservers together with loadbalancer as an example.

Heat Templates

Let’s jump right in and take a look at the contents of the template that we’ll use to deploy our infrastructure. These template files are typically formatted in YAML and comprise three main sections:

  • Parameters
  • Resources
  • Outputs

 

heat_template_version: 2014-10-16

description: Demo template to deploy a pair of webservers and a loadbalancer

parameters:
key_name:
type: string
description: Name of SSH keypair to be used for compute instance
flavor:
type: string
default: dc1.1x1.20
constraints:
- allowed_values:
- dc1.1x1.20
- dc1.1x2.20
description: Must be a valid Data News Blog Compute Cloud flavour
image:
type: string
default: 6c3047c6-17b1-4aaf-a657-9229bb481e50
description: Image ID
networks:
type: string
description: Network IDs for which the instances should have an interface attached
default: f77c6fdb-72ad-402f-9f1b-6bf974c3ff77
subnet:
type: string
description: ID for the subnet in which we want to create our loadbalancer
default: a8d1edfe-ac8c-49b0-a5c2-c72fa61decd2
user_data:
type: string
default: |
#cloud-config
packages:
- nginx
name:
type: string
description: Name of instances
default: webserver

resources:
webserver0:
type: OS::Nova::Server
properties:
key_name: { get_param: key_name }
flavor: { get_param: flavor }
image: { get_param: image }
networks: [{ network: { get_param: networks } }]
user_data: { get_param: user_data }
user_data_format: RAW

webserver1:
type: OS::Nova::Server
properties:
key_name: { get_param: key_name }
flavor: { get_param: flavor }
image: { get_param: image }
networks: [{ network: { get_param: networks } }]
user_data: { get_param: user_data }
user_data_format: RAW

lb_pool:
type: OS::Neutron::Pool
properties:
protocol: HTTP
subnet_id: { get_param: subnet }
lb_method: ROUND_ROBIN
vip:
protocol_port: 80

lb_members:
type: OS::Neutron::LoadBalancer
properties:
pool_id: { get_resource: lb_pool }
members: [ { get_resource: webserver0 }, { get_resource: webserver1 } ]
protocol_port: 80

outputs:
vip_ip:
description: IP of VIP
value: { get_attr: [ lb_pool, vip, address ] }

 

The first line – heat_template_version: 2014-10-16 – specifies the version of Heat’s templating language we’ll be using, with an expectation that within this template we could be defining resources available up to and including the Juno release.

The first actual section – parameters – let’s us pass in various options as we create our Heat ‘stack’. Most of these are self-explanatory but give our template some flexibility should we need to do some customisation. When we provision our Heat stack there’s a few options we’ll have to specify such as the SSH key name we’ll expect to use with our instances, and various values we can override such as the network we want to attach to, the subnet in which to create our loadbalancer, and so on. Where applicable there’s some sensible defaults in there – in this example the IDs for network and for subnet are taken from my own demonstration project.

The next section – resources – is where most of the actual provisioning magic actually happens. Here we define our two webservers as well as a loadbalancer. Each webserver is of a particular type – OS::Nova::Server – and has various properties passed to it – all of which are retrieved via the get_param intrinsic function. The lb_pool and lb_members resources are similarly created, members in the latter being a list of our webserver resources.

Finally, the outputs section in our example uses another intrinsic function – get_attr – which returns a value from a particular object or resource. In our case this is the IP address of our load-balancer.

Putting it all together

Now that we have our template, we can look at using the heat command-line client to create our stack. Its usage is very straightfoward; Assuming we’ve saved the above template to a file called heatdemo.yaml, to create our stack all we have to do is the following:

$ heat stack-create Webservers --template-file heatdemo.yaml -P key_name=deadline
-P flavor='dc1.1x1.20' -P name=webserver
+--------------------------------------+------------+--------------------+----------------------+
| id | stack_name | stack_status | creation_time |
+--------------------------------------+------------+--------------------+----------------------+
| 433026fc-b543-4104-902f-d335e1ea189d | Webservers | CREATE_IN_PROGRESS | 2015-04-16T15:26:52Z |
+--------------------------------------+------------+--------------------+----------------------+

The stack-create option to the heat command takes various options, such as the template file we’d like to use. We can also inject various parameters using the command-line at this point, and in my example I’m specifying the SSH key name I wish to use as well as the size (flavor) of instance and a name for each machine that’s created. We can check on the stack’s progress as it’s created by looking in Horizon or again using the heat command:

$ heat stack-show Webservers | grep -i status
| stack_status | CREATE_COMPLETE |
| stack_status_reason | Stack CREATE completed successfully |

Looks good so far – let’s take a look in Horizon. Under Project -> Orchestration -> Stacks we see our newly-created ‘Webserver’ stack. Clicking on that gives us a visual representation of its topology:

Clicking on ‘Overview’ summarises the various details for us, and in the ‘Outputs’ section we can see the IP of the VIP that was configured as part of the stack’s creation. Let’s test that everything’s working as it should from another host on the same network:/

$ nova list | grep -i webserv
| efab9c99-ddc1-4cee-abfb-c3756233418e | Webservers-webserver0-ano27iof4iem | ACTIVE | - | Running | private=192.168.2.34 |
| d3eee79d-7ed4-4b27-8512-16cf201f82f3 | Webservers-webserver1-yiaeqoaxrcq5 | ACTIVE | - | Running | private=192.168.2.33 |
$ neutron lb-vip-list
+--------------------------------------+-------------+--------------+----------+----------------+--------+
| id | name | address | protocol | admin_state_up | status |
+--------------------------------------+-------------+--------------+----------+----------------+--------+
| d943c34b-8299-46ad-88e5-6f7d9d26b769 | lb_pool.vip | 192.168.2.32 | HTTP | True | ACTIVE |
+--------------------------------------+-------------+--------------+----------+----------------+--------+
$ ping -c 1 192.168.2.32
PING 192.168.2.32 (192.168.2.32) 56(84) bytes of data.
64 bytes from 192.168.2.32: icmp_seq=1 ttl=63 time=1.34 ms

--- 192.168.2.32 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.348/1.348/1.348/0.000 ms
$ nc -v -w 1 192.168.2.32 80
Connection to 192.168.2.32 80 port [tcp/http] succeeded!
$ curl -s 192.168.2.32:80 | grep -i welcome
<h1>Welcome to nginx!</h1>

Here we’ve verified that there’s two instances launched, that we’ve a loadbalancer and VIP configured, and then we’ve done a couple of basic connectivity tests to make sure the VIP is up and passing traffic to our webservers. The nginx default landing page that we can see from the output of the curl command means everything looks as it should.

Cleaning up

In order to remove our stack and all of its resources, the heat command really couldn’t be any simpler:

$ heat stack-delete Webservers
+--------------------------------------+------------+--------------------+----------------------+
| id | stack_name | stack_status | creation_time |
+--------------------------------------+------------+--------------------+----------------------+
| 433026fc-b543-4104-902f-d335e1ea189d | Webservers | DELETE_IN_PROGRESS | 2015-04-16T15:26:52Z |
+--------------------------------------+------------+--------------------+----------------------+

After a few seconds, another run of heat stack-list will show that the ‘Webservers’ stack no longer exists, nor does any of its resources:

$ nova list | grep -i webserv
zsh: done nova list |
zsh: exit 1 grep -i webserv
$ neutron lb-vip-list

Summary

This example shows how straightforward it can be to orchestrate infrastructure resources on OpenStack using Heat. This is a very basic and limited example – it’s possible to do much, much more with Heat including defining elastic and auto-scaling pieces of infrastructure, but hopefully this provides you with some insight and inspiration into how such a tool can be useful.

Programming, Technology

Unravelling Logs – Part 2

In my first post in this series, I talked about the ELK stack of Elasticsearch, Logstash and Kibana and how they provide the first steps into automated logfile analysis. In this post, I’m going to deal with the first step in this process, how we get logs into this system in the first place. I’m not going to go into the process of installing logstash and elasticsearch, that’s pretty well covered elsewhere on the internet, so this post assumes you’ve already done that.

Logstash can accept logs in a number of different ways, one of which is plain old syslog format. The first thing to integrate and probably the easiest, is syslog which is where the majority of logging happens anyway. We started off doing this by configuring rsyslog in exactly the way we would to centralise logging to another rsyslog server.

In /etc/rsyslog.conf ( or in a separate config file in /etc/rsyslog.d/ depending on your distribution of choice ) you’d use one of the following :

# Provides UDP forwarding. The IP is the server's IP address
*.* @192.168.1.1:514

# Provides TCP forwarding.
*.* @@192.168.1.1:514

On the logstash server, we then need to define an input to handle syslog :

input {
  syslog {
    port => 514
    type => "native_syslog"
  }
}

The type entry is arbitary, this just adds a tag to any incoming entries on this input, but because the input itself is defined as a syslog input, logstash will use it’s built in filters for syslog in order to structure the data when it’s pushed into ElasticSearch.

The traditional method of doing this would be to send logs over UDP, which has less of a cpu overhead than TCP. But as the saying goes – I’d tell you a joke about UDP but you might not get it … Once we start to treat our log data as something we need to see in real time, as opposed to a historical record that may or may not be useful, then not receiving it is critical. We found that rsyslog would occasionally stop sending logs over the network when using UDP, and we had no way of automatically detecting if it was broken or not.

Switching to TCP guarantees delivery at the network layer, but had it’s own set of problems – we hit another situation where if rsyslog had a problem sending logs, it could also hang without logging locally. From an auditability and compliance perspective it’s critical that we continue to log locally, so architecturally we decided we needed to split the log generation from the log shipping so that the two things can’t possibly interfere with each other, and we’re guaranteed local logging under all circumstances.

You can actually install logstash directly on your clients and use that as your log shipper, but a more lightweight alternative is to use logstash-forwarder. This is a shipping agent, written in Go, to push logs from your clients into a central logstash server. It’s designed for minimal resource usage, is secured using certs, and allows you to add arbitary fields to log entries as it ships them. Since it uses TCP, we also wrote a Nagios NRPE check which uses netstat to confirm that logstash-forwarder is connected to our logstash server, and we get alerted if there any problems with the connectivity.

The logstash-forwarder configuration file is in JSON, and is fairly simple, although you do need to understand a bit about SSL certs in order to configure it. We just set up the server details, certs and define which logs to ship :

{
  "network": {
    "servers": [ "logstash.yourdomain.com:55515" ],
    "ssl certificate": "/var/lib/puppet/ssl/certs/yourhost.yourdomain.com.pem",
    "ssl key": "/var/lib/puppet/ssl/private_keys/yourhost.yourdomain.com.pem",
    "ssl ca": "/var/lib/puppet/ssl/certs/ca.pem",
    "timeout": 15
  },

  "files": [
  {
    "paths": [ "/var/log/syslog" ],
    "fields": {"shipper":"logstash-forwarder","type":"syslog"}
  },
  ]
}

In the syslog section, you can see we’ve added two arbitary fields to each entry that gets shipped, one which defines the shipper used, and one which tags the entries with a type of syslog. These make it easy for us to identify the type of the log for further processing later in the chain.

On the logstash server side, our configuration is slightly different, the input is defined as a lumberjack input, which is the name of the protocol used for transport, and we define the certs :

input {
  lumberjack {
    port => 55515
    ssl_certificate => "/var/lib/puppet/ssl/certs/logstash.yourdomain.com.pem"
    ssl_key => "/var/lib/puppet/ssl/private_keys/logstash.yourdomain.com.pem"
    type => "lumberjack"
  }
}

As we’re not using the syslog input, we also need to tell logstash how to split up the log data for this data type. We do that by using logstash’s built in filters, in this case a grok and a date filter, in combination with the tags we added when we shipped the log entries. One important thing to note here is that logstash processes the config file in order, so you need to have your filter section after your input section for incoming data to flow through the filter.

filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "<%{POSINT:syslog_pri}>%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
      add_field => [ "program", "%{syslog_program}" ]
      add_field => [ "timestamp", "%{syslog_timestamp}" ]
    }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
}

What this filter does is look for messages tagged with the tag we added when we shipped the entry with logstash forwarder, and if it finds that tag, pass the message on through the filter chain. The grok filter basically defines the layout of the message, and the fields which logstash should assign the sections to. We’re also adding a few extra fields which we use internally at Data News Blog, and finally we pass through a date filter, which tells logstash what format the timestamp is in so it can set the internal timestamp correctly. The final step is very important so you have consistent timestamps across all your different log formats.

Once we’ve configured both sides, we just start the logstash-forwarder service and logstash-forwarder will watch the files we’ve defined, and send any new entries on over the network to our logstash server, which now knows how to process them. In the next post in this series I’ll talk about some of the more advanced filtering and munging we can do in logstash, and also introduce Riemann, a fantastic stream processing engine which can be combined with logstash to do much more complex real time analysis.

Programming, Technology

Puppet Custom Types

Usually when creating custom types in puppet you will use templates to define a set of resources to manage. If you find yourself littering your template classes with exec statements then the likelihood is that you should probably consider creating a native custom type and directly extending the puppet language. This post is dedicated to just that as online documentation surrounding this topic is hazy at best, so I shall attempt to describe in layman’s terms what I discovered. Not only is this my first attempt at creating a custom type but also my first dalliance with Ruby, so I’m sure noob style errors will abound. Take it easy on me guys!

Background

So for this example I’m going to walk you through the development of my type to handle DNS resource records in BIND. Existing solutions out there typically just create zone files then don’t touch the contents again, assuming that the administrator will have made local changes which shouldn’t be randomly deleted. Some improvements are out there which use the concat library and templates to build the zone files completely under the control of puppet, but that doesn’t fit our requirements.

In our cloud, hardware provisioning is conducted with foreman, which manages detection of new machines via PXE, then provisioning which includes setting up PXE, DHCP and DNS. DNS in this case is handled via a foreman proxy running locally on the name server and performing dynamic zone updates with nsupdate. As you can no doubt appreciate using puppet resource records exclusively will destroy any dynamic updates performed by foreman. What we need is a custom type which creates resource records, which cannot be done in foreman (think CNAME and MX records), but in the same manner so they are not lost in the ether. So without further ado, creating a custom type to manage resource records via a canonical provider, in this case nsupdate.

Concepts

The first consideration when designing a custom type is how is it going to be used. I’m just going to dive straight in and show you a couple of examples of my interface.

dns_resource { 'melody.angel.net/A':
  ensure   => present,
  rdata    => '192.168.2.1',
  ttl      => '86400',
  provider => 'nsupdate',
}

dns_resource { '1.2.168.192.in-addr.arpa/PTR':
  nameserver => 'a.ns.angel.net',
  rdata      => 'melody.angel.net',
}
Title
This is an identifier that uniquely identifies the resource on the system. In this case it is an aggregate of the DNS record name and record type. My first attempt ignored this field entirely and used class parameters to identify the resource, but as we will see later on this is not the correct way of doing it
Properties
These are things like the TTL and relevant data class parameters. Upon identifying a resource or set of resources with the class title, properties are tangible things about the resource that can be observed and modified. In fact these are the fields that puppet will interrogate to determine whether a modification is necessary, which is an important distinction to make
Parameters
These are the remaining class parameters which have nothing to do with the resource being created, but inform puppet how to manage the resource. Parameters such as resource presence and the name server to operate on have no bearing on the resource itself as reported by DNS. These parameters will not be interrogated to determine whether a modification is necessary

Fairly straight forward, but easy to turn down a blind alley if not spelled out explicitly.

Module Layout

Before we delve into code let us first consider the architecture puppet uses to organise custom types. There are two layers we need to consider. At the top level is the actual type definition, which is responsible for defining how the type will manifest itself in your puppet code. Here you define the various properties and parameters which will be exposed, validation routines to sanitise the inputs, munging to translate inputs into canonical forms, default values and finally you can add in automatic requirements. To expand upon this last point a little here you can define a set of puppet resources that are a prerequisite for your type. Puppet will if these resources actually exist, add in dependency edges to the graph and ensure that the prerequisites are executed before your type. Admittedly I don’t like having to rummage through code to identify whether any implicit behaviour is forthcoming, however on this one occasion I will let it slide as it does remove a load of messy meta parameters from the puppet modules themselves.

The second layer is the provider which actually performs the actions to inspect and manage the resource. And here is the flexibility of puppet, you needn’t be limited to a single provider, in this example I’m creating an nsupdate provider but there is no reason why you cannot have a plain text zone file provider, or one for tinydns. These are runtime selectable with the provider class parameter, or are implicitly chosen by way of being the only provider, or based on facts. As an example the package builtin type will check which distro you are running and based on that use apt or yum etc.

Delving a little deeper into providers the general functionality is as follows. When puppet executes an instance of your type it will first ask the provider if the resource exists, if it doesn’t and is requested to be present then the provider will be asked to create it. Likewise if it exists and is requested absent then the provider will be asked to delete the resource. The final case is where the resource exists and is requested to be present. Puppet will inspect each property of the real resource defined by the type and compare with the requested values from the catalogue. If they differ then puppet will ask the provider to perform the necessary updates. Simple

Type code

Hopefully those concepts were straight forward and made clear sense. So lets look at how this all fits together. First lets look at the type definition which lives in the path <module>/puppet/type/<type>.rb

# lib/puppet/type/dns_resource.rb
#
# Typical Usage:
#
# dns_resource { 'melody.angel.net/A':
#   rdata => '192.168.2.1',
#   ttl   => '86400',
# }
#
# dns_resource { '1.2.168.192.in-addr.arpa/PTR':
#   nameserver => 'a.ns.angel.net',
#   rdata      => 'melody.angel.net',
# }
#
Puppet::Type.newtype(:dns_resource) do
  @doc = 'Type to manage DNS resource records'

  ensurable

  newparam(:name) do
    desc 'Unique identifier in the form "/"'
    validate do |value|
      unless value =~ /^[a-z0-9\-\.]+\/(A|PTR|CNAME)$/
        raise ArgumentError, 'dns_resource::name invalid'
      end
    end
  end

  newparam(:nameserver) do
    desc 'The DNS nameserver to alter, defaults to 127.0.0.1'
    defaultto '127.0.0.1'
    validate do |value|
      unless value =~ /^[a-z0-9\-\.]+$/
        raise ArgumentError, 'dns_resource::nameserver invalid'
      end
    end
  end

  newproperty(:rdata) do
    desc 'Relevant data e.g. IP address for an A record etc'
  end

  newproperty(:ttl) do
    desc 'The DNS record time to live, defaults to 1 day'
    defaultto '86400'
    validate do |value|
      unless value =~ /^\d+$/
        raise ArgumentError, "dns_resource::ttl invalid"
      end
    end
  end

  # nsupdate provider requires bind to be listening for
  # zone updates
  autorequire(:service) do
    'bind9'
  end

  # nsupdate provider requires nsupdate to be installed
  autorequire(:package) do
    'dnsutils'
  end

end

The first few lines are just boilerplate code, here you define the name of your type as it will appear in puppet code, and documentation, because everyone loves documentation right?

The ensurable method adds in support for the ensure parameter which shouldn’t come as too much of a surprise. What it also does is forces the creation of create, destroy and exists? methods by the provider.

The name parameter must be defined. desc allows documentation of your class parameters. Following that is our first encounter with parameter validation, which is basically checking for a hostname followed by a slash and one of our supported resource record types. Probably not the most RFC compliant regular expression but it works for now!

The nameserver parameter introduces default values so you don’t have to specify them in your puppet code, and the final thing I wish to draw attention to are the autorequires which add implicit dependencies to the graph as discussed previously and may reference any puppet resource.

Provider Code

Now for the guts of the operation, without further ado here are the contents of <module>/puppet/provider/<type>/<provider>.rb

# lib/puppet/provider/dns_resource/nsupdate

require 'resolv'

# Make sure all resource classes default to an execption
class Resolv::DNS::Resource
  def to_rdata
    raise ArgumentError, 'Resolv::DNS::Resource.to_rdata invoked'
  end
end

# A records need to convert from a binary string to dot decimal
class Resolv::DNS::Resource::IN::A
  def to_rdata
    ary = @address.address.unpack('CCCC')
    ary.map! { |x| x.to_s }
    ary.join('.')
  end
end

# PTR records merely return the fqdn
class Resolv::DNS::Resource::IN::PTR
  def to_rdata
    @name.to_s
  end
end

# CNAME records merely return the fqdn
class Resolv::DNS::Resource::IN::CNAME
  def to_rdata
    @name.to_s
  end
end

Puppet::Type.type(:dns_resource).provide(:nsupdate) do

  private

  # Run a command script through nsupdate
  def nsupdate(cmd)
    Open3.popen3('nsupdate -k /etc/bind/rndc-key') do |i, o, e, t|
      i.write(cmd)
      i.close_write
      raise RuntimeError, e.read unless t.value.success?
    end
  end

  public

  # Create a new DNS resource
  def create
    name, type = resource[:name].split('/')
    nameserver = resource[:nameserver]
    rdata = resource[:rdata]
    ttl = resource[:ttl]
    nsupdate("server #{nameserver}
              update add #{name}. #{ttl} #{type} #{rdata}
              send")
  end

  # Destroy an existing DNS resource
  def destroy
    name, type = resource[:name].split('/')
    nameserver = resource[:nameserver]
    nsupdate("server #{nameserver}
              update delete #{name}. #{type}
              send")
  end

  # Determine whether a DNS resource exists
  def exists?
    name, type = resource[:name].split('/')
    # Work out which type class we are fetching
    typeclass = nil
    case type
    when 'A'
      typeclass = Resolv::DNS::Resource::IN::A
    when 'PTR'
      typeclass = Resolv::DNS::Resource::IN::PTR
    when 'CNAME'
      typeclass = Resolv::DNS::Resource::IN::CNAME
    else
      raise ArgumentError, 'dns_resource::nsupdate.exists? invalid type'
    end
    # Create the resolver, pointing to the nameserver
    r = Resolv::DNS.new(:nameserver => resource[:nameserver])
    # Attempt the lookup via DNS
    begin
      @dnsres = r.getresource(name, typeclass)
    rescue Resolv::ResolvError
      return false
    end
    # The record exists!
    return true
  end

  def rdata
    @dnsres.to_rdata
  end

  def rdata=(val)
    create
  end

  def ttl
    @dnsres.ttl.to_s
  end

  def ttl=(val)
    create
  end

end

I’m using ruby’s builtin resover library to check the presence of a resource on the DNS server. The first 4 classes highlight one of the cool things about ruby, classes aren’t static. What we’re doing here is attaching new methods to the DNS resource types to marshal the relevant data into our canonical form i.e. a string, and also providing an exception case in a super class to catch when we add in support for a new resource type. It would have been easy to omit the override and just let things raise exceptions, but I like giving my peers useful debug.

Onto the main body of the provider. nsupdate unsurprisingly calls the same binary with an arbitrary set of commands. Usually you’d use puppet’s commands method to define external commands which enables a load of debug details but in this situation I needed access to standard in. create, destroy & exists? basically do just that, create, destroy and probe for the existence of a resource as defined by the name. The final four calls are accessors for the properties we defined earlier. You have to be careful with these as regards types as puppet will mismatch 86400 and “86400” and try to update the resource on each execution.

Conclusions

All in all, going from zero to hero in the space of 2 days wasn’t as daunting as I’d expected, new language, new framework. Hopefully I’ve summarised my experiences in a way which my readers will be able to easily digest. On reflection the whole expansion of puppet has been breathtakingly easy, and I’m hoping it will provide some inspiration to better improve our orchestration and provisioning efforts. And hopefully yours too! Until next time.

Programming, Technology

Elastic High-Availabilty Clustering With Puppet

In this post I’m going to demonstrate one method I discovered to facilitate HA clustering in your enterprise. The specific example I’m presenting here is how to easily roll out a RabbitMQ cluster to be used by the Nova (Compute) component of OpenStack. Some other applications which come to mind are load balancers, for example you assign a puppetmaster role to a node when provisioning and have it automatically added to Apache’s round robin scheduler. Thus if our monitoring software decides the existing cluster is under too much strain we can increase capacity in a matter of minutes from bare metal.

Exported Variables

Ideally what we want to provide this functionality is some form of exported variable, when collected, contains all instances of that variable i.e. each rabbit host would export its host name and these could be aggregated. Puppet supports neither exporting variables nor exporting resources with the same names. Custom facts weren’t going to cut it either as they are limited to node scope. Then I tripped upon a neat solution by the good folks at Example42. Their exported variables class quite cleverly exports a variable as a file

define exported_vars::set (
  $value = '',
) {
  @@file { "${dir}/${::fqdn}-${title}":
    ensure  => present,
    content => $value,
    tag     => 'exported_var',
  }
}

Which is realized on the puppet master by including the following class

class exported_vars {
  file { $dir:
    ensure => directory,
  }
  File <<| tag == 'exported_var' |>>
}

And then a custom function is able to look at all the files in the directory, matching on FQDN, variable name, both returning an array of values. It also defaults to a specified value if no matches are found. Perfect!

Elastic RabbitMQ Cluster

Here’s a concrete example of putting this pattern to use

class profile::nova_mq {

  $nova_mq_username = hiera(nova_mq_username)
  $nova_mq_password = hiera(nova_mq_password)
  $nova_mq_port     = hiera(nova_mq_port)
  $nova_mq_vhost    = hiera(nova_mq_vhost)

  $mq_var = 'nova_mq_node'

  exported_vars::set { $mq_var:
    value => "${::fqdn}",
  }

  class { 'nova::rabbitmq':
    userid             => $nova_mq_username,
    password           => $nova_mq_password,
    port               => $nova_mq_port,
    virtual_host       => $nova_mq_vhost,
    cluster_disk_nodes => get_exported_var('', $mq_var, ['localhost']),
  }
  contain 'nova::rabbitmq'

}

A quick run through of what happens. When first provisioned the exported variable is stored in PuppetDB, and the RabbitMQ server is installed. Here we can see the get_exported_var function being used to gather all instances of nova_mq_node that exist, but as this is the first run of the first node we default to an array containing only the local host. When the puppet agent next runs on the puppet master, the exported file is collected and executed. Finally the second run on the RabbitMQ node will pickup the exported variable and add it to the list of cluster nodes.

Gotchas

Some notes to be aware of

  • exported_vars doesn’t recursively purge the directory by default, so nodes which are retired leave their old variables lying about, you’d also need to have dead nodes removed from puppetdb too
  • there are no dependencies between the file and directory creation, so it may take a couple runs to get fully synced
  • with load balanced puppet masters it’s a bit hit or miss as to whether one has collected the exported variables or not when you run your agents. This can be mitigated by provisioning the variable directory on shared storage (think clustered NFS on a redundant GFS file system)

And there you have it, almost effortless elasticity to your core services provided by Puppet orchestration.

Programming, Technology

Transparent Encryption Of Offsite Backups With Puppet And Git

I’ll be going into some detail as to how our source control setup works at a later date, but I wanted to address a hot topic before hand – secure storage of configuration data in the cloud.

All of our source code commits are automatically backed up in the cloud. For us this is GitHub, but this should hold for other SaaS platforms such as those offered by Atlassian. As such all of our configuration data goes onto untrusted systems – be it network address ranges or passwords stored in our Hiera configuration files. This also goes for any certificates that need to be part of our puppet environment.

First Steps

Our initial solutions were based on puppet modules as that was what Google initially hinted at. Hiera_yamlgpg seemed to fit the requirements. This module replaces the default yaml backend provided by puppet and provides transparent decryption of the Hiera data files on the fly by the puppet master upon compilation of a catalog. The plus points of this approach was that the majority of the data file could be left in plain text and only the pertinent fields encrypted with gpg like so :

echo -n Passw0rd | gpg -a -e -r recipient

I ran into issues by forgetting to strip the new line at times, and the Hiera data file soon became a mess of GPG statements. Obviously decrypting obfuscated data was a pain, and performing code review was tedious as there was no way of seeing the actual changes without some leg work.

At this point it was decided we should just opt for full file encryption. This led me to wondering whether git supported some form of hook whereby encryption and decryption could be performed while transferring sensitive data on and off site. As it turns out better exists. Git supports filters which can be run on individual files when checked in and out of working branches.

Transparent Encryption With Git

Files can be tagged with attributes either individually or with wildcards in either .git/info/attributes or .gitattributes. The former is on a single repository basis, the latter is under version control and propagated to all my peers, which seems like the right thing

/hieradata/common.yaml      filter=private

The specified file is tagged with a filter called private. This tag can be arbitrary. Now when we checkout (smudge) a file or check it back in (clean) a file with the private filter we can run arbitrary commands on the contents. The input and output are via stdin and stdout respectively.

git config --global filter.private.smudge decrypt
git config --global filter.private.clean  encrypt

The encrypt and decrypt scripts were initially based on PGP, the keys were already installed from our prior dabble with hiera_yamlgpg. And it worked, but not great. The issue was that GPG doesn’t perform encryption deterministically, most likely including data such as time stamps and who encrypted the data. This led to git thinking that the private files were always modified. Not a problem, the files can be ignored in the index and manually committed when a change actually occurred. But this is hardly the transparent work flow we desired. The real deal breaker was when trying to pull from a remote origin, which git refused to do as it would destroy locally modified files. Back to the drawing board then.

Turns out things work perfectly when you remove the determinism.

Encryption & Decryption In Python

There are a couple solutions out there that use openssl, but required compilation which made me steer clear. We’re a python shop, and I’m a geek, so I architected a solution using AES 256 from python-crypto and encoded into base 64.

Important bits are, key and initialisation vector generation

def gen_key():
    """
    Generate a new key
    """
    try:
        keyf = open(KEY_PATH, 'w')
    except IOError:
        sys.stderr.write('Err: Open {0} for writing\n'.format(KEY_PATH))
        exit(1)
    keyf.write(Random.new().read(KEY_SIZE + AES.block_size))
    keyf.close()

Encryption

def encipher():
    """
    Encipher data from stdin
    """
    key = get_key()
    ivec = get_ivec()
    data = sys.stdin.read()
    datalen = len(data)

    # Now for the fun bit, we're going to append the data to a 32 bit
    # integer which describes then actual length of the data as we
    # need to round the cypher input to the block size, this allows
    # recovery of the exact data length upon deciphering. We also
    # specify big-endian encoding to support cross platform operation
    buflen = round_up(datalen + 4, AES.block_size)
    buf = bytearray(buflen)
    struct.pack_into('>i{0}s'.format(buflen - 4), buf, 0, datalen, data)

    # Encipher the data
    cipher = AES.new(key, AES.MODE_CBC, ivec)
    ciphertext = cipher.encrypt(str(buf))

    # And echo out the result
    sys.stdout.write(HEADER)
    sys.stdout.write(base64.b64encode(ciphertext))

And decryption

def decipher_common(filedesc):
    """
    Decipher data from a file object
    """
    key = get_key()
    ivec = get_ivec()
    ciphertext = base64.b64decode(filedesc.read())
    # Decipher the data
    cipher = AES.new(key, AES.MODE_CBC, ivec)
    buf = cipher.decrypt(ciphertext)

    # Unpack the buffer, first unpacking the big endian data length
    # then unpacking that length of data
    datalen, = struct.unpack_from('>i', buf)
    data, = struct.unpack_from('{0}s'.format(datalen), buf, 4)

    # And echo out the result
    sys.stdout.write(data)

decipher_common takes a file descriptor as when used in diff mode git will provide you with a file name, which may or may not be already decrypted. This is the purpose of the HEADER string, to determine whether to perform the decryption or just echo out the file contents. You can enable the diff functionality updating .gitattributes

/hieradata/common.yaml      filter=private diff=private

And your git configuration to act on the tag

git config --global filter.private.smudge 'dc_crypto decipher'
git config --global filter.private.clean  'dc_crypto encipher'
git config --global diff.private.textconv 'dc_crypto diff'

Obviously you need to be pretty secure with your symmetric key and initialisation vector, but I hope I’ve given enough information for you to avoid the same mistakes I did and keep your data secure in the SaaS world.

Programming, Technology

Dynamic Nagios Host Groups With Puppet

So this little problem caused me some headaches. Coming from a C/C++ systems programming background, Puppet takes some getting used to and this assignment was a learning experience and a half!

The setup we wanted was to have Puppet define a set of host groups on our icinga server, then have every host export a nagios host resource which selected host groups it was a member of based on which classes were assigned to it by the ENC. The little database experience I have suggested that this way was best as it avoided the sprawl of host groups and services being redefined on a per host basis. A lot of other blogs suggest this is the way to go, but allude to the fact that they have a node variable which controls membership, and doesn’t provide the dynamism that we wanted.

Plan Of Attack

With puppet being declarative, using global or class variables to flag which host groups to include wont work as you’re at the mercy of compilation order, which by its very nature is non-deterministic. It gets worse with global variables when using fat storeconfigs, as a definition on one host seems to get propagated to all the others, and thin configs don’t export variables. Placing the exported nagios host definitions in a post-main stage suffered from the scope being reset.

The next idea was to use the ENC, foreman in our case, to generate the host groups. Problem here was our foreman host groups refer to hardware platforms, whereas our nagios host groups refer to software groups. And defining per host host-group membership isn’t going to cut it!

And then there was the eureka moment. Facter 1.7 supports arbitrary facts being generated from files in /etc/facter/facts.d. Facts are available at compilation time, so regardless of ordering they are always available. Better still we can generate them dynamically per host based on the selected profiles and collect them using an ERB template. And here’s how…

Dynamic Nagios Host Groups In Puppet

First piece of the puzzle is to define a module to allow easy generation of custom facts, first by creating the directory structure

class facter::config {
  File {
    ensure => directory,
    owner  => 'root',
    group  => 'root',
    mode   => '0755',
  }

  file { '/etc/facter': } ->
  file { '/etc/facter/facts.d':
    recurse => true,
    purge   => true,
  }
}

Then by creating the reusable definition

class facter {
  define fact ( $value = true ) {
    file { "/etc/facter/facts.d/${title}.txt":
      ensure  => file,
      owner   => 'root',
      group   => 'root',
      mode    => '0644',
      content => "${title}=${value}",
      require => Class['facter::config'],
  }
}

Next up we define a set of virtual host groups. The idea being we can realise them in multiple locations based on profile and not worry about collisions. Think multiple profiles having an apache vhost, rather than fork the community apache sub-module we can just attach at the profile level.

class icinga::hostgroups {
  include facter
  @facter::fact { 'hg_http': }
  @facter::fact { 'hg_ntp': }
}

Then to finally create the facts on the host system something like the following will suffice

class profile::http_server {
  include icinga::hostgroups
  realize Facter::Fact['hg_http']
}

The final piece of the jigsaw is the host definition itself

class icinga::client {
  @@nagios_host { $::hostname:
    ensure     => present,
    alias      => $::fqdn,
    address    => $::ipaddress,
    hostgroups => template('icinga/hostgroups.erb'),
  }
}

And the ERB template to gather the facts we’ve exported

hg_generic<% -%>
<% scope.to_hash.keys.each do |k| -%>
<% if k =~ /(hg_[\w\d_]+)/ -%>
<%= ',' + $1 -%>
<% end -%>
<% end -%>

And there you have it. I’m by no means an expert at either puppet or ruby so feel free to suggest better ways of achieving the same end result. Note this isn’t production code, just off the top of my head, so there may be some mistakes, but you get the gist. Happy monitoring!