Configuration Management - Part 1
Here’s where it all starts… I believe configuration management to be the most important and crucial practice for running an efficient and safe IT infrastructure. Sadly, one that is very rarely done properly, if at all.
In development, a comprehensive test suite is claimed to be the basis for all the other Agile practices – only if every aspect of the code is tested can you be confident to go in and make changes. Similarly, a comprehensive CMDB is the beginning of Agile SysAdmin. It:
- allows you to see the pattern of dependencies in your environment
- gives an instant guide to the cascading impact of an outage
- prevents outages caused by conflicting services and other mistakes
- enables “change with confidence”
- enables systematic coverage (e.g. backups and monitoring should do enough, but not too much)
- allows easy queries for “what’s out there?” (what is non-standard, what is at an old level, etc)
For the developers out there – sometimes “configuration management” is used as a synonym for software source code control, as in “SCM”. Great, but not what I’m talking about here. This article is about tracking your installed base of machines and software.
So what’s the magic ingredient? Obviously, like any tool your CMDB must be usable, and not get in the way of someone trying to do a job – even in an “emergency fix” situation.
But like TDD, it must be 100% comprehensive. And you don’t get that by asking folks, “Please remember to update the CMDB after pushing out any change”. You get it by so constructing your environment that a change cannot happen without being registered in the CMDB first.
How? If you’re managing a webserver, look at the parameters that make that webserver unique – the host it runs on, the port, the source directory of its web pages. Put those in the CMDB, and then run a cron job every night that generates the config files. Now, if someone changes the webserver locally, the changes get wiped the next night and it was all pointless. No one will now forget to update the CMDB.
As a side benefit, a central change + push should become easier than going to a particular host, looking around in the files for a setting, firing up an editor, etc. And consider if you have 50 servers…
ITIL does discuss the CMDB (or CDB, in v3) at length. Sadly, the message is rather diluted because it mixes in a bunch of rather unconnected factors – a special cupboard for your software CDs, and another one for hardware spares. That’s just odd. Also, I think ITIL still fails to make the link between Problem/Issue handling, and a reliable CMDB; it just doesn’t make the config of your servers as fundamental as it should.
Of course, this requires up-front effort to write scripts that generate config files, and can be limiting if you want to use a new feature that you haven’t included in your template. Maybe you have to make a call to cheat occasionally, but this should be seen as a weakness to be cleaned up … and not a common event. As TDD teaches us, the cheat quickly becomes a timewaster even in the short term.
And yes, it favours products configured by well-understood text files over undocumented binary formats, but then, so should you. Most of these things have an API of some sort anyway.
In order to have config data from disparate systems at all levels of the stack, and possibly managed by different teams, you may well need a federation of interlocking CMDBs rather than a monolith. That’s OK. It can help where you have a patchwork of existing management systems, where you only have control over one part, or want to break up the CMDB project into manageable challenges. Try to integrate though, so you can view dependencies the whole way down the stack.
The corollary to all this is that I don’t rate the “discovery” tools out there – better than nothing, yes, but a second best option. I’d rather manage my environment than find out about it later. I do however like the nice graphical tools they seem to include – explore your dependency stack, ie “if this service goes out, what is affected?”.
I’ll be blogging a lot more about this topic - examples, how-(not)-to, and the effects on so many other practices. Here’s the bottom line:
Comprehensive, proactive configuration management is absolutely fundamental to Agile SysAdmin.
You could call our ideal the CMDB-Driven Environment.