Avoiding The Needless Multiplication Of Forms

System Deployment

Dr Andrew Moss

2015-10-16

Time to wander into a slightly different topic: now that the mechani.se domain is back up and running it is time to setup a gitolite3 installation on face.mechani.se

Problem Context for gitolite

My way of working on a linux system has evolved over the years because of some specific desires:
The third point is the killer; during a session on one machine I build up a thick and fecund context. Depending on the work it may be edits to source, environmental variables, command history and other forms of state that are local to the machine. Over time each machine acquires a layering of packages and installed artefacts (libraries, modules, random pieces of related source). Even seemingly inconsequential parts of the machine state are useful: the collection of workspaces and windows, positions and combinations are all memory prompts for a particular task.
The original dream (probably not even mine, these things tend to be infectious) was a teleporting environment: effectively hibernate a machine and transport the hibernated image to a new machine to be restored. These are the idle dreams of a grad student who works late through the night and doesn't want to start fresh when he trudges into the office. These dreams never quite found traction, although years of experimenting allowed them to morph into something more useful.
Virtualbox introduced teleportation a few years ago. The reality involves more suckage than the dream. Image files are large and cumbersome to transport. Somewhere I have a custom diff utility that syncs the hard-drive overlays inside a .VDI against a target-set to reduce the diff to a few hundred megs at a time (possible over my positively rural ADSL as well as on a flash drive). It just didn't really work out for us. Versioning entire OS images, avoiding branches and generally playing the maintain-the-repo-consistency game on a larger scale was even less fun that you would think.
It turns out that the answer is very boring and simple - its more of a process than an artefact.

Version control for everything

Most of work falls neatly into two categories:
This clean taxonomic split is the stuff that programmers live for. It suggests a scratch directory that never needs to be backed up or transported off machine, and a set of version control repositories for everything that I would displeased to lose. That is where people balk at the idea of how much hassle it would be to keep everything in sync, and the server-side issue of maintaining a bare-bones repository against each of those projects. I wrote a script. Well, truth be told I wrote quite a few over the years find the right way to work, but in the end they all collapsed into a single script.
Much like a sordid-ploy by Sauron there is a single repository that rules them all, gitenv:
This repository has a very strange property for source-control: the files within act as if they are in continuous change. Normally the state of a repository's contents acts discretely: things do not change in-between git commands. But linking the known_hosts file, and the shell history into this repository means that the contents are always dirty. Because it is alway dirty it always needs to merge against the remote - so each machine has a slightly different history for this repository. It is challenging to work with.
Everything else is simple in comparison, there is a single independent repository for each:
This means that each machine that I use has a collection of a few dozen repositories. This would be a serious pain to maintain by hand. Instead one script takes care of the difficult between the continuous environment repository and the server (and its mirrors), and then works out how close to consensus the rest of the repositories are. Where the actions to establish consensus are simple (i.e. the repository is purely ahead or behind) the script brings it into line automatically. This makes things sane.
Transporting state between machines is the same as using backup/restore. This is absolutely essential - it means that the backup and the restore mechanism are in use every day. When you positively, absolutely need to rely on a system, make sure that you eat your own dog-food. Mmmm chewy. The weird thing about my backup and restore system is that any two machines rarely have exactly the same contents - but they all chase the same consensus state, and progress towards synchronisation is monotonic. This is actually good enough to make sure that nothing is every lost.

Gitolite3

Git servers are nice and easy. Manually keeping track of repository details is an absolute pain in the arse. Thankfully gitosis, and now gitolite have made that process incredibly simple. Despite that simplicity I have not yet worked out how to integrate this into the preseeded process, so for now this is a dangling live piece of state on the server. [Note to self: seeing it like this it is quite obvious running this with the sudo flipped around, root or root->git should make it easy]
Each git server needs a user dedicated to gitolite3:
sudo adduser --system --shell /bin/bash --gecos 'Git version control' --group --disabled-password --home /home/git git # Copy public key into git.pub sudo su git cp ../main/rsa_git.pub git.pub gitolite setup -pk git.pub
The docs make it look much more complex, but on debian if you have installed the gitolite3 package this is all there is to it. Don't reuse a key - it may seem easier in the short-term but it actually makes things much more complex in the long term. Dedicate a key to git, and use an agent properly!
All the repositories inside gitolite are bare - this is the point of a server, guaranteed push. This has been running quite happily against a single server or years, as I do the upgrade I'm setting up a second mirror for the git server. I haven't tried automated the sync between mirrors yet - there is a bit of thought to be had first about whether or not pushes are guaranteed in a system with mirrors. I'm sure it will be fun to find out :)
I am always forgetting the admin repo URLs as there are slight differences in git-urls under the different protocol prefixes, but here it is as simple as:
git clone git@face.mechani.se:gitolite-admin face-gitolite-admin
Inside the admin repo the key is already in place so the config layout becomes completely uniform, conf/gitolite.conf looks like:
repo randomforests RW+ = git repo paperBase RW+ = git
So now the allsync.py script needs some tweaks to handle multiple remotes as mirrors...