In my last post I wrote about backing up my dedicated server and why I decided to use tarsnap. After a couple of months of running tarsnap manually I decided it was way past the time to properly automate it.
The main issue is how many snapshots do you want to store? On the one hand it’s nice to be able to go back in time as far as possible, but on the other hand there’s the issue of how large your archives get (and consequently the cost).
There are three different charges for tarsnap; data sent, data received and data stored. Each is charged on a daily basis and subtracted from a total in your account (you keep an account in credit rather than being billed). If you’re doing backups on a daily basis the data sent and received will be approximately the same regardless of how long you retain the archives for. So the figure to consider is the cost for storing the data.
I decided to go for a model where I had X daily backups, Y weekly backups and Z monthly backups. I also decided I wanted to back up only certain directories, and that I wanted to keep them as separate archives (because I’m dealing with large numbers of files, and this breaks it down a bit – I don’t think it affects costs).
So I went about scripting this. First step was to write a “fake” tarsnap. The reasoning behind this was that it’d allow me to do quick backup runs without any time used for archiving or any costs. It’s basically just a perl script that adds and removes archives from a database file.
Next I wrote a backup script. It’s pretty basic at the moment, but fully automates the creation of archives and deletion of expired ones. You provide it with a list of directories to back up, and how many daily, weekly and monthly archives you want to keep. Then stick it in cron and off it goes.
It’s a bit tailored to my setup, and may only work on FreeBSD (are the date flags the same on other operating systems?). Also, its cleaning of old archives is primitive; it’s based on the number of archives, rather than the age.
I welcome feedback on these scripts and improvements, but bear in mind they’re very much a work in progress.