Automating tarsnap backups

Automating tarsnap backups

In my last post I wrote about backing up my dedicated server and why I decided to use tarsnap. After a couple of months of running tarsnap manually I decided it was way past the time to properly automate it.

The main issue is how many snapshots do you want to store? On the one hand it’s nice to be able to go back in time as far as possible, but on the other hand there’s the issue of how large your archives get (and consequently the cost).

There are three different charges for tarsnap; data sent, data received and data stored. Each is charged on a daily basis and subtracted from a total in your account (you keep an account in credit rather than being billed). If you’re doing backups on a daily basis the data sent and received will be approximately the same regardless of how long you retain the archives for. So the figure to consider is the cost for storing the data.

I decided to go for a model where I had X daily backups, Y weekly backups and Z monthly backups. I also decided I wanted to back up only certain directories, and that I wanted to keep them as separate archives (because I’m dealing with large numbers of files, and this breaks it down a bit – I don’t think it affects costs).

So I went about scripting this. First step was to write a “fake” tarsnap. The reasoning behind this was that it’d allow me to do quick backup runs without any time used for archiving or any costs. It’s basically just a perl script that adds and removes archives from a database file.

Next I wrote a backup script. It’s pretty basic at the moment, but fully automates the creation of archives and deletion of expired ones. You provide it with a list of directories to back up, and how many daily, weekly and monthly archives you want to keep. Then stick it in cron and off it goes.

It’s a bit tailored to my setup, and may only work on FreeBSD (are the date flags the same on other operating systems?). Also, its cleaning of old archives is primitive; it’s based on the number of archives, rather than the age.

I welcome feedback on these scripts and improvements, but bear in mind they’re very much a work in progress.

(Visited 3,846 times, 1 visits today)
Share

9 Comments

  1. Chris

    Thanks for your replies. I managed to create a simple solution by running the monthly/weekly backups on a specific hour like this:

    if [ X”$DOM” = X”$MONTHLY_DAY” ]; then
    if [ X”$HOD” = X”$MONTHLY_HOUR” ]; then

    HOD is current hour and MONTHLY_HOUR is the hour on which to run monthly backups.

  2. Chris – yeah, it basically needs hacking to add hourly support. It’d need to know the hour of day to do daily/weekly operations on, and otherwise do hourly. I’ve got a copy somewhere that does that, but I never got around to uploading it.

  3. Chris

    I’m using the script but ran into a problem when running it on a hourly basis. Basically on mondays the script would keep creating a weekly backup each hour and delete older backups from pervious weeks (I’ve set WEEKLY=4). Is there any way to prevent this?

Leave a Reply

Your email address will not be published. Required fields are marked *