For years now I'm constantly changing my backup plans, always a little unsatisfied with the current solution. So let me try to write down what I actually want first and then see what options are there. I know there are plenty of tools to deal with those (I know e.g. unison, git for config files, bup, etc..), the thing is: Without a little planning the best tool won't help anything.

  • Synchronization of config files. I.e. dotfiles in my home directory. As I work on multiple computers and regularly change things, this is a quite essential element. Versioning/diff-/mergemechanics would be nice, but is not a must. The mechanism used here should not automatically include all sorts of files as some are machine-specific and beyond the dot-directories there are lots of caches etc... which I don't want to explicetely exclude, rather I'd like to include a certain subset of my config files for syncing.
  • Synchronization of data files. I.e. presentations, wallpapers, papers, all that stuff. No need to do versioning here, as I'll practically never want to go back to an earlier version of those files (as I more or less horde them anyway). It will quite often happen though that these files move or are being renamed (as they are sorted automatically). For those files, everything under a certain subdirectory should always be included in the syncing process, manually adding things here would get quite tedious.
  • Synchronization of music. This is a tough one as my music collection is quite large (e.g. its infeasible to sync with unison). To make things harder: My music player automatically sorts my music by artist, etc... and these change from time to time as I discover some files that are not or not correctly tagged. Having to tel a synchronization program that a file has moved and changed every time i correct a track name in my player is certainly not a thing I want to do. On the other hand I don't want/need the same state on all machines (it probably wouldn't hurt, but my internet connection at home is not the fastest). I wouldn't mind carrying a usb disk back and forth between home/work for syncing my music, however I do not want to manually select files for syncing, a high degree of automatism is a must here.
  • Protection against dumb things. Especially premature deletions of files. This even holds for checked-out repositories in the ideal case (when deleting a changed file accidentially that is not yet checked in). Ideally a mechanism to achieve this would allow going back a selection of time intervals (e.g. like rsnapshot does). However: Say I'm on vacation for the longest time interval and come back, then discover I accidentially deleted file X the day before I left, but in the oldest backup version its still gone. So the solution here should not do anything if files didn't change.
  • Protection against hardware failures. Assume my laptop explodes and the harddisk is completely destroyed. In this scenario there should be a backup somewhere off-site that is not more than a week old. However I don't want an automated network sync that I can't see or immediately abort (I might be sitting in a train and be connected via GSM or simple need every bit of the bandwidth for something urgent).
  • The sum of the solutions should be maintainable. I.e. it would be nice not to have 5 different tools running for solving these problems.

In my next post about this topic I'll list a few tools I know/use up to now and maybe say a few words on how well they solve some of these problems.

Comment on twitter