If you ever had a hard drive fail without having a backup then you know it can be a painful experience. It happened to me when I was 16, and I lost a hard drive filled with music. Previously I had thought, “eh, what are the chances it happens to me,” as I often mistakenly had thought about many situations as a dumb, invincible 16 year old. But it was far from important, it was only music that I had obtained from the high seas. I mean sure, it took a lot of time to acquire during those days, does anyone else remember it taking 20 minutes to download a single song? But it was far from irreplaceable. What happens if it’s your family photos or letters or other things that simply can not be replaced? I learned a valuable lesson on that day that carried forward. Backups are important.

There are a couple of important rules to remember when considering backups.

  1. RAID is not a backup.
  2. The 3-2-1 rule for backups states you should have 3 copies of your important data, on 2 separate devices, with 1 being offsite.

I will start with “RAID is not a backup,” and keep it brief as most people will not have RAID setup in their homes. RAID is a solution that fixes the problem of downtime in case of drive failure, that’s it. It is not a backup, it does nothing to protect against accidental file deletions; therefore, it is not a proper backup solution. I’ve seen some people debate this, and they are simply wrong. Period. Moving on.

Now onto 3-2-1, and the software that I use to accomplish this.

The servers sitting down in the basement are just used Thinkcentre PCs bought off of eBay for around $100 each and there are 4 x 8TB drives connected to them. Using snapraid I have 3 drives usable as data drives, and one drive is set to be a parity drive. This means if any one drive out of the 4 fails, I lose nothing. Well, to be accurate, I have snapraid set to sync nightly @ 2:00AM, which means, I could lose any data that has been changed since the last nighly sync.

I like this timing because if I am working on something during the day, and accidentally delete it then I can recover the files from the last sync the night before. And it’s unlikely that I place something on the drives that is not replaceable, and also not still on another device (rule of two devices) during the day before the next sync takes place so this works for me. But remember how RAID is not a backup? Well, despite it’s name, snapraid is software that blurs those lines a bit between RAID and a backup solution. It sort of acts like RAID and it sort of acts like a backup solution, but in my opinion, it’s neither. It is however, a happy medium. I do not consider it a full backup solution, which is why I also use borg. And unless you have a server like me, with a bunch of drives connected to it, then you can skip snapraid, and just use borg.

I consider borg a full backup solution. It has deduplication, which means if I have multiple copies of the same file in different folders or even on different machines then it only needs to take up the space of about one of those files. Likewise if I have a big file and only a little of the data changes, it doesn’t need to make a completely new copy of the changed file, just a little more than the changed parts of it. This saves a lot of storage space. Borg is also fast, easy to install, easy to script, supports encryption and compression, and allows for backing up to remote hosts. The backups that borg creates are what I actually consider to be my backups.

And then finally, there is timeshift, and timeshift is software used to backup your OS installation. And specifically your OS installation. So I use timeshift to make sure if my OS gets borked by an update that I can easily revert it.

So those are the three major pieces of software I use for creating my backup solution. Now how does the overall picture look? Let me give an overview, and tie in the 3-2-1 rule to see how I’m doing.

My phone, computers, laptops etc. have pictures, videos, settings, etc. (copy 1, device 1, site 1) that I do not want to lose so syncthing runs on it and backs up all those files and folders to the server drives (copy 2, device 2, site 1).

The server has the snapraid parity so in turn gives the client backups (copy 3, device 3, site 1) and all the server service settings, docker, media, documents are also given their parity (copy 2, device 2, site 1). But as I said, I don’t consider snapraid a complete backup solution so…

Borg makes full backups of all of this to two additional large external drives giving me another +2 copies and +2 devices of redundancy. Then borg makes backups to an online storage solution giving another +1 copy, +1 device, +1 sites.

So my client devices (phones, laptops, etc.) backups have 6 copies, on 6 devices, at 2 sites. And my server data is backed up with 5 copies, on 5 devices, at 2 sites. It’s for sure overkill, but I also for sure do not need to worry about losing my data.

Some things I learned through setting up these software solutions:

  1. Make sure to exclude the data drive mounts for timeshift in the timeshift.json file. My data drives are mounted in /srv and timeshift was including them in the snapshot which exceeded the size of the drive and caused it to error.
  2. snapraid-runner is a nice script for running snapraid sync and scrub and easily setting other options. I also learned that in order for snapraid to run correctly without errors on my data that I needed to add an ignore rule for *.log files as they were changing during the sync process and causing it to error.
  3. When running snapraid sync or a borg backup to shutdown all docker containers and limit anything that might have a database or try changing a file while the backup is occurring. Since most of my services run in docker containers, here is an example of how I do this to start snapraid-runner:
#!/bin/bash
#stop all docker containers and if successful start snapraid-runner
docker stop $(docker ps -a -q) && snapraid-runner -c /path/to/config
#these services need to start before other services that depend on them
docker start redis
docker start nginx
#start all the other services
docker start $(docker ps -a -q)