Backups
Like flossing, backups are something we should do all the time, but for many it is often overlooked for spans of time. Partly because it is a lot of work. Partly because it has been a moving target over time.
Backups
The basic idea is to take a snapshot of either your whole system or the parts that are important to you and save them in a safe place that can be retrieved when needed. Sounds simple, but it is always a moving target. Let me go through my history of backups to illustrate the problems.
My journey
Originally I used floppy disks (ok, I used tape for a couple years until I could afford a floppy disk drive) as my sole storage media. In all cases the floppy disk was bigger than my main memory so I had to switch out floppies multiple times to make copies (the drives were expensive and I never bought 2).
Floppies died easily. The 5.25" would bend and then not rotate. Sometimes they just got crap in them and wouldn’t rotate. The 3.5" were almost armored by comparison and rarely failed mechanically for me. Both though would die a quick death when a magnet got too close. So much so people my age still have a knee jerk reaction to magnets near computers, even when there is nothing magnetically sensitive in them.
On my first programming job we had a tape drive. That was very expensive in the day and I was in awe. One cartridge (a QIC like the cover image up top) could hold 20MB of data. The whole OS could be installed on one tape and the machine backed up for weeks on a couple more. Looking back it was a cheap quarter inch cartridge tape drive, but it was a major step forward. I learned about tar for tape archive and never wanted to look at a floppy again.
I had some run in with the bigger reel to reel systems on the VAX and other systems in the computer rooms, but they looked like a pain and I was glad to be stuck with QICs.
In grad school we had high end workstations and the fad was 8mm tapes. These were actually designed for portable video use, but companies like Exabyte had made SCSI drives for data. Lightning fast, so fast that I could store 2.5GB of data in about an hour (it was the early 90s and we had a tower of 300MB hard drives on that SCSI bus). We could afford to buy these tapes fairly often, which was good given how often they would wear out. It was great, by comparison.
I bought an 8mm drive at home to back up my systems also. This worked well until I started to by hard drives that were the same size. That’s ok, because industry had brought us the 4mm DAT. Originally for high quality audio, it never caught on. The data version was all the rage though. 12GB in the same time as the 8mm and it was half the size.
We had a RAID array of drives to store the growing number of pictures on. Switching out 4mm DATs was getting to be a pain. I bought a tape changer that would take a cartridge of 4 and could switch them via commands like ‘mt’. Those switchers required a lot of maintenance and would break easily. That was the end of tapes at home.
At work we moved through the LTO series and had tape robots. Seeing how often the maintenance guy had to come out it was easy to understand why the maintenance contract was close to a programmers salary. That was about the end of tapes at work.
CD-ROMs and DVDs could hold 640MB and 4.7GB respectively. While you weren’t going to back up your whole system with either one, you could backup critical data and system recovery images with them. If you went nuts you could write a series of DVDs to backup something large, but you had to be serious.
Around this time hard drives started to get really cheap. So cheap that is started to make sense to use them over tapes. Sure they weren’t designed for that type of abuse but the failure rate was on par with tapes and the convenience was way better. To this day I have a toaster and a few SATA drives that I copy things to every now and then.
At work we back things up to the cloud. Often from the cloud. Each of the cloud providers offer a low cost storage tier that you pay about as much to get data out of as it costs to store for a month. I’ve Petabytes of data in these and the costs are minimal, but I have ask for my data back and pay a pretty penny every time I do.
Backups today
I think most people don’t even do backups now. Chromebooks and phones will auto backup to the cloud. This works so well that an good Internet connection, a new machine and a small amount of time is all you need to be up and running after a major failure. Google even made a video about it where a guy in a hazmat suit would take a chromebook from someone typing and destroy it in various ways and bring a new one back. The typist didn’t miss a beat (it’s almost that good).
All of my machines have options for backup to different cloud services. I could even schedule it to happen every night while I’m asleep. I could recover a file from a specific date with a few mouse clicks, so long as it hasn’t been moved to that low cost tier. These options are really cool, but if you have a lot of data they are pricey.
What do you do on a cloudless day?
So you use the cloud and all your data is magically backed up by service XYZ. That’s great. It’s like elves painlessly floss your teeth while you sleep. The problem is what do you do when there is no cloud? Your Internet connection could be down, or you have data caps and can’t get all of your family pictures up of down from the cloud. None of us want to think about it, but one day, one of the cloud providers is going to make a mistake and loose something. Data is still safer there than on my computer, but you should have redundancy.
At the same time I have all of this data so I can use it. My families photos and videos mean nothing to the cloud, only to my family. Sure the family shares them in the cloud with one another and I have groups of friends that do the same. I should own a copy here. I mean really own. Store them and serve them locally.
This may sound crazy but it is important. The good news is that it is getting easier every day.
Local storage
I’ve been looking for options to do this for years. The closest to what I want is called Perkeep (formerly Camilstore). It lets me upload blobs of data and has a web interface for images and video. There is a search feature and preview option for multimedia files. I have a screenshot of a local test version below.
It is fairly simple to upload content to. Each blob is stored by it’s hash, so as I upload from different backup media I don’t have to worry about cluttering up the storage with multiple copies. They will just be references to the same image.
Right now it is not reading the meta data from any of the files I have uploaded. There are ways of storing that data in the system. Worst case I will write my own scanner or importer to pull metadata from each file and populate it in the system.
The server hardware is minimal. I have it running on a 3rd gen i5 and I’m pretty sure I can get it to run on a raspberry pi 4 with a USB drive attached. So the investment is anywhere from $100-700 per server. With ZFS as the filesystem I can have RAID redundancy and easy growth options. Compared to the equivalent in the cloud I probably pay for the system in about a year.
Is this perfect? No, not by a long shot. What gets me excited is that with a small investment in time and machines I can own the data and the whole tower below it. Sure I’ll still back this up to the cloud, but I’ll probably buy another server and replicate it at family houses in different areas. That’s essentially a shoestring Google at that point, and I own it.