Author Topic: REthinking backup stratey  (Read 706 times)

REthinking backup stratey
« on: July 23, 2018, 09:30:05 am »
This morning I have a hard drive that has gone buggerup. Natch, it is the one with most data on. You may ask, whay aren't I using RAID? Answer is, over the years I've had more issues with domestic style RAID 0 than I have ever had with hard drives, I'm still reasonably hopeful that I may be able to pry the data off.

It's not critical as I do have a backup strategy,we'll have to see how effective it is, at the moment I'm reasonably hopeful, and if it fails I've only myself to blame. That's not the backup strategy I'm talking about.

For better or worse due to LEGACY, I run 7 drives, 1 internal SSD (240Gb),  4 internal HD (2 x 2Tb, 1x1Tb, 1x 700Gb), 2 External HD (2 x 2Tb). I recognise that multiplicity increases my exposure to failure, although it is one of the more recent drives that has failed. With some exceptions, I have a relatively sensible (hah!) split, keeping most data on one drive and backing that data up (I hope).

But the truth is, any drive failure is going to be inconvenient. So, I'm trying to consider how I can re-think to end up with a fully resilient system.

Sounds like I have to go fully virtualised, with some sort of vSAN, creating snapshots to achieve the resilience. That would need 2 or 3x used storage, but the snapshot archive would only have to be external, powered off most of the time.

I'm having problems getting my head around how to do this,  especially what the licensing implications are and whether I can do on a single machine or I'll need a server, or if there are any better ideas out there. Not sure how I get on trying to do enterprise grade stuff in a domestic environment (with domestic budget....)

Option 2 is doing a simple disk image duplication, somehow or the other. Not sure what tech is around to do that but it strikes me as easier in concept.

The challenge, create a resilient environment that will allow easy resurrection in the event of failure without restore from backup. 7 or even 30 Day RPO. As above, I don't trust RAID in day to day operation,but maybe I'm wrong to be so pessimistic.

tiermat

  • According to Jane, I'm a Unisex SpaceAdmin
Re: REthinking backup stratey
« Reply #1 on: July 23, 2018, 10:51:21 am »
My sarcastic self is saying "Oh, look, how cute, a physical device based backup system. I thought they went out of fashion years ago".

My, more helpful, self is saying work out what you can't live without, stick that in the cloud.  Next level down, stick on a Drobo or the like, so it has hardware resilience, with a second copy on a Seagate, or the like, external drive (following the mantra that data doesn't exist unless it is in 3 locations).  Stuff you don't really care about (MP3s etc), leave on each local machine.

I used to spend far too much time worrying about how to keep my data safe.  I have now farmed out that concern to someone else and sleep much better for it.  If you tot up the cost (hardware, software and time), you will get a lot of storage, for a long time, in the cloud for the same money!
I feel like Captain Kirk, on a brand new planet every day, a little like King Kong on top of the Empire State

Re: REthinking backup stratey
« Reply #2 on: July 23, 2018, 11:08:03 am »
Well, yes, that is pretty much what I've done - data wise (or at least, hope I have done, but that's another story).

However, getting up and running "as was" is much more than just recovering data. What I'm thinking around is how to have a duplicate environment to recover into, effectively for the cost of the hard disks.

tiermat

  • According to Jane, I'm a Unisex SpaceAdmin
Re: REthinking backup stratey
« Reply #3 on: July 23, 2018, 11:19:00 am »
Can I suggest some reading material?

https://landing.google.com/sre/book/chapters/embracing-risk.html

Work out what it is worth to you, how much it will cost to develop (include a calculation where your time is zero value, and one where it has a value >zero), then use those figures to work out what you should be doing.

My answer was "Treat all devices as disposable, don't do anything that require commercial grade plans"

Again, this means I sleep well at night :)
I feel like Captain Kirk, on a brand new planet every day, a little like King Kong on top of the Empire State

Kim

  • 2nd in the world
Re: REthinking backup stratey
« Reply #4 on: July 23, 2018, 11:23:08 am »
RAID works fine[1], but it isn't a backup solution.  Indeed it introduces a new layer of potential failure modes[2] in order to makes dealing with a subset of common hardware failures more convenient.

Backups need to be automated so they actually happen, and they need to be designed to handle the most common forms of data loss.  In my world, the most common use-case is "Oh fuckit, I've made a complete mess of this piece of work, I'm pulling Thursday's version out of the backups and starting again.".  Anything that involves restoring entire images, VMs or otherwise, makes that sort of thing a complete pain in the arse.

Real disasters, like visits from the disk fairy, malware, OS updates run amok, being pwned by 1337 h4xx0rz, or your house burning down are much rarer, but a good backup strategy should cope with those too.


My strategy is a product of paranioa[3], combined with the convenience of working with relatively small datasets.  Code, email and documents don't take up much space - if you're regularly creating high-resolution images or videos, it's going to get expensive.

I have a server (running RAID1 to hedge against disk failures), where all the important data lives as a matter of course[4].  Desktops make daily borg backups to the server, mainly to aid recovery of their system-specific configuration if they break.  The server has a disk on which nightly rsync snapshots of all but the junk (temp files, cache directories, that sort of thing) are made.  This is exposed to the users via a read-only mount, so if you want to restore something, you can just go and fish it out of /backups/$date/path/to/file and you only have to wait for the drive to spin up.

Additionally, I have an external drive unplugged on a shelf which holds a single rsync snapshot, which a script nags me to connect and update after 28 days.  That should protect against more serious system failures, although recovery isn't automated.  (There seems little point - over the years the only catastrophic server failures I've had were one worm infection when I was young and naive, and a couple of motherboard failures during the capacitor plague years.  The way I see it, if I have to rebuild the hardware, the overhead of doing a clean OS install and re-populating /home, /etc important parts of /var etc. from backups is negligible, and probably worthwhile for de-crufting.  I try to keep one of the desktop machines of the same generation of hardware as the server, so in an emergency I can use it as a source of donor parts.)

I have a smaller disk in a friend's server in another city, to which I make nightly borg backups of the more important parts of the server's filesystem.  This is compressed and encrypted, and generally requires a bit of sysadmin-fu to recover files from, but it's there for worst-case scenarios.  A commercial cloud storage solution would work just as well for this, the main advantage here was that I could sneakernet the original image[5] and save myself three months.


[1] At a domestic level, I wouldn't go near anything but Linux software RAID with a barge pole.  Hardware RAID is strictly for the big boys who can afford to have duplicate hardware on the shelf doing nothing, otherwise you're liable to find yourself up the creek without a working RAID controller.  RAID0 has no redundancy btw - it's just a way to multiply your rate of disk failures in exchange for slightly more performance.
[2] RAID is excellent at faithfully reproducing corruption across multiple drives.
[3] Ignoring the sheer number of person-hours that go into even a reasonably modest computer program, as I get older I find that email and IRC logs are more reliable than my own memory of events.  I don't think it's paranoid to put a bit of effort into maintaining those.
[4] Anything portable, or running Microsoft Windows is considered fundamentally untrustworthy, and not for storing data on.
[5] Insert Tannenbaum quote about a hard drive hurtling up the M6.
To ride the Windcheetah, first, you must embrace the cantilever...

Re: REthinking backup stratey
« Reply #5 on: July 23, 2018, 12:16:48 pm »
Yeah I did mean RAID 1 mirroring, not RAID 0, and yes I agree with what you say, Kim.

I'm talking about system recovery, not data recovery.

As I said, I do have data backups. What I'm considering is, in a virtual environment, creating a mirror of the VMDKs so, in the event of a failure, you switch across to the other set. Cost should be relatively low, just an external drive that you spin up to do the images, once a month? Convenience and value should be high. 100% restored system, data, program, links.

Only it isn't quite as simple as that, because you do need to have a way to bring the hypervisor up as well, that could be another SSD, ready and waiting, or even a hd. Plus if it was as straightforward as I think it might be, you'd have thought more people would be doing it and I haven't heard of it being done.

Kim

  • 2nd in the world
Re: REthinking backup stratey
« Reply #6 on: July 23, 2018, 12:18:25 pm »
Yeah I did mean RAID 1 mirroring, not RAID 0, and yes I agree with what you say, Kim.

I'm talking about system recovery, not data recovery.

Yeah, that was a crosspost with the one where you clarified that.
To ride the Windcheetah, first, you must embrace the cantilever...


Morat

  • I tried to HTFU but something went ping :(
Re: REthinking backup stratey
« Reply #8 on: July 23, 2018, 02:11:19 pm »
Yeah I did mean RAID 1 mirroring, not RAID 0, and yes I agree with what you say, Kim.

I'm talking about system recovery, not data recovery.

As I said, I do have data backups. What I'm considering is, in a virtual environment, creating a mirror of the VMDKs so, in the event of a failure, you switch across to the other set. Cost should be relatively low, just an external drive that you spin up to do the images, once a month? Convenience and value should be high. 100% restored system, data, program, links.

Only it isn't quite as simple as that, because you do need to have a way to bring the hypervisor up as well, that could be another SSD, ready and waiting, or even a hd. Plus if it was as straightforward as I think it might be, you'd have thought more people would be doing it and I haven't heard of it being done.

Have you investigated VEEAM backup? It might be out of budget but it's extremely good. I've used it to restore VMs for real and it worked beautifully.

They have a free version for physical servers, just in case you're not all virtualised. https://www.veeam.com/windows-endpoint-server-backup-free.html
Tandem Stoker, CX bike abuser (slicks and tarmac) and owner of a sadly neglected MTB.

Re: REthinking backup stratey
« Reply #9 on: July 23, 2018, 04:03:32 pm »
XigmaNas (formerly Nas4free)
Zfs level 3 raid or mirrors
Snapshots or synching to duplicate servers.


The cost of the above depends on how much you value your data.


Morat

  • I tried to HTFU but something went ping :(
Re: REthinking backup stratey
« Reply #10 on: July 24, 2018, 02:20:05 pm »
Well, yes, that is pretty much what I've done - data wise (or at least, hope I have done, but that's another story).

However, getting up and running "as was" is much more than just recovering data. What I'm thinking around is how to have a duplicate environment to recover into, effectively for the cost of the hard disks.

For that sort of thing, virtualisation is ideal. If you had a beefy PC/Server running VMware (or possibly XEN, or Hyper-V  I don't know enough about those two) and a backup system to back up VMs to a second backup storage array or that there Cloud - you'd be able to restore full VMs very easily. Unless you're gaming you can actually run several VMs from a single host* quite happily.
VEEAM backup lets you run a VM from the backup set while you migrate it back to primary storage. The migration can be a very slow process, but it gets you up and running in just a few clicks.
Of course, most virtualised enironments are designed to provide uninterrupted service through redundancy and HA failover between multiple hosts - but that gets expensive quite quickly due to the duplicated hardware. I suspect a single host and some cloudy backup space would cover your needs, but it depends on what internet speeds you have and your other requirements.

*depends on the size of the host, of course, and the size of the VMs. You can do a lot with multiple minimal CentOS installs using the same resources overall as a single windows server VM.
Tandem Stoker, CX bike abuser (slicks and tarmac) and owner of a sadly neglected MTB.

Re: REthinking backup stratey
« Reply #11 on: July 29, 2018, 09:38:03 am »
Ham,
I'm not sure why you would bring in the complexity of virtualisation and vSANs?
I'm assuming this behemoth full of disks you are using is a desktop PC. If so, you don't say what OS you are using?

I don't trust RAID solutions like Intel's RST (Rapid Storage Technology), but MDADM, ZFS, hardware RAID all work well IME.

I really like separation of concerns. All essential data stored on a Linux server (in my case Debian) that has disks in a RAID1 Mirror. Data shared via Samba or NFS. Desktops and laptops are then pretty much disposable. If my desktop dies, no tears are shed and I can switch to using my laptop until I get the desktop fixed, and vice versa. If I wasn't so concerned about conserving electricity, I'd go for RAID6 or RAID10 here.

Linux server runs ZFS. Snapshots are taken on a daily basis and pushed to a backup server - also running Debian/ZFS. Backup server runs RAID6 and is only switched on for duration of the backup window. Snapshots are retained for 60 days giving me point in time recovery and thus a degree of resistance to cryptoware etc.

The trouble with relying on virtualisation and a vSAN etc. will be that many of the features you want - VM Snapshot, cloning, VM level replication - require licensing. My personal experiences with VMWare haven't been the greatest either. I do run a hypervisor (proxmox) but that's a separate beast. If I wanted to look at a vSAN type solution then it would probably involve two physical hosts for resilience.
A Few Apples Short of a Strudel

Re: REthinking backup stratey
« Reply #12 on: July 30, 2018, 01:53:30 pm »
Ham,
I'm not sure why you would bring in the complexity of virtualisation and vSANs?

I'm floating the idea to see what people think might be the optimal solution. Not at all sure about vSAN myself

Quote
I'm assuming this behemoth full of disks you are using is a desktop PC. If so, you don't say what OS you are using?

System boot OS in Win10, it's not so much a behemoth as  a grew like Topsy jobbie, two of the disks are legacy from previous systems that don't have anything much useful on, although are used. The OS sits on an SSD where I have left sufficient room to install Linux if I wanted. One of the two 2TB has the "performance" VMDK, the other has data (which OBVIOUSLY is the one that died). I run Virtualbox hypervisor, and guests in that.

There's a NAS with media files and non-performance VMDK

Oh yeah. Programs are hosed all about the internal HD  ::-)

Quote
I don't trust RAID solutions like Intel's RST (Rapid Storage Technology), but MDADM, ZFS, hardware RAID all work well IME.

I really like separation of concerns. All essential data stored on a Linux server (in my case Debian) that has disks in a RAID1 Mirror. Data shared via Samba or NFS. Desktops and laptops are then pretty much disposable. If my desktop dies, no tears are shed and I can switch to using my laptop until I get the desktop fixed, and vice versa. If I wasn't so concerned about conserving electricity, I'd go for RAID6 or RAID10 here.

Linux server runs ZFS. Snapshots are taken on a daily basis and pushed to a backup server - also running Debian/ZFS. Backup server runs RAID6 and is only switched on for duration of the backup window. Snapshots are retained for 60 days giving me point in time recovery and thus a degree of resistance to cryptoware etc.

The trouble with relying on virtualisation and a vSAN etc. will be that many of the features you want - VM Snapshot, cloning, VM level replication - require licensing. My personal experiences with VMWare haven't been the greatest either. I do run a hypervisor (proxmox) but that's a separate beast. If I wanted to look at a vSAN type solution then it would probably involve two physical hosts for resilience.

Indeed. One of the reasons that I still boot Win is for convenience, it's a shared desktop with Mrs Ham and if I was going to start from some sort of hypervisor I would have to script it through to a Win startup, previously I had been questioning the value of that.

The other reason is, of course, performance, I don't like the idea of trying to run compute and IO intensive apps inside a VM inside a host, ESX running native would be cleaner.

Given our (at work) involvement with VMWare, I might be able to get hold of a license if it was worth it, which comes down to whether it really would make any difference. Just creating snapshot copies of the VMDK would prevent the hassle of bringing a dead system back to a recent incarnation.

Morat

  • I tried to HTFU but something went ping :(
Re: REthinking backup stratey
« Reply #13 on: August 07, 2018, 12:39:20 pm »
Just for clarity, where I said Host I meant a machine running ESXi - not a windows machine running VMware.
ESXi itself is free, it's the enterprise features that cost.

It's still a big spec for home use.
Tandem Stoker, CX bike abuser (slicks and tarmac) and owner of a sadly neglected MTB.

Re: REthinking backup stratey
« Reply #14 on: August 07, 2018, 01:19:58 pm »
FWIW I have a big ESXi machine in the corner of my sitting room and it works a treat.

32GB RAM, 2x2TB HDD, 2x120GB SDD.

I run about 5 or 6 VMs on there, one of which is a backup host with a ZFS partition spread across two slices of the HDDs. At some point I need to get some more disks (the server has enough physical space I just wanted to get a SATA controller that will play nicely with ESXi).

Any new project I do I can get a completely new Linux VM up and running within 10 minutes.
"Yes please" said Squirrel "biscuits are our favourite things."