- cross-posted to:
- technology@lemmy.world
- technology@lemmy.ml
- cross-posted to:
- technology@lemmy.world
- technology@lemmy.ml
Hi all,
As self-hosting is not just “home-hosting” I guess this post should also be on-topic here.
Beginning of the year, bleeping-computers published an interesting post on the biggest cybersecurity stories of 2023.
Item 13 is an interesing one. (see URL of this post). Summary in short A Danish cloud-provider gets hit by a ransomware attack, encrypting not only the clients data, but also the backups.
For a user, this means that a senario where, not only your VM becomes unusable (virtual disk-storage is encrypted), but also the daily backups you made to the cloud-provider S3-storage is useless, might be not as far-fetches then what your think.
So … conclussion ??? If you have VMs at a cloud-provider and do daily backups, it might be usefull to actually get your storage for these backups from a different provider then the one where your house your VMs.
Anybody any ideas or remarks on this?
The real issue here is backups vs disaster recovery.
Backups can live on the same network. Backups are there for the day to day things that can go wrong. A server disk is corrupted, a user accidentally deletes a file, those kinds of things.
Disaster recovery is what happens when your primary platform is unavailable.
Your cloud provider getting taken down is a disaster recovery situation. The entire thing is unavailable. At this point you’re accepting data loss and starting to spin up in your disaster recovery location.
The fact they were hit by crypto is irrelevant. It could have been an earthquake, flooding, terrorist attack, or anything, but your primary data center was destroyed.
Backups are not meant for that scenario. What you’re looking for is disaster recovery.
Yes. Fair point.
On the other hand, most of the disaster senarios you mention are solved by geographic redundancy: set up your backup // DRS storage in a datacenter far away from the primary service. A scenario where all services,in all datacenters managed by a could-provider are impacted is probably new.
It is something that, considering the current geopolical situation we are now it, -and that I assume will only become worse- that we should better keep in the back of our mind.
It should be obvious from the context here, but you don’t just need geographic separation, you need “everything” separation. If you have all your data in the cloud, and you want disaster recovery capability, then you need at least two independent cloud providers.
I’ve thought about how I could handle disaster recovery for my homelab environment, but I haven’t come to any good solutions. For example, if my main concern was being hit by crypto. I can’t just recover from a regular backup, since I’m not sure how I can make a backup without that backup just being encrypted along side everything else. Since I mainly just backup everything to my file server, which is then synced to the cloud. In that setup, my cloud backups would be lost as well.
Would you have some starting points on how others handle disaster recovery? I’d like to avoid manually making an offline backup, because inevitably I’d forget to do it, which would make it useless anyway.
What cloud backup solution are you using? A lot of them offer additional protection that would keep a history of your files. You can essentially say “once a week create a point in time recovery of all my files” and then you could recover your files from that point in time.
This usually costs extra, and it makes sense why. They’re essentially keeping extra copies of your data for you.
How that is configured allows you to determine your RPO, or recovery point objective.
https://www.imperva.com/learn/availability/recovery-point-objective-rpo/
So you can decide how much data you’re comfortable losing by determining how often those point in time recovery events happen.
Did that make sense?
It does make sense. Thank you. I appreciate the link!
However, my cloud usage is purely as a proxy/load balancer, as none of my cloud providers hold any actual data. They’re just routing traffic, and all data/processing is on premises. What I’m interested in, is how to setup something like what you describe, but on premises also. From a design stand point, if I wanted to protect myself from a ransomware attack, obviously my cloud backups would be lost because they’re a mounted filesystem during a backup eventually. So I don’t know how to wrap my head around handling this, just storage design wise as specific tools I can figure out. How does one create a recovery point, and keep it safe from something like this? Just image the entire file system from a live booted offline environment? Feels like a chicken-egg problem to me.
By definition a disaster recovery solution needs to be geographically separate. You’re protecting yourself from catastrophe, and some of those scenarios include your main location burning down, flooding, being hit by a tornado, etc etc.
So you either need to collocate systems with a friend who you trust, purchase colocation services from a provider, or use a cloud service to achieve what you’re looking for to truly have a DR solution.
As far as how to do that, the main idea is to have that point in time available on a system that, even if you get compromised, the backups won’t. The old school method here is to use an external hard drive or a tape device, and physically store that offsite. So like use your regular backup mechanism, and in addition to what it’s doing now schedule a daily/weekly/monthly job that backs up to this other device, and then store that away from your main location.
That’s essentially the idea though, and there are any number of solutions you can use to do it.
So … conclussion ???
Have backups.
Only 2 copies of your data stored in the same place isn’t enough, you want 3 at minimum and at least 1 should be somewhere else.
Indeed. Whatever you put in a cloud needs backups. Not only at the cloud provider, but also “at home”.
There has been a case of a cloud provider shutting down a few months ago. The provider informed their customers, but only the accounting departments that were responsible for the payments. And several of those companies’ accounting departments did not really understand the message except for “needs no longer be paid”.
So for the rest of the company, the service went down hard after a grace period, when the provider deleted all customer files, including the backups…
What if the data is leaked/compromised?
That’s why you use encryption.
Backups are usually encrypted from most popular backup programs, either by default or as an option (restic, borg, duplicati, veeam, etc…). So that would take care of someone else getting their hands on your backup data.
I never store my actual files on a cloud service, only encrypted backups.
For local data on my devices, my laptop is encrypted with bitlocker, and my Android phone is by default. My desktop at home is not though.
Its just some elses computer. Said this since the beginning
The issue is not cloud vs self-hosted. The question is “who has technical control over all the servers involved”. If you would home-host a server and have a backup of that a network of your friend, if your username / password pops up on a infostealer-website, you will be equaly in problem!
If you follow the 3-2-1 backup policyand unless it’s the end of the world you should be fine.
3 backups 2 different media types 1 off-site
If your worried about a cloud provider getting attacked then have 2 off-site.
Easy, I always mirror my cloud. My setting is: cloud is extern and in my network there is always the same copy of everything on a simple smb-nas.
-
My house burns to the ground (or easier, the NAS is broken) = online backup
-
The online provider got hacked = No problem, I have an backup at home.
-
The hackers burned my house down at the same time they killed my cloud = Well fuck.
PS. Since the most syncs are going directly to the cloud its just an rclone cronjob every night to backup everything on the NAS.
-
haha
“the cloud” does not change the fact that if you data does not reside in 2 physical locations you do not have a backup.
so yes, standard practices that have existed… well, since the beginning, still apply.
Well, the issue here is that your backup may be physically in a different location (which you can ask to host your S3 backup storage in a different datacenter then the VMs), if the servers themselfs on which the service (VMs or S3) is hosted is managed by the same technical entity, then a ransomware attack on that company can affect both services.
So, get S3 storage for your backups from a completely different company?
I just wonder to what degree this will impact the bandwidth-usage of your VM if -say- you do a complete backup of your every day to a host that will be comsidered as “of-premises”
yeah, you can use another cloud provider as backup… if you do it correctly.
personally, my disaster recovery plans dont include entire offsite VMs. i only care about data in a dr situation. so you send incremental daily backups offsite.
containers have made VMs even more irrelevant/ephemeral so focus on the data.
I assume “data” includes your container configuration files in this strategy?
Those are pretty easy to store off site since they shouldn’t change often.
if you backup your vm data to the same provider as you run your vm on you don’t have an ‘off-site’-backup, which is one criteria of the 3-2-1 backup rule.
I’m more worried about what’s going to happen to all the self-hosters out there whenever Cloudflare changes their policy on DNS or their beloved free tunnels. People trust those companies too much. I also did at some point, until I got burned by DynDNS.
We start paying for static IPs. If cloudflare shuts down overnight, a lot of stuff stops working but no data is lost so we can get it back up with some work.
They’re just creating a situation where people forget how to do thing without a magic tunnel or whatever. We’ve seen this with other things, and a proof of this is the fact that you’re suggesting you’ll require a static IP while in fact you won’t.
Where I live, many ISPs tie public IPs to static IPs if they are using CG-NAT. But of course there are other options as well. My point was that the other options don’t disappear.
Though I do get the point that Cloudflare aren’t giving away something for nothing. The main reason to me is to get hobbiest using it so they start using it (on paid plans) in their work, or otherwise get people to upgrade to paid plans. However, the “give something away for free until they can’t live without it then force them to pay” model is pretty classic in tech by now.
However, the “give something away for free until they can’t live without it then force them to pay” model is pretty classic in tech by now.
Yes, this is a problem and a growing one, like a cancer. This new self-hosting and software development trends are essentially someone reconfiguring and mangling the development and sysadmin learning, tools and experience to the point people are required to spend more than ever for no absolute reason other than profits.
I am my cloud provider. Don’t have duplicate copies of my server yet so I guess I’m kinda fucked.
Well, based on advice of Samsy, take a backup of home-server network to a NAS on your home-network. (I do home that your server-segment and your home-segment are two seperated networks, no?) Or better, set up your NAS at a friend’s house (and require MFA or a hardware security-key to access it remotely)
But man, I’ll be able to amend all those TODO items that have been accumulating of the last 12 months and fix all those issues while rebuilding my raid.
I mean that’s only if my GITs aren’t hijacked during the ransomware attack.
And I mean, I’ll probably just push the same config to my server and let it on its merry way again.
Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:
Fewer Letters More Letters DNS Domain Name Service/System IP Internet Protocol NAS Network-Attached Storage NAT Network Address Translation
4 acronyms in this thread; the most compressed thread commented on today has 5 acronyms.
[Thread #410 for this sub, first seen 8th Jan 2024, 07:45] [FAQ] [Full list] [Contact] [Source code]
Dammit, I came here hoping to see at least one “I have a very special set of skills.” Oh well.
Yeah I’d cut bait, rebuild from latest tapes. But also…
I’d put the corrupted backups in an eye-catching container, like a Lisa Frank backpack or Barbie lunchbox, to put on the wall in my office as a cautionary tale.
I will put “multicloud” on my wishlist.
Looking at it from a infosec point of view, cloud-providers are an ideal target. All the customers who have just lost all their data now complaining to the cloud-provider are the ideal pressure-mechanism to get the cloud-provider to pay out.
A data cloud backup loss should be fine, because it’s a backup. Just re-up your local backup to a new cloud/second physical location, that’s the whole point of two.
I don’t see a need to run two conccurent cloud backups.
In this case, it is not you -as a customer- that gets hacked, but it was the cloud-company itself. The randomware-gang encrypted the disks on server level, which impacted all the customers on every server of the cloud-provider.
Yeah absolutely, but tonyou as an individual , it’s the same net effect of your cloud backup is lost. Just re-up your local backup to a different cloud provider.
I wonder if the specifics of the hack would make backing up elsewhere fail. Possibly by spreading the hack to new machines.
In any case, testing backups is important.
I have been thinking the same thing.
I have been looking into a way to copy files from our servers to our S3 backup-storage, without having the access-keys stored on the server. (as I think we can assume that will be one of the first thing the ransomware toolkits will be looking for).
Perhaps a script on a remote machine that initiate a ssh to the server and does a “s3cmd cp” with the keys entered from stdin ? Sofar, I have not found how to do this.
Does anybody know if this is possible?