Overview

Our client came to us after letting go of previous IT for losing critical company data.  The supposedly robust backups designed by previous IT did not work when put to the test. Because of poorly configured backups, the data was lost forever.  We needed to redesign their system and build their faith in our backups for the future.

Client:
Confidential
Industry:
Architecture and Design

Challenge

While considering a new backup system, we discovered that we could actually make a lot of other improvements at the same time.  The old backup system was just a drive connected to their file server.

The configuration we inherited consisted of a weekly drive clone of the file server’s contents, but this is not ideal.  First, nobody had a way of knowing if the clone was successful or not based on the way it was set up.  Someone had to manually go in each week and ensure that the files really copied over.  Unfortunately, the failure to do so resulted in catastrophic data loss.

As designed, the backup was also limited to a snapshot of the file system as it was a week ago.  This posed an alarming scenario if a user were to start a big project on Monday, work on it all week, and then lose it on Friday.  The weekly snapshot would have missed this project entirely, and the business would lose time and labor.  This represents an insufficient  RPO (recovery point objective): the interval between real-time and the time of the last useful backup.

Additionally, clones are really ineffective backups.  They just take a single snapshot at the time of backup.  That means that the previous week’s backup would get overwritten by the current week, and you could never recover files older than your last backup.  Having multiple versions of your business’ data allows for recovery even in the case of data corruption to one version.

Another major issue is that the previous IT used inadequate storage, using USB drives.  USB drives are not made to withstand the constant movement of server data, and they tend to die when overused.  After a few thousand writes, a USB drive will fall apart.  Those drives also don’t have a way of notifying you that they’re having issues, so data loss is always unexpected.  It is essential to have systems to monitor your storage; in this case, that was never done.

The final issue was that we also needed to get the files off-site, to ensure that data would be preserved even if there was physical damage to the servers. Previously, the data was vulnerable to something as simple as a water leak in the office, which could cause untold, and entirely preventable damage.

Goal

Redesign the backup system and allay any future data concerns.

Solution

  1.  Automated Backup Monitoring

We immediately put in a low-cost backup server that can properly take backups and notify us of any issues.  This solved several problems at once.

First, we could do daily backups and keep them versioner so that we could always go back in time to grab data. With the dataset we were working with, we could go back four years to retrieve data.  Of course, we had to wait four years to build up the library, but at least we knew that we were protected from the first date of installation.

Next, we ensured that the unit would tell us every time it took a backup and if it ever had any issues.  This way, if we do not get the success email, we know to investigate the unit.  Also, if a hard drive needs to be replaced, which happens naturally over time, we would get an advanced notice giving us the time to handle it.

  1. Backup 3-2-1

Backup 3-2-1 is a framework for how businesses should think about their backups.  You should have 3 copies of your data, in 2 different mediums, and have 1 of those copies offsite.

The firm already had an excellent and highly secure cloud storage service – but they weren’t properly utilizing it.  To satisfy our backup framework, we simply moved their main dataset away from the file server to their cloud storage.  Now, the backup server that is on-site is effectively an off-site backup and the cloud storage service has excellent retention to serve as the on-site copy.

Finally, we attached an enterprise-grade drive to the server that could withstand daily writes and configured a second backup system to comply with Backup 3-2-1.  We also enabled file system-level replication to dramatically speed up the recovery time, also known as RTO (recovery time objective).

Results

With a comprehensive backup and recovery plan in place, this company will never lose another file again.  Daily backups ensure an RPO of 24 hours, and their RTO is under 30 minutes, thanks to lightning-fast replication recovery.

During our redesign, we leveraged our experience to solve the initial problem and then improved their systems with proper configurations.

For any firms that are unaware of the quality of their backups, RPO and RTO are critical to understanding your setup.  Ask your IT where your firm is; you may be surprised!

Problem SOLVED