Innovation Awards 2022 is the market-leading awards program for celebrating ecosystem innovation and excellence across the technology sector in ASEAN.
Item-level vs. image-level back-up: Why it’s best to use a combo
- 11 August, 2020 20:32
There are two very different ways to back-up a computer: Item-level back-up and image-level back-up, and both methods have been used by IT pros for decades. Each comes with its own advantages and disadvantages, which is why most environments use a combination of the two.
An item-level back-up backs up discrete collections of information that are addressed as individual items, and the most common type of item is a file. In fact, if this article were being written several years ago, this would most likely be called file-level back-up.
The other type of item that might be included in an item-level back-up is an object in an object storage system. For many environments, objects are similar to files in that most companies using object storage are simply using it to hold onto what would otherwise be files.
But since they are being stored in an object storage system, they are not files, as files are stored in a file system. The contents are often the same, but they get a different name because they are stored differently.
You typically perform item-level back-up if you are backing up a file server, a Windows or Linux server, or a virtual machine where the back-up agent is running inside the server/VM itself.
The back-up agent is deciding which files to back-up by first looking at the file system, such as C:\. If you are performing a full back-up, it will back-up all the files in the file system. If you are performing an incremental back-up, it will be backing up files that have changed since the last back-up.
You are also performing an item-level back-up if you are backing up your object storage system, such as Amazon S3, Azure Blob, or Google Cloud Storage. You may wonder why you’d want to back-up these services since they are typically replicated to multiple locations for data resiliency.
However, replication within object storage is typically designed to survive system outages but is not necessarily designed to survive an attack from the outside that deletes the object itself. If an object is deleted – whether on purpose, accidentally or maliciously – the deletion will be replicated across all copies. The only way to protect against this is to perform an item-level back-up of your object storage to another account.
The advantage of an item-level back-up is that it is very easy to understand. Install a back-up agent in the appropriate place and it will examine your file or object storage system, find all of the items there and back them up at the appropriate time.
Restoring the entire file or object storage system requires first restoring the full back-up, and then performing restores from each subsequent incremental back-up.
An image-level back-up is the result of backing up either a physical or virtual device at the block level. This is why – depending on your frame of reference – image-level back-ups are also referred to as drive-level, volume-level, or VM-level back-ups.
The device could be storing a variety of information types, including a standard file system, block storage for a database, or even the boot volume for a physical or virtual machine. Within an image-level back-up, you're backing up the building blocks of the file system, rather than backing up the files themselves.
Prior to the advent of virtualisation, image-level back-ups were rare because backing up the physical drive was a lot harder and required un-mounting the filesystem while you backed up the blocks.
Otherwise you risked a contaminated back-up where some of the blocks are from one point in time and some of the blocks are from another point in time. Virtual snapshot technology, such as what is found in Windows Volume Shadow Services (VSS) or VMware snapshots, solved this underlying problem.
Backing up at the volume level became much more popular once VMs came on the scene. Image-level back-ups allow you to perform a back-up of a VM at the hypervisor level, where your back-up software runs outside the VM and sees the VM as one or more images (e.g. VMDK files in VMware).
Backing up at the image level has a number of advantages. First, it provides faster back-ups and much faster restores. Image-level back-ups avoid the overhead of the file- or object-storage system and go directly to the underlying storage.
Restores can be much faster because file-level back-ups require restoring each file individually, which requires creating a file in the file system, a process that comes with quite a bit of overhead.
This problem really rears its ugly head when restoring very dense filesystems with millions of files, where the process of creating the files during the restore actually takes longer than the process of transferring the data into the files. Image-level restores do not have this problem because they are laying the data straight down at the block level.
Once the changing block issue was addressed with snapshots, back-up systems were presented with the second biggest challenge of image-level back-ups: incremental back-ups. When you are backing up at the drive, volume, or image level, every file is a full back-up.
For example, consider a VM represented by a VMDK file. If that VM is running and a single block in the VM changes, the modification time on that image will show that it has changed. A subsequent back-up will then back-up the entire VMDK file, even though only a few blocks of data might have changed.
This challenge has also been solved in the VM world via change-block tracking (CBT), which is a protocol to keep track of when a previous back-up was created and the blocks that have changed since that last back-up. This allows an image-level back-up to perform a block-level incremental back-up by using this protocol to ask which blocks have changed, and then copying only those blocks.
This leaves us with one final disadvantage of backing up at the image level, and that is the lack of item-level recovery. Customers do not typically want to restore an entire VM; they want to restore a file or two within that VM.
How do you restore a single file from a VM when you backed up the entire VM as a single image? This is also a problem that has been solved by many back-up software companies.
For example, in the case of a VMware VM, they understand the format of VMDK files, which allows them to do a number of different things. Some back-up products allow you to mount the original VMDK files as a virtual volume and grab the files that you need from there, where others can grab the appropriate blocks that are necessary to restore the file you've asked for.
The best of both worlds
This means that at this point in the back-up and recovery industry, most customers are performing image-level back-ups of all of their VMs while still retaining the ability to perform both incremental back-ups and item-level restores. It also supports block-level incremental back-ups, which are actually much more efficient than item-level incremental back-ups.
Backing up at the VM level also comes with the ability to easily restore the VM as a single image. This makes what we used to call bare-metal recovery so much easier than it used to be. You get all of the bare-metal recovery capabilities that you need without having to jump through hoops to address the changing block issue.
We even have image-level back-ups of physical Windows servers, since most people are using Windows VSS to create a snapshot of each file system prior to backing it up. This allows the back-up software product to back up at the image-level without risking data corruption.
Now that you understand the advantages and disadvantages of each approach, you can make an educated decision as to which is appropriate for you. Most people backing up VMs choose image-level back-ups, and most people backing up file servers use item-level back-ups. Deciding what to do with physical servers might take a bit more research.