Using hard drives for data backup was unimaginable a few
years ago, simply because their cost compared to tapes was prohibitive. Things
have changed now, as hard drive capacities have shot up and costs have come down
dramatically. It's still not become cheaper than tape though, but has become
its serious contender for certain applications. The reasons for the same are
pretty much evident. Hard drives are fast, so they can perform backup and
recovery jobs much faster than tape. They're much more reliable than tape.
Chances of an error happening during an overnight backup job on tape are far
higher than doing the same on disk.
Similarly, recovery operations, especially for small
amounts of data, or even for individual files are much faster on disk than tape.
Please note that we said small amounts of data, because when you take large
amounts, then tape drives are just as efficient, as they can attain very high
throughputs when streaming large volumes of data. So disk based backup is
definitely a technology to watch and one of the ways ahead for storage. A lot of
work is happening in this area, but its price points still haven't reached a
level where mass deployments can happen.
Before going any further, we'd like to differentiate
between backup and archival. Data backup is done for data that would be required
in the near future, say within a few days or weeks. Archival on the other hand
is meant for storing data for a long period of time, going into years. Tape
drives still remain the most suitable media for archival. In fact, most
disk-based backup systems also ultimately store everything to tape drives for
archival. So the first level of backup is from disk to disk, and the second
level is disk to tape. This concept is known as disk to disk to tape. There are
two ways of doing disk to disk to tape backup. One is to use backup software to
do the backup job. In this case, the software would treat the disk array as a
tape and try to perform the backup operations on it. Though it sounds like a
fairly simple process, there are several challenges in this. One is the
difference in the very nature of disks and tapes themselves. While tape uses
sequential access, hard drives use random access. This makes it very difficult
for the backup software to track what data is backed up where. Likewise,
multiple backup jobs on disk would cause it to get fragmented. This can pose a
major problem for large volume backups because the defragmentation process can
take ages on high capacity media. Hardware storage vendors have taken all these
challenges of software based disk backup and created their own disk libraries
that emulate tape. These are called hardware based disk libraries or virtual
tape libraries. They're virtual because they're disks emulating as tape. So
the backup software only sees the array as a tape library. Since the array is
actually made up of disks, you get all the benefits of the hard disk that we
mentioned.
While this does sound promising, there are a number of
things that need to be checked when going for such a solution. For one, you need
to check which disk arrays can the virtual tape library take backup from. Does
it allow you to connect to other disk based systems like NAS boxes? How many
different types of tapes can it emulate? Does it use the native tape format to
backup the data? How many different backup applications, tape drives and
libraries is it compatible with? Because in the end, it has to fit seamlessly
into your existing storage system without requiring too many changes.