Backing up is a must for all enterprises, large and small. But backing up is
only part of the story. There is also recovery, which, in the long run, is far
more important. And then you have the need to archive what you've backed up,
so that you can pull out data as and when you need it later. What technology do
you use for performing all these functions? If you asked this question about two
years ago, you would get only one answer-Tape. Today, however, things are
looking different with disk emerging as a strong contender. It's no wonder
that the question that plagues most enterprises is whether to choose disk or
tape, or should a combination of the two be used. And all that hype created by
vendors around both doesn't help much either. In this story, we try to remove
the hype and analyze both disk and tape based technologies in detail to help you
decide the best option.
So what is the reason for this ongoing rage between tapes and disks? Is it
just that the tapes are as old as probably the concept of storage, is it cost,
speed, or time? Well, on the cost front, disk is definitely becoming cheaper,
but it's unlikely to become as cheap as tape, so that can't be the reason
for tapes to fade away. In fact, that would be one reason why people will
continue to stick with tape for times to come. On the speed front, disks
definitely have an edge, which directly links to the time taken for backup and
restore. So the time advantage is with disk. How does one answer this dichotomy?
Before choosing any technology, you need to first assess your own
requirements. If you're backing up data, what kind of backup do you want to
do? Today, you have several options in hand: online, offsite, near-line (near
online), incremental...mirroring, archival and the list goes on. Then put down
the advantages and disadvantages of disk and tape against the options that are
applicable to your organization. The immediate answers are that disk is more
suitable for online, incremental, and mirroring based backups, while tape is
more suitable for offsite, archival and near-line backup.
Is it anti-incumbency?
Despite lots of talk, nobody seems to be taking that bold step to do away
with any one of them. Is it because the tape has been the traditional medium of
choice that has been around for ages and has a good record for longevity and
reliability if handled correctly and managed properly? Or is it because of the
anti-incumbency, the inertia that resists change, that has crept in?
Well not exactly. Faster performance is one reason for everyone to move from
tape to disk for backups, but it's not the only reason. The magnetic tape is
also fragile, can break easily, is vulnerable to environmental factors such as
humidity and heat, and it loses tension. Tape drive heads get dirty and have to
be cleaned, and data isn't always restored perfectly due to such reasons.
On the other hand, disks are more durable, last longer, withstand more
overwriting and you don't need to clean any heads. When it comes to backing up
using disks, they are easier to manage. Disk backup systems include management
tools, often browser-based, for you to easily configure settings and check
status from anywhere.
Not only these, tapes can only read and write data sequentially, making the
whole process slow. So there was a need to bring in a technology that could
write to them fast enough. Consequently, multiplexing (also called interleaving)
was born. Many back-up software use multiplexing to send back-up streams from
multiple file systems or multiple clients simultaneously to one tape drive. This
allows a tape drive to 'stream' at its rated speed. Many tape drives today
would never write anywhere near their rated speed, if they don't make use of
interleaving. But, restoring data from a multiplexed tape can take much longer.
As a result, most organizations use both backup devices (tape and disks) for
better results. Once last night's backups are sent to disk, they can then be
easily copied to tape and sent offsite. (These tapes would also not need to be
multiplexed, as the copy is being made locally.)
Here lies a dilemma. When you use tapes, you need multiplexing to be able to
finish backups in time, but multiplexing makes it difficult to restore data
easily.
Glossary |
AoE (ATA over Ethernet): Network storage standard for mounting disks on the network. AIT (Advanced Intelligent Tape): Helical tape scan technology DLT (Digital Linear Tape): A form of magnetic tape technology Interleaving: A process of arranging parts of one sequence of LTO (Linear Tape Open): Family of Open tape standards jointly SDLT (Super DLT): It provides native capacities of 110 GB, 160 UDO (Ultra Density Optical): Disks storage capacity starting VTL (Virtual Tape Library): It is an intelligent disk-based |
To avoid it, you need a media type that doesn't need to stream as with
interleaving. But how could you do that? Well, “the answer is blowing in the
wind”-disk.
Mixed approach
Disk arrays are used for regular storage, tape is relegated more to just
archiving and batch backup is done using autoloaders. But there are other types
of backup that are being used-D2D (disk to disk) and D2D2T (disk to disk to
tape).
D2D: In this type of backup, data is first stored on a primary disk
for a shorter period of time and online, and later taken to secondary storage.
D2D2T: In this type, there's a disk array for primary storage and
secondary array or a VTL that emulates a tape library. The data is stored in the
disk array for primary storage and is then sent to the secondary array. After
30-90 days the data is backed up to tape. This is a type of tiered storage when
you can retrieve data online for a certain period of time and then it is
offloaded to be archived at a remote location.
But in today's scenario, organizations need to keep more data online as
there are new regulations like Sarbanes-Oxley in US and health regulations
governed by HIPAA that require the data to be kept online and to be accessed
quickly for longer periods of time without it getting changed. Here, the
requirement for quick and hands on access can be met only when the data is kept
at online or near-online locations and not stored on tapes offsite.
Size and backup window
Most backups are done overnight or at weekends. This is to get a larger backup
window. Backup window is the time between the start of a backup process and when
it ends. And then growing volume is the biggest danger that enterprises face
today. Most organizations want to save almost everything for at least longer
periods of time for the rainy day, thus, requiring more storage size.
This disk is so economical that you could actually keep all of your cyclical
full and incremental backups on disk. Day-to-day restores can be fulfilled
instantaneously by disk. In such cases, tapes can only be used for disaster
recovery and archival restores.
So one thing to remember is that atleast in near future tape will not go away
completely since there will always be applications where cost per bit is 90% of
the buying criteria. And then, as back-up compression gets built into more
products, the economic advantage of the tape will gradually decline. So if tape
is to survive in this dynamic environment, there is a need for change-not only
in terms of increasing capacities, but also a more resilient technological
platform.
Need for change
Although the capacity per cartridge of linear devices is projected to increase
with time, it is unlikely that it will pose an aggressive threat to the disks
unless there are any major technology breakthroughs. In fact, there's a limit
to the current linear tape formats, in terms of the areal recording
densities it can support. And at this rate, the capacity per cartridge cannot
exceed an uncompressed capacity of a few TBs. This can be countered only when a
new class of tape technology is developed; one that would leapfrog current
technology limitations to deliver capacities and performance well above the
capabilities of current tape formats. Some technology that would reduce the
aerial density per GB per square inch furthermore like the perpendicular
magnetic recording in disks. This new technology platform would then be able to
keep pace with hard disk drive advances. In addition, a new tape technology must
be a perfect blend of capacity, performance, and reliability with an achievable
roadmap and multiple sources of supply. Only then it will be able to fully meet
customer needs and industry acceptance and, thus, assure the future of tape
within the storage hierarchy.
Expected turnarounds
The makers of blue-laser disk are eyeing the market for enterprise archiving
applications. This space is currently dominated by magnetic tapes and to some
extent disk arrays. The new generation media support 2x speed, which is a data
transfer rate of 72 Mbit/sec, making the disks suitable for video recording,
data storage and file backup.
Similar to magnetic tape, optical disks can be removed and stored offsite for
decades though the only disadvantage is the optical disk capacities that don't
come anywhere closer to the tape drive capacities. But unlike tapes, with
optical disks, searches can be performed at random at sub-second speeds, which
is why vendors are pitching their optical disk technologies as competitors to
tape. There are already a few products in the market that are based on a new UDO
(Ultra Density Optical) laser optical disk format that offers 30 GB per platter.
The archival appliance from Plasmon is one such product with 960 GB to 19 TB
capacities. As per the roadmap laid for the optical disks, companies plan to
release a 60 GB optical disk by 2007, followed by 120 GB and 240 GB versions in
2009 and 2011, respectively. With all these coming up, you won't need tapes
for going to individual records for restoring. This is because while the disks
will be almost of the same capacity, they will also be near-line accessible. The
blu-ray disks are also gearing up to be used for archiving data, first by
entering the SOHO space and then big corporates. All this is slated to become
reality soon.
Different organizations have different objectives when they have to back up
data.
In fact, backup and retrieval are not the same as continuous availability of
the data. So you should not lock yourself into one technology solution. Instead,
go for what suits you best for one particular need and scenario. For instance,
if you have data that is to be seldom used, tape is an ideal choice, but if all
you want is a faster backup in a smaller time window, disk may have an edge.