Managing data is a tougher task and to keep up with the pace of the ever
increasing data is a cumbersome job for IT admins in every organization. The
rate of growth of data being managed within any typical enterprise is over 50%
per year and it is the job of IT admins to make sure that the data is managed
and stored effectively without any increase in their IT budgets. Number of data
management related issues faced within an organization are:
1. Storage and management of email, which is a critical business application
is one of the challenge.
2. Another challenge is management of the data centers of the organization
which are active 24 X 7. With increasing data being stored in these data
centers, completing backup of data in limited time is another daunting task for
the IT administrators.
Direct Hit! |
Applies To: IT managers, storage managers USP: Tech to manage ever-increasing data within organizations effectively Google Keywords: Data management techniques |
The crux of the modern data management challenge is that this ever-growing
data must be retained for longer durations and this data needs to be backed up
in lesser time without any considerable increase in their IT budgets. For this
organizations need not only to look for new technology innovations but they must
also adopt these innovations to gain a competitive edge in the market.
The Innovations
There are a number of disruptive data management techniques which initially
might not sound like a great technical innovation, but which when used properly
then they can drastically change the way data is being managed. Let us talk
about a few such innovations.
Generating Snapshot copies of data
Traditional backup techniques take some time to backup a data set and if the
data is being updated using a certain application along with the backing up
process, then chances are that the backup may not be valid or data may not be
usable when they are restored. One way to tackle this problem is to generate
snapshot copies of the data and stop updating data while the backup process is
on. For this one can instantly freeze a copy of disk volume and easily backup or
copy data from that frozen copy. Once you've created a snapshot copy on the
disk, you can use it for taking backups to tape or another disk even when live
data on the disk is being changed.
Every snapshot copy is a complete and consistent image of the data-the user
need not to apply the incremental updates in sequence to get back his data; he
merely has to revert to the most recent snapshot. The key benefit of such an
approach is one that the snapshot copies cannot be changed. Another highlight is
that it contains only the changed data from the previous snapshot, so the cost
of keeping a snapshot is just the incremental space consumed. Moreover, restoral
of data from snapshot copies (kept on disk) is instantaneous and completely
reliable.
Overall, snapshots help in consolidation of resources, aid data migration and
they also result in cost savings. Companies should maintain multiple copies of
data in snapshots and use a tiered data retention approach in order to get
maximum benefits.
Thin provisioning
Storage users generally over-estimate their disk space requirements as they
have very little idea about how their requirement will grow over time. So each
user tends to keep a little 'buffer' in their storage capacity estimates and
this adds on to a huge storage capacity out of which only less than half the
storage capacity gets consumed. For whole of the storage capacity the
organizations pay a huge sum and if they are paying for something which they are
not even using then it is a drawback. To avoid this one can use thin
provisioning; this dramatically improves the storage capacity utilization within
an organization and thereby reduces costs significantly.
Thin provisioning works on the principle that all users are not going to use
whole of their allotted space at the same time. Keeping this in mind, the
storage administrators allocate high-storage space quota to a user, but actually
reserve only a fraction of the physical disk capacity. So for instance, if a
user is allotted a quota of 100 GB, the disk space reserved for the user may
only be 50 GB. Coupled with thin provisioning, the administrator installs a tool
which monitors the real disk usage and sends an alert to the administrator when
the real usage starts approaching the reserved space.
IP SAN
When Storage Area Networks appeared in the market, they were seen as an
expensive storage solution, which only a large data center could afford. But the
speed of access and the security provided by SANs made them an attractive choice
for large data centers.
Then came IP SAN, a Storage Area Network implemented using the Internet SCSI
(iSCSI) protocol. IP SANs provide all the benefits of SAN (security, speed and
scalability) but at a lower cost as the ISCSI uses regular Ethernet adapters,
cables and switches for storage-server connectivity. Thus the organization
adopting IP SAN can use the existing local area networks for dual purposes; for
data and storage connectivity. In fact, since this technology is based on IP
standards, so one can easily connect remote offices over the WAN, using IP
protocols, into the central data centre for consolidation and back-up of data.
The organization can migrate from locally attached storage islands to IP SAN
and adopt FC only for some really demanding and performance-hungry applications.
Mirrored storage arrays
Because of the increased risks of natural and man-made disasters, each
organization has their own disaster recovery plans. Most of the organizations
keep a backup of their data on tapes in a remote location. But since
transportation of these tapes from one place to another and then loading them on
to an alternate server to bring the data online consumes hell lot of time. For
faster disaster recovery, a live secondary server site which has a replica of
the production data at the primary site is required and this is what Storage
mirroring technology makes possible. Here, two storage systems kept at a large
distance from each other transparently send data packets to each other and in
the event of a disaster at one site, the other site can take over almost
instantaneously because it has the most recent copy of whole of the critical
data. Since the mirroring is done by the storage system, it does not add any
additional load to the data center servers in this process. There have been some
dramatic improvements in reliability and disaster readiness because of this
technology.
De-duplication
A recent innovation, which has a huge potential to reduce storage costs is
De-duplication method, it avoids duplicate copies of data. It searches for
duplicate copies kept on a storage system and eliminates all the duplicate
copies. This is beneficial as there are a number of files whose duplicate data
exists-for instance consider a presentation sent by one user to all other
team-mates. There will be multiple copies in each user's mailbox and a few
copies will also be there in shared folders. So, there is a lot of scope for
detecting and eliminating these duplicate blocks of data. De-duplication just
does that and it is transparent to the end-user. It saves storage costs and also
speeds up the data backup processes.
Since de-duplication is a relatively new innovation, the implications of this
technology haven't been completely understood by the user community. But one can
be certain that this is bound to cause some disruption in the way data is
managed.
What to expect in future?
We have seen how adopting disruptive innovations in data management offer
competitive advantage to the adopters. The pace of innovation in the data
storage industry will only grow in the coming days. Many start-ups are entering
the storage industry, eager to solve specific data management challenges. And
the already established storage systems vendors are also ramping up innovation,
hence, taking advantage of the technical talent available in countries like
India. But the real winners will be those users who are ready to adopt the
disruptive innovations in their workflow, and make good use of them to meet
their IT challenges.