Advertisment

Managing Information Over Years

author-image
PCQ Bureau
New Update

Today, data centers are facing an indomitable challenge of

provisioning colossal capacity of storage space at an affordable price yet

meeting ever-increasing performance demands. The biggest threats that data

centers and the storage industry are facing today are power consumption, housing

space and environmental concerns. However, many administrators and planners fail

to recognize the fact that the value of data to an organization decreases over

time, as it loses its relevance, freshness, and “popularity”. One question that

administrators should be asking themselves is: why should data that is

decreasing in value remain in expensive front line storage, subject to the same

backup, replication, and recovery policies and procedures as key data? Would it

not be useful to have a system or methodology in place for analyzing and

tracking data freshness, so that storage space could be made free for more fresh

and relevant data, and time/ bandwidth consuming data protection policies be

relaxed as data loses its value?

Advertisment

Direct Hit!

Applies To: Database managers



USP: Learn to manage information based on relevance to the
organization



Primary Link:
http://bit.ly/8wdR71




Search Engine Keyword: information lifecycle management

The noteworthy leap that the storage industry is forced to

take in this regard is Storage Tiering. Here, the capacity to be provisioned is

divided into separate pools of storage space with various cost/ performance

attributes. At the top resides the Tier 1 pool, which is the most expensive but

high performing nonetheless. The bottom tier is occupied by more cost-effective

storage arrays. The next challenge is to devise a sophisticated software layer

that intelligently places data into the different tiers according to their

value. This concept is variously known as data classification or Information

Lifecycle Management (ILM).

What is ILM?



ILM is a concept that encompasses the discovery, classification, analysis, and
maintenance of data, across the entire period of its useful life. It adds

structure and context to data, marking the transition from data to information.

ILM is a part of the larger concept of Business Continuity Planning, but has

become increasingly prominent in the storage arena in recent years thanks to

several factors, including advancements in data storage management techniques

and the technology that underpins it, and evolution in the storage environment,

including:

Advertisment

  • Coexistence of Fibre Channel and iSCSI (IP-Storage) in

    the data center

  • SAS and SATA storage coexisting in storage systems.

    Storage consolidation practices, for reducing the use of solitary “islands of

    data” in direct attached storage (DAS)

  • Regulatory requirements for data archiving and recall

    (SOX, etc.)

Though many vendors offer ILM services or modules as part

of their products, ILM is above all a concept or a strategy, rather than a

product. However, for a practical explanation of what the concept embodies, we

can safely generalize that many implementations of ILM encompass such components

as:

Advertisment
  • Database Management

  • Storage System Performance and Monitoring

  • Storage Capacity Planning and Management

  • Business Controls for Data Degradation and EOL

How is this done?



In a tiered storage system, storage is not merely seen as a container of

data. Another important dimension of intelligence is appended to every block,

transitioning blocks of data into blocks of information.

Advertisment

Data + Intelligence = Information

This intelligence associated with every block of data,

forms very vital metadata, which automatically tracks the access patterns to

these blocks. Therefore, data is first classified, then moved at the block-level

from tier to tier, based on frequency of access. At the peak of its popularity,

data is stored in the fastest, most responsive top-tier storage on hand and

subject to the most stringent replication and backup controls. Since the ILM

system is constantly monitoring the data's value in comparison to other data, as

it loses value, it is migrated down the chain to less expensive, less powerful

storage, where it may not be accessed as frequently, or protected as carefully.

In the final stage, it is migrated out of the storage system completely. Data of

the lowest value is either purged from the system or transferred to other media

(eg, written to tape and delivered to offsite storage) depending on the

organization's policy and regulatory requirements for data end-of-life.

Why ILM?



Having examined how an ILM system can be implemented, we should next look

more closely at the reasons why more and more organizations are accepting the

need for a comprehensive ILM strategy.

Advertisment

Exponential growth of data



With data growth averaging near 80% to 100% every year, managing storage

effectively has become a challenging task. Storage administrators face limited

budgets, and are charged with not only expanding capacity by purchasing new

hardware wisely to meet projected storage needs, but also optimize the use of

existing capacity, in order to maximize the investment in current storage

hardware. Moreover, any changes or additions need to be considered carefully, as

the downstream effects of new hardware are often unforeseen, and can quickly

wipe out any short term cost gains.

Data accessibility/freshness



As mentioned at the beginning of this article, data does not have a constant

value; rather that value is changing, whether it is due to time, relevance,

security, or popularity. Policies and procedures must therefore be set in place

to continuously shift and monitor the location (and therefore the accessibility)

of data so that information that is highest in demand is in the most accessible

location.

Carbon dioxide emissions of traditional storage servers

versus tiered storage servers
Advertisment

Cost (TOC) issues



The overall cost of a storage system is measured not just in the initial

price paid for the hardware and its commissioning. The total operating cost

(TOC) includes maintenance, power and cooling expenses, together with the cost

to staff and train administrators. As storage arrays grow, power usage (for

server operation and cooling) is just one factor that has an enormous impact on

the TOC of a storage solution. If less expensive solutions are available,

administrators should by all means devise a careful plan to incorporate these

components, with some restrictions. When possible, additional storage technology

should be adopted that does not require significant investment of time and

resources to learn its operation. New solutions that are more power or space

efficient should be integrated into the array.

Ability to protect and recover lost data



Because key data has to be protected against loss to ensure business

continuity, the term Continuous Data Protection has come into being. It

describes a scheme of ensuring data survival in the face of disasters such as

power/network outages and natural catastrophes, and incorporates techniques such

as backups, data snapshots and remote replication to do so. To add to the

challenges surrounding data protection, regulatory requirements for the

preservation and archiving of several types of corporate data continue to mount.

Data of a particularly sensitive or critical nature must be

available for recall within clearly established time limits if circumstances

demand it, and kept secure as well. Therefore a successful ILM implementation

integrates well with the backup solution and recovery solution of an

organization along several touch points. ILM dictates that as items age they can

be taken offline completely and migrated to tape storage, for example, yet some

data still must be available for recall, even at this point. Since only a

percentage of data has to be protected in the same manner, the ILM solution must

be flexible enough to manage varying CDP requirements.

Advertisment

Green data centers



As mentioned earlier, one of the primary challenges facing data centers

today is the amount of power consumption. Thus, while the initial cost of

acquisition of the storage might have been low, the higher cost of power

consumption and cooling means that the TCO is very high. In addition to the

tangible financial burdens this adds, the other, often intangible, concern in

such a data center is its environmental impact. Today, global warming and

pollution are major hazards that cannot be ignored. There are both regulatory as

well as financial incentives to reducing carbon dioxide emissions, which often

result in a direct cost saving due to increased carbon credits.

Conclusion



Storage Tiering in enterprise-class storage is becoming a highly desirable

feature today. It is only a matter of time before the cost, environmental and

performance benchmarks of a tiered system become critical parameters on which

decisions of storage system procurement will be based.

Tiered storage servers implementing ILM offer a greater

cost advantage and performance. It is important to realize that with storage

servers with Tiered Storage and ILM enable data centers to reduce footprint,

electricity costs, and CO2 emissions, for the creation of a greener and more

eco-friendly data center.

Advertisment