The first thing to start planning for optimizing the power consumption in
your datacenter is finding out how much power do your datacenter, racks,
servers, equipment, and other devices consume. Next thing is to figure out the
pattern of the consumption. For instance, what is the time of the day when the
load from a particular rack is maximum or what's the time the overall load of
the datacenter gets minimum. If we gather such relevant data, then it becomes
very easy for us to plan for better optimization of resources and hence better
optimization of power.
Gathering information
To collect, preserve and analyze information on aspects as mentioned above
you essentially need two components. First is the power meter which should be
connected to, if not all then at least most of the devices in your datacenter.
Second is the software which can get the feed from these meters and log and
analyze the data.
Direct Hit! |
Applies To: DC Admins |
One simpler way of doing this is by using remotely manageable power outlets
which comes with inbuilt digital power meters which can measure power in terms
of watt and ampere for all the outlets. Such devices come in different form
factors, such as 1U rack mountable chassis, regular power strips, power strips
which can be installed in the racks, etc.
The added benefit which you get by using such devices is that you can
remotely turn the power supplies on and off over LAN or Internet. This feature
can help in many ways. Remember, when you had to cold boot a hung server
remotely and you didn't have anyone physically available at the datacenter. This
is a very common scenario which might have happened with many of us. There are
many companies who are offering such products. These products are called the
metered PDU (Power Distribution Units). One such product which we had reviewed
some time back was Raritan Dominion PX. This was a 1U box with 8 power outlets
and had a design resembling to a 24 port network switch.
You can even get such PDUs from APC, PDUDirect, WTi, etc. Most of them are
similar in functionality. They generally provide a Web based interface from
where you can cycle power of individual power outlets and at the same time can
see the real time load (both wattage and amperage). Some of these devices do
support logging of historical data locally or remotely but some don't. For those
which don't have a centralized application which can pool data from various
such PDUs and log them. One such application is available from Raritan. This is
called PowerIQ. This application not only logs the data from the PDUs but also
tells you active power level, energy consumption, customer costs, environmental
readings, line capacities and carbon footprint. Plus trending and cumulative
total reports are also available at any level including building, floor, room,
rack, user and IT device.
The other benefit this application gives you is that it lets you create
groups of different devices and servers in your datacenter. You can set boot up
and shutdown time for the groups. So for instance if you don't require a bunch
of servers and devices on a particular day of the week or on a particular hour
of the day, then you can turn them off automatically. These devices
automatically will automatically boot up and go live on the scheduled time. This
feature by itself can save a good amount of power.
Redoing the datacenter
The second obvious step towards streamlining business and preventing
leakages in budgets is to ensure that the IT systems work efficiently by
incorporating technologies like virtualization, consolidation, migrating to
blade servers and having intelligent energy management solutions on place. In
spite of these deployments, most companies end up saving only a marginal
proportion of the expected outcome, bringing them back to level zero or worse
still, in a few cases, spending more and getting no results. On doing a
ground-up survey, energy efficiency pops up as the biggest concern area, and in
most cases the reason for this is that the IT functions properly, but the
auxiliary 'facilities' like UPS systems, cooling devices and back up power
systems end up unchanged, and result in huge leakages in spite of IT
enhancements.
As an illustration, consider a scenario where 10 servers, that were earlier
running at 50% efficiency percentages, were consolidated to 6 servers running at
80% efficiency each, eliminating the need for 4 servers and consequently the
space and cooling. Even taking into consideration the little extra power a blade
server consumes (assuming the consolidation happened from racks to blade), there
are still tremendous benefits. But if the cooling systems remain unchanged, or
are modified in an incorrect manner, the new architecture will not get
sufficient cooling, resulting in malfunction, or worse, the magnitude of cooling
and power requirements will be disproportionate to the new architecture.
An ideal starting point towards solving this problem is to understand the
magnitude of savings that can be achieved with a mix and match of 'facility'
changes. There many online and free applications which can be found on Google
and which let you indicate the data center capacity, current efficiency, current
UPS systems in use, power costs, lighting details etc, and then allow you to
make indicative changes to any of these and see the savings meter change
dynamically. For example, a 300 KW data center having 50% IT load, at typical
Indian electricity charges, running on legacy UPS systems, with chilled water
cooling system at single path power is likely to run at an efficiency of 43.8%
with an annual electricity cost of $353,050. If row-based cooling is deployed
and a high efficiency cooling system along with blanking panels and
'intelligent' lighting in the server room is installed, the savings percentage
climbs up to 58.6%, bringing down the annual energy cost to $263,639. Whether
the IT manager sees sense in these deployments for the overall cost benefit the
company would derive is his/her call, but the tools are reasonably comprehensive
in its analysis of loss reductions.
This is only the starting step in understanding or rather deriving the
problem. Having understood the area where energy efficiency can and should be
incorporated, the next step is to identify relevant technologies that can be
used to achieve better efficiency. One of the simple methods an enterprise can
deploy is the Hot Aisle data-center architecture. In a scenario where an
enterprise spends a major chunk of its data-center budget in raised flooring and
air conditioning for the servers — blades or otherwise — savings can
straightaway be realised if the Hot Aisle arrangement is used. The fans of a
server release heat into the room and the faster the server performs, the more
is the heat released. This heat is the single biggest contributor to the
requirement of a redundant air cooling system, so much that the heat released by
all the server fans put together is directly proportional to the cooling
expenditure of the data-center. Hot Aisle architecture is merely the
arrangements of servers in such a way that two rows of servers are having the
fans facing each other. The alley (or aisle) between the two fans becomes the
Hot Aisle and the next alley (the front panel of the server) becomes the Cold
Aisle. First, this architecture ensures better measurement of heat released and
gives a better idea of cooling requirement for the data center concerned and
second, it collects hot air in a separate corridor, and this trapped air can be
'fed' into the cooling system for better efficiency. The cold aisle, meanwhile
only needs re-circulated air to keep the servers functioning. Most
state-of-the-art data centers that deploy this architecture also have an
artificial roof for the hot aisle to trap the more air more efficiently. From a
data-center design point of view, this architecture gives a better indication of
the 'hot spots' or 'red' areas of the data center and allows for better planning
and expansion.
Another technology that CIOs and data center managers can deploy and use is
In-Row cooling. This technology goes on to prove that modifying the cooling
architecture in a data-center can go a long way in achieving better power
efficiency. Traditional cooling mechanisms utilize the raised floor of a
data-center to pump air from the bottom and cool each server in the rack bottom
upwards. With the advent of blade servers, the performance of each other and the
corresponding cooling it requires is directly proportional to the compute power
that is designated to the particular blade. In such a scenario, the blade that
is sitting right on top of the stack may be the one releasing maximum hot air,
as it is the one that performs the most. This creates an artificial 'hot zone'
around it, which is further magnified if hot aisles are absent. A simple way to
solve this problem is to cool sideways — in other words, cool air can be pumped
not bottom upwards but along the sides of each server. This way, each server in
the rack across multiple racks gets the same amount of cooling. This may not
always be needed, so automating In-Row cooling of this nature can help the
cooling system 'understand' which server is performing faster than the other,
and adjust the cooling dynamically in real time. If the task assigned to the
particular server is completed and load is shifted to a different server in a
different rack, the cooling changes accordingly to cool the newly assigned
server more effectively. In data-center scenarios where compute power is
required in spikes — for instance on an share broking site where load reduces
significantly on weekends — In-Row cooling can directly contribute to reducing
the power expenses of the enterprise.