Advertisment

Preview : Intel's Nehalem-EX CPU for High-End Servers

author-image
PCQ Bureau
New Update

We got a double treat in labs this time. On one side we received Intel's

latest Nehalem-EX CPU, which is the chipmaker's bet for high-capacity SMP

servers, and boasts of many of the RAS (reliability, availability, and

serviceability) features that were only found in the upper spectrum of server

CPUs, like the Itanium. The other treat was the server that Intel sent, which

contained this processor-the latest PowerEdge R810 rack server from Dell. You'll

know why that was a treat when you hear about the specs of the 2U rack server.

Advertisment

Nehalem-EX CPU



The Nehalem-EX is designed for SMP environments, and can therefore scale

from 2 to 256 CPUs. This essentially implies very large workload environments

like Decision Support Systems, virtualization, HPC, databases, ERP, CRM, and

other mission critical applications. Intel claims that a single Nelahem-EX based

sever can replace 20 single core servers. The LGA 1567 socket based Nehalem-EX

is built on the 45 nm fabrication process, and supports up to 8-cores in a

single CPU. Moreover, each core supports two threads with the help of

Hyperthreading technology, due to which, each 8-Core Nehalem-EX provides for 16

logical cores.

While each core has 3 MB of L3 cache, the cache interconnect allows each core

to share all the L3 cache. Due to this, an 8-core Nehalem-EX provides a whopping

24 MB of L3 cache to each core. Each CPU has two integrated memory controllers,

which support up to 16 DIMMs. So a 4-socket server based on the Nehalem-EX will

support up to 64-DIMM slots, which can take up to 1 TB of memory. The processor

has a base clock speed of 2.26 GHz, which goes up to 2.66 GHz with turbo boost.

The other thing we mentioned about the Nehalem-EX was its RAS (reliability,

availability, and serviceability) features. In this, a Nehalem-EX based system

will work with the firmware/OS to recover from hardware errors. It would

automatically attempt to recover or restart processes so that the machine

continues to function normally.

Advertisment

About the Dell R810 Server



As is standard in Dell servers, the R prefix is used to denote a rack

server, and the 0 at the end of its model, viz. 810 denotes that it's based on

Intel CPUs (a '5' at the end denotes AMD CPUs). The one we received was powered

by two Intel Xeon X7560 processors. This server is meant for virtualization or

workload consolidation applications, as well as for mid-size databases. The 2U

PowerEdge 810 weighs around 26 kg when fully populated. Its front panel has a

DVD ROM and 2 USB 2.0 ports to connect any USB device or flash drives. There is

also an LCD display unit that prompts for basic troubleshooting info to the

server admin. There are two SD card ports inside the server which provide

redundancy. Six hot pluggable redundant cooling fans keep the server's internals

cool.

The PowerEdge 810's motherboard is based on Intel 7500 Chipset and has 4 CPU

sockets. These can be populated with either four Xeon 7500 or two Xeon 6500

series CPUs of quad, six, or eight core versions. The server came with two,

eight-core Xeon 7560 CPUs, meaning we had the power of 16 physical or 32 virtual

cores in our hands. The server has 32 DIMM slots that support DDR3 memory, and

luckily for us, they were all populated with 4 GB DDR3 DIMMs, meaning a whopping

128 GB of RAM in a single 2U box! If that's not sufficient for your requirement,

then each DIMM slot can support up to 16 GB memory, meaning the maximum memory

capacity supported by this server is 512 GB. The server reached us populated

with six 146 GB SAS drives of 15k RPM each. You can plugin six 2.5” SAS or SATA

hard drives including SSDs, and it supports up to 3 TB of storage. The server is

powered with the help of 2 redundant hot pluggable 1100 watt power supplies,

configured for 1+1 redundancy. We tested the server with Windows Server R2

64-bit.

Performance



Considering that this is a fairly high-end configuration, we focused on two

aspects of its performance. One, we tried to see how the performance scales as

the number of cores increased. Two, we measured the power consumed by the server

as more cores were activated, to see if the power consumption increases linearly

as more CPU power is consumed. Plus, we also ran the CineBench benchmark, which

is a 3D content creation benchmark that measures the performance of the CPU and

graphics sub-system of a machine.

Advertisment

SunGard & CineBench benchmark results



We used a financial risk analysis application benchmark called SunGard for

the job. The application allows you to select the number of threads to use,

which essentially controls how much CPU power to extract from the system. The

application uses a Monte Carlo method financial engine to determine the future

value of a fictitious portfolio. We ran this test with 8, 16, and 32 threads,

and as can be seen, the time taken to determine the future value of the

portfolio reduces significantly as the number of threads increases. So

essentially, when we moved from 8 to 16 threads, there was a 48% jump in

performance, and as we moved further to 32 threads, we saw a 34% jump. We then

measured the power consumed by the server with 8, 16, and 32 threads. With 8

threads in use, the average power consumption by the server came to 521 watts.

As we moved from 8 to 16 threads, the power consumption increased to 600 watts,

which is a 15% increase in power consumption. Further, as we moved from 16 to 32

threads, the power consumption increased to 623 watts, which is a mere 4%

increase in power consumption. So essentially, the jump in performance is far

greater with more cores, as compared to the jump in power consumption. Since

power consumption in the data center is something that worries most CIOs, this

statistic would come in handy while choosing a new server platform.

We also ran the CineBench benchmark on Dell R810, and compared it against a

4-way, six-core Dunnington processor based server. Yhe Dunnington system had 24

cores running inside. The good thing about CineBench is that apart from showing

results of the current system, it also shows test results for similar systems

that have been conducted by others. Initially, we were a little disappointed by

the R810 server's CineBench results-14.81 against Dunnington's 18.58 CPU points.

After some pondering over why its performance was lower, we found some probable

answers. For one, the Dunnington system had 24 cores, against 16 cores in the

R810. Plus, the Dunnington CPUs were running at 2.66 GHz vs 2.27 GHz in the

R810. Another interesting outcome we observed for Intel-based servers was that a

higher number of threads doesn't give a huge performance jump. That's why, even

though the R810 had 32 threads against only 24 running in the Dunnington, its

performance was lower. The CPU frequency provides a minor advantage in

performance. Interestingly, CineBench reported far lower results for a 12C/12T

AMD Opteron based system.

Bottomline: Whether the Nehalem-EX platform is worth shifting to or

not depends upon two things-price and performance. Since the price of this

system was not known to us, we can't comment about the same. In terms of

performance, the system is definitely worth evaluating.

Advertisment