Advertisment

Intel's Dunnington Six-core Processor

author-image
PCQ Bureau
New Update

Just a few months ago we reviewed the Harpertown processor. This processor
was launched just after the launch of 45 nm Penryn based Xeon 5400 series model.
This time we had the opportunity to test the next in line from Intel: Dunnington
processor. The server we received had 4 distinct processors with 6 cores each.
As there are not many applications that use all available cores, this processor
is meant for very high end computing and for virtualization in large
enterprises. It can also be used in cloud computing or rack optimized and
ultra-dense SKUs.

Advertisment

Technology in Dunnington

To understand and appreciate the tech used in Dunnington, we'll start with a
bit of history of the previous generation server processors. The Xeon 5200
series codenamed Woodcrest, based on the Intel's core-micro architecture was the
server and workstation version of the Intel Core 2 processor. The fastest
processor in this category operated at 3.0 GHz, claiming better performance and
also less energy consumption than previous processors. In Jan 2007, Intel
launched its quad-core, Core2quad, as the 3200 series which comprised of two
separate dual core dies placed next to each other in one CPU package. This was
targeted at blade servers. The 3300 series was similar to 3200 series but was
manufactured using 45 nm process and featured XD bit and virtualization
technology.

Direct Hit!
Applies To:
Data centers

Price: Yet to be released

USP: How six-core processors rev up the
performance of your server

Primary Link: www.intel.com

Keywords: dunnington

True to Intel's tick-tock release cycle of processors, where tick means a
refresh of the current architecture and tock means a brand new architecture, the
clock ticked and the Harpertown Xeons were released in late 2007. This family of
processors consisted of dual-die Quad-core processors manufactured on a 45 nm
process and featured 1333 to 1600 MHz front side bus with lesser TDPs, rated
between 50W to 150W depending upon the model.

Advertisment

And now Intel has become the first in the x86 processor market to launch a
processor with six cores. Their offering before this was the 7300 series, code
named Tigerton and consisted of two dual core architecture silicon chips on a
single ceramic module. Boasting of greater processing capabilities, the Tigerton
was based on Intel's Caneland (Clarksboro) platform. But now the Dunnington, or
the 7400 series, features a single-die six core design and is based on Intel's
45 nm Penryn processor. Like the Harpertown Xeon processor it has three dual
cores clubbed together. Compared to its predecessor, this processor has
significantly more cache, ie 16 MB L3 cache which is shared among all six cores,
3 MB L2 cache which is shared among two cores and 96 KB L1 cache. The increase
in the size of cache will lead to improvement of performance, mainly by reducing
the latency in accessing frequently used data. However, the processor speed
remains approximately the same, ranging from 2.13 to 2.66 MHz. However, the FSB
of Dunnington (1066 MHz) compared to Harpertown which has 1600 MHz (reviewed in
Jan 2008) is much lower, which can be a bottleneck but the 3 MB cache size is
said to reduce this to great extent. One good thing is that if you already have
a Tigerton's nPGA604 socket, then you just have to plug this CPU into that. And
it is compatible with Caneland chipsets too.

The above image shows six cores in the Dunnington CPU, having 16 MB L3 cache
shared among all the cores.

This 7400 series processor also supports VT-x technology, ie Intel Flex
migration and Flex priority technology. Earlier successful live virtual machine
migration was dependent upon the compatibility of the two CPUs between which the
migration is being done. And also to ensure that the VM is stable after the
migration is done. All these issues have been taken care of with the new Intel
Flex migration technology. Now this also solves the requirement of buying a
compatible resource pool across multiple generations of Xeon processors. This
gives you the option of choosing the right server platform with respect to
performance, cost and power for your enterprise. Flex Priority is another such
hardware feature which helps in optimizing virtualization by improving virtual
machine access to the task priority register.

Advertisment

How we tested

To test the performance of the server we ran different benchmarks such as
SunGard, Linpak, POVRay and Cinebench. We ran these benchmarks on Windows Server
2008 64-bit OS, first with 2 processors and 8GB RAM, and then with 4 processors
and 16 GB RAM. The HDD was configured on RAID 0 so that the IO doesn't create
any bottleneck during the benchmarking process. Initially we simply took out 2
processors, 8 GB RAM and ran the benchmarks. Then we placed the processor and
RAM back again and then ran the benchmarks to get full system performance. For
checking the power consumption, we connected this device via a 'wattmeter' to
the main power supply and then calculated the maximum, minimum and average power
consumption.

The performance graph while running SunGard on Dunnington (left) and
Harpertown (right). Dunnington took 95 secs less than Harpertown.

Benchmark results

Initially we started the test with Cinebench 10, which measures the performance
of processor and graphics card, and finally we gave a Cinebench score. This test
process consists of two different parts: the first part is processor intensive
and second is graphics intensive. Initially it makes use of a single CPU for
running the test whereas the latter part of the test uses all the cores. In the
second test, ie the graphics test, the test runs inside a 3D window. An animated
scene is played starting with a low demand for graphics which is increased later
on. Finally a score is generated, when the processor works on maximum speed for
the scene to be displayed properly. The higher the scores the better will be the
server performance.

Advertisment

Results: With 2 processors and 8GB RAM, Cinebench gave scores of 3262
CB-CPU while rendering 1 CPU and gave 26816 CB-CPU rendering with all the CPUs.
The GPU score ticked to 190 CB-GFX which is good for a server processor like
this one. Now with all the 4 CPUs and 16 GB RAM i.e. with full blown
configuration, this monster gave scores of 3266 CB-CPU for rendering 1 CPU which
is of course the same as the earlier case. But when it rendered all the CPUs
then the score ticked to 31372 CB-CPU which means 14% increase in the
performance compared to earlier configurations. However, pls note that this
benchmark 'CINEBENCH 10 64-bit' didn't use more than 16 cores.

As the next benchmark we used a ray tracing program POVRay which is used for
CPU benchmarking. It uses the raytracing rendering technique to calculate an
image, by simulating how light travels in the real world. For benchmarking with
POVRay, we used the standard 'benchmark.pov' as this file uses every internal
feature of POVRay and stresses the CPU to limits. One more reason for using this
benchmark file is that as it is the standard for all processors and it becomes
easier for others to compare scores.

Advertisment
All 24 cores being utilized while running the SunGard benchmark. Apart from
Linpack, this benchmark was the only one to stretch all 24 cores.

Results: With 2 processors and 8 GB RAM, POVRay rendered an average
120.38 PPS over 147456 pixels and with 4 processors and 16 GB RAM, it rendered
an average 120.25 PPS. POVRay used a maximum of 3 cores for executing the
benchmark.

Then we used SunGard Adaptive Analytics as a component of SunGard's Suite of
risk management products. More precisely, it is the stripped down version of the
actual product. This benchmark utilizes Monte Carlo method financial engine to
predict the future of fictitious portfolio. It requires two different files to
run, the first one contains a sample data that represents the actual market
condition and the second file contains the sample customer's investment
portfolio. The benchmark scores are calculated on the base of time in seconds,
so the lesser the time it will take to run, the better the server performed.

Advertisment

Results: In the first test, with 2 processors and 8 GB RAM, the total
time taken to run the benchmark is 156.2 seconds and with 4 processors and 16 GB
RAM, it took only 105.9 seconds. Harpertown with 8 cores and 16 GB RAM took
around 200 seconds which is 47% less than what this Dunnington processor took.

Comparison of the time taken by different machines to execute SunGard.

Next we ran Linkpack which takes down almost any server to its feet. It
basically measures a system's floating point computing power by making the
system solve an N by N linear equation (i.e. Ax = b). It calculates how much
amount of GFlops can be generated. The greater the number of GFLops generated
the better the system is.

Advertisment

Results: With 2 processors and 8 GB RAM the system generated 53.69
GFlops and with 4 processors and 16 GB RAM, the system gave 62.02 GFlops which
is lower than GFlops generated by Harpertown (65 GFlops). We got lower score for
Dunnington as Linpack that we had was customized for Harpertown.

For checking the min power consumption we kept the system idle which came to
be 438 W, whereas in the case of max power drawn, the wattmeter showed 715 W.

Advertisment

Stay connected with us through our social media channels for the latest updates and news!

Follow us: