This new family of processors from Intel, called Core i7 is expected to hit
markets all over the globe anytime soon. In the tick-tock model adopted by
Intel, where each tick is a shrink of the previous microarchitecture and tock is
a brand new microarchitecture; the release of Nehalem falls in the tock part of
the cycle. It is a successor to the Core microarchitecture. Designed from the
ground up, the new design is being showcased for the first time in the Core i7
family of desktop class processors.
The new microarchitecture incorporates a number of features in its design,
resulting in what till now Intel claims, better performing and more efficient
processors. Let's have a look at some of the significant changes that these
processors come with.
Native quad-core design
One of the most important reasons why Nehalem is a radically new design for
Intel is that for the first time, the chip manufacturer is producing a 'native'
quad-core processor, where all four cores sit on the same piece of silicon,
similar to AMD's Phenom X4 CPUs. Whereas the earlier Core 2 Quad processors were
designed as multi-chip modules where two dual core processors sitting together
used to form four cores.
The advantages of having a native quad-core over an MCM are significant in
terms of processor energy efficiency, performance, and dynamic scalability.
|
Inclusive level 3 cache
First showcased on earlier Xeon server chips, the Core i7 family of desktop CPUs
feature up to a massive 8MB of level 3 cache (shared between all four cores) as
compared to the 2 MB of Phenom X4. The cache is also described as an inclusive
level 3 cache. Intel claims, an inclusive cache is more efficient than an
'exclusive' cache design, even if it does mean that 1MB of Nehalem's 8MB Level 3
cache is taken up by storing a copy of the 256 KB Level 2 cache inside each
processing core.
Integrated memory controller
By modularizing the design of the CPU and the Northbridge, the memory controller
has been brought to the Nehalem CPU die. The separate processing cores and
caches are linked to the on board memory controller via a new bus standard
called the QuickPath interconnect replacing the conventional front side bus. As
QuickPath replaces the Front side Bus (FSB), it also takes over the role of
allowing the CPU to connect to other system components, buses and controllers
such as the PCI Express controller and DDR3 memory, reducing latency and
improving performance considerably.
Hyper-threading
Another feature worth mentioning is Hyper-threading. Using spare resources of a
core to execute a second process thread, Hyper-threading enables a quad-core
Nehalem processor to accept and process eight threads simultaneously, making it
even more massively parallel and powerful than the current Core 2 Quad CPUs.
Note: The last three bars in both graphcs show scores obtained by the Core i7 when overclocked at different frequencies. |
New socket
As the Nehalem CPU communicates directly with memory, an additional bank of
connections to the motherboard is needed. The current Socket LGA775 doesn't have
enough pins to accommodate the memory controller, so Nehalem CPUs require the
new Socket LGA1366 which has 1,366 connections to the motherboard rather than
just 775. The only drawback here is that two sockets are not compatible in any
way, so along with a new motherboard, you'll also need a new bigger CPU cooler
for Nehalem-based processors.
Grilling CORE i7
Core 2 Quad v/s Core i7 v/s Core i7 overclocked
The processor we tested was the 3.2 GHz Intel Core i7 965 Extreme edition. This
one being the fastest of all the three in the Core i7 family is also the most
expensive.
But before we could start running all benchmarks on the new processor, we
needed a benchmark for comparison. Therefore, we decided to turn this review
into a full scale battle between the Core i7 Extreme and another similar
processor, the 2.66 GHz Core 2 Quad 6700. Also, another thing that was way too
tempting and we couldn't resist doing was the idea of pushing the envelope,
better known as overclocking a processor: the Core i7 in this case. What
initially raised eyebrows, proved to be quite fruitful and informative at the
end.
The power consumption of the Core i7, both when idle or running intense applications, is less than that of Core 2 Quad. |
All tests were carried first on the older Quad core to set a benchmark. Then
the Core i7 was grilled by the same procedure. And finally, after overclocking
was over, the act was repeated again a number of times.
The test bed
The bigger new chip was run on the Intel Extreme DX58SO motherboard with 4GB of
DDR3 RAM, a Sapphire Radeon 1950 XTX graphics card and a 400GB HDD at 7200 rpm
spindle speed.
A similar setup was used to run the older quad core processor. The
motherboard used for the Core 2 Quad was the Gigabyte EP 45 UD3P with 4GB DDR2
RAM and the same graphics card and HDD. In both cases 32 bit Windows Vista
Ultimate edition was used.
PC Mark 2005
The synthetic benchmark was used to start the testing process. A set of tests
were chosen that required the CPU to flex its muscles to the extreme. Core 2
Quad was the first to take a shot at the benchmark. After tests like file, audio
and image compression and other multi-threaded ones, the processor's CPU score
came out to be a decent 6415. Next, we tested the new Core i7 Extreme, which we
had high hopes from. With all new features and a native quad-core, the i7 didn't
just surpass the benchmark set by the older cousin; it literally smashed it with
a CPU score of 10,995. If this wasn't enough, the 'overclocked' i7 battered the
older CPU completely by reaching a CPU score of 13,118 when running at 4.12 GHz.
Block Diagram of the Intel X58 Express Chipset |
POV Ray
Another industry benchmark, which tests the CPU's capability by rendering high
definition images with intricate details of light and shadow reflections, and
refractions. After the beating received during the PC Mark test, the Core 2 Quad
was again put to the test first. The render average for this CPU came out to be
45,010 PPS in a total of 4.37 seconds. The Core 2 Quad being a powerful desktop
class processor, the score is well, more than decent. When the same test was run
on the Core i7 the outcome managed to amaze us again. In a total time of 2.06
seconds the render average came out to be 95,325 PPS. It means in less than half
the time, we got twice the performance. When overclocked and running at 4.12
GHz, it took only 1.81 seconds to render at an overwhelming 1,13,359 PPS.
CINEBENCH Release 10
The benchmark that also checks the multi-threading capability of a processor was
no exception. The core i7 again came out to be the winner by a huge margin. When
using one CPU, Core 2 Quad scored 2163, whereas the core i7 was at 3809, and
when overclocked even a higher 4439. When using all four, the Core 2 Quad scored
6375 whereas the core i7 leaped to 15,533 and when running at 4.0GHz an even
higher 16,596. The multiprocessor speed of the Core 2 Quad was 2.95x as compared
to 4.08x of the i7. Also on the open GL standard the CPU was left behind at 101
with Core i7 scoring 172, and 203 when overclocked.
Virtualization
What might seem like an odd thing to test is actually quite important and
relevant for people who implement it. Because desktop level virtualization is
being used in the real world we couldn't ignore this area.
To get the idea of the processor's performance when running several virtual
machines, we used the CPU test of PC Mark 05 again. Using Windows Virtual PC we
made three virtual desktop systems running Windows Vista Ultimate edition (32
bit). And if that wasn't a load enough in itself, we also ran the benchmark
simultaneously on all three. The average of the three results was considered to
compare the two and as expected, the Core i7 took the cake away. The average CPU
score of the three virtual machines running on the older Core 2 Quad was 1288 as
compared to a staggering 4255 on the Core i7. Apart from scores the system
running on the Core i7 was much more responsive and lag free. We spared
overclocking the processor in this case as the load seemed overwhelming and
could lead to permanent damage to the chip.
Conclusion
We had high hopes from Nehalem. The new microarchitecture incorporates
significant new changes, therefore we expected it to perform well. But what we
did not expect was for it to give a performance that's nearly double that given
by its closest predecessor. We were also not anticipating it to hammer the Core
2 Quad while also consuming less power. Overall, it can be safely said that the
new processor is extremely powerful, consumes less power and has given desktop
performance a whole new definition.
Bottomline: The new processor, based on Nehalem
microarchitecture is extremely powerful and more power efficient than earlier
versions. The only factor that seems to bother is its steep price.