Advertisment

Next Gen Processors

author-image
PCQ Bureau
New Update

Through the years, computing has evolved at breakneck

speed. At the forefront of that evolution have been the processors. They have

been the driving force behind all innovation. As the clock speeds rise, the

speeds increase and all other components have to be revamped to keep up with the

processor. This has generally led to a far better overall computing experience.

Advertisment

The speed has obviously meant that we have reached the

limits of single-core processors. As Netburst has no doubt taught everyone,

making the core clocked too high will give you some pretty serious burns and if

you are in the server market, that's never a good thing.

However, going by the current trend as showcased at events

like IDF and CeBIT, it is clear that the next bet from these companies is not on

dual cores but on multi-cores. Everything that we have seen (from roadmaps to

demonstrations) has got us rubbing our hands with glee in anticipation.

While speeds are important, an interesting shift that we

are witnessing is that finally, manufacturers (some early, some like Intel later

on) have realized that clocking the core higher is not the only way.

Advertisment

Future development is, thus, focused more and more on

overall system performance. From removing bottlenecks in the bus speeds to

supporting higher clocked RAMs, its all coming together to give you 



performance like never before.

Even Intel has come down from its high horse of high clock

speeds and is getting serious about performance per watt. Indeed, in the past

couple of IDF sessions, performance/watt has been its mantra. Other companies

realized this quite some time back and the rival Opterons and UltraSPARCs have

been running cooler and more efficiently already.

Number crunching



With your business growing, you obviously need to cram ever more processing

power into as small an area as possible.  The space constraint obviously

gets in the way of heat management but today it's possible to get some heavy

duty power in relatively smaller packages. Currently, we know them as dual core

processors but that is really the first stage, a stepping stone, as we make a

true transition to multi-core multi-processor systems which can give you four

times or more the current



performance levels with an equal, if not smaller footprint.

Advertisment
The

Cell Processor

This is perhaps the

most promising of technologies that comes in a grossly under-rated

package. The Cell Processor from IBM finds itself rather under-utilised in

the yet to be released PlayStation3. The Cell is frankly the work of sheer

genius. It is made up of 8 Power PC cores called SPEs or Synergistic

Processing elements which are all connected to an arbiter called the PPE

or the PowerPC Processing Element. The PPE decides the task for each SPE

and doles it out accordingly. The claimed processing power is around 2

Teraflops!



Considering that the

PlayStation 3 will be no larger than your average DVD player, that's a

very powerful and space efficient solution right there! The amazing part

about the Cell is that since it is inherently a system on chip design,

scaling up the processor itself is a simple matter of adding more SPEs. In

fact, we can connect entire cell processors (PPEs, SPEs et al) to multiple

others and get tremendous computing power. Of course, we haven't yet

seen a demo of this so far but we are sure that the potential for a killer

server grade processor is definitely there.



Imagine the

possibilities where you can take your entire server cluster and replace

them with just a cell processor based server,



perhaps the size of just 1U! 

Definitely, Cell

Processor-based events can't come out fast enough!

Intel's NGMA (Next Gen Micro Architecture) Intel

has mostly been quiet about its processors recently. Clearly the reason being

the thrashing it has received at the hands of AMD. Having said that, Intel is

definitely very strong on dual and multi-core processors to give you the maximum

performance per square inch! 

Intel has had a dramatic shift from pure clock speeds

towards being power conscious-their



Tulsa




processor being a case in point. Granted that the processor is merely dual core

but the other features like the 16 MB cache and the thinner 65 nm fabrication

process boost



performance tremendously. The Blackford chipset, which the



Tulsa




runs on, offers a total memory bandwidth of 17GB/s and the total RAM capability

of 64 GB!

Advertisment

The other upcoming processors from Intel are Woodcrest,

featuring a 4 MB L2 cache and its successor Whitefield, which will be a quad

core with 16 MB L2 cache! All these multi-cores coupled with multiple processors

will lead to an unprecedented jump in calculation power that each server will

provide you with.

Sun's

Niagara
The chip has been rechristened UltraSPARC T1 (

Niagara


being the code name) and is a very good example of Sun's engineering

brilliance. This is the first time they have implemented their throughput

philosophy onto



Silicon. It is also a perfect example of how CPUs in future will be, which is

why it finds mention here.



Niagara


runs at 1.2GHz and each CPU has a  32KB L1 data cache and 16KB L1

instruction cache. These numbers really aren't too impressive considering the

UltraSPARC 4+ (Sun's first dual core) is faster (clock to clock) by around 20

percent and has 2 MB L2 cache vis-à-vis 3 MB for

Niagara


. So for single core performance, the UltraSPARC4+ beats the

Niagara


hands down. If you look at the overall performance however, the eight-core

Niagara


, with help from its revolutionary crossbar memory controller beats all its

previous brethren by a huge margin. The focus is on TLP (Thread Level

Parallelism) and a maximum of 32 different program threads can be processed

simultaneously! This is exactly the kind of performance we are looking for to

fit our objective of maximum computing per square inch!

Advertisment

Cluster computing



Cluster computing could actually gain and lose with individual servers

getting so powerful and costing the same for all practical purposes. Those with

basic requirements could perhaps do away with their clusters altogether

(depending of course on the type of apps you run on your servers) and just buy a

couple of thin servers, making data-center management seem like a vacation!

Those with much higher performance requirements would

obviously make a cluster of these ever more



powerful systems and get much more performance out of them.

Power consumption and heat







In our quest for maximum computing per square inch, we obviously cannot neglect
the power consumptions. Electricity bills are perhaps the bane of every data

center.


Advertisment

The costs increase basically on two counts-running the

servers and then keeping them cool. If we have processors with a lesser TDP and

thus lesser heat generation, both these costs can be cut quite significantly.

The processor manufacturers are not oblivious to these needs and are pulling out

all stops to give you maximum power at lowest energy bills.

Apart from the usual L1 and L2 caches, the Itanium houses a massive L3 cache

Intel's NGMA Up until now Intel's philosophy has

been, 'the highest clock speed will win you the world'. But not anymore.

With the last IDF (Fall 2005), Intel has undergone a paradigm shift in its

concept of performance. It is now focusing on performance/watt. This means more

efficient overall system performance, something that AMD has been touting for

years.

Advertisment
General Purpose GPUs

This is an interesting

concept that is being pursued by graphics-card vendors. In particular ATi

has been backing this up quite strongly. They say that the GPU today (their's

anyway) have enough processing power to offload some of the tasks from the

CPU. The first step they have taken towards this is in decoding multimedia

content.



In fact, both NVIDIA

and ATi have successfully demonstrated that while using their GFX cards on

a system, the load on a processor (as far decoding multimedia content

goes) gets significantly reduced. This leads to overall system efficiency

as your CPU is free to perform other tasks that you might want to do while

multimedia processing goes on in the background.



Due to sheer

optimization that GPUs undergo due to the increased vertex shader

implementation in games, they are more like super-tuned computing

machines. The graphics-card companies want to exploit this fact. The

result could very well be high levels of parallel processing and PCs

optimized for multi-tasking.



ATi is taking this

concept further. They claim that their X1900XTX has a total computing

power of 500+ GFLOPS compared to a mere 80 GFLOPS (maximum) in the Dual

Core Pentium 4s. Obviously, this means that any calculation intensive

tasks can be easily done by the GPU instead of loading up the CPU. Thus,

ATi says that they will not require a dedicated engine that solves Physics

problems for giving life like realism. Their claim is that due to the giga-flops

of available power, their cards can do rendering, Physics calculations and

more!


Intel has also moved to a unified micro architecture across

platforms. So whether it is Yonah for the notebooks,



Conroe




for the desktop or Woodcrest for servers, the basic architecture remains the

same. The base architecture is actually a Banias derivative which, being an

architecture for the notebooks, makes it extremely frugal on electricity.

Intel just launched their low voltage Xeon processor which

is the first of its kind (from Intel anyway) to



feature dual core as well as low power consumptions. The processor has a TDP of

just 31 W! TDP, of course, is the power the chip requires to run conventional

software at their maximum. This is usually around 90 percent of the maximum

power the chip is ever expected to require.

If you compare this 31 W to the 110 W for the single core (Irwindale)

dual processor servers, you begin to appreciate the drop in power consumptions.

It has been cut down to less than 30%. So even though initially the investment

might be a bit high, the running costs will make it up in no time.

Sun's

Niagara
While the eight cores of

Niagara


are pretty impressive themselves, the other incredible thing about it is the

sheer frugality of power consumption. While PowerPCs can consume around 100 W of

power per core and Intel's Paxville and single core

Cranford


use between 110 and 135W of power, the

Niagara


uses a maximum of 72 W! So if you have a large data center, the saving in terms

of power would perhaps be worth the switch! The amazingly low-power consumption

also means that you can pack your datacenter a lot more closely as these systems

will run far cooler than your current rack.

Compare this with the LV Xeon, the reason why this 72W is

impressive is because

Niagara


has eight cores while the LV Xeon is merely dual core. The obvious limitation

here is that

Niagara


supports only Solaris 10 while Xeon will let you load pretty much any OS on it.

AMD Opterons AMD has actually been having quite a

good run with its recent Opteron series. No longer are AMDs plagued by heating

issues that they had earlier. Their current Opterons run at a TDP of 85-90 which

is significantly cooler than the Intel's Xeons or even Itaniums.

So clearly, the shift from all the manufacturers has gone

from making the fastest processors to making processors that are more

economical, run cooler and more efficiently. The focus is shifting towards the

fastest clock speed to increasing the overall system performance and it

couldn't have come sooner.

Virtualization technology



After years of having virtualization at the top end server level, it has

finally trickled down and even desktops are coming with VT (virtualization

technology) at the processor level! All major processor vendors, from Intel to

IBM to AMD to SUN implement VT in one form or another.

This grossly increases the ease with which you can run

multiple operating environments/systems without resorting to any sort of

software optimization. You will be able to run completely independent

environments,  which may or may not have the same OS virtualized right down

to the hardware level. The dual and multi-cores will facilitate this even more

by providing you hardware level multiple processors (or cores). Each environment

will then probably get mapped to a core. The more cores you have, the better

your VT performance!

The Opteron has a hefty L2 cache which saves CPU clocks while computing

Note that we are still talking of cores per processor

(Sun's T1 gives you upto 8!). Once you combine these multi-cores into

multi-processors systems, the true impact on performance begins to emerge.

Virtualization promises to make administration tremendously simpler. No longer

will you have to go from PC to PC doing routine maintenance tasks. You will also

not have to rely on individuals to maintain their machines. All you need to do

is create two different virtual machines, one is for the user and the other is

for you to use (remotely at that!) for running your maintenance tasks. You can

in fact do both these things simultaneously with almost no downtime!

If the processing power grows enough (and it will, by the

looks of it) you might even be able to club your servers doing different tasks

into a single machine! Say you have a mail server, a web server and a print

server. Instead of having multiple machines, hardware level VT will let you

combine all these into a single server with various environments virtualized.

This saves you space, money and running costs.

The biggest benefit of hardware level VT that processors

today carry is that each environment runs completely independent of others. So

even if one of them crashes, or gets infected by a virus, the others run

unaffected. Moreover, you can simply reset the crashed environment without

having to restart the whole machine!

All major vendors today provide VT at the processor level.

While Intel calls it Intel Virtualization Technology, AMD calls it



Pacifica




and Sun, Logical Domains.

Is the software optimized?



So far, we've seen technologies and innovations that are being

incorporated in the CPUs of today and tomorrow. But performance is a mix of

software and hardware and having the fastest hardware will get you nowhere if

you use software that can't take advantage of it.

That might be one of the bottlenecks initially. Hardware

normally does leap ahead and then the software catches up. So even though you

might invest into a



super fast system right now, the benefits might not show up till manufacturers

release proper software which exploits the capabilities. Still, the time lag

might be a small price to pay for the leap in performance that we can expect in

the coming quarters.

Licensing issue



With all these myriad cores changing the very definitions of processors, it

is natural that the legal eagles have their work cut out for them. Earlier,

processors used to be simple devices and licensing was done on the basis of per

processor, or indeed per box. We then migrated to dual or multi-processor

architectures which complicated licensing further. Should the firms charge

licensing per server box? Or should they charge on a per processor basis. The

reason being that multi- processor systems might replace some installed servers,

thus, reducing the total deployment. This would mean that the company needs

lesser licenses, thus, translating into a loss of revenue for software vendors.

Things get even more complex when we enter the realm of

multi-core multi-processor systems. The tremendous jump in performance will

require even lesser licenses, which means software companies could start

charging per core!

All this translates to some pretty high cost while



migrating to multi-core and/or multi-processor systems and is something you

should enquire and ascertain before embarking on spending your money.

There is no doubt that processing power will go through the

roof in the immediate future. But unlike



earlier times they will also run cooler, be more energy efficient and give you

many important features like virtualization, dual/multi-core etc.

The only, hitch, as we mentioned could be the perennial lag

while software catches up to hardware and the licensing. Let's hope that the

software companies show maturity and come up with customer-friendly licensing

agreements which don't penalize you for going in for a higher server. All said

and done though, the future of CPUs is definitely an exciting one!

Advertisment