[Theregister] POWER8 Details revealed

insertcarehere · Aug 28, 2013

Theregister said:
http://www.theregister.co.uk/2013/08/27/ibm_power8_server_chip/?page=1

The Power8 chip is implemented in IBM's familiar high-k metal gate processes, which include copper and silicon-on-insulator technologies in a 22-nanometer process. The precise transistor count was not given during the presentation, but the Power8 chip weighs in at 650 square millimetres; this is a bit bigger than Power7+, which used a 32-nanometer process, had 2.1 billion transistors, and a surface area of 567 square millimetres.

The Power8 core has a total of sixteen execution pipes. These include two load store units (LSUs) and a condition register unit (CRU), a branch register unit (BRU), and two instruction fetch units (IFUs). There are two fixed-point units (FXUs), two vector math units (VMXs), a decimal floating unit (DFU), and one cryptographic unit (not labeled in the core diagram above).

Each core now has eight threads implemented using simultaneous multithreading (what IBM calls SMT8), instead of four threads per core with the Power7 and Power7+ chips. And like earlier Power chips, this SMT is dynamically tuneable so a core can have one, two, four, or eight threads fired up.

Each core has 512KB of SRAM memory etched right near it. A segmented NUMA-like L3 cache using what IBM calls a "non-uniform cache architecture" or NUCA for short, spans all twelve cores on the die, for a total of 96MB of L3 cache. That's only 8MB of L3 cache per core, compared to 10MB per core for the Power7+ chip announced last year, but the Power8 has a much more sophisticated main memory subsystem and an L4 cache that obviates the need for so much L3 cache on the die. (More on that in a second.) The L3 cache is implemented using embedded DRAM, as was the case with the Power7 and Power7+ processors.

At a 4GHz clock speed, you can move data into L3 cache from the external L4 cache at 128GB/sec and from the L3 cache out to L4 at 64GB/sec. Data can be crammed into L2 cache from L3 at 128GB/sec (or back out at the same bandwidth). The pipe from L2 cache into the cores has 256GB/sec of bandwidth, but only 64GB/sec in the other direction. Add it all up, across a twelve-core Power8 chip that works out to 4TB/sec of L2 cache bandwidth and 3TB/sec of L3 cache bandwidth.

Hopefully this will be some competition in the top end to Intel's server products, I can't imagine how many transistors make up 650+mm^2 on a 22nm process.

Idontcare · Aug 28, 2013

That thing is a beast!

Enigmoid · Aug 28, 2013

That's one big chip. I'm kinda scared about how much power it must suck up.

AtenRa · Aug 28, 2013

Its the work done with this power that we care

thunng8 · Aug 28, 2013

insertcarehere said:
Hopefully this will be some competition in the top end to Intel's server products, I can't imagine how many transistors make up 650+mm^2 on a 22nm process.

What do you mean by this? The power7+ already is a much faster processor than the top Xeons in server workloads and this announcement will only stretch IBM's lead.

mikk · Aug 28, 2013

thunng8 said:
What do you mean by this? The power7+ already is a much faster processor than the top Xeons in server workloads and this announcement will only stretch IBM's lead.

IVB-EX is a game changer, it will have the lead against Power 7+. Power 8 is coming in H2 2014 and is a Haswell-EX competitor.

thunng8 · Aug 28, 2013

mikk said:
IVB-EX is a game changer, it will have the lead against Power 7+. Power 8 is coming in H2 2014 and is a Haswell-EX competitor.

Ivy bridge Ex has slipped until at least q1 2014. Power7+ came out in 2012. Power8 is due mid 2014.

Chiropteran · Aug 28, 2013

Are these going to be sold in any systems for less than under $10,000?

thunng8 · Aug 28, 2013

Chiropteran said:
Are these going to be sold in any systems for less than under $10,000?

Current Power7+ systems start from aorund 6k, so I'd imagine the entry price for Power8 should be similar

http://www-03.ibm.com/systems/power/hardware/710/browse_aix.html

insertcarehere · Aug 28, 2013

thunng8 said:
Ivy bridge Ex has slipped until at least q1 2014. Power7+ came out in 2012. Power8 is due mid 2014.

Considering that they have added 4-processor support to LGA2011, low end Power7+/Power8 systems will face competition from 4P IVB-EP processor servers very soon. I am not very familiar with server features, but what important features do Xeon E7/POWER provide over the 'lesser' server platforms?

BrightCandle · Aug 28, 2013

Now what we need is Windows ported to Power8, a Visual C++ with cross compilation, some consumer level motherboards, Graphics drivers and we can finally get a decent machine with some progress. Hmm sounds like a lot of work.

Chiropteran · Aug 28, 2013

thunng8 said:
Current Power7+ systems start from aorund 6k, so I'd imagine the entry price for Power8 should be similar

http://www-03.ibm.com/systems/power/hardware/710/browse_aix.html

Ah, cool. I didn't realize they published prices on the website now. I remember looking because I was curious many years ago and it seemed to be one of those "if you don't already know the price, you can't afford it" type things.

rainy · Aug 28, 2013

insertcarehere said:
Hopefully this will be some competition in the top end to Intel's server products, I can't imagine how many transistors make up 650+mm^2 on a 22nm process.

"Hopefully" and "some" aren't appropriate words in my opinion: honestly, I'm not into server CPUs but have Intel anything comparable to that beast?

mikk · Aug 28, 2013

thunng8 said:
Ivy bridge Ex has slipped until at least q1 2014. Power7+ came out in 2012. Power8 is due mid 2014.

Power 7+ came out in H1 2013 or at least its first availability. Paper launch doesn't count. A big lead is expected for IVB-EX over Power 7+. Current plan is H2 2014 for Power 8 and HSW-EX.

thunng8 · Aug 28, 2013

mikk said:
Power 7+ came out in H1 2013 or at least its first availability. Paper launch doesn't count. A big lead is expected for IVB-EX over Power 7+. Current plan is H2 2014 for Power 8 and HSW-EX.

At least get your facts right. Power7+ shipped on the first servers in oct 2012.

Both the Power 770+ and Power 780+ machines will be generally available on October 19

http://www.theregister.co.uk/2012/10/03/ibm_power7_plus_server_launch/?page=2

And how is Haswell-ex shipping in h2 2014 when ivy bridge-ex is only shipping in h1 2014. Ivy bridge-ex must have the shortest lifespan in the history of intel high end server chips (Westmere-ex will be close to 3 years old before being replaced by ivy-ex)

The register article quotes ibm saying that power8 will ship in systems in mid 2014.

thunng8 · Aug 28, 2013

BrightCandle said:
Now what we need is Windows ported to Power8, a Visual C++ with cross compilation, some consumer level motherboards, Graphics drivers and we can finally get a decent machine with some progress. Hmm sounds like a lot of work.

That in fact might be possible in the future as IBM has opened up Power8 for licensing.

There was an announcement last month and Google and Nvidia are some of the early partners.

http://en.wikipedia.org/wiki/OpenPOWER_Consortium

insertcarehere · Aug 28, 2013

rainy said:
"Hopefully" and "some" aren't appropriate words in my opinion: honestly, I'm not into server CPUs but have Intel anything comparable to that beast?

It's going to be competing against 12-core IVB-EP and 15-core IVB-EX right at launch, not to mention later Haswell-based derivatives, its closer than it may seem. Of course, each POWER8 core probably has insane IPC and multithreading capability (8-Way SMT!), so I can't really tell.

SunnyD · Aug 28, 2013

BrightCandle said:
Now what we need is Windows ported to Power8, a Visual C++ with cross compilation, some consumer level motherboards, Graphics drivers and we can finally get a decent machine with some progress. Hmm sounds like a lot of work.

Porting Windows is a mostly pointless endeavor. Go look at NT 3.51. It ran on DEC Alpha, MIPS and Power. Problem is there were largely no APPS that ran on those platforms under it.

Having an OS that runs on a given platform is one thing. Having the dev tools is also another (good step). But when you deal in a closed software binary ecosystem like Windows, having multiple hardware architectures that require different binaries make the entire endeavor pointless since you have to deliver multiple binaries for the same application for multiple architectures. Otherwise you need an emulator like DEC provided with FX!32.

Server/workstation platforms like this are better served by *nix, custom OS platforms (usually based on *nix) and the like.

Homeles · Aug 28, 2013

This will also be competing against SPARC and the remnants of Itanium.

NTMBK · Aug 28, 2013

BrightCandle said:
Now what we need is Windows ported to Power8, a Visual C++ with cross compilation, some consumer level motherboards, Graphics drivers and we can finally get a decent machine with some progress. Hmm sounds like a lot of work.

If you're going to buy that kind of machine, why not try Xeon E7s instead? IVB E7 should be a beast.

NTMBK · Aug 28, 2013

The PCIe stuff excited me most.

These integrated PCI-Express 3.0 controllers on the Power8 die provide the transport layer for what IBM is calling the Coherence Attach Processor Interface, or CAPI. And this interface will allow accelerators plugged into the PCI bus of a system - possibly GPU coprocessors or field programmable gate arrays - to easily access data and follow pointers in main memory just like processors themselves do.

GPGPU accessing main memory and using its pointer structures should be very, very useful. The sooner we get this on x86 the better.

AtenRa · Aug 28, 2013

NTMBK said:
The PCIe stuff excited me most.

GPGPU accessing main memory and using its pointer structures should be very, very useful. The sooner we get this on x86 the better.

HuMA coming in x86 soon

NTMBK · Aug 28, 2013

AtenRa said:
HuMA coming in x86 soon

Not soon enough- I want my Kaveri 6 months ago, dammit

SocketF · Aug 28, 2013

Idontcare said:
That thing is a beast!

Totally agree. Now we just need AMD to become a member of the open-power club and design an x86-Decoder for that animal. Then they can ditch their Babydozer-line

A dual core version with 16 threads 16MB L3 @5 GHz would be enough for me :ninja:

aigomorla · Aug 28, 2013

i count 12 physical cores 8 meg cache each making it 96MB of L3 cache total.

that thing is a beast...

[Theregister] POWER8 Details revealed

Senior member

Elite Member

Platinum Member

Lifer

Member

Diamond Member

Member

Diamond Member

Member

Senior member

Diamond Member

Diamond Member

Senior member

Diamond Member

Member

Member

Senior member

Belgian Waffler

Platinum Member

Lifer

Lifer

Lifer

Lifer

Senior member

CPU, Cases&Cooling Mod PC Gaming Mod Elite Member