Question What's going on with Power?

NTMBK · Jun 4, 2020

Semiaccurate have a paywalled story up about Power: https://www.semiaccurate.com/2020/06/03/is-ibm-killing-off-power/ Apparently something big is happening, though it's definitely not being killed. Anyone got any ideas what it might be? Random thoughts that occur to me are

They could be selling the business
They could be switching fab (ditching Samsung/GloFo and going to TSMC or Intel)
They could be changing business model (leaning more into OpenPOWER? Killing OpenPOWER?)

Or something else entirely!

moinmoin · Jun 4, 2020

Since IBM bought Red Hat I hope they make Power fit better with their whole open source strategy. That and a bigger, more public push for Power in general would be nice to have.

DrMrLordX · Jun 4, 2020

Technically, they can't kill OpenPOWER. They can stop submitting designs to it though.

LightningZ71 · Jun 4, 2020

IF they want to stay relevant, they need to get with a leading edge foundry. GloFo isn't that anymore...

ksec · Jun 4, 2020

LightningZ71 said:
IF they want to stay relevant, they need to get with a leading edge foundry. GloFo isn't that anymore...

Well they switched to Samsung which is what leads to the 1 year delay in POWER10.

Microwatts demonstrate what the ISA in its simplest form is capable of, and since POWER is actually well supported I thought it might take off. But then it seems the industry and market has chosen ARM.

https://github.com/antonblanchard/microwatt

LightningZ71 · Jun 4, 2020

Power is a great architecture for its target audience. Unfortunately, their target audience is both shrinking and finding that alternative solutions are just as cost effective and capable of performing the needed tasks. There are still a few areas where there are hold outs that are tightly targeted at Power, but, I really can't see it being in any way relevant in five years.

NTMBK · Jun 6, 2020

LightningZ71 said:
Power is a great architecture for its target audience. Unfortunately, their target audience is both shrinking and finding that alternative solutions are just as cost effective and capable of performing the needed tasks. There are still a few areas where there are hold outs that are tightly targeted at Power, but, I really can't see it being in any way relevant in five years.

Yeah, it feels like it's on its way out. OpenPOWER doesn't really seem to have gone anywhere- I think they were hoping the hyperscalers would take it up, but they all seem to be buying into ARM servers instead. But the article claims that "even the far future" projects are still on the roadmap, so it doesn't sound like IBM are letting it fade away just yet. All very mysterious.

LightningZ71 · Jun 6, 2020

Look at it like this, if you are willing to target your application at Power, it means that you've likely got the ability to target most any architecture that you want. If that's the case, why would you continue to leave yourself in a single vendor lock in situation? You could decide to change to targeting ARM, which is another RISCy architecture, that has several vendors dumping a lot of money into it and developing high reliability, high performance solutions for the server space. Aside from a scarce few exclusive features, Power just isn't bringing anything unique to the table. All they are effectively doing is providing updated systems to support legacy Power code for situations where a company needs better performance but doesn't want to switch software or redevelop it for a new architecture. In essence, it's where COBOL was about a decade ago to use a loose analogy. Eventually, all of that code will be replaced with something more modern.

Gideon · Aug 17, 2020

Well it looks that Samsung 7nm actually can fab large chips (finally). Power 10 released:

IBM reveals 7nm POWER10 processor with DDR5 and PCIe 5.0 support - VideoCardz.com

IBM Reveals Next-Generation IBM POWER10 Processor with PCIe 5.0 and DDR5 New CPU co-optimized for Red Hat OpenShift for enterprise hybrid cloud IBM Power10 processor, Source: HardwareLuxx ARMONK, N.Y., Aug. 17, 2020 – IBM today revealed the next generation of its IBM POWER central processing...

videocardz.com

Specs from Anderas Schilling

602 mm² die size

18 Billion transistors

16 cores (15 active) per die

SMT4/SMT8 -

48/32 L1-Cache (I/D)

2 MB L2-Cache

128 MB L3-Cache

Single Chip Module (SCM) / Dual Chip Module (DCM)

NostaSeronx · Aug 17, 2020

Seems they improved the CMT granularity.

POWER9 splits into two independent blocks w/ up to 1-4 threads being processed per.
POWER10 splits into four independent blocks with up to 1-2 threads being processed per.

Won't be surprised if they launch a 64/128 SMT2-core version.

DrMrLordX · Aug 17, 2020

Is that a boutique node just for POWER10, or a common node that Samsung can/will use for other products?

moinmoin · Aug 17, 2020

@NostaSeronx In the die shot cores still show mirrored halves, not mirrored quarters.

NTMBK · Aug 17, 2020

And POWER11 is officially "in development":

Nice to see POWER10 come out! Interestingly they only sell up to 15 enabled cores- sounds like they're struggling with yield a bit.

Also interesting- GDDR DIMMS!

Equally interesting- their slides call out "FPGAs and ASICs" as attached accelerators, but no mention of GPUs. Between that and the new focus on CPU AI performance, is the partnership with NVidia dead?

NTMBK · Aug 17, 2020

And a breakdown of the core changes:

DrMrLordX · Aug 17, 2020

NTMBK said:
Also interesting- GDDR DIMMS!

Weird. They're using the same interface to connect to DRAM, GDDR DRAM, and storage? Simultaneously? Sounds like they're trying to do something Optane-like but without Optane.

Equally interesting- their slides call out "FPGAs and ASICs" as attached accelerators, but no mention of GPUs. Between that and the new focus on CPU AI performance, is the partnership with NVidia dead?

Maybe. I think they're trying to emphasize OpenCAPI which, to date, hasn't exactly had a blinding array of products available to utilize the interface. Even nVidia never used it (POWER9 systems like Summit used NVLink).

VirtualLarry · Aug 17, 2020

Speaking of Summit...are there any SuperComputer design wins for Power10? Or any announced (larger) customers at all?

moinmoin · Aug 17, 2020

DrMrLordX said:
Weird. They're using the same interface to connect to DRAM, GDDR DRAM, and storage? Simultaneously? Sounds like they're trying to do something Optane-like but without Optane.

I think that's not necessarily like Optane (which is storage tier masking itself as slow memory) but more like AMD's SerDes/PCIe PHY that is agnostic enough to simultaneously support further SATA and USB connections in place of lanes.

Though going by the slide I'd expect further logic to be necessary to actually connect all the tiers, and it comes with the caveat of additional latency that likely wouldn't fly on the desktop:
- Technology agnostic: near/main/storage tiers
- Minimal (< 10ns latency) add vs DDR direct attach

Jimzz · Aug 17, 2020

Yea my thought is they are going to spin it off or try and sell. My buddy has told me a lot of POWER engineering positions are not being back filled. That and some other things going on inside as well.

KompuKare · Aug 17, 2020

If I'm reading this correctly that's a huge density difference between this with its
18 billion transistors in 602mm²
and Renoir with its
9.8 billion transistors in 156mm²
Is that explainable just by the differences in density between cache, iGPU and so on, or IBM traded density for speed?

Markfw · Aug 17, 2020

KompuKare said:
If I'm reading this correctly that's a huge density difference between this with its
18 billion transistors in 602mm²
and Renoir with its
9.8 billion transistors in 156mm²
Is that explainable just by the differences in density between cache, iGPU and so on, or IBM traded density for speed?

Looks like 2x the density for TSMC ... Not that good for samsung....

Jimzz · Aug 17, 2020

KompuKare said:
If I'm reading this correctly that's a huge density difference between this with its
18 billion transistors in 602mm²
and Renoir with its
9.8 billion transistors in 156mm²
Is that explainable just by the differences in density between cache, iGPU and so on, or IBM traded density for speed?

Renior is half the CPUs, a LOT less cache, PCIe 3.0 vs 5.0, memory channels, etc...

I still think TSMCs 7nm is probably better than Samsungs in this area as well but many different things going on in those 2.

Markfw · Aug 17, 2020

Jimzz said:
Renior is half the CPUs, a LOT less cache, PCIe 3.0 vs 5.0, etc...

I still think TSMCs 7nm is probably better than Samsungs in this area as well but many different things going on in those 2.

If you multiply 156*4 you get 624 or very close to the 602 number in size, but that would enable 9.8*4 b transistors or almost 40 billion, or twice the samsung density.

JasonLD · Aug 17, 2020

Markfw said:
Looks like 2x the density for TSMC ... Not that good for samsung....

Power9 had 8B transistor count on 14nm GF process with the die size of 693.37 mm². While each Zen 1 chiplet had 4.8B transistor count with the die size of 212.97 mm² using same process.
I think it has more to do with Power architecture's design choice of using relaxed density (For clockspeed and heat?) rather than the density of Samsung's 7nm itself.

Markfw · Aug 17, 2020

JasonLD said:
Power9 had 8B transistor count on 14nm GF process with the die size of 693.37 mm². While each Zen 1 chiplet had 4.8B transistor count with the die size of 212.97 mm² using same process.
I think it has more to do with Power architecture's design choice of using relaxed density (For clockspeed and heat?) rather than the density of Samsung's 7nm itself.

I don't pretend I know the details, I was just doing the math.

thetrashcan · Aug 18, 2020

KompuKare said:
If I'm reading this correctly that's a huge density difference between this with its
18 billion transistors in 602mm²
and Renoir with its
9.8 billion transistors in 156mm²
Is that explainable just by the differences in density between cache, iGPU and so on, or IBM traded density for speed?

I think there are a couple of contributing factors to the lower density, beyond process differences between TSMC and Samsung 7nm and relaxed design rules to allow for higher clock speeds. To be clear, I suspect that a large part of the density differences are the result of process differences, but we also need to consider that large portions of the Power10 die are made up of structures that are not typically very transistor dense - or "device" dense, to use IBM's terminology, since I believe they are including capacitors and transistors in their device count (because of eDRAM).

1. I/O
IBM has a truly immense amount of off-chip I/O with Power10 - overall, we are looking at 304 SerDes operating at up to 32GT/s (16x8 OMI + 4x(32+4) PowerAXON + 2x16 PCIe5). This occupies the entire perimeter of the chip and accounts for around ~185 mm² (~30% of the die size). Off-chip I/O is known to scale poorly with process improvements - in fact, that is why the I/O die of AMD's Rome is made on GF 12nm, rather than TSMC 7nm.

2. eDRAM
IBM is still using eDRAM for it's L3, which skews things slightly - eDRAM is 2 "devices" per bit, one transistor and one capacitor, compared to SRAM, which is 6(+) transitors per bit. IIRC eDRAM has historically been less device-dense than SRAM because the capacitors are larger than transistors - though overall it is still smaller on a per-bit basis. IBM is still achieving ~9.1Mb/mm² with its eDRAM L3, compared to 7.6Mb/mm² for the SRAM L3 on the Rome CCD; not a super useful comparison, since they are on different processes, but it does illustrates that eDRAM has density advantages.

The cache regions appear to account for ~112mm² (~19% of the die), which includes 2.15B devices (2^30 bits * 2 devices) for the cache bits, plus some percentage for whatever ECC scheme has been implemented, plus whatever is necessary for the eDRAM control and on-chip network. So the remaining ~490mm² accounts for <15.85B devices, which puts the remaining die at <32.3M devices per mm², rather than the 29.9M devices per mm² assumed initially - presumably the vast majority of these remaining devices (if not all of them) are transistors and not capacitors, since we are excluding eDRAM.

Question What's going on with Power?

Lifer

Diamond Member

Lifer

Golden Member

Senior member

Golden Member

Lifer

Golden Member

Golden Member

Diamond Member

Lifer

Diamond Member

Lifer

Lifer

Lifer

No Lifer

Diamond Member

Diamond Member

Golden Member

Moderator Emeritus, Elite Member

Diamond Member

Moderator Emeritus, Elite Member

Senior member

Moderator Emeritus, Elite Member

Junior Member