If I'm reading this correctly that's a huge density difference between this with its
18 billion transistors in 602mm²
and Renoir with its
9.8 billion transistors in 156mm²
Is that explainable just by the differences in density between cache, iGPU and so on, or IBM traded density for speed?
I think there are a couple of contributing factors to the lower density, beyond process differences between TSMC and Samsung 7nm and relaxed design rules to allow for higher clock speeds. To be clear, I suspect that a large part of the density differences are the result of process differences, but we also need to consider that large portions of the Power10 die are made up of structures that are not typically very transistor dense - or "device" dense, to use IBM's terminology, since I believe they are including capacitors
and transistors in their device count (because of eDRAM).
1. I/O
IBM has a truly immense amount of off-chip I/O with Power10 - overall, we are looking at 304 SerDes operating at up to 32GT/s (16x8 OMI + 4x(32+4) PowerAXON + 2x16 PCIe5). This occupies the entire perimeter of the chip and accounts for around ~185 mm² (~30% of the die size). Off-chip I/O is known to scale poorly with process improvements - in fact, that is why the I/O die of AMD's Rome is made on GF 12nm, rather than TSMC 7nm.
2. eDRAM
IBM is still using eDRAM for it's L3, which skews things slightly - eDRAM is 2 "devices" per bit, one transistor and one capacitor, compared to SRAM, which is 6(+) transitors per bit. IIRC eDRAM has historically been less device-dense than SRAM because the capacitors are larger than transistors - though overall it is still smaller on a per-bit basis. IBM is still achieving ~9.1Mb/mm² with its eDRAM L3, compared to 7.6Mb/mm² for the SRAM L3 on the Rome CCD; not a super useful comparison, since they are on different processes, but it does illustrates that eDRAM has density advantages.
The cache regions appear to account for ~112mm² (~19% of the die), which includes 2.15B devices (2^30 bits * 2 devices) for the cache bits, plus some percentage for whatever ECC scheme has been implemented, plus whatever is necessary for the eDRAM control and on-chip network. So the remaining ~490mm² accounts for <15.85B devices, which puts the remaining die at <32.3M devices per mm², rather than the 29.9M devices per mm² assumed initially - presumably the vast majority of these remaining devices (if not all of them) are transistors and not capacitors, since we are excluding eDRAM.