The interesting question becomes what the IPC of Zen 2 will be.
AMD claimed that they had major changes that they could utilize to improve Zen 2, perhaps in time to oppose an Intel 10nm Skylake shrink.
I'd be very interested to see everyone's thoughts on the type of IPC that we might see in Zen 2. I think that we may see better than Skylake IPC, at least in some benchmarks. On average though, it is hard to say, but certainly the gap could close.
The design was completed months ago, which means they probably have the floorplan all ready to go, except for some minor tweaks.
Makes me wonder about those 5+ GHz claims on 7nm by GloFo if it ever shows up in H2 2018/Q12019.
Historically, IBM has been very good with clockspeeds and we know that there has been a lot of collaboration between IBM, AMD, and GF on this front.
It really isn't possible to say what the next generation is. Past performance is no assurance of future success. Even the people who themselves are working on it right now might not know. There might be unforeseen difficulties with 7nm or it might go smoothly. Certainly Intel (and their 10nm process is comparable to the TSMC/GF 7nm nodes) has experienced a ton of challenges ramping up their 10nm, leading to an unprecedented 3 year delay.
Another possibility is that AMD might elect to go with TSMC instead for Zen 2. That would be a major departure, but quite possible. Alternatively, AMD may want a second supplier, much like how Nvidia has done so with Samsung on their GPUs.
One thing I am noticing from the VRM on some of the high-end X470 boards is they are extremely overkill.
http://www.overclock.net/forum/27199649-post2954.html
https://www.hardwareluxx.de/community/f12/pga-am4-mainboard-vrm-liste-1155146.html
Excepting the ASUS TUF, anything but the flagship Gigabyte board, and any MSI board, of course...
Could it be that Ryzen 2 next year will have 12 core CPUs for the desktop platform?
The way the CCX is made, this would require a more complex design for the L3 caches. In a fully connected node, the number of connections = n(n - 1) / 2.
For a 4 core CCX, that is 4 x (4 - 1)/2, or 6 connections. In a 6 core CCX, that would require 6 x (6 - 1)/2, or 15 connections. That is 2.5x as many connections, but only 1.5x as many cores. You'll need a lot more L3 cache too. If they do go the 6 core route, they may be forced to abandon the fully connected topology in favor of something else, perhaps a ring or a partial connection that minimizes the number of hops.
Perhaps a 3 CCX design of 4 cores might be another option. I think that AMD should add an L4 cache to facilitate inter-CCX connectivity. It doesn't have to be a big cache, but it might reduce latency.
In many ways, this resembles network topology design, with very similar trade-offs.