Question Zen 6 Speculation Thread

Kepler_L2 · Nov 11, 2024

jpiniero said:
Case in point - I wouldn't be surprised if the PS6 uses N3. Yes... for a console that might be coming out no earlier than the end of 2028.

It's holiday 2027 afaik and probably using chiplets aka multiple nodes.

LightningZ71 · Nov 11, 2024

In that year, I would expect the IOD to be N4C or even some sort of N3"C" derivative. The compute die would likely be trailing the leading edge by at least a partial node, so maybe a Vanilla N2, or N3P. The GPU die would depend on where AMD/NVIDIA is with their design libraries.

adroc_thurston · Nov 11, 2024

LightningZ71 said:
The GPU die would depend on where AMD/NVIDIA is with their design libraries.

There is none.
Part of the SOC.

LightningZ71 said:
The compute die would likely be trailing the leading edge by at least a partial node, so maybe a Vanilla N2, or N3P

Vanilla N2 doesn't exist when N2p ramps.
But yeah N3p is what client gets.

FlameTail · Nov 11, 2024

adroc_thurston said:
Vanilla N2 doesn't exist when N2p ramps

Meaning that the N2 production line will be converted to N2P?

adroc_thurston · Nov 12, 2024

FlameTail said:
Meaning that the N2 production line will be converted to N2P?

Yeah, nodelets roll over just like that.

Doug S · Nov 12, 2024

Thibsie said:
Isn't this what RiscV (with its pros and cons) is all about ?

What RISCV are you talking about? There is such a proliferation out there you can call it almost anything you want. Everyone in the RISCV world was patting themselves on the back for the RVA23 profile that's supposed to solve all their problems by specifying a more realistic set of what is mandatory and it does - but it also has tons of extensions. You thought ARM had a lot of optional stuff you ain't seen nothing yet!

I haven't dug into much because frankly I just don't give a crap about RISCV, and it will depend on exactly what extensions get adopted, but I'm pretty sure those complex addressing modes are in there. So sure if you use a stripped down RISCV you can claim it is a pure RISC that has only simple loads/stores. But those early pure RISCs made multiplication support optional - and so does RISCV if you choose the most stripped down version. If RISCV ever makes it into Android phones in any number I'll bet it will be with something that looks very much like ARM in terms of complex addressing modes. Because it would be dumb to do otherwise.

eek2121 · Nov 12, 2024

Kepler_L2 said:
https://videocardz.com/newz/amd-reportedly-working-on-2nm-zen6-microarchitecture-codenamed-morpheus

I said client.

Client = N3. Server = N3. Dense = N2.

Doug S said:
What RISCV are you talking about? There is such a proliferation out there you can call it almost anything you want. Everyone in the RISCV world was patting themselves on the back for the RVA23 profile that's supposed to solve all their problems by specifying a more realistic set of what is mandatory and it does - but it also has tons of extensions. You thought ARM had a lot of optional stuff you ain't seen nothing yet!

I haven't dug into much because frankly I just don't give a crap about RISCV, and it will depend on exactly what extensions get adopted, but I'm pretty sure those complex addressing modes are in there. So sure if you use a stripped down RISCV you can claim it is a pure RISC that has only simple loads/stores. But those early pure RISCs made multiplication support optional - and so does RISCV if you choose the most stripped down version. If RISCV ever makes it into Android phones in any number I'll bet it will be with something that looks very much like ARM in terms of complex addressing modes. Because it would be dumb to do otherwise.

It takes years to design/develop/manufacture a CPU. Zen 5 development started 5 years ago, for comparison.

yuri69 · Nov 12, 2024

eek2121 said:
Zen 5 development started 5 years ago, for comparison.

In H1 2018 Mr. Clark reported AMD had been working on Zen 5 already.

Btw there are AMD mobile 2025-2026 roadmaps leaked. They list no Zen 6 in 2026, just refreshes. So the theory of H2 2026/2027 fits.

DrMrLordX · Nov 13, 2024

yuri69 said:
Btw there are AMD mobile 2025-2026 roadmaps leaked. They list no Zen 6 in 2026, just refreshes. So the theory of H2 2026/2027 fits.

Interesting. They're actually going to refresh Strix and Kraken? Wonder if those will wind up being like Rembrandt.

poke01 · Nov 13, 2024

Qualcomm has a good opportunity to go all out and not mess up in 2026.

FlameTail · Nov 13, 2024

poke01 said:
Qualcomm has a good opportunity to go all out and not mess up in 2026.

Qualcomm had a great opportunity this year;
- Only silicon vendor for Windows-on-ARM
- First to Microsoft's Copilot+ Initiative
- Beginning of the great upgrade cycle that will be stimulated by Microsoft phasing out support for Windows 10 in 2025.

Yet they didn't go all out (example: not investing in a sufficiently large GPU for X Elite), and messed up a few things (such as the Dev Kit cancellation).

poke01 · Nov 13, 2024

FlameTail said:
Qualcomm had a great opportunity this year;
- Only silicon vendor for Windows-on-ARM
- First to Microsoft's Copilot+ Initiative
- Beginning of the great upgrade cycle that will be stimulated by Microsoft phasing out support for Windows 10 in 2025.

Yet they didn't go all out (example: not investing in a sufficiently large GPU for X Elite), and messed up a few things (such as the Dev Kit cancellation).

At this rate only Nvidia can save WoA, if they fail then this platform has no chance of ever taking off.

Nvidia tried before during the Tegra days but that was a bad attempt. I can see why AMD has no new SKUs planned till late 2026/2027. WoA isn’t a threat to x86 so far.

Thibsie · Nov 13, 2024

I thought AMD was rumoured to come with an ARM soc themselves?

Doug S · Nov 13, 2024

I just don't buy Windows on ARM ever having more than at most 10% of the Windows market, and even that's probably optimistic. Sure in reality it is fine for most people but they will hear tales from people for whom it is a problem. All one needs to be a naysayer about ARM PCs is one application, one game, one driver for an 8 year old printer that they're still using that they can't properly run and they'll tell others to stay away.

Honestly smartphones are a bigger threat to x86 than ARM PCs, because a lot of younger people grow up used to only using a smartphone for their own "personal computing" needs. Their only exposure to a PC is at school and at work, so it carries the baggage of that negative association to being forced to do stuff they don't want to do in a place they don't want to be to where they will only buy a PC for personal use if they have absolutely no choice.

yuri69 · Nov 13, 2024

DrMrLordX said:
Interesting. They're actually going to refresh Strix and Kraken? Wonder if those will wind up being like Rembrandt.

The refreshes are a must since AMD gotta wave with *something* during January CES 2026. Let's hope they won't just overclock the AI engine again but do something meaningful like Rembrandt.

SteinFG · Nov 13, 2024

Considering that those are classified as refreshes, there will be no new silicon. same stuff but now Ryzen 400 series.

LightningZ71 · Nov 13, 2024

With the length of time involved between now and Zen6, you'd hope that it's more than just new numbers for the same chip and, instead, it's an erata respin that addresses a few things that hurt performance along with a very, very modest clock bump.

Personally, my unicorn is AMD releasing a Strix Point version with the pre-AI design with no NPU and the original MALL cache instead, even if it is just for handhelds.

FlameTail · Nov 13, 2024

What about Bald Eagle Point?

AMD Bald Eagle Point leaks as Zen 5 APU lineup with major cache upgrade

AMD could be preparing new laptop APUs to follow up the Zen 5-based Strix Point chips. Named “Bald Eagle Point”, the APUs also reportedly utilize Zen 5/5c cores like Strix Point alongside an RDNA 3.5 iGPU. However, the APUs apparently bring a major addition to the cache structure.

www.notebookcheck.net

Or was that a meme?

LightningZ71 · Nov 13, 2024

Change to the caching structure could be them switching the design to the 16 core CCX layout of Turin-D, albeit with only 12 cores implemented. I don't see losing the 8MB L3 of the C core CCX as that big of a deal if they can all share the same L3 at 16MB. I doubt that they expand the die size by just adding the MALL cache alone and don't see them axing the NPU.

Meteor Late · Nov 13, 2024

Doug S said:
I just don't buy Windows on ARM ever having more than at most 10% of the Windows market, and even that's probably optimistic. Sure in reality it is fine for most people but they will hear tales from people for whom it is a problem. All one needs to be a naysayer about ARM PCs is one application, one game, one driver for an 8 year old printer that they're still using that they can't properly run and they'll tell others to stay away.

Honestly smartphones are a bigger threat to x86 than ARM PCs, because a lot of younger people grow up used to only using a smartphone for their own "personal computing" needs. Their only exposure to a PC is at school and at work, so it carries the baggage of that negative association to being forced to do stuff they don't want to do in a place they don't want to be to where they will only buy a PC for personal use if they have absolutely no choice.

8 year old printer is not really old, especially with how slow printers improve. So yeah, it's ok if you can run emulated apps with acceptable performance, but if one cannot use their devices, then it's as good as dead.

Kepler_L2 · Nov 13, 2024

FlameTail said:
What about Bald Eagle Point?

AMD Bald Eagle Point leaks as Zen 5 APU lineup with major cache upgrade

AMD could be preparing new laptop APUs to follow up the Zen 5-based Strix Point chips. Named “Bald Eagle Point”, the APUs also reportedly utilize Zen 5/5c cores like Strix Point alongside an RDNA 3.5 iGPU. However, the APUs apparently bring a major addition to the cache structure.

www.notebookcheck.net

Or was that a meme?

Strix refresh has a different name.

Abwx · Nov 13, 2024

gdansk said:
It's a shrink of Zen 1 with improved clock rates that took 13 months to follow Zen.

It was a little more than a die shrink.

AMD Ryzen 7000 im Test: So schnell sind 7950X und 7700X

AMD Ryzen 7000 ist da. Im ausführlichen Test treten Ryzen 9 7950X und Ryzen 7 7700X mit Zen-4-Kernen gegen Ryzen 5000 und Intel Core an.

www.computerbase.de

OneEng2 · Nov 14, 2024

MS_AT said:
The crazy wide execution of AVX512 is arguably not the biggest advantage the AVX512 brought to the table for x64, but people focus on that stupid number because it is in the name and Intel in its infinite wisdom mandated that 512b execution must be supported, while 256b and 128b are optional extensions

Intel got rid of it, because E-cores implementing AVX512 according to specification would be either terribly slow or the area cost would grow to the point e-cores could no longer be so easily spammed in such quantities as they needed to spam them. Of course they got rid of the support in the next release cycle after introducing AVX512 support to client devices to make software developers happier... AVX10 is the effort to get the benefits back without having to pay the cost of 512b shuffle units.

I also don't understand the comment about software compilers support. Two out of three C/C++ biggest compilers are open source and Intel has no power to force them to do anything. Not to mention you could fork them and add the support back if you needed. If anything AMD is so successful due to Intel's work to add the support for AVX512 to those compilers. And Intel's ICX (clang's fork, AOCC is AMD's fork) still compiles code fine for Zen, for example Y-cruncher is using that for Zen5 optimized binaries as in Mystical's evaluation it's still doing a better job than upstream clang. So while I understand Intel's CPU execution leaves a lot to be desired, their software efforts in open source domain are more prominent than AMD's own.

But yes, that's probably off-topic here

Granted that Intel "got rid of it" because E cores couldn't do it, but I believe that this was a die size issue for BOTH E and P cores.

Good point on the non-Intel compilers; however, my point is that if Intel creates new instructions in their processors, they don't release this fact early enough for AMD to include the same instructions in the same time window. This has given Intel a generation of added performance every time Intel did this before AMD could spin up a new design that included the support.

The question for AVX512 becomes "Is the juice worth the squeeze?". I believe it is due to the growing number of applications that support it in the desktop/laptop, and mostly the huge gains found in many applications in DC.

Intel's recent design decisions seem to leave DC concerns on the back burner. Seems like a strategic mistake to me. We will see.

DisEnchantment · Nov 16, 2024

Z5 had few regressions in the 128b SSE int add and 256b AVX2 int add throughput/latencies. Like the throughput is halved. I suppose it is something to do with the new changes for unifying int scheduling.
AMD seems really keen on maximizing usage of the scheduling resources going for a unified scheduling.
On N3, they would have more power/XTor budget to improve on this scheduler without taking a major efficiency hit, so a win here is highly probable, this also means they are not going to have to raise the scheduling queues by much if at all.

The L3 was a good job on Z5 with support for more clients and reduction in latency by 3.5 cycles and area improvements . I suspect not much improvements we can expect in this regard.
The number of ALUs would very likely remain same, considering the maximum they could be used at once currently as exhibited by hand written benchmarks is 4.5 - 5.
There are lots of improvements in front end to be made too, with instruction latency being an issue in many micro benches. There is also the dual decode thing which seems to be not working yet.
Then when these bottlenecks are removed, they would need to improve the hit rate from L1 and L2 without hurting latency, e.g., increasing L2 and improving L1 prefetching. I doubt L1 would change in capacity but things may happen.

I am curious if there are improvements to be had increasing the BW from L1 to int since currently it is 64B/cycle only for FP and Instruction decode
With increase XTor availability, they can get the budget to improve throughput on 128b/256b SSE/AVX floating point ops.
Bump up some resources here and there, get some few hundred MHz. Then some efficiency gains from the new interconnect and higher BW and lower latency

If they remove the 2x GMI PHYs/IFOP links with ~7mm2 area and replace with small 2mm2 beach heads fan out area for a total of ~4mm2 of area dedicated to IF and considering the density jump they could get from N3E a CCD of 60mm2 (-10%) with 10.6B XTor (+25%) would be possible,
This is assuming a conservative 185 MTr/mm2 for N3E/P

Quite a lot of improvements to make without major core architectural rework I would imagine.

MS_AT · Nov 18, 2024

DisEnchantment said:
Z5 had few regressions in the 128b SSE int add and 256b AVX2 int add throughput/latencies. Like the throughput is halved. I suppose it is something to do with the new changes for unifying int scheduling.

Not quite. First of all, INT scheduler has nothing to do with SSE and AVX2 integer operation, FP/SIMD scheduler is responsible for those. The throughput is not halved for those operations, but if you max the schedulers out, you will get one extra cycle of delay on one cycle instructions. Since SIMD integer adds are natively 1 cycle, they get the latency hit. Throughput stays the same, 4 int adds at whatever SIMD width you want. Speaking of desktop Zen5.

Question Zen 6 Speculation Thread

Senior member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Platinum Member

Diamond Member

Senior member

Lifer

Platinum Member

Diamond Member

Platinum Member

Senior member

Platinum Member

Senior member

Senior member

Golden Member

Diamond Member

Golden Member

Member

Senior member

Lifer

Attachments

Senior member

Golden Member

Senior member