Question Zen 6 Speculation Thread

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,705
6,427
146
Genuine question anyway: what's with the obsession with 8+32? Currently AMD are competitive with 16+0 against 8+16, or 32 threads to 32 threads. I don't see why they also wouldn't be competitive with say 8+16 vs 8+32 on Intel's side (or both 48 threads) in the future?
 
Reactions: Tlh97 and marees

inquiss

Member
Oct 13, 2010
186
266
136
Genuine question anyway: what's with the obsession with 8+32? Currently AMD are competitive with 16+0 against 8+16, or 32 threads to 32 threads. I don't see why they also wouldn't be competitive with say 8+16 vs 8+32 on Intel's side (or both 48 threads) in the future?
I think that obsession exists only in certain members' heads.

I don't get it, if you want more nT performance we have better options now. What's so important about having them in a client package where they will be memory starved?
 

Timorous

Golden Member
Oct 27, 2008
1,748
3,240
136
I think that obsession exists only in certain members' heads.

I don't get it, if you want more nT performance we have better options now. What's so important about having them in a client package where they will be memory starved?

Hobbyists who want to play with a lot of MT performance but they don't actually make any money from it so don't want to spend a lot. IE a super tiny niche of the market.
 

dr1337

Senior member
May 25, 2020
416
685
136
Hobbyists who want to play with a lot of MT performance but they don't actually make any money from it so don't want to spend a lot. IE a super tiny niche of the market.
People don't know what they want until they have it, MT performance is relevant every day but because 99% of people don't spend money on premium products they think its a niche case... until years later when the high end tech finally trickles down to mid range and low end.

Look at NVME SSDs for example, for years they were lauded as a gimmick and even hated because Apple was an early adopter and it was expensive in early gens. Now there isn't a soul who recommends a new build without one. And I have yet to see someone say, man, my gen 4 drive is just too fast...

Considering EPYC is a thing idk why they wouldn't just abandon the 8c die completely, as long as yields are good enough obvs. The core logic is getting so much smaller than cache logic and it helps solve some of the MCM latency issues.

Also its a thing that people dislike the 7950X3D for its split architecture not being as useful when software gets placed on the wrong die. Customers actively seem to want more cores per die than just more dies themselves.
 
Reactions: Wolverine2349

Gideon

Golden Member
Nov 27, 2007
1,771
4,132
136
On a serious note though in a lot of workloads Zen 5 is going to be memory bandwidth bound. Zen 4 already is in some of the most rigourous workloads, and even in some games etc you can see benefits to more memory bandwidth (see: launch day Starfield).
The memory bandwidth limitation will certainly be serious for many tasks, however Turin Dense will essentially have exactly the same memory bandwidth per core (I'm expecting it to support DDR5-6000 (PC5-48000) otherwise it's quite a bit less) so there must beworkloads that benefit from it.

Regardless, I agree that 32 core AM5 processor makes little sense (from AMD's point of view). A 24 core (8 + 16) version could be an option for chart-topping that makes slightly more sense, but as Zen 6 moves over to laptop SoCs , it makes little sense to introduce that for a single generation.
 
Reactions: Tlh97 and marees

Tuna-Fish

Golden Member
Mar 4, 2011
1,475
1,978
136
They'll have to rev it to LPCAMM3 to support LPDDR6. It has different channel width, additional signals, etc. so it can't use the same module.

LPCAMM is not an official JEDEC name, but seems to have caught with the memory makers. Officially, LPDDR5X with a compression-attached connector is called "LPDDR5X CAMM2". Note that LPDDR5 CAMM2 and DDR5 CAMM2 are entirely incompatible, including mechanically.

Similarly, they seem to be going with just "LPDDR6 CAMM2" for the module with wider bus that's used for LPDDR6.

No, none of this makes any sense. I hope they make the LPCAMM thing an official name to distinguish between the module types.

SMT4 baybeeeee

I know you are probably joking, but I'd like to note that an increasing amount of clients are using their CPUs with SMT disabled. Removing SMT altogether is a much more likely thing to happen than SMT4.

So you have no source, meaning your claim is moot.

Unfortunately, I have no neat generally available source to link for you either, but this is real, not just Adroc weirdness. The recent JEDEC symposium was instructive, the split for every talk where it mattered was "client/mobile" and "server". But none of those talks will be generally available on the internet, probably ever, because JEDEC likes to gatekeep their stuff.

Some say 2025/2026. And Zen6 and ArrowLakeRefresh is when?
The current JEDEC target date for the DDR6 spec is "mid-2025". It takes time from spec release to go to mass production that can support a mass-market release; about 16 months from DDR5 to Alder lake, but that was abnormally short because the spec release was repeatedly delayed, and chips were ready very quickly after the spec. A more normal transition is the DDR4 one, where it took two years from spec release to Haswell-E, and more than that for high-volume products being shipped.

Unless Zen6 is a 2027 product, it will likely ships with DDR5, on AM5.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,487
3,386
106
That's not how cache coherency works on AMD parts.
On L3 miss you probe the c/sIOD directory and bang!

Like I said, it would take some advanced algorithm. I bet AMD has been working on something, and these upcoming faster connections to CCDs with advanced packaging provide incentive to implement something.
 

StefanR5R

Elite Member
Dec 10, 2016
5,914
8,826
136
(on more cores in client computing)
People don't know what they want until they have it, MT performance is relevant every day but because 99% of people don't spend money on premium products they think its a niche case... until years later when the high end tech finally trickles down to mid range and low end.
I agree that high core count computing is highly relevant to our daily lives (in the postindustrial parts of the world). But I qualify that these computing applications generally involve the processing of large amounts of data. (And tend to happen remotely from the end user.) — In contrast, high core count computing on small data exists but is irrelevant in most people's lives. — And yes, large dataset processing is trickling down to client computing all the time. But not by means of, say, a CPU with 32 very wide cores sitting on an LGA 1718 socket (edit, let alone a 1140 contacts BGA, for that matter).
 
Last edited:

Wolverine2349

Senior member
Oct 9, 2022
395
121
76
On topic, there has been at least one poster in the Zen 5 speculation thread claiming that Zen 6 sticks with ≤16 cores (and ≤8 cores per CCX) in the client segment.

Would be extremely disappointing if true. Not that we really need more than 16 cores on desktop, but more than 8 on one CCX within one CCD. So no cross latency penalty and best all around CPU for all types of games even the occasional but becoming more common ones that can scale beyond 8 cores. Dual CCX and CCD bad cross latency penalty and not good for gaming.

Based on description that Zen 6 is going to have 8 core, 16 core and 32 core CCDs, who knows. though my concern is the 32 core CCDs are really just Zen 6C e-cores. The 16 core CCDs may be reserved for Threadripper and again 8 core CCDs for mainstream desktop. Though even if 16 core CCDs make it mainstream desktop, could they be dual 8 core CCXs which brings us back to Zen 2 latency level (though of course superior IPC) but double cores per CCX which began with Zen 3, but they never had multi CCX in Zen 3 only one CCX per CCD. And even if 16 core CCDs never go t mainstream desktop they still sadly may be dual 8 core CCX CCDs anyways on Threadripper and EPYC so not even an option to pay up and get more than 8 big cores per CCX? And Zen 2 had terrible cross CCX latency beyond 4 cores which is why the quad 3300X was so good for light threaded apps as it had one 4 core CCX.

Hopefully we get 12-16 core CCXs in single CCD with Zen 6.

Though not holding my breathe and in fact by doing the math and how these companies have fabs and yields, its unfortunately a probability that the 32 core CCDs are Zen 6C e-cores and the 16 core CCD are probably dual 8 core CCXs. Cause cheaper that way as why would AMD make dedicated 16 core CCX and dedicated 8 core CCXs when they can just cram 2 together with higher binned ones and lower binned ones 1 8 core CCX. Hope to be wrong but a gut feeling unless I am mistaken in one CCX dies are made and yields unlike CCDs?

Last time we had more than 8 big cores on a single CCX//ring/die was Intel Comet Lake with 10 of them. But that arch is so outdated and so behind in IPC compared to Zen 3 and newer. Intel is bigger with more resources and had own node so was easier for them then. Though that was then and Intel is struggling more now. compared to 2020 which is why they came with e-cores and their power consumption is so high.
 
Last edited:

Joe NYC

Platinum Member
Jun 26, 2021
2,487
3,386
106
That's really really is not how cache coherency works!

I can safely say that they don't.
Abandon all hope.

I am going to keep a glimmer of hope on this one.

This would also help CCXs inside mobile chips, such as 4 Zen 5 cores complex and 8 core Zen 5c complex (and their L3s) in Strix Point (which may be too optimistic).

Since Strix Halo is such a science project, maybe there...
 

StefanR5R

Elite Member
Dec 10, 2016
5,914
8,826
136
And even if 16 core CCDs never go t mainstream desktop they still sadly may be dual 8 core CCX CCDs anyways on Threadripper and EPYC so not even an option to pay up and get more than 8 big cores per CCX?
Three thoughts:
  • Dual-CCX chiplets made sense for a) Zen 1…2 with their small 4-core CCXs, and b) for Bergamo with its smallish 8-dense-core halved-L3$ CCXs.
    Vice versa, there appears little reason to put two 8-large-core full-fat-L3$ CCXs onto a single chiplet. Maybe I am missing something though.
  • Maybe they'll make a CCD with a single 16-large-core full-fat-L3$ CCX for database servers?
  • 8-core CCXs actually seem fine to me for many workstation and HPC server uses (also a variety of other less compute oriented server uses) if these limited core count CCXs mean that the caches are fast and energy efficient. The latter has been true ever since Zen 1, if I am not mistaken.
 
Reactions: Tlh97

Wolverine2349

Senior member
Oct 9, 2022
395
121
76
Three thoughts:
  • Dual-CCX chiplets made sense for a) Zen 1…2 with their small 4-core CCXs, and b) for Bergamo with its smallish 8-dense-core halved-L3$ CCXs.
    Vice versa, there appears little reason to put two 8-large-core full-fat-L3$ CCXs onto a single chiplet. Maybe I am missing something though.
  • Maybe they'll make a CCD with a single 16-large-core full-fat-L3$ CCX for database servers?
  • 8-core CCXs actually seem fine to me for many workstation and HPC server uses (also a variety of other less compute oriented server uses) if these limited core count CCXs mean that the caches are fast and energy efficient. The latter has been true ever since Zen 1, if I am not mistaken.

Hopefully they have a 12-16 core single full fledged CCS on mainstream consumer/client platform for more than 8 cores with superior core to core latency and no crossing CCXs nor CCDs.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,487
3,386
106
The whole idea of CCX is that they don't interact.
If you want large shared LLC, MALL is your friend.

MALL seems like a fine solution for GPU alone, for APU, where CPU and GPU communicate, but not so great between 2 CCDs or CCXs. L3 on CCD or on V-Cache is just much faster than MALL

I am not suggesting sharing of L3 caches, just of the L3s being a victim cache of each other. Sort of like MALL, but the storage of the data would not be with the memory controller, it would be dumped into unused area of one of the CCDs.

Latency from 2nd CCD would be worse than from MALL, but faster than from memory. But the beauty is that the SRAM is already there, on the 2nd CCD. And there could be more of it with V-Cache. V-Cache on both CCDs would come to productive use. In gaming, the 2 CCD CPU would outperform 1 CCD CPU, and these premium 16 core, 2 CCD CPUs would become more popular. Solving the unusual situations and corner cases of 7950x3d like CPUs.
 

adroc_thurston

Diamond Member
Jul 2, 2023
3,549
5,116
96
but not so great between 2 CCDs or CCXs.
It's a shared LLC. Gets the job done.
L3 on CCD or on V-Cache is just much faster than MALL
That's why they're private!
just of the L3s being a victim cache of each other
Needlessly complicated for no gains?
But the beauty is that the SRAM is already there, on the 2nd CCD
Yes. FOR the 2nd CCD.
They're private caches.
Like you can't share them, snoop and probe and tag check happens on the IOD.
 
Reactions: Tlh97 and Joe NYC

jpiniero

Lifer
Oct 1, 2010
15,176
5,717
136
To me it seems like AMD should streamline their desktop parts:

- X3D for DIY/gamers
- The APU with a tiny IGP but a giant NPU cuz AI AI AI for OEMs, at least until AI hype ends.
- Small core product at Samsung/whatever is the cheapest viable node for the cheapskates
 
Reactions: Tlh97 and Joe NYC

adroc_thurston

Diamond Member
Jul 2, 2023
3,549
5,116
96
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |