Question Zen 6 Speculation Thread

Page 52 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

gdansk

Diamond Member
Feb 8, 2011
3,768
6,015
136
I mean, if a CCD increases from 8 to 12 cores, that tells you where the focus is for AMD. The transistor budget will be mostly spent on increasing MT, and only a small part will be reserved for ST.
It suggests that and the slide does too. Is there any reason to be optimistic for someone who realistically doesn't need more than 12 cores? The 12 core CCD may help real world performance. They might even get another few hundred megahertz. It'll have a new IOd which should help real world performance. But as Strix Halo will show shortly that does little for 1T performance.

The gap in 1T performance will grow, not shrink. But at least consumer Zen 6 should be able to extract more MT in workloads with a lot of bus contention.
 

Meteor Late

Senior member
Dec 15, 2023
266
291
96
It suggests that and the slide does too. Is there any reason to be optimistic for someone who realistically doesn't need more than 12 cores?

Nope, the target are not the vast majority of consumers, but Server.
As usual, it's all about priorities. You maintain 8 cores and you can put many more transistors to increase single core, or you can put more L3 cache for gaming, etc.
 

Hulk

Diamond Member
Oct 9, 1999
4,934
3,367
136
It suggests that and the slide does too. Is there any reason to be optimistic for someone who realistically doesn't need more than 12 cores? The 12 core CCD may help real world performance. They might even get another few hundred megahertz. It'll have a new IOd which should help real world performance. But as Strix Halo will show shortly that does little for 1T performance.

The gap in 1T performance will grow, not shrink. But at least consumer Zen 6 should be able to extract more MT in workloads with a lot of bus contention.
Since a 9950X has the ability to boost all cores to 5.7GHz, or close to it, given enough cooling, isn't the real issue with increasing MT frequency one more of thermal design? By thermal design I mean mitigating hot spots on the core and improving process to allow for faster switching with less heat. My point is that I'm not sure increasing max frequency is necessary but rather designing so that the cores can get closer to this frequency with less heat generated and in a way that the heat is more easily transferred to the cooling solution. I guess I'm saying create less heat and improve the heat transfer and that will automatically boost frequency.

One thing I've noticed with the 9950X is that it will boost to 5.7GHz on a couple cores quite often and the Vcore will never hit 1.4V. My old Raptors on the other hand would often touch 1.5V boosting to the same or lower frequencies. The voltage flucuations were wild on Raptor, even after the microcode update. This may have been due to errors on Intel's part with the programming, or it may be that Raptor often needed extremelly high voltages to be stable.

AMD's boosting behavior and Vcore seems to be more in line with what I would expect to see. Raptor's behavior always bugged me and I could never contain it outside of lower max frequency to like 4.8GHz or so.
 

gdansk

Diamond Member
Feb 8, 2011
3,768
6,015
136
Since a 9950X has the ability to boost all cores to 5.7GHz, or close to it, given enough cooling, isn't the real issue with increasing MT frequency one more of thermal design?
For workloads that don't communicate across threads. But that's the embarrassingly parallel class of problems like Cinebench.

For other problems you have a very fast core doing nothing because it is waiting on write or read of data shared by another thread which happens to be on the other CCD.

Larger CCD and new IOd should help with the latter but not so much with the former. N3 will do more for Cinebench. Although if it has more memory bandwidth that'll help with R24.
 

Meteor Late

Senior member
Dec 15, 2023
266
291
96
5% IPC after 2-3 years would be abysmal. AMD will quickly become irrelevant if they keep having mediocre gains like Zen5. Their competitors aren't standing still.

They care mostly about servers, that's the thing, general consumers are mostly irrelevant for AMD, each passing year more so, Strix Halo is another proof, as it seems targeted for AI and not much more, with how much it will cost, instead of giving users something closer to a PS5 or Series X APU for a good price.
But I don't think IPC will be 5%, that would be embarrassing, 10% is more what I expect, with at least 5% clock bump, I doubt ST gain will be less than 15%.
 
Last edited:

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
26,677
15,731
136
5% IPC after 2-3 years would be abysmal. AMD will quickly become irrelevant if they keep having mediocre gains like Zen5. Their competitors aren't standing still.
Just because gaming is like 5% better does not mean Zen 5 is horrible. Right now we are running a PG PPSE "test" and I have a Turin doing better than a 7950x with it running literally 50% slower. (thats PER CPU not total)

the 9950x is also WAY ahead of the 7950x per core, like 40% faster.

You seem to think that gaming is the only thing that makes a CPU.
 

GTracing

Senior member
Aug 6, 2021
275
645
106
Just because gaming is like 5% better does not mean Zen 5 is horrible. Right now we are running a PG PPSE "test" and I have a Turin doing better than a 7950x with it running literally 50% slower. (thats PER CPU not total)

the 9950x is also WAY ahead of the 7950x per core, like 40% faster.

You seem to think that gaming is the only thing that makes a CPU.
I didn't say anything about gaming. At my day job I'm a Flutter developer. Its hard to find benchmarks specifically for Dart, but general software development benchmarks show that Zen5 isn't much faster than Zen4 in this area. In comparison, Zen4 was a huge improvement over Zen3. Most client workloads are similar. I think your example is pretty niche.

I do not think that Zen5 is horrible, but it's not as big of an improvement as Zen4, Zen3, or Zen2 were. And that's with a nominal 13% ipc improvement. A theoretical Zen6 with 5% higher ipc would be horrible.
 
Reactions: Tlh97

Win2012R2

Senior member
Dec 5, 2024
647
609
96
5% IPC after 2-3 years would be abysmal
x86 been optimised to death, most common stuff is already like 1 cycle, superscalarity went through the roof but that really works best with well optimised code, branch prediction is at crazy good levels, the only good ways left to extract extra perf are SIMD and more cores.

Zen 6 with 12 x2 cores with 3D cache and much better memory support will be awesome, supporting APX will be cherry on top.
 

Meteor Late

Senior member
Dec 15, 2023
266
291
96
Why not 8 cores but much more L3 cache for the same area? I think that would be better for consumers.
 

Win2012R2

Senior member
Dec 5, 2024
647
609
96
Why not 8 cores but much more L3 cache for the same area? I think that would be better for consumers.
It's possible in (my at least) BIOS to disable individual cores so one can always cut down 12 to 8

There will be some factory binned very high clocked 8-10 core chips for sure!
 

Hulk

Diamond Member
Oct 9, 1999
4,934
3,367
136
5% IPC after 2-3 years would be abysmal. AMD will quickly become irrelevant if they keep having mediocre gains like Zen5. Their competitors aren't standing still.
Let me clarify my thought. I think there will be some applications that may see dramatic increases in ST performance but mainly due to extensive use of special instructions. But looking at overall ST IPC I think the days of +19% ST gains are over. The low hanging fruit is picked and there is only so much instruction parallelism that can be extracted from code. I think we're looking at somewhere between 5 and 10% IPC increase for the next generation of x86 CPU's. Hopefully I am wrong but both Intel and AMD seem to have had a pretty sudden drop in IPC from last generation to this one and I think it's not an anomoly but a trend. Again, I don't want to be a doomsayer, it's just my opinion based on current trends.

Skymont is a good example. Huge increased in ST IPC because it was so far behind Raptor and Zen 5 to begin with but it's still 10% or so behind. When you get near that asymtote gains are hard to come by.

Hey, somebody has to make outrageous claims (like I invented the question mark) to keep things lively around here!
 

OneEng2

Senior member
Sep 19, 2022
385
590
106
It may not be a design focus of Zen 6 but 1T integer performance yields the most benefit to the most users.

And outside of a few people here we're buying interactive workstations or consumer PCs. Zen 6 isn't going to be good for us, it'll be behind the ARM competition for both workstation and laptop. In certain areas in the server space (core spam) it's likely it will lose to Darkmont and its derivatives.
Not on PC's. The Windows OS alone runs a crazy number of tasks. The number of thread handles in use at any given time on most PC's is in the hundreds (note, many of them idle, but a significant number that aren't).

10-15% IPC increase and going from 8 or 16 cores (double the number of threads) to 12 to 24 cores is a very big deal. Additionally, I expect a large increase in throughput and lower latency with the new IOD and faster RAM.
For workloads that don't communicate across threads. But that's the embarrassingly parallel class of problems like Cinebench.

For other problems you have a very fast core doing nothing because it is waiting on write or read of data shared by another thread which happens to be on the other CCD.

Larger CCD and new IOd should help with the latter but not so much with the former. N3 will do more for Cinebench. Although if it has more memory bandwidth that'll help with R24.
That isn't how threading in general works. Dependencies between threads are intentionally programmed in, and many threads can READ the same memory at the same time, only writes require semaphores .... and even THAT only holds the execution of the threads that contending for that specific variable at the same time.

Cinebench is, IMHO, a low compute, high bandwidth benchmark that is able to very easily break its task (rendering frames of images in a video) into fully independent threads. I agree that this is unusual and that the majority of threaded applications are not able to scale to an infinite number of cores and threads as this particular task does.

FWIW, I think Zen 6 is going to knock the socks off of CB.
5% IPC after 2-3 years would be abysmal. AMD will quickly become irrelevant if they keep having mediocre gains like Zen5. Their competitors aren't standing still.
10-15% IPC, and AMD will be very relivant.

Competitors also are limited by the decreasing improvements in process technology and the increasing times between node improvements.

It isn't like people are constantly coming up with new original design ideas that boost performance by 20% or more. It is getting to be pretty thin pickings in the "whoa, look at that performance" design changes.... and even back in the day, it was done by adding LOTS more transistors.

Take the discussions on AVX512. WHY is AVX512 performing SO much better in certain applications than non-AVX512? Answer: It provides a super-wide (512bit) data path as opposed to the GPR's that are 64bit and instructions that can operate on those registers .... so of course, in apps that need to crunch lots of information, AVX512 is going to be up-to 8 times faster.... but it takes alot more transistors to do it.
I didn't say anything about gaming. At my day job I'm a Flutter developer. Its hard to find benchmarks specifically for Dart, but general software development benchmarks show that Zen5 isn't much faster than Zen4 in this area. In comparison, Zen4 was a huge improvement over Zen3. Most client workloads are similar. I think your example is pretty niche.

I do not think that Zen5 is horrible, but it's not as big of an improvement as Zen4, Zen3, or Zen2 were. And that's with a nominal 13% ipc improvement. A theoretical Zen6 with 5% higher ipc would be horrible.
I love flutter. Great environment for doing cross-platform SIMPLE applications. At its core, it has a GUI thread, and a main thread. It has ways to code like you actually have threads, but not like Win32 and C++ at all. And you do realize that hardly anyone uses Flutter for PC applications I assume.

Zen 5 was a huge improvement over Zen 4 for server workloads, and was a decent improvement in client in performance as well as efficiency.

My belief is that you can't get big gains in performance without gains in process technology. Since process technology is slowing down to a crawl compared to the past few decades, I think performance gains (and therefore processor releases) are going to slow down as well.
 
Reactions: Tlh97

reaperrr3

Member
May 31, 2024
55
188
66
I didn't say anything about gaming. At my day job I'm a Flutter developer. Its hard to find benchmarks specifically for Dart, but general software development benchmarks show that Zen5 isn't much faster than Zen4 in this area. In comparison, Zen4 was a huge improvement over Zen3. Most client workloads are similar. I think your example is pretty niche.

I do not think that Zen5 is horrible, but it's not as big of an improvement as Zen4, Zen3, or Zen2 were. And that's with a nominal 13% ipc improvement. A theoretical Zen6 with 5% higher ipc would be horrible.
Contrary to what some people here seem to think, I believe Zen6 will be a much more substantial uplift than Zen5 in terms of absolute performance (for client, in any case).

Zen5's problems in client seem to stem from a mix of DC focus (fullrate AVX512, which ate a lot of the transistor budget and is probably the main reason why stuff like the int PRF wasn't upgraded much), tiny process uplift (N4P vs. N5P isn't much of an upgrade) and the Zen4 IOD/GMI3 links being a bit insufficient to keep Zen5 fed, + the loss of optimizations due to the clean-slate redesign of the core.
The lack of any L2/L3 capacity upgrades didn't help, either.

The "shrink + improvements/optimizations" gens - Zen2 and Zen4 - have both been rather solid in terms of absolute performance uplift.

Zen6 will likely be focused on dealing with the bottlenecks outside the core, as well as the integer bottlenecks inside the core.

My guess would be

Core improvements:
- Int scheduler entries for ALU/AGU upgraded from 88/56 to 96/64 or whereabouts
- Int PRF upgraded from 240 to at least 288, perhaps even 336 entries (336 would mean 56 per ALU, like Zen4 had)
- ROB upgraded from 448 to at least 512 entries
- smaller other upgrades throughout the core, including in the FPU area
- return of some optimizations that accelerate some ops (or no-ops via NOPS fusion) substantially
- at least 300, but more likely 500-600 and maybe even 700-800 mhz turbo clock uplift (not all-core, but at least for some of them), thanks to N3P + smart usage of 2-2 and 3-2 fin transistors where it's worth it

Uncore improvements:
- although no increase in L3 per core, cache/bandwidth-sensitive heterogenous workloads (aka not all threads equally heavy) will benefit from the 50% larger L3 per CCD
- less cross-CCD context switch penalties due to bigger CCDs + faster chiplet interconnect
- bandwidth improvements from faster chiplet connection + faster memory support

Wouldn't surprise me if INT-heavy workloads - and therefore many client and semi-professional workloads - would see a bigger effective IPC uplift on Zen6 than what Zen5 gave us.
And then the turbo clock bumps and more cores on top of that.
 

DrMrLordX

Lifer
Apr 27, 2000
22,368
12,175
136
Zen5's problems in client seem to stem from a mix of DC focus (fullrate AVX512, which ate a lot of the transistor budget and is probably the main reason why stuff like the int PRF wasn't upgraded much), tiny process uplift (N4P vs. N5P isn't much of an upgrade) and the Zen4 IOD/GMI3 links being a bit insufficient to keep Zen5 fed, + the loss of optimizations due to the clean-slate redesign of the core.
The lack of any L2/L3 capacity upgrades didn't help, either.
NPU integration didn't do Strix Point any favors either.
 

Hulk

Diamond Member
Oct 9, 1999
4,934
3,367
136
Contrary to what some people here seem to think, I believe Zen6 will be a much more substantial uplift than Zen5 in terms of absolute performance (for client, in any case).

Zen5's problems in client seem to stem from a mix of DC focus (fullrate AVX512, which ate a lot of the transistor budget and is probably the main reason why stuff like the int PRF wasn't upgraded much), tiny process uplift (N4P vs. N5P isn't much of an upgrade) and the Zen4 IOD/GMI3 links being a bit insufficient to keep Zen5 fed, + the loss of optimizations due to the clean-slate redesign of the core.
The lack of any L2/L3 capacity upgrades didn't help, either.

The "shrink + improvements/optimizations" gens - Zen2 and Zen4 - have both been rather solid in terms of absolute performance uplift.

Zen6 will likely be focused on dealing with the bottlenecks outside the core, as well as the integer bottlenecks inside the core.

My guess would be

Core improvements:
- Int scheduler entries for ALU/AGU upgraded from 88/56 to 96/64 or whereabouts
- Int PRF upgraded from 240 to at least 288, perhaps even 336 entries (336 would mean 56 per ALU, like Zen4 had)
- ROB upgraded from 448 to at least 512 entries
- smaller other upgrades throughout the core, including in the FPU area
- return of some optimizations that accelerate some ops (or no-ops via NOPS fusion) substantially
- at least 300, but more likely 500-600 and maybe even 700-800 mhz turbo clock uplift (not all-core, but at least for some of them), thanks to N3P + smart usage of 2-2 and 3-2 fin transistors where it's worth it

Uncore improvements:
- although no increase in L3 per core, cache/bandwidth-sensitive heterogenous workloads (aka not all threads equally heavy) will benefit from the 50% larger L3 per CCD
- less cross-CCD context switch penalties due to bigger CCDs + faster chiplet interconnect
- bandwidth improvements from faster chiplet connection + faster memory support

Wouldn't surprise me if INT-heavy workloads - and therefore many client and semi-professional workloads - would see a bigger effective IPC uplift on Zen6 than what Zen5 gave us.
And then the turbo clock bumps and more cores on top of that.
Great post. I am going to be the contrarian here. Over 6GHz is the equivalent of the sub 2 hour marathon. It can be done but only with cheating. Either wind aided for the runners or with resulting catastrophic degradation for the silicon. As always, I'd love to see it. But I'm not thinking it will happen in this decade with ambient cooling.

As I sit here and edit video my computer is generally silent. Then I apply Mercalli video stabilization to a clip, 10 seconds of heavy compute, and I hear the fans spin up for 10 seconds and then spin down. The interesting thing is it is unlikely the AIO is heat soaked in 10 seconds, the fans are responding to the temporary increase in die temp, which dissipates seconds after the heavy compute is concluded. I'm thinking I might have the fans spin up much slower to the point where they almost don't spin up at all because while they are starting to ramp up the core temps will be coming down. If the temps stay high for a while, which means the coolant is warming up they will of course keeping running fast.

Going to try it. Yup, fan just started to ramp up and then backed down. Core temps did the same quick up down they would have done if the fans spun up or not.
 
Reactions: Tlh97

poke01

Diamond Member
Mar 8, 2022
3,035
4,008
106
AMD will be fine they got leadership on desktop and servers. Intel is currently producing wack P cores..
 

adroc_thurston

Diamond Member
Jul 2, 2023
4,714
6,501
96
AMD will be fine they got leadership on desktop and servers. Intel is currently producing wack P cores..
Intel's not gonna be asleep at the wheel forever. Unified Core is coming, after all.
But at least ARM self-immolated Neoverse since any X5 derivative will have beyond prohibitive costs for server applications due to gigabloat.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |