Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)

Page 598 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,243
1,679
136
The supply of V-cache dies being a limiting factor, using two of them in a single CPU prevents the "birth" of another CPU.

Suppose AMD has 10 V-cache dies. They can make 10 X3D CPUs with that or they can choose to make 5 X3D CPUs. Which one nets them the most profit?

It's the same issue that prevented Intel from going all in with eDRAM. You can produce 100,000 CPUs a month in a fab without eDRAM but with eDRAM, the production rate drops to about 45,000 per month. Which one do you think the bean counters gonna approve of? They can't sell the eDRAM CPU for twice as much coz it doesn't offer twice as much value. So their only option was to drop it.
Approaching this from a different angle, they clearly stated they are working on "cool differentiators" plural, vs previous gen X3D models. There is only a small handful of things they could conceivably do, and I covered almost all of them.

The question really is, which things did they actually choose/manage to do?

a) clockspeed parity
b) improved thermal interface
c) fully unlocked overclocking
d) increased cache size
e) dual V$ chiplet

If anyone else has any ideas what else they could possibly do here, add to the list.

I suppose if they wanted to, they could offer two dual CCD skus, one with a single V$ chiplet and a price premium model with dual V$. 9950X3D and 9955X3D, if you will.
 
Jul 27, 2020
17,892
11,678
116
a) clockspeed parity <<< maybe
b) improved thermal interface <<< maybe
c) fully unlocked overclocking <<< partially (not fully but maybe better than every X3D before)
d) increased cache size <<< unlikely
e) dual V$ chiplet <<< unlikely

@adroc_thurston , what do you think?
 

DavidC1

Senior member
Dec 29, 2023
387
576
96
It's the same issue that prevented Intel from going all in with eDRAM. You can produce 100,000 CPUs a month in a fab without eDRAM but with eDRAM, the production rate drops to about 45,000 per month. Which one do you think the bean counters gonna approve of? They can't sell the eDRAM CPU for twice as much coz it doesn't offer twice as much value. So their only option was to drop it.
I think the real issue is that it's a constant fight between consumers who wants most for the least as possible and the companies looking to charge most as possible for the least as possible.

Silicon is called the 21st century gold mine for a reason. You take sand, the most common material on the Planet, and after many refinements sell for 10x the cost of the material. Most is still R&D and cost in tools anyway.

You can sell a high end CPU with eDRAM and V-cache, but it's not like you can take a regular CPU and just slap on it. It costs thermals and thus clocks, so the performance ain't exactly the same in all applications. And now you pay for much more silicon, which by itself is cheap as dirt but making it work as eDRAM or Cache is not. You have to pay the factory space.

You aren't amortizing the cost of eDRAM by tens of millions on cheapie $400 laptops, it's entirely on the high end chips.
 

DavidC1

Senior member
Dec 29, 2023
387
576
96
Thoughts on Zen 5:

Based on David Huang's analysis, it seems the architecture isn't a straight up boost, but a reduction in some aspects, a compromise made to make the design more efficient whether in area or power or both.

Also based on his tests the dual decoder setup is relatively primitive to even the one on Tremont, where it can reach 5-wide decode(limited by backend) in some cases for single thread, and definitely a big step back from Gracemont where it can do 5-wide all the time at least for ST.

If it's true where it can do 8-wide only with SMT and if we take chief architect of Skymont's comments that 3/6/9/12 is the optimal clustered decoder setup, then AMD may have done 2x4 for the sake of increasing performance on servers but making it more power efficient for PCs by having one cluster off.

At the end of the article Huang says "Intel's most dynamic team, the Atom one having similar conclusions to AMD's".
 
Jul 27, 2020
17,892
11,678
116
Installed latest librecalc:
First run: 14s
Second run: 9s
The thing with this benchmark is, it looks pretty heavy and gives you a sense that the PC is struggling to do a lot at once and it's really hard to get the time lower. I mean, the price difference between my 12700K and your 7950X3D probably doesn't justify the two second time difference but it really shows how much ST performance you need to throw at this benchmark for it to budge. It would be crazy if a simple 9600X has enough ST oomph to do the first run in 11s.
 

Timmah!

Golden Member
Jul 24, 2010
1,463
729
136
The supply of V-cache dies being a limiting factor, using two of them in a single CPU prevents the "birth" of another CPU.

Suppose AMD has 10 V-cache dies. They can make 10 X3D CPUs with that or they can choose to make 5 X3D CPUs. Which one nets them the most profit?

It's the same issue that prevented Intel from going all in with eDRAM. You can produce 100,000 CPUs a month in a fab without eDRAM but with eDRAM, the production rate drops to about 45,000 per month. Which one do you think the bean counters gonna approve of? They can't sell the eDRAM CPU for twice as much coz it doesn't offer twice as much value. So their only option was to drop it.
I guess i think this whole “supply of v-cache dies being limiting factor” is bit overestimated. If they were so hard to come by, i dont think we would see them in “client” space at all, everything would be going into servers.

And if we did, there would be significant price increase for v-cache equipped product - that did not really happen either.

Finally, we have been told over and over by Adroc over the past year, how the DYI space is miniscule and no consequence for AMD, how we literally dont exist. And then 16c CPU, by the virtue of being most expensive, is surely gonna be only small part of overall sales in there - but AMD cant find any additional v-cache dies for those?
 
Jul 27, 2020
17,892
11,678
116
I guess i think this whole “supply of v-cache dies being limiting factor” is bit overestimated. If they were so hard to come by, i dont think we would see them in “client” space at all, everything would be going into servers.
Maybe these consumer V-cache dies are rejected dies that didn't make the cut for server requirements? Like suppose they have some stringent engineering validation routine that pinpoints how many errors may occur in a million reads/writes. If certain dies are not meeting this quality metric, AMD can't afford to have bad publicity from enterprise customer saying that their important financial data (even a bit change can cause chaos there) was corrupted on an AMD server. So they need to recoup their investment and what better way than to throw these dies at us mere peasants

Maybe adroc is right and their first priority for high quality stuff really is the server/workstation market. We only get the scraps.
 

Jan Olšan

Senior member
Jan 12, 2017
312
402
136
Not for Apple it seems. None of their SoC's seem to have issues at launch. Thats vertical integration and a great validation team for you.
So I wouldn't say laptops are hard but Intel needs to step up their standards and not release such products that are a wase of sand.
They had a bug that was causing accelerated wear of their (fixed) SSDs. I'll take any perceived or real gripes with AMD or Intel over that, any day.
 
Jul 27, 2020
17,892
11,678
116
Look on eBay:





When the price difference between identical core count server CPUs due to V-cache is more than $2000, no way AMD is just gonna give away their V-cache dies. It also cuts into their server marketshare because people and even companies could start using the dual V-cache CPUs for their commercial workloads instead of investing in a server.
 
Reactions: Tlh97 and Markfw

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,243
1,679
136
Look on eBay:

View attachment 102159

View attachment 102160

When the price difference between identical core count server CPUs due to V-cache is more than $2000, no way AMD is just gonna give away their V-cache dies. It also cuts into their server marketshare because people and even companies could start using the dual V-cache CPUs for their commercial workloads instead of investing in a server.
I didn't know you could easily substitute a small 16 core desktop CPU for a 96-128-192 core server CPU.

Also, it was mentioned previously in this thread that Turin doesn't share silicon with Granite Ridge, so it may not be as simple as choosing the most profitable end product for a given chiplet to be used in.

As I mentioned above, AMD could choose to release both single and dual V$ CCDs and charge a price premium that is not too far off than the per-CCD premium in the Genoa-X you listed. $700 for a 9950X3D and $800 for a 9955X3D with dual V$ gives them a $100 premium. That's far, far away from "giving the V$ dies away".
 
Reactions: Tlh97 and Timmah!
Jul 27, 2020
17,892
11,678
116
$700 for a 9950X3D and $800 for a 9955X3D with dual V$ gives them a $100 premium. That's far, far away from "giving the V$ dies away".
Problem is, some enterprise workloads see a huge boost from increased cache. Giving two cache dies would boost the workloads enough for some enterprise customers to consider these dual V-cache consumer CPUs for their workloads, thus leading to loss of server CPU sales to AMD.

Let's forget companies in general. What if AMD releases a dual V-cache consumer chip and the cloud providers scoop up ALL current and future inventory of next two years for upgrading their servers because the value proposition is too good to pass up? Can AMD say no? Will their board let Lisa say no? I'm saying this is a possibility because the production capacity of V-cache CPUs probably numbers in the hundreds of thousands per year whereas vanilla CPUs are in the millions. We'll be the losers if that happens and we'll only get those used V-cache CPUs to run in server mobos after they are too old for the cloud providers (meaning 5 years old at least).
 

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,243
1,679
136
Problem is, some enterprise workloads see a huge boost from increased cache. Giving two cache dies would boost the workloads enough for some enterprise customers to consider these dual V-cache consumer CPUs for their workloads, thus leading to loss of server CPU sales to AMD.

Let's forget companies in general. What if AMD releases a dual V-cache consumer chip and the cloud providers scoop up ALL current and future inventory of next two years for upgrading their servers because the value proposition is too good to pass up? Can AMD say no? Will their board let Lisa say no? I'm saying this is a possibility because the production capacity of V-cache CPUs probably numbers in the hundreds of thousands per year whereas vanilla CPUs are in the millions. We'll be the losers if that happens and we'll only get those used V-cache CPUs to run in server mobos after they are too old for the cloud providers (meaning 5 years old at least).
Let's forget that there are inherent memory capacity limitations to consumer platform and only one socket per motherboard. A cloud provider couldn't possibly make buying consumer cpus work due to the considerably lower rack density possible. Datacenter rackspace isn't cheap, and you could conceivably need 4-8x or even more rackspace for the same core count.
 
Jul 27, 2020
17,892
11,678
116
Datacenter rackspace isn't cheap, and you could conceivably need 4-8x or even more rackspace for the same core count.
How about mini-ITX mobos stacked together in tight spaces? These cloud providers are crazy. They'll do anything.

A 32 thread V-cache instance that they can rent out to a customer for 5 years at even $200 a month would pay itself many times over for them.
 

Timmah!

Golden Member
Jul 24, 2010
1,463
729
136
Problem is, some enterprise workloads see a huge boost from increased cache. Giving two cache dies would boost the workloads enough for some enterprise customers to consider these dual V-cache consumer CPUs for their workloads, thus leading to loss of server CPU sales to AMD.
I really doubt this is gonna happen, to extent that would be felt by AMD, and anyway, even if it did, we really should not care one bit about what is the most profitable for them.
 

Hail The Brain Slug

Diamond Member
Oct 10, 2005
3,243
1,679
136
Just curious, why would one even need unlocked OC mode in X3D. This is essentially a relic of the past for modern CPUs.
As it stands 7000 series X3D cannot have the voltage increased, nor the clockspeed increased. Ignoring static overclocking, that means you can't use PBO to tune the 3D cache chiplet at all outside reducing voltage with CO. No squeezing out 100-200MHz boost clocks.

Edit: if there's more nuance to the limitations than I implied, @Det0x would know.
 
Last edited:

eek2121

Diamond Member
Aug 2, 2005
3,046
4,268
136
Look on eBay:

View attachment 102159

View attachment 102160

When the price difference between identical core count server CPUs due to V-cache is more than $2000, no way AMD is just gonna give away their V-cache dies. It also cuts into their server marketshare because people and even companies could start using the dual V-cache CPUs for their commercial workloads instead of investing in a server.


ERM, you need to divide that by the number of CCDs with V-Cache. They may appear to make slightly more with EPYC, but without knowing retailer margins we have no way of knowing for sure.

I will wager a guess and say AMD does have a somewhat higher margin with EPYC and they use that to help offset client, but then again, it likely will vary based on the model and whether the part is high volume or not.

As to why AMD. would settle for lower margins for some part? Revenue. The cheaper the parts, the more they can move. It is a huge balancing act and it takes experience and understanding the market. I don’t envy the bookkeepers, though the entire world of production/manufacturing, distribution, etc has always been fascinating to me, especially because of lean…

I miss being in the sector TBH, it has been almost a decade since I worked in that area.

At any rate, I do hope we get a dual-CCD V-Cache part. AMD has the opportunity to sell me a holy-grail type of product. They only have to make it.

Also, there is another possibility: AMD could differentiate based on the size of the V-Cache.
 
Reactions: Tlh97 and Joe NYC
Jul 27, 2020
17,892
11,678
116
At any rate, I do hope we get a dual-CCD V-Cache part. AMD has the opportunity to sell me a holy-grail type of product. They only have to make it.

Also, there is another possibility: AMD could differentiate based on the size of the V-Cache.
Personally I have zero hope that any of these fantasies will come true. There is just no competition to force AMD's hand.
 
Reactions: Tlh97 and Makaveli

Det0x

Golden Member
Sep 11, 2014
1,055
3,090
136
What AMD needs is something like Intel Thread Director, a hardware solution that dynamically and in a fraction of a millisecond, schedules the workload on the appropriate die. They could make things easier for themselves by doing the "software hints" thing that Intel has done and then game developers or any other developers with cache hungry software can hint to the OS to run their instructions on the V-cache die only.

The jumping between a cache CCD and a frequency CCD on dual CCD X3D CPU's is an issue with regards to prioritizing threads, and which AMD has a really clunky solution for that obviously is suboptimal, proven by the fact that people run software to assign specific games or apps to certain cores.

However, the inter-CCD communication is still an issue with all dual CCD CPU's, like both the 7950X and 7950X3D. How big is that latency penalty? I'm asking because we currently seem to be obsessed with how an X3D CPU with dual cache CCD's would perform. We only have the data from the vanilla and the 'uneven' X3D models though. I am a fan of simpler designs, and I enjoy the piece of mind of my 7950X not needing more thread scheduling logic, but I do realize that this CPU also has a small penalty when data moves between CCD's. A CPU with two X3D cache CCD's would also have this penalty. Can anyone make a qualified guess how much of an issue that would be compared to the 7950X and 7950X3D? Would it for instance be more of an issue in applications that benefit more from more cache? As much as I dislike the need for cache/frequency logic, I'm open to the idea that the inter-CCD communication could be a significant part of the problem, and that is not going away in the upcoming generation even if AMD (unlikely) offers dual cache CCD X3D CPU's.
Windows-key + G --> "This is a game" keeps the program running on CCD0 (no scheduling issues)
Need updated Gamebar and AMD windows driver package with balanced powerplan to work properly

V-cache on both CCDs would limit max ST boost clocks since voltage limitation (~1.2 vs ~1.35)
 
Last edited:

Timmah!

Golden Member
Jul 24, 2010
1,463
729
136
Windows-key + G --> "This is a game" keeps the program running on CCD0 (no scheduling issues)
Need updated Gamebar and AMD windows driver package with balanced powerplan to work properly

V-cache on both CCDs would limit max ST boost clocks since voltage limitation (~1.2 vs ~1.35)
Might not be the case with Zen5, if it needs just 1,2V for its max clocks, as per one of the rumors, if i understood it correctly
 

Joe NYC

Platinum Member
Jun 26, 2021
2,331
2,942
106
Problem is, AMD makes a lot more by placing a single cache die on a CPU than if they were to place two dies on the same CPU. By placing two dies on the same CPU, they would have to recover the extra margin direct from the customer which would make the CPU very expensive and too unattractive for mass market appeal.

Among V-Cache possibilities, there is a possibility of single die by 2x die size covering the full die with 2x SRAM. And if V-Cache covered the whole chip, AMD could us Wafer on Wafer stacking.

Wafer on Wafer stacking should be so much cheaper than existing method of individually placing 1 SRAM die and 2 filler dies, that the cost savings on stacking could pay for 2x dies size.

The hurdle to overcome is conducting heat through V-Cache die as effectively as currently the heat can be conducted through the filler dies. This doesn't seem like the hardest problem in the field of semiconductors to overcome...

What AMD needs is something like Intel Thread Director, a hardware solution that dynamically and in a fraction of a millisecond, schedules the workload on the appropriate die. They could make things easier for themselves by doing the "software hints" thing that Intel has done and then game developers or any other developers with cache hungry software can hint to the OS to run their instructions on the V-cache die only.

If V-Cache chips can clock as high as non-V-Cache, that kludge of a solution that is 7950x3d would disappear instantly, rather than piling scheduling kludge on top of a hardware kludge.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |