Info 64MB V-Cache on 5XXX Zen3 Average +15% in Games

Page 25 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Kedas

Senior member
Dec 6, 2018
355
339
136
Well we know now how they will bridge the long wait to Zen4 on AM5 Q4 2022.
Production start for V-cache is end this year so too early for Zen4 so this is certainly coming to AM4.
+15% Lisa said is "like an entire architectural generation"
 
Last edited:
Reactions: Tlh97 and Gideon

leoneazzurro

Golden Member
Jul 26, 2016
1,005
1,599
136
In my opinion, your expectations are too high. According to the Chinese forum about GoldenCove and Cortex X2, Zen 4 is a 20-25% increase in IPC.

Yeah I read about similar figures but I think we will see some clock uplift, too, thanks to the new process. A +35-40% over vanilla Zen3 seems doable.
 
Reactions: Tlh97

Asterox

Golden Member
May 15, 2012
1,028
1,786
136
At the risk of turning this into the Zen4 thread, I think Chips and Cheese (https://chipsandcheese.com/2021/02/05/amds-past-and-future-cpus/) is up till now the only one who has dared to give an actual number (29% IPC uplift). At the time I'd be ready to declare that nonsense for being too early for a CPU that still had 18 months until release but they have been writing some solid (though non-leaking) content after so I'm kinda more on the fence now.

I think it is attractive to believe higher numbers because Intel increasing IPC At almost 20% for 2 gens in a row on desktop (though Cypress Cove had other issues) within a single year and Zen4 looking like it is taking 2 years. Belief doesn't make a product though.

I believe that a lot of the structures that could be changed are not really sized in the gigabyte leaks (stuff like various queues and such, like e.g the ROB or the internal register file), so there are definitely possibilities, but also very little hints what changes they did make and how much those help.

This is an important detail.For concrete confirmation, we still have to wait another year.

"I was told from a trusted source that a Genoa engineering sample (Zen 4 server chip) was 29% faster than a Milan (Zen 3) chip with the same core config at the same clocks."
 

AMDK11

Senior member
Jul 15, 2019
341
235
116
I think that there is nothing to compare the Epyc to the mass AM4 / 5 platform. Epyc may have a higher energy budget, more cache (3D?) Etc. From what I've heard, the Ryzen 7xxx (Zen4) has 20-25% higher IPC and the same maximum clock speed as the Ryzen 5xxx (Zen3). Information supposedly from a person who has access to this knowledge, whatever that means. Of course, as always, only after the premiere tests it will be known if it's true.

PS
The more the better
 

andermans

Member
Sep 11, 2020
151
153
76
Up
Sunny / CypressCove and GoldenCove are, in fact, not one year apart.
Project work:
SunnyCove 2016-2018(relase 2019)
GoldenCove 2018-2020(relase 2021)

WillowCove (x86 SunnyCove with a redesigned cache subsystem) and RaptorCove were certainly working in parallel.
I think work was going on in parallel on RedwoodCove which was finished in June. Perhaps it was RedwoodCove that work lasted from 2017, which would give 3-4 years of design.
The only certainty is SumnyCove, to which Intel admitted that the work started in 2016, and assuming 18-month design cycles, it will be completed in 2018 and made in silicon to be released in 2019.


I agree with you wrt the design, but there seems to be a lot of perception around being stuck on Skylake derivatives that has only really been resolved by Tigerlake-H and Rocketlake. So if you look at high end products in the market ... (+ server had the same thing where Icelake just came out this year and Sapphire Rapids is early next year already)
 

Joe NYC

Platinum Member
Jun 26, 2021
2,329
2,929
106
I don't think you would see AMD do that. They would just wait until Zen 4. Threadripper doesn't have any competition and won't until Sapphire Rapids-X shows up and who knows when that will be.

Threre IS going to be a Zen 3 Threadripper (vanilla and/or 3D). Some leaks trickling out confirm its existence.

Given it is so late in the game for vanilla Threadripper, it would make sense to skip vanilla and jump straight to 3D V-Cache Treadripper, even if it adds a month or to the latest (leaked) schedule of November 2021

It would make more sense than very late release of vanilla Threadripper and skipping Threadripper 3D
 

LightningZ71

Golden Member
Mar 10, 2017
1,659
1,942
136
But, what are we really expecting from Zen3 Threadripper? I'm still not expecting a lot from Zen3 TR over Zen2 TR. The big gain from Zen3 seems to have been ST tasks. Maybe when dealing with programs that exist in the 8 thread realm, you might realize significant gains, as there can be one or two threads per CCD, but anything that would heavily load all the cores probably won't see worthwhile gains. Now, a Zen3D based product might be a bigger deal, but, I have to think that the demand there would be massive for their supercomputer projects, or, for those "Licensed by the core" products where they can have a product that has 8 CCDs, each with a maximum clock single core, and a 4 stack of cache, giving 288MB of L3 per core.
 
Reactions: Tlh97 and Joe NYC
Jul 27, 2020
17,855
11,645
116
IBM Telum's virtual L3 concept is pretty amazing. This could be another way that AMD could increase their ST performance. Free up the space consumed by the large L3 cache, put a core(s) in that same space and use virtual L3. Less hot data for the single thread gets evicted into the L2 cache of a less heavily used adjacent core and in case of a cache hit for that data, it will be significantly quicker to fetch that data than going out to a slow L3 cache, around 7ns vs. almost 19ns for the L3.
 

LightningZ71

Golden Member
Mar 10, 2017
1,659
1,942
136
While it certainly is a latency improvement, it's going to be a power hit. Plus, it's adding overhead to adjacent L2 cache management.

I can see AMD going for a larger L2, to the order of 2MB even, and using 3D stacking for all L3 cache, for their desktop chips.
 
Jul 27, 2020
17,855
11,645
116
While it certainly is a latency improvement, it's going to be a power hit. Plus, it's adding overhead to adjacent L2 cache management.

I can see AMD going for a larger L2, to the order of 2MB even, and using 3D stacking for all L3 cache, for their desktop chips.
A larger L2 would also incur a power hit. So the question is, would the larger L2 with an existent L3 be able to justify the performance improvement over smaller L2 with a virtual L3 scheme? Maybe I'm biased but my mind says that the virtual L3 will conserve more power while delivering better performance, unless the workload is so multicore friendly that all the cores are max utilized and managing the virtual L3 with their workload leads to performance degradation. That might be the case up to a certain number of cores. But very few workloads scale perfectly to a large number of cores so a virtual L3 scheme might end up boosting performance more often than not. Am I making sense?
 

Asterox

Golden Member
May 15, 2012
1,028
1,786
136

Hitman928

Diamond Member
Apr 15, 2012
5,593
8,770
136
A larger L2 would also incur a power hit. So the question is, would the larger L2 with an existent L3 be able to justify the performance improvement over smaller L2 with a virtual L3 scheme? Maybe I'm biased but my mind says that the virtual L3 will conserve more power while delivering better performance, unless the workload is so multicore friendly that all the cores are max utilized and managing the virtual L3 with their workload leads to performance degradation. That might be the case up to a certain number of cores. But very few workloads scale perfectly to a large number of cores so a virtual L3 scheme might end up boosting performance more often than not. Am I making sense?

If they got rid of L3, the L2 would have to grow much more so you would need to compare a Zen core with a bit more L2 cache than current + L3 cache (+stacks) versus a Zen core with comparatively huge L2 caches (possibly +stacks) and virtual L3.
 
Reactions: Tlh97
Jul 27, 2020
17,855
11,645
116
If they got rid of L3, the L2 would have to grow much more so you would need to compare a Zen core with a bit more L2 cache than current + L3 cache (+stacks) versus a Zen core with comparatively huge L2 caches (possibly +stacks) and virtual L3.
That would be a very interesting comparison.
 

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
IBM Telum's virtual L3 concept is pretty amazing.
What and where did you read about it that's amazing? I only read Ian's take, and honestly having previously read plenty cache related patents by AMD I'm honestly not seeing how it's anything special, never mind "magic" as in Ian's words. AMD's patents point to Zen moving in a pretty similar direction.
 
Reactions: Tlh97

HurleyBird

Platinum Member
Apr 22, 2003
2,725
1,342
136
I think it is attractive to believe higher numbers because Intel increasing IPC At almost 20% for 2 gens in a row on desktop (though Cypress Cove had other issues) within a single year and Zen4 looking like it is taking 2 years. Belief doesn't make a product though.

Rocket Lake was "up to" 19%. Actual IPC increase was much lower.

That said, I think Zen4 is going to be a >20% IPC increase. For me, it's a combination that all the early whispers pointing in this direction, and AMD sticking to 16-cores on mainstream desktop when they could have easily, and at one point seem to have planned to, increase the core count. I don't see AMD doing that unless they're confident in 16-core Zen 4 soundly beating, if not Raptor Lake, then at least Alder Lake.

Keep in mind that the a portion of Zen 4's IPC increase is owed to the new IOD.

Of course, it's possible that the new IOD still features links for 3 CCDs, in which case they could be holding the 24-core in reserve. Odds are that they also have Zen 4-3D in reserve.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,329
2,929
106
But, what are we really expecting from Zen3 Threadripper? I'm still not expecting a lot from Zen3 TR over Zen2 TR. The big gain from Zen3 seems to have been ST tasks. Maybe when dealing with programs that exist in the 8 thread realm, you might realize significant gains, as there can be one or two threads per CCD, but anything that would heavily load all the cores probably won't see worthwhile gains. Now, a Zen3D based product might be a bigger deal, but, I have to think that the demand there would be massive for their supercomputer projects, or, for those "Licensed by the core" products where they can have a product that has 8 CCDs, each with a maximum clock single core, and a 4 stack of cache, giving 288MB of L3 per core.

If you count them, AMD could in theory release 6 different products of no volume Threadripper CPU, on 4 (?) different motherboard / chipsets in span of 12-18 months. It is borderline insane:

CPUs
Zen 3 Threadripper
Zen 3 Threadripper Pro
Zen 3D Threadripper
Zen 3D Threadripper Pro
Zen 4 Threadripper
Zen 4 Threadripper Pro

Motherboard (chipset):
4 channel Zen 3 / 3D
8 channel Zen 3 / 3D
4 channel Zen 4
8 channel Zen 4


Cutting this list in half would be a good start for a no volume CPU that is positioned in a category of workstations, and this category (workstations) being an asterisk level market share of the CPUs.

The best place to start cutting is, IMO:
Zen 3 Threadripper
Zen 3 Threadripper Pro
because they are so late to the game that they will be outperformed by regular desktop Ryzen Zen 3D near the time of their launch.

As far as not having enough L3 cache Threadrippers, this is such a low volume product that it would not make a dent.

Threadripper has just not been a master stroke of AMD marketing and product positioning departments. I wonder if it made any money even with super high prices but no volume...
 
Last edited:

Joe NYC

Platinum Member
Jun 26, 2021
2,329
2,929
106
While it certainly is a latency improvement, it's going to be a power hit. Plus, it's adding overhead to adjacent L2 cache management.

I can see AMD going for a larger L2, to the order of 2MB even, and using 3D stacking for all L3 cache, for their desktop chips.

Or, as someone already suggested on this thread, V Cache having both L2 and L3. They are already adjacent on the Zen 3 die, so it would be enlarging the stacked dies by a few percent.

Say 512k on base die and then 512k on stacked dies.
 
Reactions: Tlh97 and Vattila

Joe NYC

Platinum Member
Jun 26, 2021
2,329
2,929
106
Keep in mind that the a portion of Zen 4's IPC increase is owed to the new IOD.

In what area?

I could see some performance increase of memory controller, and perhaps some way, using IOD, to turn all of individual shared L3s into one gigantic shared L3.

Of course, it's possible that the new IOD still features links for 3 CCDs, in which case they could be holding the 24-core in reserve.

it seems that each quad of the I/O die is able to communicate with 3 CCDs, which would mean that if the desktop IOD should have on spare link, if it is reusing 1/4 of the Epyc IOD

Odds are that they also have Zen 4-3D in reserve.

That's almost certain it is coming...
 
Reactions: Tlh97

LightningZ71

Golden Member
Mar 10, 2017
1,659
1,942
136
That's a lot of distance to cover for an L2 cache. That's also a lot more connections that will need to be driven between the die.

I don't see any sort of advantage to having L2 on the stack.
 

Mopetar

Diamond Member
Jan 31, 2011
8,004
6,446
136
I don't know to what extent some of that advanced packaging will be seen in desktop parts. Desktop chips are already close to the point where cooling is a challenge and stacking anything with hotspots compounds the problem.

It's certainly a good fit for the server space. The clock speeds are lower and the power is being spread out over a larger area.
 
Reactions: Tlh97 and Vattila

LightningZ71

Golden Member
Mar 10, 2017
1,659
1,942
136
Reactions: Vattila

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
I read that as more of a "this is where the technology is capable of going" and not as "this is where we are definitely going on our roadmap with it" kind of thing. Yes, it can be done, but should we?
Depends on the costs. Going MCM with Zen and chiplets with Zen 2 allowed AMD to achieve something that they couldn't have done with a monolithic equivalent. There may come instances where the same is true for stacked dies.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,329
2,929
106
That's a lot of distance to cover for an L2 cache. That's also a lot more connections that will need to be driven between the die.

It is actually less distance to cover, not more.

Suppose AMD would in fact go from 512k to 2 MB. It means doubling both dimension of the L2 cache rectangle. Could be a millimeter or more

Distance up is only 50 microns

I don't see any sort of advantage to having L2 on the stack.

- Increasing the total L2 capacity without increasing the base CCD size
- Shorter distances
- Increasing area eligible for stacking by another 10-20%
 
Last edited:
Reactions: Tlh97 and Vattila
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |