Question AMD Rembrandt/Zen 3+ APU Speculation and Discussion

Page 55 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

izaic3

Member
Nov 19, 2019
61
96
91
Alright, so we've had some leaks so far. I don't know if any of it's been confirmed yet, as it's pretty early, but here is what I've surmised so far (massive grain of salt of course):

If if turns out to have RDNA 2 and 12 CU, I could see iGPU performance potentially almost doubling over Cezanne.

If I've made any mistakes or gotten anything wrong, please let me know. I'd also love to hear more knowledgeable people weigh in on their expectations.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Intel expects they have the graphics leadership with Meteor Lake and Arrow Lake.

I experienced this many times before. The leapfrog happens but always within reach of the competitor. If it sounds much better than it turns out to be, it's all marketing. Made it sound like Iris Xe would absolutely spank Vega 8. Same with RDNA2 spanking Iris Xe. Sure 30% is great but not huge for graphics.

No magic with these things. It's an iGPU so you sacrifice things.
 

mikk

Diamond Member
May 15, 2012
4,172
2,210
136
1. No it wouldn't. Too few ROPs.

2. Add 50% more units and the GPU power alone would be approaching 60W under load. Without including CPU nor SoC power consumption.

3. You're heavily overestimating the performance boost from memory. You have been from the very beginning. You just refuse to listen.


I'm not. A 50% bigger Vega 8 would have been more bandwidth starved than the current Vega. The potential improvement is all combined from 6nm, 50% more units, 50-100% more bandwidth. I think people were expecting a RDNA2 increase based on the highend Vega.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
1. No it wouldn't. Too few ROPs.

2. Add 50% more units and the GPU power alone would be approaching 60W under load. Without including CPU nor SoC power consumption.

3. You're heavily overestimating the performance boost from memory. You have been from the very beginning. You just refuse to listen.

This seems to be a great example of the pitfalls of extrapolating using a part in a completely different power and thermal envelope.

Chip designs have an optimal point, so the behavior will change depending on the part. We straight up assumed 50% more CUs with 50% more efficiency will end up 2-2.2x. It can reach 2x, but only if we give it much more power, which is different from the dGPUs, which really did get 1.5x1.5x between Vega and RDNA2.

Remember the speculation that Vega 8 was much more efficient than Vega 64 because it was running in a much more efficient clock setting? Seems to be true. So if AMD settled for less performance on the Vega 64, RDNA wouldn't have looked as impressive on the perf/watt graph. But AMD needed that extra power to justify higher costs because they had to use HBM.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,423
2,914
136
At 25W it must be power bound, that's why there are much smaller differences between the 6600H and the 6800H. I saw the same with Haswell iGPUs. The 15W 40EU version was only 5-10% faster than the 15W 20EU version. 28W 40EU was an additional 10% faster. Really it was memory bound as well but at 15W power bound too.
So a 28W 40EU Haswell IGP was only 21%(100*1.1*1.1) faster than a 15W 20EU version? That is very underwhelming.
BTW only a measly 10% gain by increasing TDP from 15W to 28W, which pretty much means the IGP could use >2x more W, I can't really consider as power bound. If It was power bound, then I would expect much bigger gains.
 

uzzi38

Platinum Member
Oct 16, 2019
2,702
6,405
146
I'm not. A 50% bigger Vega 8 would have been more bandwidth starved than the current Vega. The potential improvement is all combined from 6nm, 50% more units, 50-100% more bandwidth. I think people were expecting a RDNA2 increase based on the highend Vega.

GPU performance is not just more compute + more bandwidth == more performance. Vega already couldn't go any further without adding another Shader Engine. More CUs would not have helped.

Lets push Arrow Lake to one side seeing as that's now a confirmed 2024 product and now it's clear that it competes against Strix Point, which, suffice to say, will almost certainly sport vastly better iGPU performance.

As for Meteor Lake, this is probably Intel's best chance at taking the iGPU crown again. We'll have to see. Before I thought AMD would need LPDDR5X to do it, but looking at Rembrandt, they could probably use the node shrink to just bring that same peak performance down to the 25-28W device range instead. To be competitive in iGPU performance up to higher performance tiers all it would take is 1-2WGPs added to each Shader Array.
 
Reactions: Tlh97

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
So a 28W 40EU Haswell IGP was only 21%(100*1.1*1.1) faster than a 15W 20EU version? That is very underwhelming.
BTW only a measly 10% gain by increasing TDP from 15W to 28W, which pretty much means the IGP could use >2x more W, I can't really consider as power bound. If It was power bound, then I would expect much bigger gains.

Yes, you have a good point. The Haswell and Broadwell iGPUs needed the eDRAM to uncork it's performance. The eDRAM gave 30% improvement at the same power level.

Apologies for making it confusing. My point was that you gained more going from 15-28W than doubling the EUs. At 15W the gain was almost negligible. Just by going to 28W it kinda opened it up. Out of 25% gains I'd say doubling EUs equaled about 7% and going to 28W about 15%.

You have to say while it's not the number 1 reason for the lack of increase, it still was a significant factor. Later we got to see it when Broadwell introduced eDRAM for the U parts.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Before I thought AMD would need LPDDR5X to do it, but looking at Rembrandt, they could probably use the node shrink to just bring that same peak performance down to the 25-28W device range instead.

This makes sense. If we compare RX 6800 to RX 680M, the former has 5x the compute. The RX 6800 has 5x bandwidth compared to LPDDR5 version, and 6.6x compared to DDR5. It also has the Infinity Cache.

However, the RX 6800 has to run at 4x the resolution, at Ultra settings including textures at higher expected fps.

Compared to Vega 8, we know it doesn't need LPDDR4-4266. Probably 3600 is optimal. And RDNA2 probably improves bandwidth saving techniques? Then something like 5200 is all that's needed to get the 60-70% gain.
 

mikk

Diamond Member
May 15, 2012
4,172
2,210
136
I experienced this many times before. The leapfrog happens but always within reach of the competitor. If it sounds much better than it turns out to be, it's all marketing. Made it sound like Iris Xe would absolutely spank Vega 8. Same with RDNA2 spanking Iris Xe. Sure 30% is great but not huge for graphics.

No magic with these things. It's an iGPU so you sacrifice things.


There is only a 52% difference between the i7-11370H LPDDR4x-4266 versus 6800H 54W LPDDR5-6400. As I said AMD needs another big improvement, otherwise no chance. The good thing is AMD should be aware they can't use the same amount of shader units for 2-3 years like they did with Vega for several generations as the GPU clock speed is already quite high on Rembrandt, not much to gain there. Maybe 16CU is the way to go for the 5nm generation. Also they really need to overthink their U-lineup SKU strategy, it's just dumb. 6 H SKUs get 12CUs when they basically all are shipped with a faster dGPU whereas only 1 U SKU gets the full 12CU. And 3 of the U SKUs are old gen using a lower clocked Vega.
 

Mopetar

Diamond Member
Jan 31, 2011
8,004
6,446
136
Sure 30% is great but not huge for graphics.

No magic with these things. It's an iGPU so you sacrifice things.

I think you're underselling 30%. If we saw that kind of jump in a CPU no one would call it anything less than huge.

To better illustrate a 30% gain means going from 46 FPS to 60 FPS on average. That's definitely significant and makes a massive difference for the space that this product exists in.
 

Heartbreaker

Diamond Member
Apr 3, 2006
4,262
5,259
136
amazing to say the least


If you don't look to closely at the results. It's only when you compare FSR enabled on AMD, and the NVidia equivalent disabled.

Here's a chart where it fits in when comparing Apples to Apples: It trades blows with the MX 450, and is well below a mobile GTX 1650.

 

ryanjagtap

Member
Sep 25, 2021
110
132
96
I think it is a good first time implementation of RDNA2 as an igpu, even if it doesn't match to their claims of 2x performance on same TDP. Hope they further enhance it as required, whether increasing CU if not bandwidth starved or clock optimizations but it is already clocked pretty high. I don't think they will get everything right on the first try, we have not got many independent reviews yet to know all its nuances. Just like ADL is a good 'conceptual' generation with Raptor Lake smoothing out issues in it, I think the next iteration of these igpu will smooth out the problems we find out in this generation.
 

LightningZ71

Golden Member
Mar 10, 2017
1,659
1,942
136
You can really see the impact that not having any infinity cache is having on the RDNA2 architecture. If they can manage a way to get 16MB of it on the next gen APU, it would be good for another nice bump in performance.
 

Shivansps

Diamond Member
Sep 11, 2013
3,873
1,527
136
Reactions: Kaluan

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,729
136
And suddently RX570 perf on desktop with fast DDR5 rams does not sounds that crazy.
It's a far cry from RX 570 performance. That headline references a ComputerBase screenshot where the GTX 1060 is 20% faster in Cyberpunk 2077, a game where Pascal underperforms.

 

ryanjagtap

Member
Sep 25, 2021
110
132
96
You can really see the impact that not having any infinity cache is having on the RDNA2 architecture. If they can manage a way to get 16MB of it on the next gen APU, it would be good for another nice bump in performance.
I think they were developing some kind of unified cache, that would be more efficient in monolithic die like the APUs and will not increase the die size.
 

Shivansps

Diamond Member
Sep 11, 2013
3,873
1,527
136
It's a far cry from RX 570 performance. That headline references a ComputerBase screenshot where the GTX 1060 is 20% faster in Cyberpunk 2077, a game where Pascal underperforms.


The graph there shows that it is 25% slower than a desktop GTX1650 with a 12900K and 50% slower than a desktop RX580 with a 12900K. We are talking about a mobile apu with just DDR5-4800 here... DDR5-4800 is like DDR4-2133, desktop version should be a lot faster, specially with good rams.
 

tamz_msc

Diamond Member
Jan 5, 2017
3,865
3,729
136
The graph there shows that it is 25% slower than a desktop GTX1650 with a 12900K and 50% slower than a desktop RX580 with a 12900K. We are talking about a mobile apu with just DDR5-4800 here... DDR5-4800 is like DDR4-2133, desktop version should be a lot faster, specially with good rams.
The point is, that results from one benchmark doesn't make the 680M equivalent to a GTX 1060 level discrete GPU.
 

LightningZ71

Golden Member
Mar 10, 2017
1,659
1,942
136
I think they were developing some kind of unified cache, that would be more efficient in monolithic die like the APUs and will not increase the die size.
Increasing the L2 size and moving to a 32MB SLC with good partitioning rules (always reserve x% for iGPU etc.) Would do a lot for overall APU performance. That should be workable on N5.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |