Question AMD Rembrandt/Zen 3+ APU Speculation and Discussion

Page 25 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

izaic3

Member
Nov 19, 2019
61
96
91
Alright, so we've had some leaks so far. I don't know if any of it's been confirmed yet, as it's pretty early, but here is what I've surmised so far (massive grain of salt of course):

If if turns out to have RDNA 2 and 12 CU, I could see iGPU performance potentially almost doubling over Cezanne.

If I've made any mistakes or gotten anything wrong, please let me know. I'd also love to hear more knowledgeable people weigh in on their expectations.
 
Last edited:

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
I'm a bit less optimistic. With DDR5 128 bit implementations, we can reasonably expect about 75-80% of the memory bandwidth ofthe RX560. Rumors for CU/wgp count suggest that it will have a FLOPS count in the ballpark of the rx560.

That suggests to me that we're looking at similar performance to the RX560X mobile GPU from four years ago, which was around the mobile 1050/ti in performance. That's not bad, and is good enough for decent 1080p gaming in mobile.
You say you're less optimistic, but this seems like a pretty good idea of what to expect tbh.

But just one thing. FLOPs count should be significantly higher than the RX560. Rembrandt's iGPU is clocked higher than I first expected.

3/4s the shaders but clocks go from the 1.275MHz -> 2.xGHz.

 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,429
2,914
136
RX 560M 60-80W: 16CU(1024SP), 1.29GHz => 2642 GFlops
Rembrandt IGP 15W: 12CU(768SP), 1.6GHz => 2458 GFlops
Rembrandt IGP 45W: 12CU(768SP), 2GHz => 3072 GFlops
Rembrandt IGP 95W: 12CU(768SP), 2.5GHz => 3840 GFlops

I think Rembrandt can sustain these clocks at the given TDP, maybe even more.

At 15W It has 7% less GFlops, but IPC is ~20% higher as I calculated here.
So even at 15W It should be ~10% faster than RX 560M.

At 45W It has 16% more GFlops + 20% IPC would mean ~40% higher performance than RX 560M.
At 95W It has 45% more GFlops + 20% IPC would mean ~74% higher performance than RX560M.

Limited bandwidth can cripple the performance, the question is by how much.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,429
2,914
136
Doesn't matter if it only has half the ROPs. Or at least, it should even out.

Does anyone have info on how many ROPs Rembrandt has? If it's like Vega IGPs, it only has 8. DDR5 might enable 16 to work well after all.
I don't see why It should have only 8 ROPs. I think It will have 16-32 ROPs. Let's not forget that Vega IGP had also less CU.
 
Last edited:

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
Doesn't matter if it only has half the ROPs. Or at least, it should even out.

Does anyone have info on how many ROPs Rembrandt has? If it's like Vega IGPs, it only has 8. DDR5 might enable 16 to work well after all.
4RBs per Shader Engine, so that's 32ROPs I believe? It has 1 Shader Engine and 2 Shader Arrays unlike Van Gogh. Ultimately, that means it has twice the number of ROPs of the RX560.

Not that it mattered anyway. Raster throughput per ROP scales with clocks, and Rembrandt still clocks nearly twice as high.

EDIT: I'm assuming that Rembrandt uses RDNA2's RB+s here. If it's the same RBs as RDNA1 and GCN then we'd instead be looking at 16 ROPs.

Although I would be rather surprised to see RDNA1 RBs again.
 
Last edited:

eek2121

Diamond Member
Aug 2, 2005
3,051
4,276
136
Far more? Why 'cos 6nm ? I'd be surprised.
Sorry, I have not been around much.

One of the benefits of EUV is that speeds up the production process. This allows AMD to output more chips in a given period of time than they would without using EUV.

I don’t recall the exact numbers, but TSMC noted that the total machine time was significantly reduced.
 
Reactions: Tlh97 and Thibsie

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
Sorry, I have not been around much.

One of the benefits of EUV is that speeds up the production process. This allows AMD to output more chips in a given period of time than they would without using EUV.

I don’t recall the exact numbers, but TSMC noted that the total machine time was significantly reduced.
Single vs dual or quad patterning. It depends on how many layers use EUV and the size of the etching on the layer.
 
Reactions: Tlh97 and Thibsie

Thibsie

Senior member
Apr 25, 2017
811
888
136
Single vs dual or quad patterning. It depends on how many layers use EUV and the size of the etching on the layer.
Sorry, I have not been around much.

One of the benefits of EUV is that speeds up the production process. This allows AMD to output more chips in a given period of time than they would without using EUV.

I don’t recall the exact numbers, but TSMC noted that the total machine time was significantly reduced.

Good if reduction is significant though.
Numbers would be helpful indeed.
 

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
Good if reduction is significant though.
Numbers would be helpful indeed.
The numbers vary. There is no fixed value.

For example, the levels that needs 4 passes presently to resolve a feature, can be replaced with 1 pass using EUV. How many layers do you use EUV? How many machines do you have? What is the relative throughput? It depends, as EUV is still slower/stage, but there are a lot less stages.

You also get less defects with EUV on average so more good die/wafer.
 

Insert_Nickname

Diamond Member
May 6, 2012
4,971
1,692
136
I think It will have 16-32 ROPs.
4RBs per Shader Engine, so that's 32ROPs I believe? It has 1 Shader Engine and 2 Shader Arrays unlike Van Gogh. Ultimately, that means it has twice the number of ROPs of the RX560.

...

EDIT: I'm assuming that Rembrandt uses RDNA2's RB+s here. If it's the same RBs as RDNA1 and GCN then we'd instead be looking at 16 ROPs.

Although I would be rather surprised to see RDNA1 RBs again.

32 ROPs seem a little excessive for an IGP. 16 would seem realistic. Like Raven/Picasso. If we assume a 12CU design, that'd be 768SP/48TMU/16ROP. Coupled with a 2GHz+ clockspeed, and DDR5 that'd certainly be in RX560+ territory.
 

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
32 ROPs seem a little excessive for an IGP. 16 would seem realistic. Like Raven/Picasso. If we assume a 12CU design, that'd be 768SP/48TMU/16ROP. Coupled with a 2GHz+ clockspeed, and DDR5 that'd certainly be in RX560+ territory.
I think it's very possible it will be 32 ROPs, but not because Rembrandt needs it - rather to simplify the design work needed for future APUs. I'd imagine LP5X support is going to come with either Strix Point or Phoenix, and then you will also have a system level cache by the the time the former releases too. Particularly at that latter time, it'd make a lot of sense to have 32 ROPs whilst also bumping up the WGP count to 10+
 

Shivansps

Diamond Member
Sep 11, 2013
3,873
1,527
136
I'm a bit less optimistic. With DDR5 128 bit implementations, we can reasonably expect about 75-80% of the memory bandwidth ofthe RX560. Rumors for CU/wgp count suggest that it will have a FLOPS count in the ballpark of the rx560.

That suggests to me that we're looking at similar performance to the RX560X mobile GPU from four years ago, which was around the mobile 1050/ti in performance. That's not bad, and is good enough for decent 1080p gaming in mobile.

When comparing to Polaris you need to remember that Polaris is very memory inefficient. Just take a look at how the RX550 64 bit petty much dies vs the 128 bit version, and the RX550 128bit has the same bandwidth as the RX560 that is x2 in everything... The RX560 should be performing a lot faster for its size, but it does not due to memory bandwidth limits.

RDNA2 is expected to be faster than Polaris at a similar bandwidth. Think about this: the 5600G is already matching the RX550 128 bits and in some cases it beats it by some large margin with just single ranks ddr4-3200... that is what? 30% the effective bandwidth of the RX550/560? RMB has a bigger igp and better mem efficiency, something has to go very, very wrong to perform like a 560.
 

Spicy

Member
Oct 5, 2021
46
48
51
Supposedly, N6 is both an increase in die per wafer due to it having increased density and being design rules compatible with N7 in addition to increasing the wafer throughput per line.
GF's 12nm was compatible with its 14nm, but without density benefit. To have smaller die -> need new design/new mask (Zen+ had same surface than Zen).
 

maddie

Diamond Member
Jul 18, 2010
4,787
4,771
136
GF's 12nm was compatible with its 14nm, but without density benefit. To have smaller die -> need new design/new mask (Zen+ had same surface than Zen).
Zen+ was the same size by choice. 12nm did have a small density improvement.
You will need some new masks anyhow as 6nm has EUV for several layers and 7nm doesn't. Admittedly, if you're going denser, all the masks would change.
 

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
When comparing to Polaris you need to remember that Polaris is very memory inefficient. Just take a look at how the RX550 64 bit petty much dies vs the 128 bit version, and the RX550 128bit has the same bandwidth as the RX560 that is x2 in everything... The RX560 should be performing a lot faster for its size, but it does not due to memory bandwidth limits.

RDNA2 is expected to be faster than Polaris at a similar bandwidth. Think about this: the 5600G is already matching the RX550 128 bits and in some cases it beats it by some large margin with just single ranks ddr4-3200... that is what? 30% the effective bandwidth of the RX550/560? RMB has a bigger igp and better mem efficiency, something has to go very, very wrong to perform like a 560.
Huh, TIL there's a Radeon 550 which is 64b and the RX550 is seperate
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
14LPE 78CPP/9T
14LPP adds 84CPP/9T & 84CPP/10.5T
GF must have added 78CPP/10.5T sometime after but before the plus below.
14LPP+ adds 84CPP/7.5T
12LP improves the performance/power of above.

Density improvement comes from 84CPP/9T to 84CPP/7.5T which are library swaps. Transistors dimensions are unchanged between 14LPP(dual-gate transistor//KAIST transistor) and 12LP(tri-gate transistor//IBM-GF transistor).

Zen and Zen+ remained on 78CPP/10.5T Std Cells
 
Last edited:
Reactions: Tlh97 and Spicy

DrMrLordX

Lifer
Apr 27, 2000
21,807
11,161
136
Interesting. Looks like they're just going to slide B2-stepping Vermeer into the same position as the existing B0-stepping Vermeer. Same product name, basically the same sku, etc. Gonna be like the updated Switch. You'll have to check product codes to know what you're getting.
 

Spicy

Member
Oct 5, 2021
46
48
51
Density improvement comes from 84CPP/9T to 84CPP/7.5T which are library swaps. Transistors dimensions are unchanged between 14LPP(dual-gate transistor//KAIST transistor) and 12LP(tri-gate transistor//IBM-GF transistor).

Zen and Zen+ remained on 78CPP/10.5T Std Cells
According to AnandTech, it's 9T (for Zen and Zen+)
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |