Question AMD Rembrandt/Zen 3+ APU Speculation and Discussion

Page 11 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

izaic3

Member
Nov 19, 2019
61
96
91
Alright, so we've had some leaks so far. I don't know if any of it's been confirmed yet, as it's pretty early, but here is what I've surmised so far (massive grain of salt of course):

If if turns out to have RDNA 2 and 12 CU, I could see iGPU performance potentially almost doubling over Cezanne.

If I've made any mistakes or gotten anything wrong, please let me know. I'd also love to hear more knowledgeable people weigh in on their expectations.
 
Last edited:

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
The comparison linus made is not really valid because the Steam Deck and the Neo were running diferent OS. And it is running behind Proton on the Steam deck. Unless the Aya Neo was also running SteamOS and i completely missed that?
Doom Eternal is known to run slower on Linux via Proton,
Yeah, I posted an article with that information above. About 10-15% slower in DE than on Windows, depending on the drivers used.
 

mikk

Diamond Member
May 15, 2012
4,175
2,211
136
Even if it doesn't have complete HW decode, they could still make a hybrid solution like they did for VP9. Won't be as efficient of course.


I think they already have Hybrid AV1, the issue is that it's a really subpar solution for a mobile device because it's less efficient and fast.
 

Bigos

Member
Jun 2, 2019
138
322
136
Maybe AV1 decoding needs more CU than there are available on Rembrandt for a good performance? I mean, there is the fixed-function part of the decode and there is the per-pixel (per-region?) part that might use the CUs?

That or they just want to cut down on die space. With less CUs, the other parts of the GPU take a lot more space, I would assume that the Rembrandt GPU would take as little space as possible, maybe dropping AV1 saved them enough mm^2? Look at the Renoir die shot, it looks like the 8 CUs take at most half of the GPU side.
 

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
Maybe AV1 decoding needs more CU than there are available on Rembrandt for a good performance? I mean, there is the fixed-function part of the decode and there is the per-pixel (per-region?) part that might use the CUs?

That or they just want to cut down on die space. With less CUs, the other parts of the GPU take a lot more space, I would assume that the Rembrandt GPU would take as little space as possible, maybe dropping AV1 saved them enough mm^2? Look at the Renoir die shot, it looks like the 8 CUs take at most half of the GPU side.

Van Gogh has less CUs and has AV1 decode.

It's either AMD being stingy with die space or a broken implementation/firmware
 

Bigos

Member
Jun 2, 2019
138
322
136
It looks like I have confused Rembrandt with Raphael. If the RMB with fat iGPU won't have AV1 then that is indeed retarded.

But AV1 seems to be supported by Sienna Cichlid (Navi21) and up: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/drivers/radeonsi/si_get.c#L619

And Yellow Carp is included: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/amd/common/amd_family.h#L116

And Yellow Carp is Rembrandt: http://www.portvapes.co.uk/?id=Latest-exam-1Z0-876-Dumps&exid=thread...ectures-thread.2579999/page-219#post-40515918

Might be wrong, though.
 

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
It looks like I have confused Rembrandt with Raphael. If the RMB with fat iGPU won't have AV1 then that is indeed retarded.

But AV1 seems to be supported by Sienna Cichlid (Navi21) and up: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gallium/drivers/radeonsi/si_get.c#L619

And Yellow Carp is included: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/amd/common/amd_family.h#L116

And Yellow Carp is Rembrandt: http://www.portvapes.co.uk/?id=Latest-exam-1Z0-876-Dumps&exid=thread...ectures-thread.2579999/page-219#post-40515918

Might be wrong, though.
@uzzi38 mentioned the lack of AV1 in Rembrandt in the Cezanne thread. The relevant patch mail for VCN3 support in Rembrandt only mentions AVC, HEVC, VP9 and JPEG.
 

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
Finally got around to trying Doom Eternal in the same area as Linus tested on the Steam Deck on my 5800H. Mostly medium settings, I turned off motion blur and reflections, also dropped particles and volumetric lighting to low,

iGPU alone averaged 25W (no CPU power included), 45-65fps in that open area with enemies. Only way I could reliably hit the same 55-70W Linus was talking about would be to just drop everything to low really.
 
Reactions: lightmanek

Shivansps

Diamond Member
Sep 11, 2013
3,873
1,527
136
Finally got around to trying Doom Eternal in the same area as Linus tested on the Steam Deck on my 5800H. Mostly medium settings, I turned off motion blur and reflections, also dropped particles and volumetric lighting to low,

iGPU alone averaged 25W (no CPU power included), 45-65fps in that open area with enemies. Only way I could reliably hit the same 55-70W Linus was talking about would be to just drop everything to low really.

I dont know enoght of how LPDDR4X/5 works compared to regular DDR4. But the memory sticks on the 5800H are dual or single ranks?
 

GreenReaper

Junior Member
Aug 15, 2018
8
6
81
Van Gogh has less CUs and has AV1 decode.

It's either AMD being stingy with die space or a broken implementation/firmware
Or power, perhaps? PCIe 4 is more expensive. And perhaps the thinking is that AVX2 decode will be good enough.
It seems like it can compete with the best hardware decode performance-wise - at least, on one of the fastest CPUs.
(They've been pushing updates for older CPUs, too - my x120e netbook with an x120e limited to SSSE3 appreciates it!)

Don't know the relative power cost, though - I imagine it's not ideal. Encoding is faster, too, but still very slow.

Another possibility: reducing the cost of decode block licensing (if AMD is doing so) and uncertain patent exposure.
Splashing out for AV1 support could be justified for a standalone card, but maybe not for a relatively low-margin APU.
 
Reactions: Thibsie

DisEnchantment

Golden Member
Mar 3, 2017
1,687
6,243
136
It's either AMD being stingy with die space or a broken implementation/firmware
Lack of AV1 decoding could be attributed to AMD cutting corners on including full blown VCN3 blocks. They don't seem small on RDNA2 series GPUs.
Can't be broken implementation since the same IP works in RDNA2 GPUs already.
Also they were stingy with no IC integration at all

But RMB with Zen3+ could still be very impressive.
In addition to being fabbed on N6, LPDDR5 in 4x32 bit channels arrangement will help the RDNA2 GPU greatly along with some reduction in power.
This will be the APU which most Aya Neo and clones are looking for.
cTDP down to 9W (U series) and you are looking at some decent Steam Deck competitor (you can also install steam if you want)
The Zen3+ CPU part will be way ahead of the Steam Deck
 
Last edited:

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
Lack of AV1 decoding could be attributed to AMD cutting corners on including full blown VCN3 blocks. They don't seem small on RDNA2 series GPUs.
Can't be broken implementation since the same IP works in RDNA2 GPUs already.
Also they were stingy with no IC integration at all

But RMB with Zen3+ could still be very impressive.
In addition to being fabbed on N6, LPDDR5 in 4x32 bit channels arrangement will help the RDNA2 GPU greatly along with some reduction in power.
This will be the APU which most Aya Neo and clones are looking for.
cTDP down to 9W (U series) and you are looking at some decent Steam Deck competitor (you can also install steam if you want)
The Zen3+ CPU part will be way ahead of the Steam Deck

The updated VCN numbering for Rembrandt leads me to believe AMD decided to redesign the decode IP blocks for greater power efficiency - it's an area where Tiger Lake currently has a major advantage over Renoir/Cezanne after all. If that's the case, I can understand why it's cut down vs full RDNA2.

I still think Rembrandt will be extremely good overall, but the inclusion of a small 16MB IC would be enough I'd think to just solidify it's position as an entry 1080p solution.

From a GPU core resources point of view RMB should be capable enough to compete with a 1650, but total memory bandwidth is below that GPU and there's a CPU to feed at the same time. And RDNA2 does seem slightly improved vs RDNA1 in how much performance they can get at a certain mem bw (accounting for IC and approximating effective bandwidth via hit-rates provided by AMD, the 6600XT should have less effective bandwidth at 1440p yet generally performs on par or better than it).

But even then total memory bandwidth falls behind the 1650, and that GPU doesn't have to share bandwidth with an 8 core CPU at the same time. 16MB would have been plenty to keep that GPU fed.

From the sounds of it though we'll be waiting until Strix Point for an IC on die/package, which is sadge.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,687
6,243
136
The updated VCN numbering for Rembrandt leads me to believe AMD decided to redesign the decode IP blocks for greater power efficiency - it's an area where Tiger Lake currently has a major advantage over Renoir/Cezanne after all. If that's the case, I can understand why it's cut down vs full RDNA2.

I still think Rembrandt will be extremely good overall, but the inclusion of a small 16MB IC would be enough I'd think to just solidify it's position as an entry 1080p solution.

From a GPU core resources point of view RMB should be capable enough to compete with a 1650, but total memory bandwidth is below that GPU and there's a CPU to feed at the same time. And RDNA2 does seem slightly improved vs RDNA1 in how much performance they can get at a certain mem bw (accounting for IC and approximating effective bandwidth via hit-rates provided by AMD, the 6600XT should have less effective bandwidth at 1440p yet generally performs on par or better than it).

But even then total memory bandwidth falls behind the 1650, and that GPU doesn't have to share bandwidth with an 8 core CPU at the same time. 16MB would have been plenty to keep that GPU fed.

From the sounds of it though we'll be waiting until Strix Point for an IC on die/package, which is sadge.
Agreed.
Are the CVML blocks also present on RMB btw?
On mobile SoCs you can delegate all the background processing for HotWord detection, face detection (e.g. Win Hello) to the DSP and powering down the cores, to prolong battery life.
Another thing missing is the AOP(Always On Processor) which are found in Mobile SoCs, allows all core clusters to be powered down and allowing only DRAM self refresh and handling interrupts.
 

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
Agreed.
Are the CVML blocks also present on RMB btw?
On mobile SoCs you can delegate all the background processing for HotWord detection, face detection (e.g. Win Hello) to the DSP and powering down the cores, to prolong battery life.
Another thing missing is the AOP(Always On Processor) which are found in Mobile SoCs, allows all core clusters to be powered down and allowing only DRAM self refresh and handling interrupts.

I think it is, but I'm going off what a friend said a long while back, I could be wrong.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |