Question Speculation: RDNA2 + CDNA Architectures thread

Page 16 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,705
6,427
146
All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html
 

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
Quite interesting(for me) patch by AMD
call psp_int_ta_microcode() to parse the ta firmware.

RAS firmware is added to Sienna. This likely means Sienna will end up being a Pro card as well. (So there could be a Radeon VII Pro like card derived from Sienna, 5700 Pro cards had no RAS features that I know of. Or not enabled at least.)
 
Last edited:

Geranium

Member
Apr 22, 2020
83
101
61
I fail to see the point you are making.

Financing for RDNA iteration R&D need not have anything to do with CDNA development directions - quite the opposite in fact, which was my point, CDNA frees them from uArch design dependency on the likes of Sony and MS.

Besides which, the Apple part of their custom division only accounts for a small fraction of their Sony/MS business, and likely did not even exist in the time of RDNA's early development - the fact that Navi 12 is a more errata free iteration on the same RDNA1 uArch in Navi 10 and 14 supports this.

Given Apple are breaking away from contracting AMD it was and is clearly a good direction to be going in - doubly so if there is any lack of certainty over mid term console refreshes, let alone another full console generation after PS5/XSX.

Though returning to your "keeping afloat" point, to be brutally accurate, the entire Radeon/RTG division first kept the CPU division afloat during the Bulldozer mess - back then consoles were not the only driving earner for AMD during early the GCN era, before either intrinsic uArch shortfalls or iteration R&D mismanagement broke its scalability.

The pendulum then swung around during the late Volcanic Islands to Polaris/Vega timeframe, when the newly rebounding CPU group under Zen was keeping the GPU group afloat, in combination with semi custom deals from various sources*.

*Including mid term and next gen Sony/MS console SoC's, Subor, Apple, and the Hygon licensing deal.
AMD sell quite good number of GPU's to the Apple. If not why Apple always gets full die of AMD GPU first since 2012/13.
#. R7 260X had full Bonaire, while R7 360 cut down version and full die went to Apple.
#. Full die Polaris11 was only available to Apple for the first year.
#. Most of first Vega shipment went to Apple as Pro Vega 64 and Pro Vega 48
#. Full Vega20 die is only available to only Apple and datacenter while consumer only got the 60CU version. Even Pro VII is only 60CU.
#. Full Navi14 die is only available to only Apple version of Pro RX 5500M and Pro 5500
Company don't do business like this with their small partner. Company do this kind of business where good money is made.

And Navi12 is not a custom GPU. Vega12 was not either. Custom one has own names compared to general name like Navi1x/Vega1x . Navi12 may not even an Apple exclusive either, it just happens to be that Apple is the only one who use it, just like Vega12 GPU.

Apple breakout will hurt AMD's gpu division surely if AMD dont increase their sell on Windows side of the market.
 

soresu

Diamond Member
Dec 19, 2014
3,206
2,474
136
Apple breakout will hurt AMD's gpu division surely if AMD dont increase their sell on Windows side of the market.
Perhaps, but nothing more than a pinprick compared to the impact of their previous low PC market share and what it would look like if they lost their console deals.
 
Reactions: Tlh97

soresu

Diamond Member
Dec 19, 2014
3,206
2,474
136
And Navi12 is not a custom GPU. Vega12 was not either. Custom one has own names compared to general name like Navi1x/Vega1x . Navi12 may not even an Apple exclusive either, it just happens to be that Apple is the only one who use it, just like Vega12 GPU.
It is custom in so much as it is a different stepping with more fixed uArch errata from Navi 10, and it also has HBM controllers instead of GDDR6 - which is not a trivial amount of silicon to change out.

Vega 12 however is completely custom made for Apple.

There is no equivalent chip like it in any other line up that I have seen - it is not half of any larger GPU, it is not an exact double of the Raven Ridge GPU.

It is just a chip entirely by itself in design, and the only other discrete Vega uArch GPU apart from Vega 10 on the 16/14/12nm processes.

We can speculate that perhaps V12 was originally intended to be a twin brother to Vega 11 (which AMD has stated is not RR GPU), but speculation is all we have on its origins as AMD have been pretty tight lipped about it as things go.
 
Reactions: Tlh97

Konan

Senior member
Jul 28, 2017
360
291
106
Sharing if not seen yet. Can google translate.
Chiphell post from leaker wjm47196 regarding RDNA2. Can search his name for trusted track record if needed too Was correct on Polaris 30 (Radeon RX 590) launching in Q4 2018, Radeon VII in Q1 2019 and 7nm Navi mainstream cards arriving before the high-end enthusiast-grade variants in 2019 and lastly AMD unveiling RDNA 2 at CES earlier this year - so quite accurate)

Summary -
  • All the current RDNA2 rumors are fake and this is because..
  • AMD has not even finished designing the PCB yet, the toolkits for manufacturing are also not ready. Rumours claiming X times of performance are apparently BS
  • Due to Covid, the AMD engineers from US/CAN are needed and can't travel
  • However, Q4 launch for RDNA2 is still on
  • Don't expect AIB RDNA2 boards for launch | AMD will be sending PCB design resources to AIB in the next 2 weeks.
  • GPU validation sample was sent to Shanghai (apparently different from final mass producing PCB) for driver development (AMD's GPU driver is coded in AMD Shanghai?)
  • wjm47196 also said the Flagship RDNA2 has 16GB VRAM (and with the famous leaker saying this - it could even mean HBM2 could be back on the table now) and that Ampere will launch first in September
 

DiogoDX

Senior member
Oct 11, 2012
747
279
136
Sharing if not seen yet. Can google translate.
Chiphell post from leaker wjm47196 regarding RDNA2. Can search his name for trusted track record if needed too Was correct on Polaris 30 (Radeon RX 590) launching in Q4 2018, Radeon VII in Q1 2019 and 7nm Navi mainstream cards arriving before the high-end enthusiast-grade variants in 2019 and lastly AMD unveiling RDNA 2 at CES earlier this year - so quite accurate)

Summary -
  • All the current RDNA2 rumors are fake and this is because..
  • AMD has not even finished designing the PCB yet, the toolkits for manufacturing are also not ready. Rumours claiming X times of performance are apparently BS
  • Due to Covid, the AMD engineers from US/CAN are needed and can't travel
  • However, Q4 launch for RDNA2 is still on
  • Don't expect AIB RDNA2 boards for launch | AMD will be sending PCB design resources to AIB in the next 2 weeks.
  • GPU validation sample was sent to Shanghai (apparently different from final mass producing PCB) for driver development (AMD's GPU driver is coded in AMD Shanghai?)
  • wjm47196 also said the Flagship RDNA2 has 16GB VRAM (and with the famous leaker saying this - it could even mean HBM2 could be back on the table now) and that Ampere will launch first in September
This makes much more sense and mirros the RTG in last few years. Late, no custom cards near launch and infant drivers. Lets see if at least the performance will be good and not another vega disaster.
 
Last edited:

uzzi38

Platinum Member
Oct 16, 2019
2,705
6,427
146
Sharing if not seen yet. Can google translate.
Chiphell post from leaker wjm47196 regarding RDNA2. Can search his name for trusted track record if needed too Was correct on Polaris 30 (Radeon RX 590) launching in Q4 2018, Radeon VII in Q1 2019 and 7nm Navi mainstream cards arriving before the high-end enthusiast-grade variants in 2019 and lastly AMD unveiling RDNA 2 at CES earlier this year - so quite accurate)

Summary -
  • All the current RDNA2 rumors are fake and this is because..
  • AMD has not even finished designing the PCB yet, the toolkits for manufacturing are also not ready. Rumours claiming X times of performance are apparently BS
  • Due to Covid, the AMD engineers from US/CAN are needed and can't travel
  • However, Q4 launch for RDNA2 is still on
  • Don't expect AIB RDNA2 boards for launch | AMD will be sending PCB design resources to AIB in the next 2 weeks.
  • GPU validation sample was sent to Shanghai (apparently different from final mass producing PCB) for driver development (AMD's GPU driver is coded in AMD Shanghai?)
  • wjm47196 also said the Flagship RDNA2 has 16GB VRAM (and with the famous leaker saying this - it could even mean HBM2 could be back on the table now) and that Ampere will launch first in September

I'm 100% certain portions of this are either wrong or not referring to Navi21.

The die size of which I am confident in.
 
Reactions: Tlh97 and Glo.

uzzi38

Platinum Member
Oct 16, 2019
2,705
6,427
146
Last edited:

Konan

Senior member
Jul 28, 2017
360
291
106
Speaking of which, this talks about one of them.


GDDR6, not HBM2. For consumer dies anyway.

I agree about GDDR6. I think that HBM2 is too expensive.

Die size seems reasonable too.
 
Reactions: Tlh97 and uzzi38

soresu

Diamond Member
Dec 19, 2014
3,206
2,474
136
A quick Google translate from an associated webpage brings this text:

"Looking at the embedded patches (not found) without going through reviews on The amd-gfx Archives, Sienna Cichlid seems to allow connectivity with up to 4 GPUs."

4 GPU connectivity sounds like a heck of a lot for a company expecting very little from xfire / mgpu going forward.
 

uzzi38

Platinum Member
Oct 16, 2019
2,705
6,427
146
A quick Google translate from an associated webpage brings this text:

"Looking at the embedded patches (not found) without going through reviews on The amd-gfx Archives, Sienna Cichlid seems to allow connectivity with up to 4 GPUs."

4 GPU connectivity sounds like a heck of a lot for a company expecting very little from xfire / mgpu going forward.

It's worth remembering that Instinct from here on out will have 0 display capabilities.

AMD need RDNA to handle certain tasks, such as VDI. That's why you'll see mGPU support and IF bridges on RDNA2, even if AMD considers xFire dead.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
A quick Google translate from an associated webpage brings this text:

"Looking at the embedded patches (not found) without going through reviews on The amd-gfx Archives, Sienna Cichlid seems to allow connectivity with up to 4 GPUs."

4 GPU connectivity sounds like a heck of a lot for a company expecting very little from xfire / mgpu going forward.
Sienna will be a Pro part as well like the VII Pro. It has SRIOV, enhanced RAS, and self diagnostics to inform user if a fault threshold is reached. And this is also used for RMA. It has non volatile storage to save faults detected by the self diagnosis and when a threshold is reached it will not initialize anymore unless forced.
If not for Games, other loads would benefit from the XGMI.

Update:
I find it awkward that we are replying to a forum post in quick sucession. Working from home really changed my browsing habits.
I went to office on Monday to try to work from Office as part of a gradual return to normalcy, but I now find it weird to work from Office.
 
Last edited:

soresu

Diamond Member
Dec 19, 2014
3,206
2,474
136

Something actually worth discussing for once.

Also I said in another thread there's a possibility for RDNA2 on the 28th, well looks like there's not.
Those numbers are really weird.

9.5 FP64 TFLOPS, 42 FP32 TFLOPS and 150 FP16 TFLOPS.

The 150 FP16 TFOPS number makes sense from tensor/matrix logic, but FP32 is insane at 42 TFLOPS.

I can only assume that ML focused HW augments FP32 numbers too for ML work.

The 9.5 FP64 TFLOPS makes perfect sense though - you only need 1.16 Ghz to reach that number at half rate with 128 CU's in old GCN reckoning.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136

Something actually worth discussing for once.

Also I said in another thread there's a possibility for RDNA2 on the 28th, well looks like there's not.


Doubt.

def FeatureISAVersion9_0_8 : FeatureSet<
[FeatureGFX9,
HalfRate64Ops,

This is one of those exclusives compiled from open info. They just can't help themselves can they?
Ironic that they feel they need to confirm AMD's publicly shared information.
 
Last edited:

TESKATLIPOKA

Platinum Member
May 1, 2020
2,507
2,990
136
Those numbers are really weird.
....
The 9.5 FP64 TFLOPS makes perfect sense though - you only need 1.16 Ghz to reach that number at half rate with 128 CU's in old GCN reckoning.
I doesn't make any sense to me to clock FP64 cores so low. If they want 9.5 TFLOPs in FP64, then It's much better to clock FP64 cores as the rest of the chip and save a lot of space having less FP64 cores in the GPU.
 

soresu

Diamond Member
Dec 19, 2014
3,206
2,474
136
I doesn't make any sense to me to clock FP64 cores so low. If they want 9.5 TFLOPs in FP64, then It's much better to clock FP64 cores as the rest of the chip and save a lot of space having less FP64 cores in the GPU.
It does if you are packing in 8 of them in a single rack.

Remember this is CDNA meant for servers, datacenters and some workstations.

Best not to think of it as a GPU at all anymore - absolute perf per card matters less than perf/watt per system or rack when you go to server level.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,747
6,598
136
It does if you are packing in 8 of them in a single rack.

Remember this is CDNA meant for servers, datacenters and some workstations.

Best not to think of it as a GPU at all anymore - absolute perf per card matters less than perf/watt per system or rack when you go to server level.
I think he is talking about the half rate DPFP does not makes sense with the exclusive info. LLVM considers CDNA/Arcturus to support HalfRate64Ops
If a card can do 40+ FP32 TFLOPS, it will do the half DPFP i.e 20TF with the same FP32 clocks. Conversely, if it does 10TF DPFP it will do 20 TF FP32.
That is how MI50/60 work as well.
It would be strange to drop clocks by half of FP32 to get those DPFP values.
It is just that either LLVM is wrong or the exclusive info is made up stuff.
 
Last edited:
Reactions: Tlh97 and raghu78

TESKATLIPOKA

Platinum Member
May 1, 2020
2,507
2,990
136
It does if you are packing in 8 of them in a single rack.

Remember this is CDNA meant for servers, datacenters and some workstations.

Best not to think of it as a GPU at all anymore - absolute perf per card matters less than perf/watt per system or rack when you go to server level.
As DisEnchantment mentioned, I wasn't talking about perf/W.
If FP32 is 42 TFlops then FP64 shouldn't be 9.5 Tflops, but 21 TFlops If It's 1/2 or 10.5TFlops If It's 1/4, to get 9.5 Tflops you would need to have different clock speed and that doesn't make a lot of sense.
 

soresu

Diamond Member
Dec 19, 2014
3,206
2,474
136
As DisEnchantment mentioned, I wasn't talking about perf/W.
If FP32 is 42 TFlops then FP64 shouldn't be 9.5 Tflops, but 21 TFlops If It's 1/2 or 10.5TFlops If It's 1/4, to get 9.5 Tflops you would need to have different clock speed and that doesn't make a lot of sense.
They may be quoting figures for FP32 operations that favour ML workloads specifically - likely operations accelerated by the new matrix/tensor logic.

I would expect 19 TFLOPS for truly general purpose workloads - unless they have doubled the per CU FP32 compute capacity somehow.

Edit: nVidia's recent Ampere announcement was being creative on their figures apparently*, AMD may have just decided fair is fair if nVidia are going to play games.

*something to do with sparse numbers or the like - I think it's similar to geometry culling, except with unnecessary tensor computation and then quoting a performance figure that acts as if those operations were computed anyway.

I have heard speculation that nVidia's RT gigaray figures are similarly inflated based on how Turing performs with denoising and DLSS as if it is handling far more rays per second than it actually is.
 
Last edited:

TESKATLIPOKA

Platinum Member
May 1, 2020
2,507
2,990
136
Everything is possible.
19 Tflops? That would be for example 72CU at 2.05GHz, that is doable.
I have to wonder, If RDNA2 is really so much better than RDNA1 and because of that AMD didn't bother to make a bigger RDNA1 chip to combat 2080 Ti or It was because of some limitation in RDNA1.
 
Last edited:

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
Everything is possible.
19 Tflops? That would be for example 72CU at 2.05GHz, that is doable.
I have to wonder, If RDNA2 is really so much better than RDNA1 and because of that AMD didn't bother to make a bigger RDNA1 chip to combat 2080 Ti or It was because of some limitation in RDNA1.

I think AMD knew from the outset that RDNA 1 was just a stepping stone (which they have mentioned) and with limited 7nm capacity (at that time) they went with the mainstream market, which significantly outsells the high end market.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |