Question Speculation: RDNA2 + CDNA Architectures thread

Page 73 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

uzzi38

Platinum Member
Oct 16, 2019
2,703
6,405
146
All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html
 

DJinPrime

Member
Sep 9, 2020
87
89
51
3) The RX 5700 XT already had a high TDP and AMD is done playing that game.
So, not scalable at higher speed or core counts or both then. It was a small chip on 7nm from tsmc, I don't remember hearing yield issues.

Navi was by design targeting the mid-range performance level. Simple economics as greater sales are to be had in the lower to mid-range vs the higher end market. Putting Big Navi on the back burner doesn't mean they couldn't have made it earlier if it made sense at the time to do so. It was in AMD's best interest to use their resources and wafer supply for their other offerings and commitments at that time.
Also not scalable then, since it was only designed for mid range performance. If you believe that, then it means there's something about the design that doesn't work at high performance. But that's not how historically GPU have been design. All the lower tiers are basically the big chip with clusters removed, there's not a whole lot of difference in design. Whatever is their biggest chip is their best chip. Where the performance lands is what it is.
Let's do some simple math, including all the cost of the card, 5700 ended up selling for $400. So, you think that basically having the same board and maybe add another 4GB of memory and have a 50% bigger chip that sells for $800-$1000 wouldn't have bigger margins? And it's not like you're wasting the chip, bin the bad ones down to 5700 and 5500. If that's really how it is, no wonder NV is on top, AMD is not even trying. I prefer to think that AMD tried but could not scale up for some technical reason. No shame there, sometime things just didn't work the way you expected and the next version will be better. Also, I hope you're wrong about the wafer availability, cause there's tons more pressure now at 7nm. New xbox, ps5, zen 3, new iphones. If they have the same mentality, how many big Navi will we even be made?
Why bother even, with the price of 3070 and 3080, might as well not even make any big Navi. /s
 

Hitman928

Diamond Member
Apr 15, 2012
5,622
8,847
136
So, not scalable at higher speed or core counts or both then. It was a small chip on 7nm from tsmc, I don't remember hearing yield issues.


Also not scalable then, since it was only designed for mid range performance. If you believe that, then it means there's something about the design that doesn't work at high performance. But that's not how historically GPU have been design. All the lower tiers are basically the big chip with clusters removed, there's not a whole lot of difference in design. Whatever is their biggest chip is their best chip. Where the performance lands is what it is.
Let's do some simple math, including all the cost of the card, 5700 ended up selling for $400. So, you think that basically having the same board and maybe add another 4GB of memory and have a 50% bigger chip that sells for $800-$1000 wouldn't have bigger margins? And it's not like you're wasting the chip, bin the bad ones down to 5700 and 5500. If that's really how it is, no wonder NV is on top, AMD is not even trying. I prefer to think that AMD tried but could not scale up for some technical reason. No shame there, sometime things just didn't work the way you expected and the next version will be better. Also, I hope you're wrong about the wafer availability, cause there's tons more pressure now at 7nm. New xbox, ps5, zen 3, new iphones. If they have the same mentality, how many big Navi will we even be made?
Why bother even, with the price of 3070 and 3080, might as well not even make any big Navi. /s

GPUs are AMD's least profitable 7nm product. Priority was given to more profitable product lines both in terms of development funding and wafer supply. AMD's GPU group had been stripped down to a bare bones operation to give more funding to the Zen team and the graphics team are only recently seeing significant increases in funding. Without the required funding, AMD decided to focus on the higher volume segments to try and stay relevant in regards to market share. Now that they have more funding they are expanding to a full product lineup. That's it, it's not that complicated.

Ongoing wafer availability is a concern and no one knows how many wafers AMD has purchased/is purchasing for RDNA2. It might be tight for a bit with Zen3, RDNA2, and consoles all launching at the same time, but with mobile chips migrating to 5 nm, that should free up a lot of space for the medium to long term until AMD makes the move to 5 nm as well.
 

Olikan

Platinum Member
Sep 23, 2011
2,023
275
126
Hmm, is he referring to this patent?

ADAPTIVE CACHE RECONFIGURATION VIA CLUSTERING

That's quite a big change to cache subsytem
There is crazy impressive research from AMD, that pretty much IS this patent....


Just a quote:
"We extensively evaluate our proposal across 28 GPGPU applications. Our dynamic scheme boosts performance by 22% (up to 52%) and energy efficiency by 49% for the applications that exhibit high data replication and cache sensitivity without degrading the performance of the other applications. This is achieved at a modest
area overhead of 0.09 mm2/core."
 

Hitman928

Diamond Member
Apr 15, 2012
5,622
8,847
136
There is crazy impressive research from AMD, that pretty much IS this patent....


Just a quote:
"We extensively evaluate our proposal across 28 GPGPU applications. Our dynamic scheme boosts performance by 22% (up to 52%) and energy efficiency by 49% for the applications that exhibit high data replication and cache sensitivity without degrading the performance of the other applications. This is achieved at a modest
area overhead of 0.09 mm2/core."

I'm assuming by core they mean compute unit? If it's stream processor, that modest overhead is not so modest, lol.
 
Last edited:

eek2121

Diamond Member
Aug 2, 2005
3,053
4,281
136
GPUs are AMD's least profitable 7nm product. Priority was given to more profitable product lines both in terms of development funding and wafer supply. AMD's GPU group had been stripped down to a bare bones operation to give more funding to the Zen team and the graphics team are only recently seeing significant increases in funding. Without the required funding, AMD decided to focus on the higher volume segments to try and stay relevant in regards to market share. Now that they have more funding they are expanding to a full product lineup. That's it, it's not that complicated.

Ongoing wafer availability is a concern and no one knows how many wafers AMD has purchased/is purchasing for RDNA2. It might be tight for a bit with Zen3, RDNA2, and consoles all launching at the same time, but with mobile chips migrating to 5 nm, that should free up a lot of space for the medium to long term until AMD makes the move to 5 nm as well.

I would ignore him, he is just trolling people.
 

Krteq

Senior member
May 22, 2015
993
672
136
There is crazy impressive research from AMD, that pretty much IS this patent....


Just a quote:
"We extensively evaluate our proposal across 28 GPGPU applications. Our dynamic scheme boosts performance by 22% (up to 52%) and energy efficiency by 49% for the applications that exhibit high data replication and cache sensitivity without degrading the performance of the other applications. This is achieved at a modest
area overhead of 0.09 mm2/core."
Nice

But after some research I assume both patents are more related to CDNA then RDNA
 

blckgrffn

Diamond Member
May 1, 2003
9,198
3,185
136
www.teamjuchems.com
Do you really consider it a loss? I paid $800 for my 1080ti and I hope to retire it this year, but it won’t be a loss at all for me. I got my money’s worth out of it, and I can’t even sell it (it is going in another PC).

That's fair to ask - but if I could sell today and get nearly $400 and then in a five weeks there is such a bountiful crop of AMD Navi cards that I lose ~$200 on resale then that seems like lost money to me. Like deciding when to sell a stock...

And I've got a years use out of this thing, so I could look it at is as ~no cost per month of usage (sell now) or ~$20 per month (sell post launch) OR just pass it down to my son like I intended to and like you just get years of functional use out of it.

If used GPU prices hadn't been so crazy last year I probably would have tried to find a Vega 56 or something to nurse myself into RDNA2. I was so close to buying a Fury Nano on eBay for ~$105 shipped - I am kind of annoyed I didn't because of how niche that card was (I put in an offer for $100 and he countered at $105 and I let it expire)
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,761
14,785
136
Anyone else passing the time by reading the insane theories online?
I am getting bored about
1) Hearing about what AMD is going to do with their next video cards...
2) hearing about what Intel is GOING to do with anything !
3) Hearing about what AMD is doing with Zen3....
4) hearing about when we might be able to buy a 3000 series Ampere.

Come on ! I want something to discuss !!! 3000 is faster, but you can't buy it, so that leaves.....


NOTHING !!
 

reb0rn

Senior member
Dec 31, 2009
222
58
101
I must say there is so many misinformation, no one can even speculate on memory bandwith to start from it
like 16GB can only be 256/512bit or HBM2
and 12GB is 384bit

if its just 256bit i can`t see being any if at all faster then 3070
 

A///

Diamond Member
Feb 24, 2017
4,352
3,155
136
I am getting bored about
1) Hearing about what AMD is going to do with their next video cards...
2) hearing about what Intel is GOING to do with anything !
3) Hearing about what AMD is doing with Zen3....
4) hearing about when we might be able to buy a 3000 series Ampere.

Come on ! I want something to discuss !!! 3000 is faster, but you can't buy it, so that leaves.....


NOTHING !!
I personally love, maybe even loathe the threads on other sites of what AMD should do. How AMD can counter bots. How AMD has failed already based on leaks, namely the 256 bit bus engineering sample card, or that they should sell RTG off to NVidia.
 

Saylick

Diamond Member
Sep 10, 2012
3,394
7,158
136
Found this on Reddit. 20% IPC gains incoming?

https://adwaitjog.github.io/docs/pdf/sharedl1-pact20.pdf

Abstract:
Graphics Processing Units (GPUs) concurrently execute thousands of threads, which makes them effective for achieving high through-put for a wide range of applications. However, the memory wall often limits peak throughput. GPUs use caches to address this limitation, and hence several prior works have focused on improving cachehit rates, which in turn can improve throughput for memory intensive applications. However, almost all of the prior works assume a conventional cache hierarchy where each GPU core has a private local L1 cache and all cores share the L2 cache. Our analysis shows that this canonical organization does not allow optimal utilization of caches because the private nature of L1 caches allows multiple copies of the same cache line to get replicated across cores.
We introduce a new shared L1 cache organization, where all ccores collectively cache a single copy of the data at only one location (core), leading to zero data replication. We achieve this by allowing each core to cache only a non-overlapping slice of the entire address range. Such a design is useful for significantly improving the collective L1 hit rates but incurs latency overheads from additional communications when a core requests data not allowed to be present in its own cache. While many workloads can tolerate this additional latency, several workloads show performance sensitivities. Therefore, we develop lightweight communication optimization techniques and a run-time mechanism that considers the latency-tolerance characteristics of applications to decide which applications should execute in private versus shared L1 cache organization and reconfigures the caches accordingly. In effect, we achieve significant performance and energy efficiency improvements, at a modest hardware cost, for applications that prefer the shared organization, with little to no impact on other applications.
 
Last edited:

blckgrffn

Diamond Member
May 1, 2003
9,198
3,185
136
www.teamjuchems.com
I must say there is so many misinformation, no one can even speculate on memory bandwith to start from it
like 16GB can only be 256/512bit or HBM2
and 12GB is 384bit

if its just 256bit i can`t see being any if at all faster then 3070

Fine. I promise that Big Navi is Hawaii reborn with a 512 bit bus. And infinity cache. Pinky swear.

Don’t ask for sources because I don’t have any. I just can’t let this train slow down.

If only I had a YouTube channel where I got paid per view 🤔
 

DiogoDX

Senior member
Oct 11, 2012
746
277
136
I must say there is so many misinformation, no one can even speculate on memory bandwith to start from it
like 16GB can only be 256/512bit or HBM2
and 12GB is 384bit

if its just 256bit i can`t see being any if at all faster then 3070
12GB can be 192bits too.
 

eek2121

Diamond Member
Aug 2, 2005
3,053
4,281
136
That's fair to ask - but if I could sell today and get nearly $400 and then in a five weeks there is such a bountiful crop of AMD Navi cards that I lose ~$200 on resale then that seems like lost money to me. Like deciding when to sell a stock...

And I've got a years use out of this thing, so I could look it at is as ~no cost per month of usage (sell now) or ~$20 per month (sell post launch) OR just pass it down to my son like I intended to and like you just get years of functional use out of it.

If used GPU prices hadn't been so crazy last year I probably would have tried to find a Vega 56 or something to nurse myself into RDNA2. I was so close to buying a Fury Nano on eBay for ~$105 shipped - I am kind of annoyed I didn't because of how niche that card was (I put in an offer for $100 and he countered at $105 and I let it expire)

Maybe if you consider GPUs an investment? I usually end up giving them away. I routinely rebuild PCs and give them to family, friends, and those less fortunate (not necessarily in that order). To me it's a sunk cost. A part of my hobby.

I must say there is so many misinformation, no one can even speculate on memory bandwith to start from it
like 16GB can only be 256/512bit or HBM2
and 12GB is 384bit

if its just 256bit i can`t see being any if at all faster then 3070
What's funny is that the bus size can actually be any size. Most people don't realize this, but yes, it's possible to have 16gb of GDDR6 and a 352 or 384 bit bus. There are a number of ways to do this (though to be fair they aren't used as far as I'm aware). I'll leave it to your imagination to figure this out.

Found this on Reddit. 20% IPC gains incoming?

https://adwaitjog.github.io/docs/pdf/sharedl1-pact20.pdf

Abstract:

It is my understanding that the actual "IPC" (in quotes because can one really use the term 'IPC'?) of the architecture, including everything (rendering, shaders, etc.) is closer to 7%. We will see, however. My information is based mostly on console related stuff. I've seen numerous rumors and leaks that indicate that PC RDNA2 parts are at least somewhat different from console parts, but I'm not sure those changes will help "IPC". AMD is going to reach performance by scaling CU count upwards. An "IPC" increase isn't needed. It's just icing on the cake. Coincidentally, a 50% perf/watt increase would allow them to have a 72CU part run at the same TDP and same clocks as the RX 5700 XT. Food for thought. Assuming that they are able to scale up performance with CU count, well...

I know some people here may not understand the concept of AMD delivering solid execution, but they've been literally "executing" Intel. Anyone that claims they can't do the same thing to NVIDIA should stop posting here and short AMD stock.

EDIT: As an addendum to why "IPC" isn't really valid for GPUs, the "TFLOPs" measurement is the closest you'd get to IPC, which as you can see is wildly abused (NVIDIA claims double FP32 TFLOPs with the 3080 over the 2080ti, yet as we've witnessed, it performs 20-30% faster). Once you start factoring in geometry, textures, clocks, shaders, etc. all bets are off.

EDIT 2: As an example of why IPC can't really be measured. Vega64 has 12.66 TFLOPs of compute power, or nearly 30% more than the RX 5700 XT. However, you'll note that the RX 5700 XT beats the Vega 64 soundly in gaming. No, AMD isn't making up the TFLOPs number. Vega has really strong compute performance, but not so great gaming performance.
 
Last edited:

Tup3x

Golden Member
Dec 31, 2016
1,012
1,002
136
Well, new Xbox does have this weird memory layout. It might work for it since CPU will mainly use the slower pool but I'm not so sure how well it would work for top end discreet GPU. GTX 550 Ti also had similar thing.
 

pandemonium

Golden Member
Mar 17, 2011
1,777
76
91
It really is.

If I'm understanding what they're laying out in theory, they want to basically AI the entire pipeline from the start, by task.

Given their wide range of compute tests they used, I can see this having an impact on real-time rendering. Like DLSS improving vastly over a generation, this could have broad ramifications for how efficiently GPGPUs handle their tasks.
 

GodisanAtheist

Diamond Member
Nov 16, 2006
7,070
7,492
136
It really is.

If I'm understanding what they're laying out in theory, they want to basically AI the entire pipeline from the start, by task.

Given their wide range of compute tests they used, I can see this having an impact on real-time rendering. Like DLSS improving vastly over a generation, this could have broad ramifications for how efficiently GPGPUs handle their tasks.

- But is it going to be ready for a top to bottom RDNA2 stack? This type of radical technological shift looks like it would be a prime candidate for a pipe cleaner product or a mid gen refresh, not a top to bottom stack launch.

Wonder if this is the kind of thing that's being kept in the pipe for an RDNA3 launch or even further down the line.

After all the promises of the new pathways and discard accelerators etc in Vega and Polaris I would be less surprised if AMD managed to bork the physical design so the feature is useless than not.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |