AMD Vega (FE and RX) Benchmarks [Updated Aug 10 - RX Vega 64 Unboxing]

Page 66 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

tential

Diamond Member
May 13, 2008
7,355
642
121
Just as I predicted. Vega 11 will replace Polaris. Most likely will be similar to Vega 10 just with additional power saving features similar to Polaris 11 vs Polaris 10.
 
May 11, 2008
20,055
1,290
126
Pixel fillrate is definitely miles behind competitors products. Even GTX 1070 is quite a bit ahead.

If i am not mistaken :

With tile based rendering, the idea is to split up the screen in tiny fragments and all the data on those fragments is also loaded and locally stored on chip preventing external memory access (like hbm2 or gddr). Any reading modifying and writing of data can be done locally in the (i suspect) L2 cache. Only to load the cache with data from external memory is needed and to store the end result back to external memory. In general speaking.
The ROPs (Raster Operations) units work now closely with the L2 cache from the gpu, so i suspect the L2 cache has grown quite a bit to help out with reducing external memory access.
Also the draw stream binner which has the task of culling non visible polygons will help in hidden surface removal.

The trick with culling not visible polygons (hidden surface removal) is that you also have to do less operations on pixels. Then a lower pixel fill rate is less of an issue if the culling of not visible polygons is successful. It all depends on the implementation of course (and driver support)...

At least, that is the theory.
 

Tup3x

Golden Member
Dec 31, 2016
1,012
1,002
136
If i am not mistaken :

With tile based rendering, the idea is to split up the screen in tiny fragments and all the data on those fragments is also loaded and locally stored on chip preventing external memory access (like hbm2 or gddr). Any reading modifying and writing of data can be done locally in the (i suspect) L2 cache. Only to load the cache with data from external memory is needed and to store the end result back to external memory. In general speaking.
The ROPs (Raster Operations) units work now closely with the L2 cache from the gpu, so i suspect the L2 cache has grown quite a bit to help out with reducing external memory access.
Also the draw stream binner which has the task of culling non visible polygons will help in hidden surface removal.

The trick with culling not visible polygons (hidden surface removal) is that you also have to do less operations on pixels. Then a lower pixel fill rate is less of an issue if the culling of not visible polygons is successful. It all depends on the implementation of course (and driver support)...

At least, that is the theory.
Yet NVIDIA keeps adding more and more ROPs while using tile based rendering, with good results.
 

crisium

Platinum Member
Aug 19, 2001
2,643
615
136
Nvidia went from 2880:48 shaders:rops, to 1280:48 just two generations later. They must be doing something right. In the same time period (and same competing product stacks), AMD went from 2816:64 to 2304:32. The performance-per-flop and relative performance-per-area don't make me think anything right is going on there with AMD's decisions. Pixel fillrate must still be very important, but only one company is willing or able to chase higher amounts.
 
May 11, 2008
20,055
1,290
126
Yet NVIDIA keeps adding more and more ROPs while using tile based rendering, with good results.

Implementations differ of course, being all heavily patented.
Also, as mention in my other post, tile based rendering is very difficult to implement. Nvidia acquired the TBR technology after buying 3dfx in 2002. Yet it took until Maxwell (2010 - 2014 ish) before they used it for high end desktop pc graphic renderers.
AMD is just getting started. Give them some time.

EDIT:
Of course, if AMD really does do TBR and it is not a marketing pr stunt.
 
Last edited:

n0x1ous

Platinum Member
Sep 9, 2010
2,572
248
106
I really thought we would see 128 ROPS in Vega when it was announced that it was suppose to be a major overhaul of GCN
 
May 11, 2008
20,055
1,290
126
Nvidia went from 2880:48 shaders:rops, to 1280:48 just two generations later. They must be doing something right. In the same time period (and same competing product stacks), AMD went from 2816:64 to 2304:32. The performance-per-flop and relative performance-per-area don't make me think anything right is going on there with AMD's decisions. Pixel fillrate must still be very important, but only one company is willing or able to chase higher amounts.

I think that has something to do with the (Nvidia) warp size of 32 and the (AMD) wave size of 64.
If not all CU are fully utilized, AMD has a problem.
But that is just my understanding from reading the technical information, i am not a gpu programmer.
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
I really thought we would see 128 ROPS in Vega when it was announced that it was suppose to be a major overhaul of GCN

I think the current Chinese development team simply doesn't know how to fix the limitations of GCN (remember, they weren't the ones who originally designed it - a North American team did that, but was laid off in 2013). They don't know how to expand it to more than 4 shader engines, 64 CUs, or 64 ROPs. All they can do is some mild tinkering around the edges.
 
Reactions: Head1985

advt.naveen

Junior Member
May 17, 2013
20
7
81
I think the current Chinese development team simply doesn't know how to fix the limitations of GCN (remember, they weren't the ones who originally designed it - a North American team did that, but was laid off in 2013). They don't know how to expand it to more than 4 shader engines, 64 CUs, or 64 ROPs. All they can do is some mild tinkering around the edges.
Is that means Vega architecture (I'm talking about micro-architecture not rtl code writing) is fully designed by an inexperienced team?
If so then what is the work of raja team in North America does. I thought ROP and CU counts are part of architecture designing and that's taken care by raja and few other members in the team. According to the documented architecture several rtl teams will work.
 

CatMerc

Golden Member
Jul 16, 2016
1,114
1,153
136
I think the current Chinese development team simply doesn't know how to fix the limitations of GCN (remember, they weren't the ones who originally designed it - a North American team did that, but was laid off in 2013). They don't know how to expand it to more than 4 shader engines, 64 CUs, or 64 ROPs. All they can do is some mild tinkering around the edges.
That's a fairly ridiculous thing to suggest. Given internal documentation and schematics, it's all down to their resources and ingenuity. There isn't anything hidden from them.

If AMD doesn't have those documents, we have far bigger problems than what employees they have.
 
Last edited:

Head1985

Golden Member
Jul 8, 2014
1,866
699
136
I think the current Chinese development team simply doesn't know how to fix the limitations of GCN (remember, they weren't the ones who originally designed it - a North American team did that, but was laid off in 2013). They don't know how to expand it to more than 4 shader engines, 64 CUs, or 64 ROPs. All they can do is some mild tinkering around the edges.
yeah vega should have 6SE/96rops resulting in +50% geometry power and +50% pixel fillrate.Vega also should have 3072bit resulting in 768GB/s bandwidth or 4096bit with 1TB/s.In current vega there is just so much bottlenecks.
 

CatMerc

Golden Member
Jul 16, 2016
1,114
1,153
136
yeah vega should have 6SE/96rops resulting in +50% geometry power and +50% pixel fillrate.Vega also should have 3072bit resulting in 768GB/s bandwidth or 4096bit with 1TB/s.In current vega there is just so much bottlenecks.
How likely do you think for AMD not to know this when a bunch of semi knowledgeable forum posters come to this conclusion?

"What about cosmic radiation? What's the plan there?"
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
I believe that this was a test that AMD set the rules for, so you'd be a fool to believe that it had any chance of coming out in a way that doesn't make them look good.
You should watch the video rather than making inaccurate assumptions.

They could have used a 960, and had the same visual results.

Heck, just get a 1050, put it up against the Vega Rx, and the exact same setup, and then, nvidia could say, look, our GPU + monitor costs $500 less than Vega RX + monitor! You can't tell the difference.

This is yet more smoke & mirrors.

Frankly, this is the exact same kind of crap Intel is doing with the "glue" comments on Epyc & Threadripper. This is beneath AMD.
Have you done these comparisons? Because if you haven't you have no way of knowing, never mind verifying, the accuracy of your claims.

There's a lot of people seemingly saying anything they can to sway people's opinions. I don't see the point, really.

I am little more surprised that HardOCP repeated this kind of silly test that AMD was doing at it's tour.

Actually, except for not taking measurements [H] tests like this all of the time.

Again, someone else simply making completely misleading statements. Why?

Because he could just as well have used a GTX 1070 and had the same results. All this kind of testing proves is that people are poor at determining frame rate once it gets over a reasonable threshold. Especially if you have some kind of variable sync on top.
They weren't attempting to determine frame rates at all. And without actually performing the test with a 1070 there is no way to claim any accuracy regarding your statement. It's simply hyperbole.
 
Last edited:
Reactions: Kuosimodo

Head1985

Golden Member
Jul 8, 2014
1,866
699
136
How likely do you think for AMD not to know this when a bunch of semi knowledgeable forum posters come to this conclusion?
pretty big at this point.Furyx and vega have still tons of bottlenecks because 4x SE and low memory bandwidth.
They maybe know they have bottlenecks, but sure they dont know how to fix it(6xSE and 96rops)
 
Last edited:

CatMerc

Golden Member
Jul 16, 2016
1,114
1,153
136
pretty big at this point.Furyx and vega have still tons of bottlenecks because 4x SE and low memory bandwidth.
They maybe know they have bottlenecks, but sure they dont know how to fix it(6xSE and 96rops)
Or maybe the bottlenecks aren't as obvious, which is why they aren't doing this. Maybe 6 SE's is fine, and maybe 64 ROPS are too. We don't know nearly as much as AMD does.

Just because they're lagging behind NVIDIA in a theoretical sense on some aspects (like max pixel fill rate), doesn't mean that's what's holding back their performance.

If it was as simple as more ROPs they would have done it. Vega unlike Fiji had room to spare.
 
Reactions: Kuosimodo and Krteq

tential

Diamond Member
May 13, 2008
7,355
642
121
Or maybe the bottlenecks aren't as obvious, which is why they aren't doing this. Maybe 6 SE's is fine, and maybe 64 ROPS are too. We don't know nearly as much as AMD does.

Just because they're lagging behind NVIDIA in a theoretical sense on some aspects (like max pixel fill rate), doesn't mean that's what's holding back their performance.

If it was as simple as more ROPs they would have done it. Vega unlike Fiji had room to spare.
Ya amd is so smart we could never know more than them.
We should probably turn our voltage back up on Polaris. Just to be safe.
 

Zstream

Diamond Member
Oct 24, 2005
3,396
277
136
Ya amd is so smart we could never know more than them.
We should probably turn our voltage back up on Polaris. Just to be safe.

Which reminds me, my settings keep reverting back to default. I'll have to look into that.
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
Or maybe the bottlenecks aren't as obvious, which is why they aren't doing this. Maybe 6 SE's is fine, and maybe 64 ROPS are too. We don't know nearly as much as AMD does.

Just because they're lagging behind NVIDIA in a theoretical sense on some aspects (like max pixel fill rate), doesn't mean that's what's holding back their performance.

If it was as simple as more ROPs they would have done it. Vega unlike Fiji had room to spare.

Again, you're assuming they know how to do it. I think they don't - GCN, at this point, is basically alien technology to them, designed by a completely different team, and they can't figure out how to expand it. Keep in mind that these are low-wage engineers from a culture not known for its creativity.
 
Reactions: VirtualLarry

.vodka

Golden Member
Dec 5, 2014
1,203
1,537
136
If that's the case now that they have some nice cash coming thanks to Ryzen they should hire these guys again to fix the mess.
 

CatMerc

Golden Member
Jul 16, 2016
1,114
1,153
136
Ya amd is so smart we could never know more than them.
We should probably turn our voltage back up on Polaris. Just to be safe.
1. Do yields mean anything to you? Because yes, AMD knows best about their spread of chips and what voltage they would need to get a certain clockspeed with x% of chips.
2. AMD overvolting their chips is a myth. NVIDIA has even bigger voltage safety guards than AMD, and they undervolt just as well.

Again, you're assuming they know how to do it. I think they don't - GCN, at this point, is basically alien technology to them, designed by a completely different team, and they can't figure out how to expand it. Keep in mind that these are low-wage engineers from a culture not known for its creativity.
If it's completely alien technology to them, they wouldn't be able to create chips from it. It's not like Polaris or Fiji have the same floorplan or configuration of previous designs.

Any company keeps documentation, and even if by some bizzare stroke of ignorance AMD managed to not do so, they have enough smart people to reverse engineer the thing. We have a ton of people on forums like Beyond3D who know GCN chips very intimately, so to think AMD employees with all the support and equipment can't do so is just absurd.
 
Reactions: Kuosimodo

advt.naveen

Junior Member
May 17, 2013
20
7
81
Again, you're assuming they know how to do it. I think they don't - GCN, at this point, is basically alien technology to them, designed by a completely different team, and they can't figure out how to expand it. Keep in mind that these are low-wage engineers from a culture not known for its creativity.
GCN is an high level module, it's broken down into smaller modules and engineers are going to work according to the documentation. Architecture expanding (modification) most probably handeled by senior team members who knows GCN well.
GCN is one of the industry leading technology, no company throws their important tech to an totally inexperienced team.

Sent from my ONEPLUS A3003 using Tapatalk
 

tential

Diamond Member
May 13, 2008
7,355
642
121
GCN is an high level module, it's broken down into smaller modules and engineers are going to work according to the documentation. Architecture expanding (modification) most probably handeled by senior team members who knows GCN well.
GCN is one of the industry leading technology, no company throws their important tech to an totally inexperienced team.

Sent from my ONEPLUS A3003 using Tapatalk
They throw it to the best team they can afford.... That's the point being made.
 
Status
Not open for further replies.
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |