AMD Vega (FE and RX) Benchmarks [Updated Aug 10 - RX Vega 64 Unboxing]

EXCellR8 · Jul 28, 2017

if this architecture doesn't work out at least we'll be able to say... "what happens in Vega, stays in Vega"

... I'll show myself out

tential · Jul 28, 2017

Just as I predicted. Vega 11 will replace Polaris. Most likely will be similar to Vega 10 just with additional power saving features similar to Polaris 11 vs Polaris 10.

William Gaatjes · Jul 28, 2017

Tup3x said:
Pixel fillrate is definitely miles behind competitors products. Even GTX 1070 is quite a bit ahead.

If i am not mistaken :

With tile based rendering, the idea is to split up the screen in tiny fragments and all the data on those fragments is also loaded and locally stored on chip preventing external memory access (like hbm2 or gddr). Any reading modifying and writing of data can be done locally in the (i suspect) L2 cache. Only to load the cache with data from external memory is needed and to store the end result back to external memory. In general speaking.
The ROPs (Raster Operations) units work now closely with the L2 cache from the gpu, so i suspect the L2 cache has grown quite a bit to help out with reducing external memory access.
Also the draw stream binner which has the task of culling non visible polygons will help in hidden surface removal.

The trick with culling not visible polygons (hidden surface removal) is that you also have to do less operations on pixels. Then a lower pixel fill rate is less of an issue if the culling of not visible polygons is successful. It all depends on the implementation of course (and driver support)...

At least, that is the theory.

Tup3x · Jul 28, 2017

William Gaatjes said:
If i am not mistaken :

With tile based rendering, the idea is to split up the screen in tiny fragments and all the data on those fragments is also loaded and locally stored on chip preventing external memory access (like hbm2 or gddr). Any reading modifying and writing of data can be done locally in the (i suspect) L2 cache. Only to load the cache with data from external memory is needed and to store the end result back to external memory. In general speaking.
The ROPs (Raster Operations) units work now closely with the L2 cache from the gpu, so i suspect the L2 cache has grown quite a bit to help out with reducing external memory access.
Also the draw stream binner which has the task of culling non visible polygons will help in hidden surface removal.

The trick with culling not visible polygons (hidden surface removal) is that you also have to do less operations on pixels. Then a lower pixel fill rate is less of an issue if the culling of not visible polygons is successful. It all depends on the implementation of course (and driver support)...

At least, that is the theory.

Yet NVIDIA keeps adding more and more ROPs while using tile based rendering, with good results.

crisium · Jul 28, 2017

Nvidia went from 2880:48 shaders:rops, to 1280:48 just two generations later. They must be doing something right. In the same time period (and same competing product stacks), AMD went from 2816:64 to 2304:32. The performance-per-flop and relative performance-per-area don't make me think anything right is going on there with AMD's decisions. Pixel fillrate must still be very important, but only one company is willing or able to chase higher amounts.

William Gaatjes · Jul 28, 2017

Tup3x said:
Yet NVIDIA keeps adding more and more ROPs while using tile based rendering, with good results.

Implementations differ of course, being all heavily patented.
Also, as mention in my other post, tile based rendering is very difficult to implement. Nvidia acquired the TBR technology after buying 3dfx in 2002. Yet it took until Maxwell (2010 - 2014 ish) before they used it for high end desktop pc graphic renderers.
AMD is just getting started. Give them some time.

EDIT:
Of course, if AMD really does do TBR and it is not a marketing pr stunt.

n0x1ous · Jul 28, 2017

I really thought we would see 128 ROPS in Vega when it was announced that it was suppose to be a major overhaul of GCN

William Gaatjes · Jul 28, 2017

crisium said:
Nvidia went from 2880:48 shaders:rops, to 1280:48 just two generations later. They must be doing something right. In the same time period (and same competing product stacks), AMD went from 2816:64 to 2304:32. The performance-per-flop and relative performance-per-area don't make me think anything right is going on there with AMD's decisions. Pixel fillrate must still be very important, but only one company is willing or able to chase higher amounts.

I think that has something to do with the (Nvidia) warp size of 32 and the (AMD) wave size of 64.
If not all CU are fully utilized, AMD has a problem.
But that is just my understanding from reading the technical information, i am not a gpu programmer.

JDG1980 · Jul 28, 2017

n0x1ous said:
I really thought we would see 128 ROPS in Vega when it was announced that it was suppose to be a major overhaul of GCN

I think the current Chinese development team simply doesn't know how to fix the limitations of GCN (remember, they weren't the ones who originally designed it - a North American team did that, but was laid off in 2013). They don't know how to expand it to more than 4 shader engines, 64 CUs, or 64 ROPs. All they can do is some mild tinkering around the edges.

advt.naveen · Jul 28, 2017

JDG1980 said:
I think the current Chinese development team simply doesn't know how to fix the limitations of GCN (remember, they weren't the ones who originally designed it - a North American team did that, but was laid off in 2013). They don't know how to expand it to more than 4 shader engines, 64 CUs, or 64 ROPs. All they can do is some mild tinkering around the edges.

Is that means Vega architecture (I'm talking about micro-architecture not rtl code writing) is fully designed by an inexperienced team?
If so then what is the work of raja team in North America does. I thought ROP and CU counts are part of architecture designing and that's taken care by raja and few other members in the team. According to the documented architecture several rtl teams will work.

CatMerc · Jul 29, 2017

JDG1980 said:
I think the current Chinese development team simply doesn't know how to fix the limitations of GCN (remember, they weren't the ones who originally designed it - a North American team did that, but was laid off in 2013). They don't know how to expand it to more than 4 shader engines, 64 CUs, or 64 ROPs. All they can do is some mild tinkering around the edges.

That's a fairly ridiculous thing to suggest. Given internal documentation and schematics, it's all down to their resources and ingenuity. There isn't anything hidden from them.

If AMD doesn't have those documents, we have far bigger problems than what employees they have.

Head1985 · Jul 29, 2017

JDG1980 said:
I think the current Chinese development team simply doesn't know how to fix the limitations of GCN (remember, they weren't the ones who originally designed it - a North American team did that, but was laid off in 2013). They don't know how to expand it to more than 4 shader engines, 64 CUs, or 64 ROPs. All they can do is some mild tinkering around the edges.

yeah vega should have 6SE/96rops resulting in +50% geometry power and +50% pixel fillrate.Vega also should have 3072bit resulting in 768GB/s bandwidth or 4096bit with 1TB/s.In current vega there is just so much bottlenecks.

CatMerc · Jul 29, 2017

Head1985 said:
yeah vega should have 6SE/96rops resulting in +50% geometry power and +50% pixel fillrate.Vega also should have 3072bit resulting in 768GB/s bandwidth or 4096bit with 1TB/s.In current vega there is just so much bottlenecks.

How likely do you think for AMD not to know this when a bunch of semi knowledgeable forum posters come to this conclusion?

"What about cosmic radiation? What's the plan there?"

3DVagabond · Jul 29, 2017

Mopetar said:
I believe that this was a test that AMD set the rules for, so you'd be a fool to believe that it had any chance of coming out in a way that doesn't make them look good.

You should watch the video rather than making inaccurate assumptions.

Elixer said:
They could have used a 960, and had the same visual results.

Heck, just get a 1050, put it up against the Vega Rx, and the exact same setup, and then, nvidia could say, look, our GPU + monitor costs $500 less than Vega RX + monitor! You can't tell the difference.

This is yet more smoke & mirrors.

Frankly, this is the exact same kind of crap Intel is doing with the "glue" comments on Epyc & Threadripper. This is beneath AMD.

Have you done these comparisons? Because if you haven't you have no way of knowing, never mind verifying, the accuracy of your claims.

There's a lot of people seemingly saying anything they can to sway people's opinions. I don't see the point, really.

PeterScott said:
I am little more surprised that HardOCP repeated this kind of silly test that AMD was doing at it's tour.

Actually, except for not taking measurements [H] tests like this all of the time.

Again, someone else simply making completely misleading statements. Why?

PeterScott said:
Because he could just as well have used a GTX 1070 and had the same results. All this kind of testing proves is that people are poor at determining frame rate once it gets over a reasonable threshold. Especially if you have some kind of variable sync on top.

They weren't attempting to determine frame rates at all. And without actually performing the test with a 1070 there is no way to claim any accuracy regarding your statement. It's simply hyperbole.

Head1985 · Jul 29, 2017

CatMerc said:
How likely do you think for AMD not to know this when a bunch of semi knowledgeable forum posters come to this conclusion?

pretty big at this point.Furyx and vega have still tons of bottlenecks because 4x SE and low memory bandwidth.
They maybe know they have bottlenecks, but sure they dont know how to fix it(6xSE and 96rops)

CatMerc · Jul 29, 2017

Head1985 said:
pretty big at this point.Furyx and vega have still tons of bottlenecks because 4x SE and low memory bandwidth.
They maybe know they have bottlenecks, but sure they dont know how to fix it(6xSE and 96rops)

Or maybe the bottlenecks aren't as obvious, which is why they aren't doing this. Maybe 6 SE's is fine, and maybe 64 ROPS are too. We don't know nearly as much as AMD does.

Just because they're lagging behind NVIDIA in a theoretical sense on some aspects (like max pixel fill rate), doesn't mean that's what's holding back their performance.

If it was as simple as more ROPs they would have done it. Vega unlike Fiji had room to spare.

tential · Jul 29, 2017

CatMerc said:
Or maybe the bottlenecks aren't as obvious, which is why they aren't doing this. Maybe 6 SE's is fine, and maybe 64 ROPS are too. We don't know nearly as much as AMD does.

Just because they're lagging behind NVIDIA in a theoretical sense on some aspects (like max pixel fill rate), doesn't mean that's what's holding back their performance.

If it was as simple as more ROPs they would have done it. Vega unlike Fiji had room to spare.

Ya amd is so smart we could never know more than them.
We should probably turn our voltage back up on Polaris. Just to be safe.

Zstream · Jul 29, 2017

tential said:
Ya amd is so smart we could never know more than them.
We should probably turn our voltage back up on Polaris. Just to be safe.

Which reminds me, my settings keep reverting back to default. I'll have to look into that.

JDG1980 · Jul 29, 2017

CatMerc said:
Or maybe the bottlenecks aren't as obvious, which is why they aren't doing this. Maybe 6 SE's is fine, and maybe 64 ROPS are too. We don't know nearly as much as AMD does.

Just because they're lagging behind NVIDIA in a theoretical sense on some aspects (like max pixel fill rate), doesn't mean that's what's holding back their performance.

If it was as simple as more ROPs they would have done it. Vega unlike Fiji had room to spare.

Again, you're assuming they know how to do it. I think they don't - GCN, at this point, is basically alien technology to them, designed by a completely different team, and they can't figure out how to expand it. Keep in mind that these are low-wage engineers from a culture not known for its creativity.

.vodka · Jul 29, 2017

If that's the case now that they have some nice cash coming thanks to Ryzen they should hire these guys again to fix the mess.

CatMerc · Jul 29, 2017

tential said:
Ya amd is so smart we could never know more than them.
We should probably turn our voltage back up on Polaris. Just to be safe.

1. Do yields mean anything to you? Because yes, AMD knows best about their spread of chips and what voltage they would need to get a certain clockspeed with x% of chips.
2. AMD overvolting their chips is a myth. NVIDIA has even bigger voltage safety guards than AMD, and they undervolt just as well.

JDG1980 said:
Again, you're assuming they know how to do it. I think they don't - GCN, at this point, is basically alien technology to them, designed by a completely different team, and they can't figure out how to expand it. Keep in mind that these are low-wage engineers from a culture not known for its creativity.

If it's completely alien technology to them, they wouldn't be able to create chips from it. It's not like Polaris or Fiji have the same floorplan or configuration of previous designs.

Any company keeps documentation, and even if by some bizzare stroke of ignorance AMD managed to not do so, they have enough smart people to reverse engineer the thing. We have a ton of people on forums like Beyond3D who know GCN chips very intimately, so to think AMD employees with all the support and equipment can't do so is just absurd.

advt.naveen · Jul 29, 2017

JDG1980 said:
Again, you're assuming they know how to do it. I think they don't - GCN, at this point, is basically alien technology to them, designed by a completely different team, and they can't figure out how to expand it. Keep in mind that these are low-wage engineers from a culture not known for its creativity.

GCN is an high level module, it's broken down into smaller modules and engineers are going to work according to the documentation. Architecture expanding (modification) most probably handeled by senior team members who knows GCN well.
GCN is one of the industry leading technology, no company throws their important tech to an totally inexperienced team.

Sent from my ONEPLUS A3003 using Tapatalk

tential · Jul 29, 2017

advt.naveen said:
GCN is an high level module, it's broken down into smaller modules and engineers are going to work according to the documentation. Architecture expanding (modification) most probably handeled by senior team members who knows GCN well.
GCN is one of the industry leading technology, no company throws their important tech to an totally inexperienced team.

Sent from my ONEPLUS A3003 using Tapatalk

They throw it to the best team they can afford.... That's the point being made.

Cloudfire777 · Jul 29, 2017

Hahahahaha no livestream from the event they will be launching the very product they have been teasing and hyping for years now.

Well done RTG. Im not sure if its Raja making these stupid decisions but RTG is 1000x worse than when there was no deidcated GPU company within AMD.
Couldnt do this any worse

https://mobile.twitter.com/RadeonPro/status/891109514845372417

dark zero · Jul 29, 2017

So... seems that the product itself might not be ready at all...

AMD Vega (FE and RX) Benchmarks [Updated Aug 10 - RX Vega 64 Unboxing]

Diamond Member

Diamond Member

Lifer

Golden Member

Platinum Member

Lifer

Platinum Member

Lifer

Golden Member

Junior Member

Golden Member

Golden Member

Golden Member

Lifer

Golden Member

Golden Member

Diamond Member

Diamond Member

Golden Member

Golden Member

Golden Member

Junior Member

Diamond Member

Golden Member

Platinum Member