AMD Vega (FE and RX) Benchmarks [Updated Aug 10 - RX Vega 64 Unboxing]

Page 80 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

Paul98

Diamond Member
Jan 31, 2010
3,732
199
106
Hopefully moving forward AMD will have a far more scalable architecture, I don't have any knowledge on this. What it seems is that NVidia just scales up there chips from a base design, you can see it just looking at the different specs from like the 1060 to 1080ti. Where as AMD has far more that is static when they scale up
 
Reactions: GodisanAtheist

urvile

Golden Member
Aug 3, 2017
1,575
474
96
I dont know why you guys still speculating about gaming performance.Vega cards will be sell out 5min after launch or at 1000+ USD because miners.And it will continue few years.You wont be abble buy vega for gaming unless pying 1000usd so dont bother.
Bitcoin is up 700USD from 2days ago and all time maximum and other coins going up as well.There will be even bigger mining craze in few days/weeks than current one.

I have read that is why AMD are selling vega cards in gaming bundles. In a bid to discourage miners. If the benchmarks are good I will still buy the AIO variant but it all depends on price relative to the gtx1080ti. I have waited all this time for Vega because I have a freesync monitor but I get the feeling I will be buying an AIO gtx1080ti. Which will be slightly annoying given I could have bought one months ago.
 
Reactions: Kuosimodo

IllogicalGlory

Senior member
Mar 8, 2013
934
346
136
There's lots of reasons why such an extrapolation of expected performance is just silly.
Are you expecting a performance per shader per clock regression? Because if not, it seems like a reasonable extrapolation. If you're looking for justification from a technical standpoint, it may be silly as we know little or anything about the details of Volta, but looking at the trend of NVIDIA's designs and generational updates, it seems that 40-70% performance increases at the same power are the order of the day.
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
Hopefully moving forward AMD will have a far more scalable architecture, I don't have any knowledge on this. What it seems is that NVidia just scales up there chips from a base design, you can see it just looking at the different specs from like the 1060 to 1080ti. Where as AMD has far more that is static when they scale up

AMD had issues scaling down, too. Polaris 11 is about 15% slower than GP107 and uses about 40-50% more juice.
 
Last edited:

Elixer

Lifer
May 7, 2002
10,376
762
126
I dont know why you guys still speculating about gaming performance.Vega cards will be sell out 5min after launch or at 1000+ USD because miners.And it will continue few years.You wont be abble buy vega for gaming unless pying 1000usd so dont bother.
A few years? Vega won't be around in a few years, that will be Navi.

Yes, all kidding aside, if, (and it still is a big if) Vega turns out to be a cryptomining beast in things other than Monero, yes, we will see lots of Polaris cards up for sale crashing that market, and little stock of Vega, unless AMD farms out Vega to Samsung's fabs.

I actually wouldn't mind seeing AMD use another foundry, and we can finally get to the bottom of how good (or bad) GloFlo has been for them.

BTW, the "bundles" AMD has, wouldn't actually stop any miners, since, it seems you can just pay $100 more for the card, and that is it. You aren't forced to by the other stuff.
 

GodisanAtheist

Diamond Member
Nov 16, 2006
7,069
7,492
136
Hopefully moving forward AMD will have a far more scalable architecture, I don't have any knowledge on this. What it seems is that NVidia just scales up there chips from a base design, you can see it just looking at the different specs from like the 1060 to 1080ti. Where as AMD has far more that is static when they scale up

- There is a lot of speculation that Navi is going to be a Zen style MCM, but I'm starting to think AMD is going to make its GCN arch scaleable within the die, like NV has managed to do since Fermi with the Graphics Processing Cluster (each cluster on its own could be a video card, and multiple clusters are linked together through L2 cache to create, yup, a scalable arch with fixed processing ratios up and down the product stack).

Source: http://www.anandtech.com/show/2977/...tx-470-6-months-late-was-it-worth-the-wait-/3

NV has basically tweaked their Fermi arch to get the GPC as efficient as possible from Fermi through Pascal while adding functionality around the edges (such as tile based rendering). Then they just plug more GPC units together or take units out as the node and chip position require and presto, a complete line-up.

Source (and you can keep going through the 980 and 1080 reviews): http://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/2

On the other hand, GCN, while a fantastic arch when it first launched, is a fixed arch that caps out at 4 shader engines according to AMD themselves (Source: http://www.anandtech.com/show/9390/the-amd-radeon-r9-fury-x-review/4) and has effectively hit the wall. Vega is still Fiji, with tweaking around the edge to try and extract maximum performance. We all knew this on some level when Vega was announced to have the exact same shader count as Fiji.

Navi will likely bring per-die scalability in line with what NV has been doing for ages, but I think we will still have to wait a generation or two before we see a gpu that scales past a single die on an interposer. Vega is GCN brought to the brink, but I assume a lot of the fringe development put into Vega will allow AMD to focus on the "clean sheet" arch redesign they desperately need.
 

coercitiv

Diamond Member
Jan 24, 2014
6,403
12,864
136
BTW, the "bundles" AMD has, wouldn't actually stop any miners, since, it seems you can just pay $100 more for the card, and that is it. You aren't forced to by the other stuff.
Bingo!

When someone charges $100 for a premium Founders Edition, it's smart & efficient marketing. When someone else charges $100 extra for a premium Limited Edition and throws in some optional discounts, it's a desperate move to grab some attention.
 

railven

Diamond Member
Mar 25, 2010
6,604
561
126
Well, to be fair, all the bundles come with the two games, which require no effort to claim.

So I'd expect a bunch of keys going for dirt cheap if miners do gobble up the bundles.
 
Reactions: Kuosimodo

urvile

Golden Member
Aug 3, 2017
1,575
474
96
Well. Vega will be on pre-order here soon. However I am not comfortable with pre-ordering it. Why do I keep thinking gtx1080ti? Specifically the Gigabyte AORUS waterforce Xtreme edition. Why?
 

amenx

Diamond Member
Dec 17, 2004
4,012
2,284
136
Well. Vega will be on pre-order here soon. However I am not comfortable with pre-ordering it. Why do I keep thinking gtx1080ti? Specifically the Gigabyte AORUS waterforce Xtreme edition. Why?
Cant see why anyone would pre-order anything, from either side, without full and comprehensive product reviews.
 
Reactions: Kuosimodo and Konan

urvile

Golden Member
Aug 3, 2017
1,575
474
96
Well. Vega will be on pre-order here soon. However I am not comfortable with pre-ordering it. Why do I keep thinking gtx1080ti? Specifically the Gigabyte AORUS waterforce Xtreme edition. Why?
Cant see why anyone would pre-order anything, from either side, without full and comprehensive product reviews.

Sometimes I like to live dangerously. I am sure plenty of people purchased a (reference) gtx1080 at full price when they were available for pre-order. Horses for courses I guess.
 

Glo.

Diamond Member
Apr 25, 2015
5,765
4,670
136
Because the 4096-core GCN chips are bottlenecked by low ROP counts (64) and a narrow front-end (4 triangles/clock). In contrast, both GM200 (980 Ti) and GP102 (1080 Ti) have 96 ROPs and can do 6 triangles/clock.
Also, as noted previously, Vega isn't delivering expected memory bandwidth. They wanted to double bandwidth per pin over Fury X (this is even touted in the release day slides) but only got ~1.8x per pin, and that only by overvolting and overclocking. And it seems to be thermally throttled as well (liquid cooling or better air cooling on AIB cards might help with that, at least).
TileBased Rasterization allows you to lift this problem, because you basically are culling unused parts of the displayed image. Maxwell and Consumer Pascal GPUs are on the low-level architecture the same: 128 cores/256 KB Register File Size, and GTX 980 Ti has 96 ROPs, with over 300 GB/s memory bandwidth, and GTX 1080 has 64 ROP with over 300 GB/s memory bandwidth, with almost the same amount of CUDA cores in both GPUs, but GTX 1080 is much higher clocked. And there is no problem here.

I genuinely suggest waiting with judgement about Vega, until software matures. Fiji was bottlenecked by load balancing and massive stalls and underutilization of the GPU. Vega should not have this problem because of 3 reasons. Memory management, that helps lifting stalls, with portioning of data available to cores. New scheduler, that is able to work with more than 4 shader engines(this is base for not Only Vega, but future architectures). Load balancing which overclocks and declocks the shader engines depending on the load they are working on. each Shader engine is working on its own part of the display. Some parts of display may have more complex scenes, loaded with higher number of geometry, and this technique should improve balancing this(overclocking the shaders which are more loaded, and underclocking, and slowing down the shaders which have smaller load). This alone should improve stalls in pipelines. The last feature solely relies on BIOS of the GPU to work in absolute perfect sync with drivers, and the software.

And last bit. Primitive Shaders allow GCN5 to register 10 Triangles/clock with 4 shader engines. This alone should make Vega 40% faster per clock than Fiji is. All of the improvements of the architecture, with higher clock speeds should make Vega two times faster than Fury X is. And around 30% faster than GTX 1080 Ti.

I have written this before, I am writing this today. There is nothing in the hardware of GCN5 that could result in bottlenecking in. Software - that is different story.
 

Glo.

Diamond Member
Apr 25, 2015
5,765
4,670
136
I am just extrapolating GV102 perf from GV100 perf 15 tflops vs GP100 10.6 tflops. Thats a 41.5% perf increase

http://www.anandtech.com/show/11367...v100-gpu-and-tesla-v100-accelerator-announced

We can surely expect 40-50% perf increase over GP102 from GV102.
40% performance increase you will get only from changing the architecture layout from 128 Cores/256 KB Register File size to 64 cores/256 KB register File size.

In other words: search comparisons of performance between GP100 vs GTX Titan X Pascal, and you will get a picture how much GV consumer chips will be compared to Pascal, clock for clock, core for core.
 

Glo.

Diamond Member
Apr 25, 2015
5,765
4,670
136
A few years? Vega won't be around in a few years, that will be Navi.

Yes, all kidding aside, if, (and it still is a big if) Vega turns out to be a cryptomining beast in things other than Monero, yes, we will see lots of Polaris cards up for sale crashing that market, and little stock of Vega, unless AMD farms out Vega to Samsung's fabs.

I actually wouldn't mind seeing AMD use another foundry, and we can finally get to the bottom of how good (or bad) GloFlo has been for them.

BTW, the "bundles" AMD has, wouldn't actually stop any miners, since, it seems you can just pay $100 more for the card, and that is it. You aren't forced to by the other stuff.
Navi is GCN5. But with small additions. Vega is foundation for all future AMD GPUs, at least on architectural level.
 

Malogeek

Golden Member
Mar 5, 2017
1,390
778
136
yaktribe.org
Are you expecting a performance per shader per clock regression? Because if not, it seems like a reasonable extrapolation. If you're looking for justification from a technical standpoint, it may be silly as we know little or anything about the details of Volta, but looking at the trend of NVIDIA's designs and generational updates, it seems that 40-70% performance increases at the same power are the order of the day.
It really depends on how big they decide to make the chip. GV100 has over 40% more cuda cores and the die is also more than 30% larger than GP100, so of course it's going to be faster. I fully expect amazing power/perf performance, even better than Pascal, but I don't think people should expect a 50% performance improvement at the same size chip.
 

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
It'll be close enough to that. Their annual upgrade market then basically dictates how big they'll make the chip - GV104 ~10% > than GP102, GV102 a significant chunk (~30%?) > than GV104 etc.

The only thing that might make them think would be if they were threatening to blow up the power budget but GV100 shows that should be OK.
 

Muhammed

Senior member
Jul 8, 2009
453
199
116
All of the improvements of the architecture, with higher clock speeds should make Vega two times faster than Fury X is. And around 30% faster than GTX 1080 Ti.
My god, you'd think with all the official information from AMD people would not resort to conjecture and empty hype, but here we are again! AMD says it's product is 1080 level, but one dude says it's 1080Ti with time! LOL!
 

Malogeek

Golden Member
Mar 5, 2017
1,390
778
136
yaktribe.org
My god, you'd think with all the official information from AMD people would not resort to conjecture and empty hype, but here we are again! AMD says it's product is 1080 level, but one dude says it's 1080Ti with time! LOL!
Fine Wine Technology. People don't buy GPUs according to what it could be in a couple of years though
 
Reactions: Kuosimodo

insertcarehere

Senior member
Jan 17, 2013
639
607
136
Cant see why anyone would pre-order anything, from either side, without full and comprehensive product reviews.

On the contrary, if one perceives that miners will gobble them up and raise prices along the way, why not pre-order as much as you can and flip them for inflated prices?
 
Reactions: tonyfreak215

Muhammed

Senior member
Jul 8, 2009
453
199
116
Fine Wine Technology. People don't buy GPUs according to what it could be in a couple of years though
That's the problem, there is no such thing as FineWine, people got excited about it in the era of Kepler generation when Kepler cards fell behind their counterparts cards from AMD. This however didn't happen to Fermi, Maxwell, Pascal or any other generation. NV remains faster overall, they might have their lead tampered a bit, but they still remain faster.
 

Glo.

Diamond Member
Apr 25, 2015
5,765
4,670
136
My god, you'd think with all the official information from AMD people would not resort to conjecture and empty hype, but here we are again! AMD says it's product is 1080 level, but one dude says it's 1080Ti with time! LOL!
So you are saying that AMD architecture, having higher throughput of the cores, compared to consumer Pascal GPUs, and having higher geometry throughput than Pascal GPUs and just acknowledging this, is creating hype?

GCN architecture is made from 4, 16-wide SIMDs in each CU, accounting for 64 cores, and are tied by 256 KB Register File size per 64 cores.

Each wavefront in GCN has 64 KB, and each warp in Nvidia is 32 KB.

Kepler architecture was bound by 256 KB RFS per 192 cores.
Maxwell was bound by 256 KB RFS per 128 cores
Consumer Pascal is bound by 256 KB RFS per 128 cores.
GP100 - 64 cores/ 256 KB Register File Size.
GV100 - 64 cores/256 KB RFS.

What this means is that 128 cores in Maxwell architecure had the same throughput, and performance of those 192 cores, from Kepler architecture.


Now compare notes. 64 KB wavefront processed by architecture, that has 256 KB register File size per 64 cores, vs architecture that has 32 KB warp, processed by architecture that has 256 KB RFS per 128 cores. Which one will be quicker?

Previous generations of GCN were bound by Geometry performance - 1 triangle per clock, per shader engine.
Same for Nvidia 1 triangle per clock, per SM. This is where the differences start to go in Nvidia way.
If Nvidia has 6 SM's it can register 6 Triangles each clock, and is able to achieve higher clocks, so the pipeline can be filled much more often than GCN.

GCN can process more work each cycle, but cannot achieve as high clocks.

Here is where GCN5 comes. It changed everything on architecture level, to mitigate its downsides. Primitive Shader, and Programmable Geometry Pipeline allows GCN to register 10 triangles each clock with 4 shader engines, and the architecture is able to clock itself up to 1.6 GHz. There is no reason to believe that GPU architecture that has higher overall throughput, in properly optimized software to be behind latest Nvidia architectures, including Volta.

4096 GCN core chip, with 64 KB wavefront, processed by 64 cores, that are fed with 256 KB register file size, will be faster per clock, than 3840 CUDA core chip, with 32 KB warp, processed by 128 core, that are fed by the same 256 KB RFS. The only reason why its not faster: software developers did not implemented Primitive Shaders, and not used Programmable Geometry Pipeline, and drivers of said GPU are not ready.

FineWine technology has nothing to do here. Biggest features of Vega architecture, that have hghest impact on its performance are not implemented in the software.

To sum this wall of text. In DX11 games, Vega will be at best around GTX 1080 performance. With properly optimized software in DX12 and Vulkan, it will be 30% faster than GTX 1080 Ti.
 

Muhammed

Senior member
Jul 8, 2009
453
199
116
To sum this wall of text. In DX11 games, Vega will be at best around GTX 1080 performance. With properly optimized software in DX12 and Vulkan, it will be 30% faster than GTX 1080 Ti.
Nope, not gonna happen.
And theoretical numbers don't mean squat, if the architecture is inefficient or has multiple bottlenecks that prevent it from achieving it's maximum potential. And Vega appears to still have loads of these bottlenecks, memory compression is behind NV, polygon throughput is behind NV, (primitive shaders needs to be coded for by the driver and the developers, and it's efficacy is still unkown), ALU utilization is still behind NV (needs a lot of Async to increase utilization), ROPs and TMUs throughput is still behind NV, nearly every hardware aspect is behind NV. No surprise it's barely GP104 level. AMD's GPUs needs alot of hand holding and software intervention to be usable, which sucks, but is perfect for a console environment. NV's GPUs just work behind the scenes, transparently. This is the way to do it, that's why they are superior out of the box.
 
Reactions: crisium and xpea

Glo.

Diamond Member
Apr 25, 2015
5,765
4,670
136
Nope, not gonna happen.
And theoretical numbers don't mean squat, if the architecture is inefficient or has multiple bottlenecks that prevent it from achieving it's maximum potential. And Vega appears to still have loads of these bottlenecks, memory compression is behind NV, polygon throughput is behind NV, (primitive shaders needs to be coded for by the driver and the developers, and it's efficacy is still unkown), ALU utilization is still behind NV (needs a lot of Async to increase utilization), ROPs and TMUs throughput is still behind NV, nearly every hardware aspect is behind NV. No surprise it's barely GP104 level. AMD's GPUs needs alot of hand holding and software intervention to be usable, which sucks, but is perfect for a console environment. NV's GPUs just work behind the scenes, transparently. This is the way to do it, that's why they are superior out of the box.
Its not gonna happen because of hardware, or because of software?

GCN5 does NOT HAVE hardware bottlenecks. ONLY thing which can bottleneck it is Software.

I have been silent for past few weeks to get understanding of Vega architecture. The conclusion is very simple. From architecture standing point Vega has all of Nvidia CUDA most important features, but adds on top of them extremely clever culling system, and memory paging system.

About the culling system here is very informative video:

I do not see a reason why architecture that has higher throughput, but requires software optimization, would be slower than weaker architecture in optimized software.
 
Status
Not open for further replies.
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |