[Videocardz rumour] Vega pushed forward to October

Page 10 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86

exar333

Diamond Member
Feb 7, 2004
8,518
8
91
2X the performance of an RX 480 would put it faster than an AIB 1080 and all that for $499 vs. $650-700 that NV charges? That's how hype gets started without any basis for it. You made your prediction sound disappointing but to me that would be a huge win if true. It would essentially mean a card 25% faster than the reference GTX1070 for only $50 more or similar performance to the GTX1080 for $150-200 less. I doubt Vega 10 will be that competitive. AMD is going to need to fill in $329-499 price levels with cards between RX 480 and the GTX1080. This is why AMD already confirmed they will have Vega 10 and Vega 11. The confusing part about these 2 chips is that AMD has Vega with HBM2 on the roadmap, which doesn't align well with some speculation that Vega 10 may be GDDR5X and Vega 11 (larger chip) would be HBM2.



At the same GPU clocks and on paper memory bandwidth, Polaris 10 (4th gen) is just 18% faster on average against Tahiti (7970) but is only 7% faster against Tonga (R9 380X). That means in nearly 5 years AMD has improved IPC by just about half of what NV did in 2 years moving from Kepler to Maxwell (35-40%). Given AMD's current track record in improving IPC for GCN over 5 years, I would not expect any major improvements in architecture with Vega (0-10%). Chances are AMD will use the wide core approach (close to double everything) because it's unlikely they will use the brute force (GPU clock speeds) approach of their competitor. Most 'reasonable' estimates for Vega 10 are 4096 SPs, 256 TMUs, 64 ROPs, HBM2.

Computerbase shows only a 4.3% advantage for 2304 SP RX 480 against 2048 SP RX 480 at the same GPU/memory clocks. This suggests a severe memory bandwidth or some other bottleneck (ROP?). Despite 12.5% higher shaders and textures, the RX 480 is unable to translate this on paper advantage to real world games. This means AMD is going to need the fastest GDDR5X they can get their hands on or have no choice but to wait for HBM2. In order to improve perf/watt, as their road-map suggests, they are going for HBM2 for maximum efficiency. This is where things get tricky for AMD. SK-Hynix shows only 2 options on their road-map for Q3' 2016:

H5VR32ESM4H-12C 4GB 4Hi = 204GB/sec 1.6Gbps
H5VR32ESM4H-20C 4GB 4Hi = 256GB/sec 2.0Gbps

If they choose the former, they will likely need 4x4Hi = 16GB 800GB/sec. 16GB seems massive overkill for this generation of gaming. You end up with wasted VRAM, higher cost, higher power usage.

If they choose the latter, I am not sure you can do 2x4Hi = 8GB 512GB/sec. 8GB is a good spot to be in but 512GB/sec memory bandwidth seems borderline low for a 4096 SP, 64 ROP GCN 4.0 because RX 480 is already memory bandwidth bottlenecked with 256GB/sec.

Here is another issue: if Vega 10 is only 4096 SP, 64 ROP, 256 TMU, 512GB/sec 250W TPD design, if linear scaling was 2X over RX 470, the card would only end up slightly beating the GTX1070. Since Polaris 10 is only a 5.7B chip, the Vega 10 with these specs would fit at ~ 11B, but the rumours have been whispering that flagship Vega would be ~ 15-18B transistors. Not adding up.

On the surface it seems we could see a 3500-4000 Vega 10 with 10-11B transistors but they could also be a much larger 5000-6000 shader Vega 11 with 15-18B (or the rumours are just that - unsubstantiated dreams). What makes it so difficult to estimate is AMD's vagueness and the gargantuan gap that exists in performance between the RX 480 and GTX1080/Titan XP. It doesn't seem realistic for a 1.2-1.3Ghz Vega 10 to be both a competitor to GP104 and GP102.

IMHO, AMD's best bet to improve performance will be going wider (more functional units), and maximizing perf/watt at the transistor level just like NV mentioned they spent months trying to maximizing clock speeds on 16nm node. AMD's Polaris 10 clocks were a disappointment for a 14nm node. Whatever I typed is pretty much nothing new though.

It is disappointing. That product today would most certainly be solid and (as you say) undercut the current 1070/1080 price structure. You very well would see some minor price cuts. That said, for these to come next year, is just too little too late. The 1080ti will be around and NV can afford to price cut the current 70/80. Also, Vega likely launches only 1Q before Volta. Buyers will likely already purchasing their 70/80s by then and will waiting for the next big then. For AMD to really make an impact here, they need 80-90% of 1080ti performance. Don't build to a price point, build to class-leading (or near) with performance. If HBM, that also should with efficiency too.
 

parvadomus

Senior member
Dec 11, 2012
685
14
81
Vega10 should easily win against GP104. I expect it to be somewhere between GP104 and GP102.
Die size at ~400mm2
Given GP102 performance, AMD might win only with at least a 500mm2 part with 96Rops/6144 SPs.
Also perf/watt will be much better than RX480 thanks to HBM2.
 

parvadomus

Senior member
Dec 11, 2012
685
14
81
IMHO, AMD's best bet to improve performance will be going wider (more functional units), and maximizing perf/watt at the transistor level just like NV mentioned they spent months trying to maximizing clock speeds on 16nm node. AMD's Polaris 10 clocks were a disappointment for a 14nm node. Whatever I typed is pretty much nothing new though.

The magic behind NV's perf/watt is basically the delta color compression. If you think about it, power consumption directly correlates to the quantity of memory channels and memory speed. AMD´s DCC is way behind, it already needs the jump that Nvidia did from Kepler to Maxwell.
 

poofyhairguy

Lifer
Nov 20, 2005
14,612
318
126
The magic behind NV's perf/watt is basically the delta color compression. If you think about it, power consumption directly correlates to the quantity of memory channels and memory speed. AMD´s DCC is way behind, it already needs the jump that Nvidia did from Kepler to Maxwell.

I thought the magic was that "don't render what you can't see" trick that Power VR has pushed for over a decade?
 

parvadomus

Senior member
Dec 11, 2012
685
14
81
I thought the magic was that "don't render what you can't see" trick that Power VR has pushed for over a decade?

That might help. But I am 99% sure the savings in memory I/O circuitry is the main factor. If you think about it,RX480 basically ties 970/980/1070/1080 in power consumption, all have 4x64-bit buses, the gtx 1060 being only 3x64-bits consumes roughly 75% the power of the RX480.
Hawaii was 8x64-bits and how much did it consume?
Then we have Fiji with HBM, which basically matched Nvidia in terms of performance/watt, thats because the power consumption that memory IO was just drastically reduced.
Have a look at this:
http://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review/8
For average textures, Nvidia architecture have almost 2x the compression capability and that means memory bandwidth requirements are much much lower.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Vega10 should easily win against GP104. I expect it to be somewhere between GP104 and GP102.
Die size at ~400mm2
Given GP102 performance, AMD might win only with at least a 500mm2 part with 96Rops/6144 SPs.
Also perf/watt will be much better than RX480 thanks to HBM2.

One Vega could be close to 380mm2 with HBM2. This one could be close to 190-200W TDP.
 

IEC

Elite Member
Super Moderator
Jun 10, 2004
14,361
5,023
136
Seems plausible. There certainly could be significant power savings by going with HBM2, decreasing the efficiency gap between AMD and nV.

For the sake of competition (and consumer choice) I hope Vega 10 is awesome and produced in large quantities... and soon.
 

Azix

Golden Member
Apr 18, 2014
1,438
67
91
dx12 will be what saves vega I think. All you have to do right now is clock a fury X higher and it would beat a 1080 in dx12/vulkan. Maybe around 1200Mhz would do. The problem is fiji is massive. AMDs main problems will be die size and clock speed. I dont expect wonders in dx11 but by then we should have a decent selection of AAA dx12 games to shift the benchmarks from dx11 suckry.

I think they will shoot for at most something close to 400mm^2. Best case for them I think is in the 300+ range. HBM should do wonders for their power consumption.

The magic behind NV's perf/watt is basically the delta color compression. If you think about it, power consumption directly correlates to the quantity of memory channels and memory speed. AMD´s DCC is way behind, it already needs the jump that Nvidia did from Kepler to Maxwell.

unlikely. The one thing we can always say for sure is that they removed their hardware scheduler and they cut double precision massively. AMDs hardware scheduler setup is not going anywhere but they can probably butcher dp further.

Ultimately they really need to optimize for higher clocks. That is where they are falling behind. They can't keep beefing up the core counts etc. At some point they need higher clocks to be standard.
 
Last edited:
Reactions: Yakk

JustMe21

Senior member
Sep 8, 2011
324
49
91
I'd say that Nvidia's biggest perf/watt improvements came from going with tile based rendering. While the better color compression helps, it wouldn't provide a significant enough performance improvement, nor would it have reduced power usage as much as we see. Nvidia also needs to add more Async compute muscle into their chips.
 

Thala

Golden Member
Nov 12, 2014
1,355
653
136
I'd say that Nvidia's biggest perf/watt improvements came from going with tile based rendering. While the better color compression helps, it wouldn't provide a significant enough performance improvement, nor would it have reduced power usage as much as we see. Nvidia also needs to add more Async compute muscle into their chips.

What are you talking about? NVidia does not use tile based deferred rendering. It essentially is an immediate mode renderer as always.
 

Mikeduffy

Member
Jun 5, 2016
27
18
46
HBM2 has the potential to be huge, but I'm concerned that the market doesn't care for smaller gpus - I mean look at the Nano, IMO last generation's most advanced card without question and the media hardly gave it the attention it deserved.

Anyways, AMD needs to market HBM2 a bit better in order to get enthusiasts more excited over the product.

Side note: is compression even important if they are using the massive bandwidth of HBM2?
 

sm625

Diamond Member
May 6, 2011
8,172
137
106
The RX480 die draws what 110 watts? So if AMD simply doubles all the resources of Polaris 10, the resulting chip should be roughly 410 sq mm. This is assuming they will be able to save 20 sq mm by switching to HBM2, and also save 10 sq mm by not having to double certain feature such as their UVD/VCE. This is a very reasonable size die. If they maintain the same voltages we're looking at 220W for the die and probably 270 watts for the entire card, assuming minor efficiency improvements from HMB2 vs HBM1. This all seems very reasonable and straightforward. But what about performance? I dont see this chip beating a GTX1080 by much, if at all. And it will not be able to overclock. It will most likely be underclocked relative to RX480. It will also have to have at least one CU disabled for yield purposes. Since it will cost 2X RX480, and RX480 sits at $250 on a good day, I think we can expect a retail price of $500 for a product that just barely beats a GTX1080 on DX12 titles and falls behind on DX11 titles.

So, in summary, I expect small Vega to be 19CU, 4864 SP, 250W, 64 ROP, 8GB HBM2, 410 sq mm, 1188MHz boost clock, and roughly matching a GTX 1080 in DX12 titles for $499 MSRP $550 actual street price.
 

Samwell

Senior member
May 10, 2015
225
47
101
Might be. But this has nothing to do with tile based rendering. It is just a hierarchically tiled rasterization order, like any GPU is doing for several years.

You are mixing Tile based deferred rasterization with tile based immediate rasterization. AMD and Intel are using normal immediate mode rasterization and Nvidia till Kepler was using the same. Qualcomm Adreno, ARM Mali use Tile based immediate rasterization and it seems Maxwell and Pascal now too. What you are talking about is tile based deferred rasterization, but this is only used by PowerVR.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
The RX480 die draws what 110 watts? So if AMD simply doubles all the resources of Polaris 10, the resulting chip should be roughly 410 sq mm. This is assuming they will be able to save 20 sq mm by switching to HBM2, and also save 10 sq mm by not having to double certain feature such as their UVD/VCE. This is a very reasonable size die. If they maintain the same voltages we're looking at 220W for the die and probably 270 watts for the entire card, assuming minor efficiency improvements from HMB2 vs HBM1. This all seems very reasonable and straightforward. But what about performance? I dont see this chip beating a GTX1080 by much, if at all. And it will not be able to overclock. It will most likely be underclocked relative to RX480. It will also have to have at least one CU disabled for yield purposes. Since it will cost 2X RX480, and RX480 sits at $250 on a good day, I think we can expect a retail price of $500 for a product that just barely beats a GTX1080 on DX12 titles and falls behind on DX11 titles.

So, in summary, I expect small Vega to be 19CU, 4864 SP, 250W, 64 ROP, 8GB HBM2, 410 sq mm, 1188MHz boost clock, and roughly matching a GTX 1080 in DX12 titles for $499 MSRP $550 actual street price.


R9 380X
32 CU, 2048 Shaders, 128 TMUs, 32 ROPs, 256bit GDDR-5 memory, 190W TDP

Fury Nano
64 CU, 4096 Shaders, 256 TMUs, 64 ROPs, 4096bit HBM1, 175W TDP

RX 480
36 CU, 2304 Shaders, 128 TMUs, 32 ROPs, 256bit GDDR-5 memory, 150W TDP

(double Polaris 10)
72 CU, 4608 Shaders, 256 TMUs, 64 ROPs, 4096bit HBM2, 175W TDP ???

If they managed to double Tonga and lower TDP from 190W to 175W at 28nm and be faster than GTX 980, why not do the same at 14nm and be faster than GTX 1080 ??
Sell it at $650 and have faster than GTX 1080 DX-12/Vulkan performance at 4K at lower price and still make a lot of profit. They could even have a cut down version at 150-160W TDP faster than GTX 1070 at $499. And one more cut down version later on at $399 to close the gap between RX 480 and GTX 1070.

lets see.
 

sm625

Diamond Member
May 6, 2011
8,172
137
106
(double Polaris 10)
72 CU, 4608 Shaders, 256 TMUs, 64 ROPs, 4096bit HBM2, 175W TDP ???

If they managed to double Tonga and lower TDP from 190W to 175W at 28nm and be faster than GTX 980, why not do the same at 14nm and be faster than GTX 1080 ??

Seems likely, but I think they will try to push the clocks to try to compete with the 1080. So it will end up being 250W. If they were not so far behind 190W @ around 1050MHz would be great. But I fear it is going to have to be pushed to 1200MHz simply out of dire need to compete.
 

Thala

Golden Member
Nov 12, 2014
1,355
653
136
You are mixing Tile based deferred rasterization with tile based immediate rasterization. AMD and Intel are using normal immediate mode rasterization and Nvidia till Kepler was using the same. Qualcomm Adreno, ARM Mali use Tile based immediate rasterization and it seems Maxwell and Pascal now too. What you are talking about is tile based deferred rasterization, but this is only used by PowerVR.

I do not think i am mixing anything at all. At least based on the linked article, the tile based rasterization order is nothing special and used in every GPU since years.
If you have different information please give evidence.
 

3DVagabond

Lifer
Aug 10, 2009
11,951
204
106
Vega10 should easily win against GP104. I expect it to be somewhere between GP104 and GP102.
Die size at ~400mm2
Given GP102 performance, AMD might win only with at least a 500mm2 part with 96Rops/6144 SPs.
Also perf/watt will be much better than RX480 thanks to HBM2.
Sources for any of this?
 

Glo.

Diamond Member
Apr 25, 2015
5,762
4,667
136
Just did some math. Fiji is 70% larger than Tonga.. and 2x perf. repeat the same with polaris10 vs vega10/11
Polaris could be pipe cleaner for WSA with GloFo. Vega, and all HBM chips should be built on TSMC process, because AMD worked with TSMC and Amkor on production supply line for HBM chips, before.

For example: 3072 GCN core GPU, with 64 ROPs and 2 stacks of 4GB HBM2 would be around 300-320mm2.

You would need to add 30% of this die size for 4096 GCN core GPU.

And yes. IMO Vega is built with 2 GPUs, and that 4096 GCN cores is the accurate number for bigger one.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |