Vega/Navi Rumors (Updated)

Page 134 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

swilli89

Golden Member
Mar 23, 2010
1,558
1,181
136
Why would AMD need to send engineers to port code? If you don't know how to code you wouldn't be in a position to need porting in the first place. It's not like any of the code is a new language.
Because it costs a company time and money to do this. AMD could incentivize, or make easier, the transition to Zen (for x86) or Vega based cores with some software engineers.. and who better to help port code than AMD engineers.

Nvidia rules datacenter not because of superior hardware.. but because they dumped way more R&D, promotion, and collaboration money into CUDA. AMD now needs to both deliver great hardware but also help develop the ecosystem and tools around that hardware.
 

maddie

Diamond Member
Jul 18, 2010
4,881
4,951
136
I wondered about that as well when I read about the "infinity fabric". Seemed more about memory access and sharing then load balancing between GPUs, which is the main problem with multi GPU solutions.

Pretty surprised in this thread to see the fascination with similarity of naming schemes as well. Personally AMD can call the thing "FX5800 Ultra" and I wouldn't care, just want to see the parts.

I don't buy the "drivers not ready" theory, they've been working on this arch for years, big companies are not as inept as "Oops! We forgot to write the drivers!". AMD knows NV is selling a lot of Pascal these days and parts would be in stores if there was not something like "not enough HBM2 memory to launch" or "we aren't getting yields of functional chips per wafer on this revision to sell profitably".

Last, companies don't "troll" or "prank" in regard to product releases or press conferences. These are educated, middle to old aged people that are paid hundreds of thousands to millions of dollars. They are not 14 year old dumba$$es who "troll". Everything they do is aimed at presenting a professional, profit driven impression of the firm to investors. PR dept gets some latitude to do crap like "poor Volta" to pander to the fanboys, but you will NEVER see a CEO with a doctoral degree "trolling the industry".
Maybe not troll but have you never seen engineers being whimsical and prankish?

Remember this is a company in the process of being resurrected and laden with very competent engineers.

Not as you described, [These are educated, middle to old aged people].

R3 R5 R7 Epyc Threadripper Ryzen Ryzenmaster Vega FE
 
Reactions: w3rd

w3rd

Senior member
Mar 1, 2017
255
62
101
I feel that this is exactly what we are witnessing. The end of an age and beginning of another.

To digress a bit, we have a technological tour de force in V100 resulting in a 800mm^2 monster die on a custom process. Not something a small company can replicate, so design around it. Don't go head to head. That's AMD future.


Exactly this^.
 

maddie

Diamond Member
Jul 18, 2010
4,881
4,951
136
I don't know. I think developing a multi chip solution that's transparent to the system and game developers is exactly where they're heading.

I'm just not sure Vega or even Navi will be capable of that however.
I am, about Navi.

The latest slides presented shows Navi on 7nm, apparently soon after that process becomes available. We can be fairly certain that even a medium sized die will be prohibitively expensive to fab, at least for consumer GPUs. They appear to be banking on using the multi-die approach to overcome cost barriers. That is at least a 1 yr process advantage on the competition, even ignoring the other benefits of a seamless multi-die approach.
 
Reactions: Magic Hate Ball

Magic Hate Ball

Senior member
Feb 2, 2017
290
250
96
I am, about Navi.

The latest slides presented shows Navi on 7nm, apparently soon after that process becomes available. We can be fairly certain that even a medium sized die will be prohibitively expensive to fab, at least for consumer GPUs. They appear to be banking on using the multi-die approach to overcome cost barriers. That is at least a 1 yr process advantage on the competition, even ignoring the other benefits of a seamless multi-die approach.

And GPU is the ideal location for parallel computing, so I agree entirely. They've hinted at this in interviews over the last two years at least.
 

crisium

Platinum Member
Aug 19, 2001
2,643
615
136
Q: Many argue that vega is just a refined polaris gpu, how would you respond to this ?
A: My software team wishes this was true
Vega is both a new GPU architecture and also completely new SOC architecture. It's our first InfinityFabric GPU as well

The new geometry pipeline in Vega was designed for higher throughput per clock cycle, through a combination of better load balancing between the engines and new primitive shaders for faster culling. As a programmer you shouldn't need to do anything special to take advantage of these improvements, but you're most likely to see the effects when rendering geometrically complex scenes that can really push the capabilities of the hardware.

I'm going to combine these 2 a bit. Ok Raja, you're on.

1)Vega is a new architecture will higher throughput per clock than Polaris, AKA Vega better damn well have superior Performance-Per-Flop (PPF) compared to Polaris.

2) Developers don't have to do any work to harness this superior throughput (not counting HBCC).

3) AMD software driver teams WILL need to do work to see this true. But let me add another Raja quote.

RX will be fully optimized gaming drivers, as well as a few other goodies that I can't tell you about just yet

Ok, so let's revise that #3.

3.1) AMD software driver teams will have optimizied for the "higher throughput per clock cycle".

BTW, I am being very generous with only wanted Vega to beat Polaris PPF. Overall, Hawaii still tends to be the PPF winner. As I discussed here, Vega (assuming 1525MHz) with Polaris PPF is 97% of an aftermarket 1080 Ti at QHD. If by the grace of god they can match Hawaii PPF, they will beat the 1080 Ti straight up. If they only mean to barely beat pathetic Fiji PPF though then forget about it.

Calling you out Raja. Drivers ready and better than Polaris PPF. If you over deliver (beat Hawaii PPF) I will buy your product...
(also because that performance means it will beat the Titan Xp)
 
Reactions: Headfoot

Topweasel

Diamond Member
Oct 19, 2000
5,437
1,659
136
@Valantar I'm an amateur so just arriving in this forum.
Titan Xp die size is less than 500mm2 using GDDR5X, suppose there are other reasons.
Not want to complain/flame/etc, just to inform as industry passionate.
Missing a point there. It isn't about GDDR5X requiring a certain sized chip it's that at Vega's current size if they added it, it would be that much larger than it already is and 500mm2 seems to be sweet spot for large chips these days. So AMD would need remove HBM2 and probably other functionality and performance to get it down to size while supporting GDDR5X.
 

nathanddrews

Graphics Cards, CPU Moderator
Aug 9, 2016
965
534
136
www.youtube.com
I think developing a multi chip solution that's transparent to the system and game developers is exactly where they're heading.
This has been a wish of mine for a long time, but yeah, I don't think we'll see anything like that until Navi at the earliest.

The primary issue that I can see with that as a business model comes down to bifurcation of the product stack. Right now, we get multiple cuts of different dies that result in ~5-9 SKUs. This aspect of lithography isn't going to go away anytime soon, there will be always be binning of silicon. OK, so now move to a world where we have some sort of HBM/Infinity Fabric/bridge chip of some kind that allows AMD to snap multiple GPU dies together transparently into new configurations. Assuming they work out the engineering behind it and it all works perfectly, how far do you take this multi-chip process?

I would assume they would only do it at the highest level possible (like all the previous X2 ATI/AMD GPUs of olde). At the same time, they could just as easily (in this theoretical world) toss 8 weaker chips together to form a high-end multi-chip. Which process would allow for higher margin?

Wait, I just realized something. This is Cell.
 
Reactions: Paratus

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
I don't buy the "drivers not ready" theory, they've been working on this arch for years, big companies are not as inept as "Oops! We forgot to write the drivers!". AMD knows NV is selling a lot of Pascal these days and parts would be in stores if there was not something like "not enough HBM2 memory to launch" or "we aren't getting yields of functional chips per wafer on this revision to sell profitably".

I don't think you understand how complicated modern graphics drivers are. They include not only extensive optimizations for a slew of different games (nearly all of which have to be done without access to the original game source code), but even bugfixes - it's common for developers to have severe shader bugs that the driver developers have to patch. Drivers for a whole new architecture are not a trivial task, and getting them to a competitive level on a majority of AAA DX11 titles could easily take months even for a well-staffed team that is focused on release.

Remember that most sites never update their charts after the original review of a GPU. If AMD releases with half-baked drivers, as they have done too many times in the past, then the poor performance numbers will continue to haunt them throughout Vega's lifetime. Better to take a little longer and do it right, while still making sales to high-margin professional users (whose drivers need to be very stable, but don't need the kind of silly tweaking that DX11 gamer drivers do). The top consumer Vega SKU needs to beat GP102 in order to not be a miserable failure, and if that means another month or two getting the drivers up to par, then so be it.
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
This has been a wish of mine for a long time, but yeah, I don't think we'll see anything like that until Navi at the earliest.

The primary issue that I can see with that as a business model comes down to bifurcation of the product stack. Right now, we get multiple cuts of different dies that result in ~5-9 SKUs. This aspect of lithography isn't going to go away anytime soon, there will be always be binning of silicon. OK, so now move to a world where we have some sort of HBM/Infinity Fabric/bridge chip of some kind that allows AMD to snap multiple GPU dies together transparently into new configurations. Assuming they work out the engineering behind it and it all works perfectly, how far do you take this multi-chip process?

I would assume they would only do it at the highest level possible (like all the previous X2 ATI/AMD GPUs of olde). At the same time, they could just as easily (in this theoretical world) toss 8 weaker chips together to form a high-end multi-chip. Which process would allow for higher margin?

Wait, I just realized something. This is Cell.

I agree, we will probably see full use of Infinity Fabric with multiple GPUs acting as one in Navi, not Vega. But it would be great if they surprised us; a dual Vega acting as one giant chip with full scaling in all titles could probably beat the best consumer Volta SKU without breaking a sweat.

Regarding how far the multi-chip process goes - well, we have AMD doing up to four chips on a MCM for Zen (Naples). So we could potentially see the same thing on Navi. Just as an example, imagine if the base chip was 32 CUs (2048 shaders) - a dual chip solution would have 64 CUs/4096 shaders, and a quad chip solution would have 128 CUs/8192 shaders. That's a pretty wide range. And of course there would be cut versions inbetween, maybe with 1792/3584/7168 shaders. Anything below 2048 shaders could be handled with APUs, especially if they have HBM2-enabled versions by this time.

Scalability, if it really works and is transparent to the system, would be huge. AMD could cover all parts of the market on a fraction of the R&D budget, since they'd only need to make one chip. Winning the performance crown would be easy - think about it, RX 480 is about half the performance of GTX 1080 Ti, so if AMD had this kind of ability now with Polaris 10, a dual-chip unit would match 1080 Ti and a quad-chip unit would have DOUBLE the performance of Nvidia's best offering. This is despite AMD having an architecture that clearly needs some work to catch up (work that Vega will hopefully complete).

I'm sure the technical obstacles to pulling it off are formidable, but the rewards are worth it.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
I'm going to combine these 2 a bit. Ok Raja, you're on.

1)Vega is a new architecture will higher throughput per clock than Polaris, AKA Vega better damn well have superior Performance-Per-Flop (PPF) compared to Polaris.

2) Developers don't have to do any work to harness this superior throughput (not counting HBCC).

3) AMD software driver teams WILL need to do work to see this true. But let me add another Raja quote.



Ok, so let's revise that #3.

3.1) AMD software driver teams will have optimizied for the "higher throughput per clock cycle".

BTW, I am being very generous with only wanted Vega to beat Polaris PPF. Overall, Hawaii still tends to be the PPF winner. As I discussed here, Vega (assuming 1525MHz) with Polaris PPF is 97% of an aftermarket 1080 Ti at QHD. If by the grace of god they can match Hawaii PPF, they will beat the 1080 Ti straight up. If they only mean to barely beat pathetic Fiji PPF though then forget about it.

Calling you out Raja. Drivers ready and better than Polaris PPF. If you over deliver (beat Hawaii PPF) I will buy your product...
(also because that performance means it will beat the Titan Xp)

Real technical speculation in a supposedly technical thread! I like the PPF metric since it doesn't have all the incorrect connotations of using IPC when the improvements result from things other than IPC...
 

Valantar

Golden Member
Aug 26, 2014
1,792
508
136
I'm going to combine these 2 a bit. Ok Raja, you're on.

1)Vega is a new architecture will higher throughput per clock than Polaris, AKA Vega better damn well have superior Performance-Per-Flop (PPF) compared to Polaris.

2) Developers don't have to do any work to harness this superior throughput (not counting HBCC).

3) AMD software driver teams WILL need to do work to see this true. But let me add another Raja quote.



Ok, so let's revise that #3.

3.1) AMD software driver teams will have optimizied for the "higher throughput per clock cycle".

BTW, I am being very generous with only wanted Vega to beat Polaris PPF. Overall, Hawaii still tends to be the PPF winner. As I discussed here, Vega (assuming 1525MHz) with Polaris PPF is 97% of an aftermarket 1080 Ti at QHD. If by the grace of god they can match Hawaii PPF, they will beat the 1080 Ti straight up. If they only mean to barely beat pathetic Fiji PPF though then forget about it.

Calling you out Raja. Drivers ready and better than Polaris PPF. If you over deliver (beat Hawaii PPF) I will buy your product...
(also because that performance means it will beat the Titan Xp)
While I like your theory, do note that Raja specifically mentions a higher geometry throughput. In other words no increase in shader throughput, texturing, and all the other non-geometry work done by the GPU (per clock, I assume that is). This should balance out the architecture compared to for example Fiji, though.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
While I like your theory, do note that Raja specifically mentions a higher geometry throughput. In other words no increase in shader throughput, texturing, and all the other non-geometry work done by the GPU (per clock, I assume that is). This should balance out the architecture compared to for example Fiji, though.
I find it very hard (impossible) to believe a new shader core CU (NCU) will result in zero increase in shader throughput per clock.
 
Reactions: raghu78

crisium

Platinum Member
Aug 19, 2001
2,643
615
136
Tahiti to Tonga to Polaris comparisons shows that "IPC" has increased since Hawaii (which outside of A Sync compute should be the same as GCN1). It's the 44CU:64ROP:384 GB/s (390) ratio that just brute forces past Polaris with its less balanced 36:32:256 ratio. (Ok, bandwidth shouldn't be used in the ratio here since it is separate from clocks... so I'm being mathematically inaccurate but hopefully you get my point).

So IPC only applies where things are equal, and Hawaii wins PPF through a better balance. Obviously Fiji is a mess for other reasons. I hope Vega's improvements make up for a still less balanced ratio than Hawaii.

Valantar, good point.

And it's not really a theory, just me putting the heat on Raja. I still think Vega could go anywhere from slightly faster than 1080 to faster than TXp.
 
Last edited:

Valantar

Golden Member
Aug 26, 2014
1,792
508
136
I find it very hard (impossible) to believe a new shader core CU (NCU) will result in zero increase in shader throughput per clock.
I don't see a problem with Raja/RTG saying "compared to everything else in our GPUs, we have enough shader throughput. Let's work on balancing out the architecture by strengthening its weakest points, and other than that focus on packed math." That makes perfect sense even in a non-resource limited scenario, and definitely does for someone as small as RTG taking on a Herculean task like modern high end GPU IC design (and driver development).
 

Glo.

Diamond Member
Apr 25, 2015
5,802
4,776
136
Tahiti to Tonga to Polaris comparisons shows that "IPC" has increased since Hawaii (which outside of A Sync compute should be the same as GCN1). It's the 44CU:64ROP:384 GB/s (390) ratio that just brute forces past Polaris with its less balanced 36:32:256 ratio. (Ok, bandwidth shouldn't be used in the ratio here since it is separate from clocks... so I'm being mathematically inaccurate but hopefully you get my point).

So IPC only applies where things are equal, and Hawaii wins PPF through a better balance. Obviously Fiji is a mess for other reasons. I hope Vega's improvements make up for a still less balanced ratio than Hawaii.

Valantar, good point.

And it's not really a theory, just me putting the heat on Raja. I still think Vega could go anywhere from slightly faster than 1080 to faster than TXp.
The GPUs appear to be different to the degree that currently we may not see huge benefits from this architecture, so be prapared for huge initial disappointment. But the GPus are designed to handle high resolution gaming, thanks to incredible culling techniques. It is dependent on the developers to use those culling features from the architecture tho. Exatly the same case is for the Packed Math.
I don't see a problem with Raja/RTG saying "compared to everything else in our GPUs, we have enough shader throughput. Let's work on balancing out the architecture by strengthening its weakest points, and other than that focus on packed math." That makes perfect sense even in a non-resource limited scenario, and definitely does for someone as small as RTG taking on a Herculean task like modern high end GPU IC design (and driver development).
increase in performance can come from next generation scheduling(load balancing) and from new pixel engine, that is detached from Memory controller, but client of L2 cache, rather.

Throughput of the cores is enough, for this generation. Volta will disrupt massively, but AMD is not on a wasted position. We can see situation that right now AMD will be the go-to solution for high-end, and later Nvidia can come with Volta which will again turn the tide.

But this is a scenario where both teams can win, not lose.

AMD's biggest strength is uniqueness of platform they offer. Do not forget about this. They can afford disruptiveness of hardware pricing, compared to performance of products, because they can earn money from other computing platforms as well.

AMD is very much back in the game, and will be in very short future profitable company. What it will result in is increased R&D and better products with each generation.
 

raghu78

Diamond Member
Aug 23, 2012
4,093
1,475
136
I find it very hard (impossible) to believe a new shader core CU (NCU) will result in zero increase in shader throughput per clock.

very true. I think NCU is one of the major architectural improvements in Vega to increase shader efficiency and utilisation. GCN has very bad real world gaming performance / FLOPS because of serious utilization issues. Hawaii with 6 TFLOPS barely matches GTX 980 with 5.1 TFLOPS. Fury X with 8.6 TFLOPS loses to GTX 980 Ti with 6.8 TFLOPS. Rx 480 with 5.7 TFLOPS barely matches GTX 1060 with 4.6 TFLOPS. Vega will address this major weakness of GCN.

I don't see a problem with Raja/RTG saying "compared to everything else in our GPUs, we have enough shader throughput. Let's work on balancing out the architecture by strengthening its weakest points, and other than that focus on packed math." That makes perfect sense even in a non-resource limited scenario, and definitely does for someone as small as RTG taking on a Herculean task like modern high end GPU IC design (and driver development).

Thoretical FLOPS mean nothing if you cannot deliver real world gaming performance. Thats the major problem with GCN which Vega should address.
 
Reactions: crisium

Valantar

Golden Member
Aug 26, 2014
1,792
508
136
Thoretical FLOPS mean nothing if you cannot deliver real world gaming performance. Thats the major problem with GCN which Vega should address.
...which is where balancing out the architecture to ensure better utilization across all components comes in, no? I.e. exactly what I was saying.
The GPUs appear to be different to the degree that currently we may not see huge benefits from this architecture, so be prapared for huge initial disappointment. But the GPus are designed to handle high resolution gaming, thanks to incredible culling techniques. It is dependent on the developers to use those culling features from the architecture tho.
Didn't Raja state explicitly in the AMA that there was no need for developer effort to make full use of Vega?
increase in performance can come from next generation scheduling(load balancing) and from new pixel engine, that is detached from Memory controller, but client of L2 cache, rather.

Throughput of the cores is enough, for this generation. Volta will disrupt massively, but AMD is not on a wasted position. We can see situation that right now AMD will be the go-to solution for high-end, and later Nvidia can come with Volta which will again turn the tide.
I can't tell if you're agreeing with me, disagreeing, or perhaps somewhere in the middle
 

Glo.

Diamond Member
Apr 25, 2015
5,802
4,776
136
very true. I think NCU is one of the major architectural improvements in Vega to increase shader efficiency and utilisation. GCN has very bad real world gaming performance / FLOPS because of serious utilization issues. Hawaii with 6 TFLOPS barely matches GTX 980 with 5.1 TFLOPS. Fury X with 8.6 TFLOPS loses to GTX 980 Ti with 6.8 TFLOPS. Rx 480 with 5.7 TFLOPS barely matches GTX 1060 with 4.6 TFLOPS. Vega will address this major weakness of GCN.



Thoretical FLOPS mean nothing if you cannot deliver real world gaming performance. Thats the major problem with GCN which Vega should address.
In theory, Raja is actually correct, because the nature of GCN architecture should be on par, clock for clock on the same level as GP100/Volta architecture. But all of this is neutralized by imbalanced design of the GPUs, and previously weak scheduling, and weak software, not to mention lack of culling, inefficient handling of Registry Files, and detachment of Pixel Engine from L2 cache, and being a client of Memory Controller(a lot of optimization required, and not being able to fully optimize the data delivery to cores).

In theory Vega is lifting all of bottlenecks, with adding new, next generation hardware features, like next generation Memory tech, advanced culling technique for geometry, and Mobile-like Tiling of Rasterization(at leats thats what is suggested...).

To be honest, I am a bit disappointed that AMD did not increased the throughput of the cores. But overall I agree with Raja, that this time, more important was redesign of the architecture to be more modern, and properly efficient.
 

crisium

Platinum Member
Aug 19, 2001
2,643
615
136
Ha, I just mean I get the same impression when you respond to me too. Such as telling me to prepare for disappointment and that AMD will rule the high end in the same post.
 
Reactions: Valantar

Glo.

Diamond Member
Apr 25, 2015
5,802
4,776
136
I can't tell if you're agreeing with me, disagreeing, or perhaps somewhere in the middle
Not agreeing, nor disagreeing with you . Just adding something .

Ha, I just mean I get the same impression when you respond to me too. Such as telling me to prepare for disappointment and that AMD will rule the high end in the same post.
I hope this post helps you understand my way of thinking
 

Malogeek

Golden Member
Mar 5, 2017
1,390
778
136
yaktribe.org
GCN has very bad real world gaming performance / FLOPS because of serious utilization issues.
Yes it has in DX11 however AMD have been designing GCN from the start to fill that utilization with compute, which is why we see very large increases with async compute and compute heavy workloads in DX12 and Vulkan. Mantle was created for this to take advantage of their architecture and is now the basis of Vulkan and DX12.
 
Reactions: w3rd
Status
Not open for further replies.
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |