Vega/Navi Rumors (Updated)

swilli89 · May 11, 2017

Guru3D said:
Seems real, pricing is more or less accurate, though the names are wrong!

Its all just a complete guess from a brand new account most likely made as a throwaway.

Krteq said:
Let's wait for 16th May, if there will be some Vega teaser, that guy on reddit was right.

Noo please do not think like this. The guy took available public information and made a few assumptions and guesses. May 16th is already a day AMD has publicly scheduled to talk about the future of its business. A Vega release that day would be purely incidental. Also.. the guy deleted his post and is no where to be found..

Crumpet · May 11, 2017

Gatecrasher3 said:
But really, lets say Vega is a flop (no faster then 1080 for around the same cost), what should AMD do next in regards to their GPUs? Ask for a bit more R&D money from the tentative success of Ryzen?

Navi is already deep into development, if Vega flops there's not much they can change for Navi. GPU's are in production for 4 years or more.

That said, a 1080 competitor would be extremely welcome, outside of true enthusiasts, the 1070 and 1080 price points are the go to areas for the mid-high end, and AMD would do well to have a gpu in that spot. Worst case scenario, I still think it would do okay.

Glo. · May 11, 2017

According to you_know_which_site there is Dual Vega GPU in the Linux drivers patches already.

tamz_msc · May 11, 2017

Glo. said:
According to you_know_which_site there is Dual Vega GPU in the Linux drivers patches already.

Those are more likely to be remnants of Pro Duo, or related to Pro SSG, though dual-GPU Vega cannot be ruled out.

Bacon1 · May 11, 2017

Glo. said:
According to you_know_which_site there is Dual Vega GPU in the Linux drivers patches already.

That info was actually from reddit: https://www.reddit.com/r/Amd/comments/6ahbua/a_dualgpu_liquid_cooled_vega_card_may_be_on_the/

swilli89 · May 11, 2017

Bacon1 said:
That info was actually from reddit: https://www.reddit.com/r/Amd/comments/6ahbua/a_dualgpu_liquid_cooled_vega_card_may_be_on_the/

Damn it.. No. I don't want to see AMD bring a dual GPU to a single card fight. Its always been a band-aid. 3870x2 started this trend and it just means that their single chip isn't fast enough. I hope they can bring a <275 watt single GPU that can equal the 1080Ti and come in a bit cheaper, maybe $600-$650. That is what they need to be "back" in the conversation for true enthusiasts. Even though Vega would be a few months after 1080Ti, it looks like we probably won't see Volta cards for a few quarters so that would give them ample time to get sales.

Magic Hate Ball · May 11, 2017

swilli89 said:
Damn it.. No. I don't want to see AMD bring a dual GPU to a single card fight. Its always been a band-aid. 3870x2 started this trend and it just means that their single chip isn't fast enough. I hope they can bring a <275 watt single GPU that can equal the 1080Ti and come in a bit cheaper, maybe $600-$650. That is what they need to be "back" in the conversation for true enthusiasts. Even though Vega would be a few months after 1080Ti, it looks like we probably won't see Volta cards for a few quarters so that would give them ample time to get sales.

It depends if they can make them appear as a single logical core to the rest of the system, like two CCX's in Ryzen. After all, the workload of a GPU scales in parallel massively.

Elixer · May 11, 2017

swilli89 said:
Noo please do not think like this. The guy took available public information and made a few assumptions and guesses. May 16th is already a day AMD has publicly scheduled to talk about the future of its business. A Vega release that day would be purely incidental. Also.. the guy deleted his post and is no where to be found..

No, the mods deleted the post.

As for being correct or not, guess we will find out in a few weeks.

Snarf Snarf · May 11, 2017

Magic Hate Ball said:
It depends if they can make them appear as a single logical core to the rest of the system, like two CCX's in Ryzen. After all, the workload of a GPU scales in parallel massively.

All signs point to that being a Navi architectural change due to the bullet point "scalability" on the road map under Navi. This is AMD's first shot at a GPU with Infinity Fabric, the lessons learned here are definitely going to be big for that type of project. Raja has already come out and said that monolithic dies are going to start hitting the wall soon and interposer fused GPU's are the future, we just haven't hit the wall on process technology yet. Also first gen Infinity Fabric might not be fast enough or have too much latency in its current state to make multi-GPU viable, HBM is very low clocked and if the IF has to run at the same clock that would pose some serious issues.

DisEnchantment · May 11, 2017

Sooo... I cloned the AMD's git repo

and I found this fix

Code:

for (i = 0; i < NUM_LINK_LEVELS - 1; i++)

Lol, same ole mistake,

I was shocked to see OMG, Huang commited this change. Then I saw Eric Huang. For a moment I though bloody hell Jensen Huang.

Samwell · May 11, 2017

Snarf Snarf said:
All signs point to that being a Navi architectural change due to the bullet point "scalability" on the road map under Navi. This is AMD's first shot at a GPU with Infinity Fabric, the lessons learned here are definitely going to be big for that type of project. Raja has already come out and said that monolithic dies are going to start hitting the wall soon and interposer fused GPU's are the future, we just haven't hit the wall on process technology yet. Also first gen Infinity Fabric might not be fast enough or have too much latency in its current state to make multi-GPU viable, HBM is very low clocked and if the IF has to run at the same clock that would pose some serious issues.

You need both chips on one interposer to let them behave like one. Otherwise it won't work out i think, because the bandwidth between them has to be massive. But Vega is just too big for that, there are no interposers for 2x500mm² plus 4 Stacks of HBM.

With Navi in 7nm i'm sure about it. But the problem is not so much the wall in process tech, as 5nm is still on its track, but the design cost. You need 400Mio$ for a chip in 7nm compared to 200Mio$ in 16nm. Now look at AMDs R&D and you see, that AMD can't make many 7nm chips per year. But make one chip, double it for the bigger version works out on interposers. Then maybe a 2nd bigger die and doubling it again, as 4 dies might be a bit to risky for the first try of this technique. Same will happen with nvidia, but maybe they survive one node more with their big chips because of their higher R&D.

Kenmitch · May 11, 2017

Samwell said:
You need both chips on one interposer to let them behave like one. Otherwise it won't work out i think, because the bandwidth between them has to be massive. But Vega is just too big for that, there are no interposers for 2x500mm² plus 4 Stacks of HBM.

With Navi in 7nm i'm sure about it. But the problem is not so much the wall in process tech, as 5nm is still on its track, but the design cost. You need 400Mio$ for a chip in 7nm compared to 200Mio$ in 16nm. Now look at AMDs R&D and you see, that AMD can't make many 7nm chips per year. But make one chip, double it for the bigger version works out on interposers. Then maybe a 2nd bigger die and doubling it again, as 4 dies might be a bit to risky for the first try of this technique. Same will happen with nvidia, but maybe they survive one node more with their big chips because of their higher R&D.

Why design it that way if the intention was to use multiple dies on a imposer? Would make more sense to design it with a modular approach. Not sure about the technical hurdles one would need to jump. Seems like there would be repetitive items not needed on both/all die on the imposer. 1 mother/master die that is the front end seen by the OS with multiple other die supplying the cores, sbaders, etc.

Samwell · May 11, 2017

Kenmitch said:
Why design it that way if the intention was to use multiple dies on a imposer? Would make more sense to design it with a modular approach. Not sure about the technical hurdles one would need to jump. Seems like there would be repetitive items not needed on both/all die on the imposer. 1 mother/master die that is the front end seen by the OS with multiple other die supplying the cores, sbaders, etc.

That's the second step companys will do, but not before trying it otherwise, because of the risk. It's cutting edge technology and you need even more bandwidth if you don't double everything. It will be a really disruptive technology, but it won't help if you're trying to use it too aggressive and run into more problems than what it's worth. Looking back at HBM one can also think, whether AMD maybe shoud've waited longer before using it.

I hope these rumours of not enough HBM2 and HBM not hitting the clock speeds are not true. It looks like at least samsung is having problems there, as Volta was announced with 32GB and 1Tb/s last year. But volta is only getting 16GB and 900 Gb/s, so it seems that samsung missed the clock target for HBM2 and isn't able to manufacture 8 High-Stacks. Let's see whether Hynix is getting both those targets right, as AMD announced it for Mi25 also.

Snarf Snarf · May 11, 2017

The modular approach would be the way to go, similar to the CCX design in Zen. If I recall correctly AMD's blocks in their GPU's are already somewhat modular, the PS4 Pro and Scorpio SOC's are proof of this. The only caveat would be making the masks for these dies, they would need to move a good bit of volume to make the masks worth it financially.

The problem with this approach is that ROP's, CU's, and Cache don't always scale linear into performance. Simply doubling things gets really inefficient as the bottlenecks move to different parts of the design as you add more CU's, need more bandwidth, or as Fury showed us the need to keep all the shaders fed. This design method would take an extreme amount of retooling much like Keller had the CPU division do, being able to find these bottlenecks quickly and adjust things accordingly would allow AMD to optimize their own designs more quickly and would possibly help time to market.

xpea · May 12, 2017

Kenmitch said:
Why design it that way if the intention was to use multiple dies on a imposer? Would make more sense to design it with a modular approach. Not sure about the technical hurdles one would need to jump. Seems like there would be repetitive items not needed on both/all die on the imposer. 1 mother/master die that is the front end seen by the OS with multiple other die supplying the cores, sbaders, etc.

The master/slave approach with scaling in mind is a very bad concept. Lets take a reasonable approach and say you want a master die that can work with up to 3 slaves to address entry/mid/performance/enthusiast markets. So 2 dies for 4 level of performance.
1/ Your 4 dies (1 master + 3 slaves) enthusiast setup wont be competitive at all in terms of efficiency. Never forget that what cost the most power is not calculation (primitive projection, ALUr, fixed block functions) but moving data through buses. In this setup you will end up with maybe 50W of lost power because of data moving from die to die, power that can be used for more FLOPs and/or bigger front end.
2/ And on top of 1/ you will have a huge disadvantage on the entry level model (1 master) because it will have an overkill front end that will be designed to handle 3 times more FLOPs than it has. Bigger die, bigger price, not competitive at all...

That's why, personally I believe that this scalable dream has no future if you have the money and the resource to design proper and highly optimized die for each market. Nvidia knows it, that's why they never talk about that. But if AMD is really going to go this route with Navi, then I'm afraid it will be the end of the road for the red team...

tamz_msc · May 12, 2017

xpea said:
The master/slave approach with scaling in mind is a very bad concept. Lets take a reasonable approach and say you want a master die that can work with up to 3 slaves to address entry/mid/performance/enthusiast markets. So 2 dies for 4 level of performance.
1/ Your 4 dies (1 master + 3 slaves) enthusiast setup wont be competitive at all in terms of efficiency. Never forget that what cost the most power is not calculation (primitive projection, ALUr, fixed block functions) but moving data through buses. In this setup you will end up with maybe 50W of lost power because of data moving from die to die, power that can be used for more FLOPs and/or bigger front end.
2/ And on top of 1/ you will have a huge disadvantage on the entry level model (1 master) because it will have an overkill front end that will be designed to handle 3 times more FLOPs than it has. Bigger die, bigger price, not competitive at all...

That's why, personally I believe that this scalable dream has no future if you have the money and the resource to design proper and highly optimized die for each market. Nvidia knows it, that's why they never talk about that. But if AMD is really going to go this route with Navi, then I'm afraid it will be the end of the road for the red team...

Haha what garbage, just take a look at the PS4 pro and Scorpio. Scorpio fits 44CUs and 8 Jaguar cores in 360 Sq mm and still manages to keep the power supply within the housing.

You have three different versions of big Polaris - that is in essence what AMD means by scalability. On a high level, one design that can be customized accordingly.

Stick to the Nvidia thread.

xpea · May 12, 2017

tamz_msc said:
Haha what garbage, just take a look at the PS4 pro and Scorpio. Scorpio fits 44CUs and 8 Jaguar cores in 360 Sq mm and still manages to keep the power supply within the housing.

You have three different versions of big Polaris - that is in essence what AMD means by scalability. On a high level, one design that can be customized accordingly.

Stick to the Nvidia thread.

I talk about apples you reply about oranges...
Scorpio and PS4 use single die APUs. I was talking about multi die approach like stated in my quote

PS: you have been reported

tamz_msc · May 12, 2017

xpea said:
I talk about apples you reply about oranges...
Scorpio and PS4 use single die APUs. I was talking about multi die approach like stated in my quote

PS: you have been reported

You think scalability refers to multi-die. You don't know about that.

I think scalability means a smallest functional block defined according to a set of parameters that can be put into any die size with modifications as per requirements.

How can I claim any of this? Well this approach has already allowed FP16 packed math operations to do checkerboard rendering on the new consoles. Actual half-precision math that is used for graphics.

P.S. Some learned people believe that GV100 uses the tensor cores for FP16. There are no FP16-specific hardware in it.

P.P.S. I don't care if you report me.

T1beriu · May 12, 2017

tamz_msc said:
You think scalability refers to multi-die. You don't know about that.

I'm pretty sure the scalability in Navi is about multi-die architecture. Raja Koduri was giving clear hints in a video interview about a year ago and keeps bringing it up since then. Not to mention the push on Infinity Fabric, for CPUs and GPUs, that AMD has been doing, DirectX 12 with multi-GPU.

Found one of the videos. https://www.youtube.com/watch?v=4qJj1ViyyPY

sm625 · May 12, 2017

Gatecrasher3 said:
But really, lets say Vega is a flop (no faster then 1080 for around the same cost), what should AMD do next in regards to their GPUs?

If this happens, then it means AMD will have gone more than 5 years without a single significant change in IPC. One Polaris CU is pretty much the same as a 5 year old Tahiti CU. It is really difficult to imagine even AMD being that incompetent.

tamz_msc · May 12, 2017

T1beriu said:
I'm pretty sure the scalability in Navi is about multi-die architecture. Raja Koduri was giving clear hints in a video interview about a year ago and keeps bringing it up since then. Not to mention the push on Infinity Fabric, for CPUs and GPUs, that AMD has been doing, DirectX 12 with multi-GPU.

Found one of the videos. https://www.youtube.com/watch?v=4qJj1ViyyPY

Scalability *could be* about multi-die in addition to what I said.

Topweasel · May 12, 2017

Magic Hate Ball said:
It depends if they can make them appear as a single logical core to the rest of the system, like two CCX's in Ryzen. After all, the workload of a GPU scales in parallel massively.

Yeah IF is the magic bullet for AMD, it works on CPUs and GPU modules, can work with both CPUs and GPUs at the same time, works across a multi die package and can even span sockets.

Two Vega chips would show up as single 16GB setup with 2x the CU count they go with. On top of that packaging with HBM should keep the card size reasonable.

w3rd · May 12, 2017

Vega x2 is a thing.
And AMD's infinity Fabric is their future.

I am a keen observer of the industry, and given Dr Su's & Raj's comments over the latest months, AMD trajectory isn't hard to place. For Gamers and end-users, AMD is going to take the crown.

Lets face it, AMD now has better support and drivers than NVidia. They have a unified architecture (hsa) that AMD has aggressively been working towards for 7 years and that research and development and effort is now coming together and starting to pay off (APU).

But it is not hard to figure that AMD's "Infinity Fabric" plays a big role in all of this. And so does their partnerships with Hynix, thus HBM2 and the new "unified cache controller". AMD development into HSA has given them an ahead start on things like Vega X2 (ie: $799 TitanXp.?), which allows them to showcase their fabric with an array of multi-gpu chips.

Not important? Ask yourself this... if baby vega is nearly equal to GTX1080, then how much wattage would two 1080's be in SLI. Then what would two 1080 in SLI gpu scaling be..? Cost for end-user to buy two 1080's..?

Then realize, that two baby vegas sitting on fabric don't suffers from any of those^ problems. It gives AMD tremendous value. And that is something their competitors typically have never offered.

crisium · May 12, 2017

sm625 said:
If this happens, then it means AMD will have gone more than 5 years without a single significant change in IPC. One Polaris CU is pretty much the same as a 5 year old Tahiti CU. It is really difficult to imagine even AMD being that incompetent.

All being equal, Polaris is ~18% faster than Tahiti: https://www.computerbase.de/2016-08/amd-radeon-polaris-architektur-performance/2/

You can have an opinion that it is not enough, but it's good to have factual context.

Saylick · May 12, 2017

w3rd said:
Vega x2 is a thing.
And AMD's infinity Fabric is their future.

I am a keen observer of the industry, and given Dr Su's & Raj's comments over the latest months, AMD trajectory isn't hard to place. For Gamers and end-users, AMD is going to take the crown.

Lets face it, AMD now has better support and drivers than NVidia. They have a unified architecture (hsa) that AMD has aggressively been working towards for 7 years and that research and development and effort is now coming together and starting to pay off (APU).

But it is not hard to figure that AMD's "Infinity Fabric" plays a big role in all of this. And so does their partnerships with Hynix, thus HBM2 and the new "unified cache controller". AMD development into HSA has given them an ahead start on things like Vega X2 (ie: $799 TitanXp.?), which allows them to showcase their fabric with an array of multi-gpu chips.

Not important? Ask yourself this... if baby vega is nearly equal to GTX1080, then how much wattage would two 1080's be in SLI. Then what would two 1080 in SLI gpu scaling be..? Cost for end-user to buy two 1080's..?

Then realize, that two baby vegas sitting on fabric don't suffers from any of those^ problems. It gives AMD tremendous value. And that is something their competitors typically have never offered.

As much as we enthusiasts appreciate the technical side of things, even if AMD has a technological advantage, I think it is fair to say that history has shown that superior technology is not enough to take the overall crown. Consistency in execution, superior marketing, and cultivating mindshare have trumped technological superiority in the past. AMD loyalists and enthusiasts in general will buy AMD but your average joe may not. I don't mean to sound pessimistic but I too have been following the whole ATI/AMD vs Nvidia "war" play out for the last 10 years and it was only those moments where AMD had solid execution and a great product combined with Nvidia's incompetence that they were able to claw back meaningful marketshare (e.g. HD 5870). The odds of Nvidia being incompetent and fouling up their execution are slim to none nowadays, so AMD is going to need more than a solid product to succeed, i.e. in today's competitive landscape, the product cannot sell itself purely based on it's own merits. They will need to win the hearts of the masses for that to happen but it is an uphill battle as Nvidia is deeply entrenched in that position. Unfortunately, having a lot of cash and a great marketing team are hard to come by for AMD.

Vega/Navi Rumors (Updated)

Golden Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Senior member

Lifer

Senior member

Golden Member

Senior member

Diamond Member

Senior member

Senior member

Senior member

Diamond Member

Senior member

Diamond Member

Member

Diamond Member

Diamond Member

Diamond Member

Senior member

Platinum Member

Diamond Member