EXCellR8
Diamond Member
- Sep 1, 2010
- 3,982
- 839
- 136
if this architecture doesn't work out at least we'll be able to say... "what happens in Vega, stays in Vega"
... I'll show myself out
... I'll show myself out
Pixel fillrate is definitely miles behind competitors products. Even GTX 1070 is quite a bit ahead.
Yet NVIDIA keeps adding more and more ROPs while using tile based rendering, with good results.If i am not mistaken :
With tile based rendering, the idea is to split up the screen in tiny fragments and all the data on those fragments is also loaded and locally stored on chip preventing external memory access (like hbm2 or gddr). Any reading modifying and writing of data can be done locally in the (i suspect) L2 cache. Only to load the cache with data from external memory is needed and to store the end result back to external memory. In general speaking.
The ROPs (Raster Operations) units work now closely with the L2 cache from the gpu, so i suspect the L2 cache has grown quite a bit to help out with reducing external memory access.
Also the draw stream binner which has the task of culling non visible polygons will help in hidden surface removal.
The trick with culling not visible polygons (hidden surface removal) is that you also have to do less operations on pixels. Then a lower pixel fill rate is less of an issue if the culling of not visible polygons is successful. It all depends on the implementation of course (and driver support)...
At least, that is the theory.
Yet NVIDIA keeps adding more and more ROPs while using tile based rendering, with good results.
Nvidia went from 2880:48 shaders:rops, to 1280:48 just two generations later. They must be doing something right. In the same time period (and same competing product stacks), AMD went from 2816:64 to 2304:32. The performance-per-flop and relative performance-per-area don't make me think anything right is going on there with AMD's decisions. Pixel fillrate must still be very important, but only one company is willing or able to chase higher amounts.
I really thought we would see 128 ROPS in Vega when it was announced that it was suppose to be a major overhaul of GCN
Is that means Vega architecture (I'm talking about micro-architecture not rtl code writing) is fully designed by an inexperienced team?I think the current Chinese development team simply doesn't know how to fix the limitations of GCN (remember, they weren't the ones who originally designed it - a North American team did that, but was laid off in 2013). They don't know how to expand it to more than 4 shader engines, 64 CUs, or 64 ROPs. All they can do is some mild tinkering around the edges.
That's a fairly ridiculous thing to suggest. Given internal documentation and schematics, it's all down to their resources and ingenuity. There isn't anything hidden from them.I think the current Chinese development team simply doesn't know how to fix the limitations of GCN (remember, they weren't the ones who originally designed it - a North American team did that, but was laid off in 2013). They don't know how to expand it to more than 4 shader engines, 64 CUs, or 64 ROPs. All they can do is some mild tinkering around the edges.
yeah vega should have 6SE/96rops resulting in +50% geometry power and +50% pixel fillrate.Vega also should have 3072bit resulting in 768GB/s bandwidth or 4096bit with 1TB/s.In current vega there is just so much bottlenecks.I think the current Chinese development team simply doesn't know how to fix the limitations of GCN (remember, they weren't the ones who originally designed it - a North American team did that, but was laid off in 2013). They don't know how to expand it to more than 4 shader engines, 64 CUs, or 64 ROPs. All they can do is some mild tinkering around the edges.
How likely do you think for AMD not to know this when a bunch of semi knowledgeable forum posters come to this conclusion?yeah vega should have 6SE/96rops resulting in +50% geometry power and +50% pixel fillrate.Vega also should have 3072bit resulting in 768GB/s bandwidth or 4096bit with 1TB/s.In current vega there is just so much bottlenecks.
You should watch the video rather than making inaccurate assumptions.I believe that this was a test that AMD set the rules for, so you'd be a fool to believe that it had any chance of coming out in a way that doesn't make them look good.
Have you done these comparisons? Because if you haven't you have no way of knowing, never mind verifying, the accuracy of your claims.They could have used a 960, and had the same visual results.
Heck, just get a 1050, put it up against the Vega Rx, and the exact same setup, and then, nvidia could say, look, our GPU + monitor costs $500 less than Vega RX + monitor! You can't tell the difference.
This is yet more smoke & mirrors.
Frankly, this is the exact same kind of crap Intel is doing with the "glue" comments on Epyc & Threadripper. This is beneath AMD.
I am little more surprised that HardOCP repeated this kind of silly test that AMD was doing at it's tour.
They weren't attempting to determine frame rates at all. And without actually performing the test with a 1070 there is no way to claim any accuracy regarding your statement. It's simply hyperbole.Because he could just as well have used a GTX 1070 and had the same results. All this kind of testing proves is that people are poor at determining frame rate once it gets over a reasonable threshold. Especially if you have some kind of variable sync on top.
pretty big at this point.Furyx and vega have still tons of bottlenecks because 4x SE and low memory bandwidth.How likely do you think for AMD not to know this when a bunch of semi knowledgeable forum posters come to this conclusion?
Or maybe the bottlenecks aren't as obvious, which is why they aren't doing this. Maybe 6 SE's is fine, and maybe 64 ROPS are too. We don't know nearly as much as AMD does.pretty big at this point.Furyx and vega have still tons of bottlenecks because 4x SE and low memory bandwidth.
They maybe know they have bottlenecks, but sure they dont know how to fix it(6xSE and 96rops)
Ya amd is so smart we could never know more than them.Or maybe the bottlenecks aren't as obvious, which is why they aren't doing this. Maybe 6 SE's is fine, and maybe 64 ROPS are too. We don't know nearly as much as AMD does.
Just because they're lagging behind NVIDIA in a theoretical sense on some aspects (like max pixel fill rate), doesn't mean that's what's holding back their performance.
If it was as simple as more ROPs they would have done it. Vega unlike Fiji had room to spare.
Ya amd is so smart we could never know more than them.
We should probably turn our voltage back up on Polaris. Just to be safe.
Or maybe the bottlenecks aren't as obvious, which is why they aren't doing this. Maybe 6 SE's is fine, and maybe 64 ROPS are too. We don't know nearly as much as AMD does.
Just because they're lagging behind NVIDIA in a theoretical sense on some aspects (like max pixel fill rate), doesn't mean that's what's holding back their performance.
If it was as simple as more ROPs they would have done it. Vega unlike Fiji had room to spare.
1. Do yields mean anything to you? Because yes, AMD knows best about their spread of chips and what voltage they would need to get a certain clockspeed with x% of chips.Ya amd is so smart we could never know more than them.
We should probably turn our voltage back up on Polaris. Just to be safe.
If it's completely alien technology to them, they wouldn't be able to create chips from it. It's not like Polaris or Fiji have the same floorplan or configuration of previous designs.Again, you're assuming they know how to do it. I think they don't - GCN, at this point, is basically alien technology to them, designed by a completely different team, and they can't figure out how to expand it. Keep in mind that these are low-wage engineers from a culture not known for its creativity.
GCN is an high level module, it's broken down into smaller modules and engineers are going to work according to the documentation. Architecture expanding (modification) most probably handeled by senior team members who knows GCN well.Again, you're assuming they know how to do it. I think they don't - GCN, at this point, is basically alien technology to them, designed by a completely different team, and they can't figure out how to expand it. Keep in mind that these are low-wage engineers from a culture not known for its creativity.
They throw it to the best team they can afford.... That's the point being made.GCN is an high level module, it's broken down into smaller modules and engineers are going to work according to the documentation. Architecture expanding (modification) most probably handeled by senior team members who knows GCN well.
GCN is one of the industry leading technology, no company throws their important tech to an totally inexperienced team.
Sent from my ONEPLUS A3003 using Tapatalk