I don't think AMD GFX has 'arrived' at another point yet. They are paving the way - but NV is a moving target with a huge treasure chest. From what I've seen so far, TRG needs to be executing much better than they are. Honestly, what I hope for at, at this point, is that Ryzen and the Ryzen based APU will take up all the production capacity at GF. Then TRG (the radeon group) can move back to TSMC. I think this would be the best play for them (doesn't seem like GF has done as well executing on large ASICs).
As far as the multi-die strategy, so long as NV is able to stay ahead of AMD with ~400mm2 GPUs, AMD will have a tough time grabbing back market share. The engineering for stitching together two dice with some fast and wide fabric will be very challenging, IMHO. Unless IF scales up really to that level, AMD will need some customer interconnect system, and even then, there will be latency issues with off-die communication of shared info.
By arrive, I mean another strategy. I never meant they have finished the journey.
Yes, they appear to be delayed.
Ryzen appears more energy efficient than Intel so GloFlo/Samsung process can't be that bad. Porting to TSMC is a non starter long term. Nvidia will have 1st call on wafers.
Executing large ASICs don't matter with the small-die approach. That is one of it's great value. Early good yield access to new nodes. Do you really think AMD stating quite clearly their early bold [foolhardy?] move to 7nm has nothing to do with yields? They must be fabbing small die for both CPUs and GPUs.
There is no true reticle limit to composite ASIC size. Interposer limits become the barriers to max size. What about a 1000mm^2 + stitched together top model clocked for great energy efficiency.
IF can scale to 512 bits at least, according to AMD. As I said, Naples and Threadripper will tell a lot more of it's capabilities.
Last year in the multi-die thread, I linked a Xilink paper showing signal latency using an SI interposer is the same as on die. I am assuming the multi-die are all on a shared interposer, not through the PCB.
Why do you think multi-die means 2? I'm thinking 1 to 8 ratio, top to bottom. Huge product stack, small development cost.
Small die are lot cheaper/mm^2. We might very well have a much larger AMD die competing with a smaller Nvidia one, while costing equal or even less to fab.
Certainly development cost has large savings, similarly fabbing one product must be amazing for inventory control and product flexibility. [Points to Ryzen]
The amazing thing is we that we have a stated goal as scalability in Navi, before the comments were removed [Raja spilled the beans too early?]. We have a very early use of 7nm. New nodes being traditional only used for small mobile ASICs due to associated costs and dfficulties with larger ones. We have the CPU side following a similar strategy. Is it really such a leap?