I'd wager that CF works questionably mostly in NV sponsored games, and you could guess why. It is not like a company hasn't ever asked dev to remove features to disadvantage their competition.You can at least try to be original with your come backs. Curious If you think SLI is broken "quite often" how would you describe CF that's broken far more often? Or do you play the ignorance card and pretend those cases don't exist? It's really more of a rhetorical question than anything. I already know the answer. Pretty sure you do too.
Not exactly. It will be done by new APIs, and every optimization will lie in the developers hands.https://youtu.be/KIiNFXf00_U?t=11m5s
AMD explains why HBM 4GB is not comparable to 4GB GDDR5.
Actually as some of us have expected, due to the overwhelming bandwidth (& lower latency), they can optimize it as a dynamic cache, to bring in assets required for the scene and dump the rest into system ram on the fly.
Very similar to what NV does with the 3.5 + 0.5 970 partitions (as well as many earlier NV cards with segmented memory), put non-required assets into the slow partition.
It means they need to optimize drivers for each game that actually requires (not allocate, but actually require) >4GB vram at 4k.
https://youtu.be/KIiNFXf00_U?t=11m5s
AMD explains why HBM 4GB is not comparable to 4GB GDDR5.
Actually as some of us have expected, due to the overwhelming bandwidth (& lower latency), they can optimize it as a dynamic cache, to bring in assets required for the scene and dump the rest into system ram on the fly.
Very similar to what NV does with the 3.5 + 0.5 970 partitions (as well as many earlier NV cards with segmented memory), put non-required assets into the slow partition.
It means they need to optimize drivers for each game that actually requires (not allocate, but actually require) >4GB vram at 4k.
Not exactly. It will be done by new APIs, and every optimization will lie in the developers hands.
Yes, but they are trying to sell us the cards. So we should take that with a grain of salt until objective reviews are done.
https://youtu.be/KIiNFXf00_U?t=11m5s
AMD explains why HBM 4GB is not comparable to 4GB GDDR5.
Actually as some of us have expected, due to the overwhelming bandwidth (& lower latency), they can optimize it as a dynamic cache, to bring in assets required for the scene and dump the rest into system ram on the fly.
Very similar to what NV does with the 3.5 + 0.5 970 partitions (as well as many earlier NV cards with segmented memory), put non-required assets into the slow partition.
It means they need to optimize drivers for each game that actually requires (not allocate, but actually require) >4GB vram at 4k.
Not exactly. It will be done by new APIs, and every optimization will lie in the developers hands.
Ofc, take nothing concrete from official company speak, its just the way he describes it, was exactly as I envisioned it in a post awhile ago regarding 4GB HBM being dynamic cache for assets since its just so fast (bandwidth & latency).
Proof will be Fury X CF vs 980Ti SLI 4K battle, maxing settings to get playable >45 fps. If Fury X CF tanks then we know its hit a vram barrier. What will be useless, is to compare 4K with 8x MSAA where both CF & SLI setups get 20 fps. I'm sure some shill sites will try it anyway.
Yep, but that some think is without game'barely'works features enabled. Still, no mean feat that.They already have a slide showing a single Fury card averaging 53fps in FC4 at 4K, don't they?
https://youtu.be/KIiNFXf00_U?t=11m5s
AMD explains why HBM 4GB is not comparable to 4GB GDDR5.
Actually as some of us have expected, due to the overwhelming bandwidth (& lower latency), they can optimize it as a dynamic cache, to bring in assets required for the scene and dump the rest into system ram on the fly.
Very similar to what NV does with the 3.5 + 0.5 970 partitions (as well as many earlier NV cards with segmented memory), put non-required assets into the slow partition.
It means they need to optimize drivers for each game that actually requires (not allocate, but actually require) >4GB vram at 4k.
Actually it is not that simple as an inept developer i'm afraid. On more than a few occasions, game code hasn't been available to them till launch of the said game. That of course skews the results in some reviews. That and things like excessive tesselation in Crysis 2, Metro etc, which were designed to obviously have a certain impact.@garagisti
It depends on the game developer, with Witcher 3, I am very happy with the performance. Just a few small tweaks and CF works amazing, with better scaling than NV, AND I get the option of running with Hairworks with minimal performance loss by controlling tessellation usage.
Most of the GW titles that cause AMD problems have been from Ubisoft. To be fair, they also messed up on NV GPUs too. FC4 had broken SLI on NV setups, shadows were messed up for months. Watch Dogs didn't even have functional SLI at 4K and stutter on NV SLI at every resolution. ACU was just bug ridden in general.
Basically, if its an Ubi title, just avoid buying it early on, wait a few months for patches and you're good to go.
5150Joker Comparing a $649 Fury X with a TitanX (@$999) is a comparison but NOT of equal cards. Running 3 Titan X's is @$3000 vs $1950 for 3 Fury X's.
I suspect AMD could not get an 8G card ready or decided to wait for HBM2 to release such a high resolution card. Tough when you are releasing a new model that you want to run well. No doubt their highest end consumer card, for now, is limited to 4G Vram, HBM1.
If you have followed the releases from AMD, and especially Joe Macri, he recently (last few weeks) acknowledged this limitation and said he addressed it by assigning engineers specifically to better utilize the Vram. Only extensive testing will see if his efforts bore fruit.
The real battle appears to be between the GTX980 TI and the Fury X. The price is comparable, though the GTX980 TI has more Vram.
Only when reviewers get these cards and run them at 4k and multiple 4k monitors with various games will we see the results.
In games like Witcher 3 and GTA V, when you are moving in the open world, it's seamless and its due to asset streaming. Stutter free. That's coming from system ram. Definitely possible with optimizations or as they call it: heuristics.
Why multiple 4k monitors? Who here is building a setup to use multiple 4k monitors on today's cards?
From his sig, the Titan X SLI is driving a 1440p monitor, pretty good match up to max games. But definitely 4K maxing isn't possible with recent titles for SLI. You'll need Quad SLI for a single 4K "max"..
Who knows, maybe future games won't be as demanding as the recent batch. But how likely is that?
Why didn't AMD do it before, then? (Or NV for that matter)
The memory bandwidth of Fury is not all that much higher than the 290/390X.
384GB/sec vs 512GB/sec, certainly not "overwhelming", just 33% higher.
In both cases the developers are needlessly wasting GPU memory.
If you are afraid of texture popping from HDD streaming, you can load your assets to main RAM in a lossless compressed format (LZMA or such in addition to DXT). Uncompress the data when a texture region becomes visible and stream to GPU memory using tiled resources.
Uncompressed textures (no DXT compression in addition to ZIP/LZMA) are just a stupid idea in huge majority of the use cases. You just waste a lot of bandwidth (performance) and memory for no visible gain. With normalization/renormalization the quality is very good for material properties and albedo/diffuse and BC5 actually beats uncompressed R8B8 in quality for normal maps (the Crytek paper about this method is a good read). BC7 format in DX11 gives you extra quality compared to BC3 with no extra runtime cost.
Most games are not using BC7 yet on PC, because the developer needs to also support DX10 GPUs. Duplicate assets would double the download size. Tiled resources need DX11.2 and DX11.2 unfortunately needs Windows 8. This is not yet broad enough audience. These problems will fix themselves in a few years. In addition, DX12 adds asych copy queues and async compute allowing faster streaming with less latency (much reduced texture popping).
Hopefully these new features will stop the brute force memory wasting seen in some PC games. Everything we have seen so far could have been easily implemented using less than 2GB of video memory (even at 4K), if the memory usage was tightly optimized.
33% raw, but memory compression added, it's effectively doubled. That's a ton of bandwidth available to move buffers & assets around, rotating their residency, or keeping them "non-local" on the fly. I'm not a GPU engineer so I can't say with authority that it's possible. But AMD's brief explanation seems plausible? No reason to doubt until proven otherwise.
Why didn't AMD do it before, then? (Or NV for that matter)
The memory bandwidth of Fury X is not all that much higher than the 290/390X.
384GB/sec vs 512GB/sec, certainly not "overwhelming", just 33% higher.
33% raw, but memory compression added, it's effectively doubled. That's a ton of bandwidth available to move buffers & assets around, rotating their residency, or keeping them "non-local" on the fly. I'm not a GPU engineer so I can't say with authority that it's possible. But AMD's brief explanation seems plausible? No reason to doubt until proven otherwise.
https://forum.beyond3d.com/posts/1852564/
There we go from the mouth of a AAA render dev.