Fury XT and Pro prices

LTC8K6 · Jun 17, 2015

Has anyone seen a picture of the air cooled Fury card?

Silverforce11 · Jun 17, 2015

https://youtu.be/KIiNFXf00_U?t=11m5s

AMD explains why HBM 4GB is not comparable to 4GB GDDR5.

Actually as some of us have expected, due to the overwhelming bandwidth (& lower latency), they can optimize it as a dynamic cache, to bring in assets required for the scene and dump the rest into system ram on the fly.

Very similar to what NV does with the 3.5 + 0.5 970 partitions (as well as many earlier NV cards with segmented memory), put non-required assets into the slow partition.

It means they need to optimize drivers for each game that actually requires (not allocate, but actually require) >4GB vram at 4k.

garagisti · Jun 17, 2015

2is said:
You can at least try to be original with your come backs. Curious If you think SLI is broken "quite often" how would you describe CF that's broken far more often? Or do you play the ignorance card and pretend those cases don't exist? It's really more of a rhetorical question than anything. I already know the answer. Pretty sure you do too.

I'd wager that CF works questionably mostly in NV sponsored games, and you could guess why. It is not like a company hasn't ever asked dev to remove features to disadvantage their competition.

Silverforce11 · Jun 17, 2015

@garagisti
It depends on the game developer, with Witcher 3, I am very happy with the performance. Just a few small tweaks and CF works amazing, with better scaling than NV, AND I get the option of running with Hairworks with minimal performance loss by controlling tessellation usage.

Most of the GW titles that cause AMD problems have been from Ubisoft. To be fair, they also messed up on NV GPUs too. FC4 had broken SLI on NV setups, shadows were messed up for months. Watch Dogs didn't even have functional SLI at 4K and stutter on NV SLI at every resolution. ACU was just bug ridden in general.

Basically, if its an Ubi title, just avoid buying it early on, wait a few months for patches and you're good to go.

Glo. · Jun 17, 2015

Silverforce11 said:
https://youtu.be/KIiNFXf00_U?t=11m5s

AMD explains why HBM 4GB is not comparable to 4GB GDDR5.

Actually as some of us have expected, due to the overwhelming bandwidth (& lower latency), they can optimize it as a dynamic cache, to bring in assets required for the scene and dump the rest into system ram on the fly.

Very similar to what NV does with the 3.5 + 0.5 970 partitions (as well as many earlier NV cards with segmented memory), put non-required assets into the slow partition.

It means they need to optimize drivers for each game that actually requires (not allocate, but actually require) >4GB vram at 4k.

Not exactly. It will be done by new APIs, and every optimization will lie in the developers hands.

LTC8K6 · Jun 17, 2015

Silverforce11 said:
https://youtu.be/KIiNFXf00_U?t=11m5s

AMD explains why HBM 4GB is not comparable to 4GB GDDR5.

Actually as some of us have expected, due to the overwhelming bandwidth (& lower latency), they can optimize it as a dynamic cache, to bring in assets required for the scene and dump the rest into system ram on the fly.

Very similar to what NV does with the 3.5 + 0.5 970 partitions (as well as many earlier NV cards with segmented memory), put non-required assets into the slow partition.

It means they need to optimize drivers for each game that actually requires (not allocate, but actually require) >4GB vram at 4k.

Yes, but they are trying to sell us the cards. So we should take that with a grain of salt until objective reviews are done.

Silverforce11 · Jun 17, 2015

Glo. said:
Not exactly. It will be done by new APIs, and every optimization will lie in the developers hands.

Plenty of games will still be DX11 for awhile yet.

I also don't believe in the DX12 miracle, its not going to fix bad/lazy devs who can't optimize. It'll probably make it worse as it places more responsibilities on them.

Silverforce11 · Jun 17, 2015

LTC8K6 said:
Yes, but they are trying to sell us the cards. So we should take that with a grain of salt until objective reviews are done.

Ofc, take nothing concrete from official company speak, its just the way he describes it, was exactly as I envisioned it in a post awhile ago regarding 4GB HBM being dynamic cache for assets since its just so fast (bandwidth & latency).

Proof will be Fury X CF vs 980Ti SLI 4K battle, maxing settings to get playable >45 fps. If Fury X CF tanks then we know its hit a vram barrier. What will be useless, is to compare 4K with 8x MSAA where both CF & SLI setups get 20 fps. I'm sure some shill sites will try it anyway.

flopper · Jun 17, 2015

Silverforce11 said:
https://youtu.be/KIiNFXf00_U?t=11m5s

AMD explains why HBM 4GB is not comparable to 4GB GDDR5.

Actually as some of us have expected, due to the overwhelming bandwidth (& lower latency), they can optimize it as a dynamic cache, to bring in assets required for the scene and dump the rest into system ram on the fly.

Very similar to what NV does with the 3.5 + 0.5 970 partitions (as well as many earlier NV cards with segmented memory), put non-required assets into the slow partition.

It means they need to optimize drivers for each game that actually requires (not allocate, but actually require) >4GB vram at 4k.

Thx.

Glo. said:
Not exactly. It will be done by new APIs, and every optimization will lie in the developers hands.

Thats easy to patch in then using DX12.
Fury seems amazing to me.

LTC8K6 · Jun 17, 2015

Silverforce11 said:
Ofc, take nothing concrete from official company speak, its just the way he describes it, was exactly as I envisioned it in a post awhile ago regarding 4GB HBM being dynamic cache for assets since its just so fast (bandwidth & latency).

Proof will be Fury X CF vs 980Ti SLI 4K battle, maxing settings to get playable >45 fps. If Fury X CF tanks then we know its hit a vram barrier. What will be useless, is to compare 4K with 8x MSAA where both CF & SLI setups get 20 fps. I'm sure some shill sites will try it anyway.

They already have a slide showing a single Fury card averaging 53fps in FC4 at 4K, don't they?

garagisti · Jun 17, 2015

LTC8K6 said:
They already have a slide showing a single Fury card averaging 53fps in FC4 at 4K, don't they?

Yep, but that some think is without game'barely'works features enabled. Still, no mean feat that.

Enigmoid · Jun 17, 2015

Silverforce11 said:
https://youtu.be/KIiNFXf00_U?t=11m5s

AMD explains why HBM 4GB is not comparable to 4GB GDDR5.

Actually as some of us have expected, due to the overwhelming bandwidth (& lower latency), they can optimize it as a dynamic cache, to bring in assets required for the scene and dump the rest into system ram on the fly.

Very similar to what NV does with the 3.5 + 0.5 970 partitions (as well as many earlier NV cards with segmented memory), put non-required assets into the slow partition.

It means they need to optimize drivers for each game that actually requires (not allocate, but actually require) >4GB vram at 4k.

What exactly do you expect them to say?

Latency is not significantly lower (maybe 20%). Hynix's slides show comparable latency.

"Bring data in from system memory". You can do that on any card with GDDR5. And you are still limited to the 16 GB/sec that PCIe gives you. You have even less bandwidth when running CF, which is incidentally when you are most likely to need more than 4 GB because CF is running through and using some of the bandwidth already.

Silverforce11 · Jun 17, 2015

In games like Witcher 3 and GTA V, when you are moving in the open world, it's seamless and its due to asset streaming. Stutter free. That's coming from system ram. Definitely possible with optimizations or as they call it: heuristics.

garagisti · Jun 17, 2015

Silverforce11 said:
@garagisti
It depends on the game developer, with Witcher 3, I am very happy with the performance. Just a few small tweaks and CF works amazing, with better scaling than NV, AND I get the option of running with Hairworks with minimal performance loss by controlling tessellation usage.

Most of the GW titles that cause AMD problems have been from Ubisoft. To be fair, they also messed up on NV GPUs too. FC4 had broken SLI on NV setups, shadows were messed up for months. Watch Dogs didn't even have functional SLI at 4K and stutter on NV SLI at every resolution. ACU was just bug ridden in general.

Basically, if its an Ubi title, just avoid buying it early on, wait a few months for patches and you're good to go.

Actually it is not that simple as an inept developer i'm afraid. On more than a few occasions, game code hasn't been available to them till launch of the said game. That of course skews the results in some reviews. That and things like excessive tesselation in Crysis 2, Metro etc, which were designed to obviously have a certain impact.

The problem as far as i as a paying customer am concerned is that clearly not enough people are calling out a company on the BS. Can you imagine what will have been the uproar here had it been AMD which was involved in the bumpgate, or for that matter 3.5gb incident. Heck, they have full 4gb but somehow that's not as good. A lot of people here suggest that it is OK for a IHV to actually go and have the code such that it is a piece of crap for all intents and purposes, which to me is simply baffling. I mean, we all have seen poorly coded games, half-baked stuff even, but some of it coming from NV sponsored devs, it is not just that i'm afraid. BF was not done well, errors etc., but it didn't have features built in to punish competition. That is the difference really between a dev getting it wrong, well, opposed to simply having code written in such a way as to punish competition.

TemjinGold · Jun 17, 2015

guskline said:
5150Joker Comparing a $649 Fury X with a TitanX (@$999) is a comparison but NOT of equal cards. Running 3 Titan X's is @$3000 vs $1950 for 3 Fury X's.

I suspect AMD could not get an 8G card ready or decided to wait for HBM2 to release such a high resolution card. Tough when you are releasing a new model that you want to run well. No doubt their highest end consumer card, for now, is limited to 4G Vram, HBM1.

If you have followed the releases from AMD, and especially Joe Macri, he recently (last few weeks) acknowledged this limitation and said he addressed it by assigning engineers specifically to better utilize the Vram. Only extensive testing will see if his efforts bore fruit.

The real battle appears to be between the GTX980 TI and the Fury X. The price is comparable, though the GTX980 TI has more Vram.

Only when reviewers get these cards and run them at 4k and multiple 4k monitors with various games will we see the results.

Why multiple 4k monitors? Who here is building a setup to use multiple 4k monitors on today's cards?

LTC8K6 · Jun 17, 2015

Silverforce11 said:
In games like Witcher 3 and GTA V, when you are moving in the open world, it's seamless and its due to asset streaming. Stutter free. That's coming from system ram. Definitely possible with optimizations or as they call it: heuristics.

Why didn't AMD do it before, then? (Or NV for that matter)

The memory bandwidth of Fury X is not all that much higher than the 290/390X.

384GB/sec vs 512GB/sec, certainly not "overwhelming", just 33% higher.

Silverforce11 · Jun 17, 2015

TemjinGold said:
Why multiple 4k monitors? Who here is building a setup to use multiple 4k monitors on today's cards?

From his sig, the Titan X SLI is driving a 1440p monitor, pretty good match up to max games. But definitely 4K maxing isn't possible with recent titles for SLI. You'll need Quad SLI for a single 4K "max"..

Who knows, maybe future games won't be as demanding as the recent batch. But how likely is that?

Arachnotronic · Jun 17, 2015

Silverforce11 said:
From his sig, the Titan X SLI is driving a 1440p monitor, pretty good match up to max games. But definitely 4K maxing isn't possible with recent titles for SLI. You'll need Quad SLI for a single 4K "max"..

Who knows, maybe future games won't be as demanding as the recent batch. But how likely is that?

Yeah, the current generation of GPUs just isn't 4K ready, IMO.

Fiji looks cool, looking forward to see how it does in benchmarks. In the meantime, I will be waiting for 14/16nm GPUs before I upgrade my systems.

Silverforce11 · Jun 17, 2015

LTC8K6 said:
Why didn't AMD do it before, then? (Or NV for that matter)

The memory bandwidth of Fury is not all that much higher than the 290/390X.

384GB/sec vs 512GB/sec, certainly not "overwhelming", just 33% higher.

33% raw, but memory compression added, it's effectively doubled. That's a ton of bandwidth available to move buffers & assets around, rotating their residency, or keeping them "non-local" on the fly. I'm not a GPU engineer so I can't say with authority that it's possible. But AMD's brief explanation seems plausible? No reason to doubt until proven otherwise.

itsmydamnation · Jun 17, 2015

https://forum.beyond3d.com/posts/1852564/
There we go from the mouth of a AAA render dev.

In both cases the developers are needlessly wasting GPU memory.

If you are afraid of texture popping from HDD streaming, you can load your assets to main RAM in a lossless compressed format (LZMA or such in addition to DXT). Uncompress the data when a texture region becomes visible and stream to GPU memory using tiled resources.

Uncompressed textures (no DXT compression in addition to ZIP/LZMA) are just a stupid idea in huge majority of the use cases. You just waste a lot of bandwidth (performance) and memory for no visible gain. With normalization/renormalization the quality is very good for material properties and albedo/diffuse and BC5 actually beats uncompressed R8B8 in quality for normal maps (the Crytek paper about this method is a good read). BC7 format in DX11 gives you extra quality compared to BC3 with no extra runtime cost.

Most games are not using BC7 yet on PC, because the developer needs to also support DX10 GPUs. Duplicate assets would double the download size. Tiled resources need DX11.2 and DX11.2 unfortunately needs Windows 8. This is not yet broad enough audience. These problems will fix themselves in a few years. In addition, DX12 adds asych copy queues and async compute allowing faster streaming with less latency (much reduced texture popping).

Hopefully these new features will stop the brute force memory wasting seen in some PC games. Everything we have seen so far could have been easily implemented using less than 2GB of video memory (even at 4K), if the memory usage was tightly optimized.

LTC8K6 · Jun 17, 2015

Silverforce11 said:
33% raw, but memory compression added, it's effectively doubled. That's a ton of bandwidth available to move buffers & assets around, rotating their residency, or keeping them "non-local" on the fly. I'm not a GPU engineer so I can't say with authority that it's possible. But AMD's brief explanation seems plausible? No reason to doubt until proven otherwise.

Tonga does lossless compression...is that the compression we are talking about? It was a new feature of the R9-285 cards that helped compensate for only having 2gb of VRAM.

There is a Tonga card now with 4gb of VRAM...

Erenhardt · Jun 17, 2015

LTC8K6 said:
Why didn't AMD do it before, then? (Or NV for that matter)

The memory bandwidth of Fury X is not all that much higher than the 290/390X.

384GB/sec vs 512GB/sec, certainly not "overwhelming", just 33% higher.

Downplay it all you want, but GDDR5 wasn't any better.
http://www.anandtech.com/show/2556
Sure it doubled on the same bus setup with GDDR3. But comparing it to GDDR4 makes it 60% faster.

Comparing it to the beastly 512 bit GDDR3 setup is quite revealing:
First GDDR5 card was a downgrade from GDDR3 monster card (gtx280) in terms of available bandwidth.

HBM started way better than GDDR5, and surpasses everything out there by 30%+

JoeRambo · Jun 17, 2015

Silverforce11 said:
33% raw, but memory compression added, it's effectively doubled. That's a ton of bandwidth available to move buffers & assets around, rotating their residency, or keeping them "non-local" on the fly. I'm not a GPU engineer so I can't say with authority that it's possible. But AMD's brief explanation seems plausible? No reason to doubt until proven otherwise.

Any data coming from system memory has to pass PCIE. There is simply not enough PCIE bandwidth for that.
AMD could come up with some clever loss-less compression/decompression for generic assets on the fly scheme, using fixed function hardware to save power, but it far from trivial and would need help from drivers to decide what to compress.

Silverforce11 · Jun 17, 2015

itsmydamnation said:
https://forum.beyond3d.com/posts/1852564/
There we go from the mouth of a AAA render dev.

Well there we have it.

Since GCN has no issue with tile resources or async compute & rendering, definitely drivers can stream in assets as required. I like the emphasis on async compute/rendering to prevent texture pop-ins as it reduces frame latency.

Headfoot · Jun 17, 2015

To my knowledge most PC games are still using uncompressed visual assets. That's a really easy way to reduce VRAM usage. The question is whether AMD has taken this route or not. But its certainly possible.

Fury XT and Pro prices

Lifer

Lifer

Senior member

Lifer

Diamond Member

Lifer

Lifer

Lifer

Senior member

Lifer

Senior member

Platinum Member

Lifer

Senior member

Diamond Member

Lifer

Lifer

Lifer

Lifer

Diamond Member

Lifer

Diamond Member

Golden Member

Lifer

Diamond Member