6950 vs GTX 460 768MB - Why does Nvidia beat the Radeon in Civ5???

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

A5

Diamond Member
Jun 9, 2000
4,902
5
81
I'll have to try that out. My issues with the game have been when my i7 @ 4.0 just wasn't enough. When that first happened I dropped it down to stock (2.67) and it was even worse, then got gradually better as I increased the clocks. The difference is that have the onscreen animations run at 15-20-25 fps doesn't bother you, at least in my case I'm usually zoomed out far enough that the animations are pretty tough to see, anyway. When the cpu gets overwhelmed it can seem to take ages to open up one of the myriad windows in the game and take 30-60 seconds or more IBT, to the point that on a large map with 12+ civs I have many times just rage quit right in the middle of (to me) the most exciting part of the game. I finally had to cap myself at standard map/epic speed or large map/standard speed if I wanted to finish the game without this happening. I haven't played in a month or so but I hear that the newest patch has helped a lot with these late-game issues (and possibly helped nvidia to improve their fps as well), maybe I'll fire it up in the next couple of days.

I've never had that long IBT, but I've also never had that many civs on the map either. I do agree that the day-to-day user experience in the game is affected more by CPU speed and RAM than the GPU, though.
 

Ryan Smith

The New Boss
Staff member
Oct 22, 2005
537
117
116
www.anandtech.com
As posted by Ryan Smith in an other thread:

Hey all, your friendly neighborhood GPU editor here;

So I was reading the forums and this thread caught my eye. I always have an open door policy, so please feel free to email me; you don't need to yell on the forums to get my attention.

Anyhow, for Civilization V:
http://www.anandtech.com/show/4135/nvidias-geforce-gtx-560-ti-upsetting-the-250-market/9
It's basically this in a nutshell. I'm not too sure what I can say that wasn't already in the article, but I'll see what I can do.

If you look at the old results, you'll notice that for single-GPU results, AMD and NV results tended to cluster together. The AMD cards would do around 32-36fps at 1920, while the NV cards would do 38-42fps or so. Obviously if we were truly CPU limited by the game, then everything would be about the same. Instead even slower NVIDIA cards do a bit better here. At the same time if we were GPU limited (even if the difference came down to specific architectural quirks), then we would see at least some scaling with faster GPUs, which we haven't seen.

There is a 3rd option however, and it's something that doesn't come up too much: being CPU limited by the driver. If AMD and NVIDIA are doing setup in a different manner (and they are), then you could see different results when you're CPU bound in the driver setup process. Furthermore setup can be an expensive process due to a number of reasons, so being setup limited doesn't necessarily point towards any one factor right away. Pre-tessellation vertices are probably the textbook example here, but this is probably not the case. Whatever the case, if we are driver limited, then with previous drivers it looks like AMD had more CPU overhead than NVIDIA, explaining the higher results for NVIDIA cards.

This all changed with Release 265 obviously, and now NVIDIA is much less CPU limited. Ultimately I am not sure what NVIDIA did to their drivers because they aren't willing to talk about it and let AMD see their hand. However from the data I have it's clear that something was going on with this game that created a driver bottleneck. In turn whatever that bottleneck is, NVIDIA has finally found it (keep in mind Civ 5 was released 4 months ago) and moved the bottleneck back to the GPU. If AMD has a similar bottleneck, then there's no reason to believe that they can't find it and pick up similar gains.

-Thanks
Ryan Smith

PS Keep in mind that virtually every DX11 title released up through the end of 2010 would have been developed solely against AMD cards, at least at first. AMD was sampling Juniper (5700 series) to developers in the early summer of 2009. Similar NVIDIA hardware wasn't available until a couple of months into 2010, around when the first consumer products launched.
Hopefully one day NVIDIA will allow me to explain what they did to improve Civ V so much. I found out what they did, however I'm not allowed to talk about it (and boy I'm dying to). It makes all the CPU limitations make sense though, and it somewhat reshaped my view on DX11. Honestly I'm surprised the eggheads over at Beyond 3D haven't already figured this one out; it seemed kind of obvious in retrospect.
 

Ryan Smith

The New Boss
Staff member
Oct 22, 2005
537
117
116
www.anandtech.com
After talking things over with NVIDIA, they've agreed to allow me to discuss the precise changes they made to boost their Civ V performance by so much. So gather around children, crazy uncle Ryan has a story to tell.

---

In our description of Civ V, I've mentioned that it uses a slew of DirectX 11 technologies" but I've never gone in to great detail on what those are. I'm not going to go into deep detail on that now - there's a good article over at PC Games Hardware that contains an interview with Firaxis about that - but I will quickly explain the ins and outs.

Often from a gamer standpoint it's natural to look at the immediate visual benefits of a new API. With DX11, the big feature is tessellation with a secondary feature of contact hardening shadows. However there's also a great deal of stuff going on in the backend for developers to make things faster - making things faster allows developers to use new graphical effects that may not have been practical before. So for DX11 on top of tessellation and contact hardening shadows there's also things like multithreaded rendering, compute shaders, support for larger textures, and the implementation of a pull model for certain attribute evaluation.

So why do I like Civ V? Because the LORE engine it's based on implements so many of these features. Sure, something like AvP will have tessellation added, or Bad Company 2 will implement contact hardening shadows, but most of the DX11 games today are adding one or two graphical features that improve the look of the game, but only begin to scratch the surface of the API. LORE goes much, much deeper. Firaxis uses multi-threaded rendering, they use compute shaders for texture compression, and they use tessellation. Today it's probably the most extensive AAA DX11 game that has been released so far. This makes it a great GPU benchmark, as it's a real game we can use to test features other games don't touch.

So what then is going on that made Civ V so much faster for NVIDIA? Admittedly I had to press NVIDIA for this - performance practically doubled on high-end GPUs, which is unheard of. Until they told me what exactly they did, I wasn't convinced it was real or if they had come up with a really sweet cheat. It definitely wasn't a cheat.

If you recall from our articles, I keep pointing to how we seem to be CPU limited at the time. Now if you go back to the list of DX11 features Civ V uses, a light bulb should light up: multithreaded rendering. Civ V uses multi-threaded rendering, in fact it uses it quite extensively. Now why do we have multi-threaded rendering in the first place? Half of this is to better mesh with multi-threaded games by enabling additional threads to directly contribute without having to go through a master thread first. But a second purposes is because multi-threaded rendering helps the GPU just as much as it helps the CPU.



Traditionally, rendering is a very serial process. The program needs to setup a bunch of objects and then pass that on to the video drivers and finally to the GPU. There's a high degree of submission overhead, meaning it's possible to choke the CPU while submitting a large number of objects to the GPU. In DirectX 11, multi-threaded rendering is achieved by turning the D3D pipeline into a 3 step process: the Device, the Immediate Context, and the Deferred Context. The important bit here is that the deferred context is full of things that have yet to be sent to the GPU, and that you can have a deferred context for each thread. When developers talk about multi-threaded rendering with DX11, this is what they're referring to. When you use DX11s multi-threaded rendering capabilities correctly, you can have several threads assemble their deferred contexts, and then combine them into a single command list once it comes time to render the scene.

So Civ V uses proper multi-threaded rendering, that's great! So why isn't this the end of the story?

It turns out that you don't actually need to support all these nifty multi-threading features to be DX11(or rather D3D11) compliant - those features are optional - and that's what happened. And this is what changed my perspective on DX11, as before now I've never realized that anything in the API/spec was optional. Previously we had all the pieces to understand what was going on, but without knowing that AMD and NVIDIA did not fully support multi-threaded rendering, it was never clear what the bottleneck was.

But let's be clear here: multi-threaded rendering is a massive undertaking on the driver and hardware side. You're doing the GPU equivalent of inventing the multi-tasking operating system. NVIDIA and AMD have not until this point supported multi-threaded rendering in their drivers, as they have needed time to implement this feature correctly in their drivers. If you have the DX SDK installed, in the DX Caps Viewer this is visible in the D3D11 section under the title "Driver Command Lists".



So in a nutshell, 4 months ago Civ V supported multi-threaded rendering. AMD and NVIDIA did not.

Firaxis @ PC Games Hardware said:
Civilization V, as far as we know, is the first fully threaded DX11 game.

Unfortunately, because no other games have used this feature yet, neither Nvidia nor AMD have publically released threaded drivers, so users may not experience all the benefits just yet. We decided to keep threading enabled for Civilization V, however, because we are continuing to work closely with Nvidia and AMD on their support for multi-threading. We expect publically available threaded drivers shortly.

The internal architecture of the Civilization V graphics engine, however, is heavily multi-threaded and users will see multi-processor benefits even with drivers that are not threaded (including DX9). We have developed a series of configurable benchmark modes that we use internally for measuring our threading ability. These are fully described in the readme file. After some discussion, we decided to expose these internal tests on the released version so, if the users view the readme file, they will see that there are detailed instructions of these benchmark modes.

Can you guess then what changed?

With the Release 265 series drivers, NVIDIA enabled partial support for DX11's multi-threaded rendering features. At the time this support was limited to just Civ V, and while it was beyond the experimental stage it was clearly limited to Civ V as that allowed NVIDIA to deploy it against a single known program while they collected feedback and finished the other aspects of multi-threaded rendering.

With NVIDIA's drivers now allowing Civ V to use multiple deferred contexts, Civ V's performance shot way up. With high-end GPUs performance damn near doubled at lower resolutions. Civ V was in fact CPU limited - it was CPU limited because it was only able to use a single thread to assemble its contexts, and that thread was maxing out the single GPU core it could use. This is why drivers played such a big part in Civ V's performance, because how drivers handled D3D11 contexts was the key to unlocking Civ V's performance.

At this point in time we appear to be GPU limited, but we may also be CPU limited. Firaxis says Civ V can scale to 12 threads; this would be a hex-core CPU with hyperthreading. Our testbed is only a quad-core CPU with HT, meaning we probably aren't maxing out Civ V on the CPU side. And even with HT, it's likely that 12 real cores would improve on performance relative to 6 cores + HT. Firaxis believes they're GPU limited, but it's hard to definitively tell which it is.


Image from Firaxis GDC11 presentation

In any case, full support for multi-threaded rendering was finally enabled in NVIDIA's Release 270 drivers, which were released last week. At this point any game or application can take advantage of the feature, and not just Civ V. This is also why NVIDIA has finally allowed me to write about what they're previously told me, as they no longer consider it a secret. Finally, on a side note the fact that Civ V had this feature enabled in NVIDIA's drivers early is why performance does not appear to have changed between Release 265 and Release 270.

Anyhow, as far as I know, AMD does not currently offer fully support for multi-threaded rendering (I don't have an AMD card plugged in right now to run the DX Caps Viewer against). I'm not sure where they are on it, though I doubt they're very far behind.

So in conclusion, the reason NVIDIA beats AMD in Civ V is that NVIDIA currently offers full support for multi-threaded rendering/deferred contexts/command lists, while AMD does not. Civ V uses massive amounts of objects and complex terrain, and because it's multi-threaded rendering capable the introduction of multi-threaded rendering support in NVIDIA's drivers means that NVIDIA's GPUs can now rip through the game.

This is the true power of DX11. When properly implemented in both drivers and games, DX11's multi-threaded rendering capabilities are going to allow developers to push a lot more stuff out to the GPU without immediately bottlenecking the CPU.

On a future note, while Civ V is the first game to use DX11 multi-threaded rendering, it is not going to be the last. Battlefield 3 will most likely use it - DICE was lamenting the lack of driver support last month at GDC. The Capcom team responsible for Lost Planet 2 also mentioned how they would have liked to have this feature working before LP2, though I can't find the article at this time.

Coincidentally, last month's interview with AMD's Richard Huddy at Bit-Tech also has a lot in common with this. AMD says DX11 multi-threaded rendering can double object/draw-call throughput, and they want to go well beyond that by bypassing the DX11 API.

Further Reading: AnandTech, Revealing The Power of DirectX 11
 
Last edited:

jimbo75

Senior member
Mar 29, 2011
223
0
0
Yeah no doubt that will repaid in the next Geforce review.

The problem with Civ 5 is the benchmark is not indicative of ingame play and it should never have been used. If it had been AMD at the top it wouldn't have been, that's why there is no Dragon Age 2 either.
 

notty22

Diamond Member
Jan 1, 2010
3,375
0
0
Thanks Ryan, good stuff to digest and give some life back in to dx11 relevance. Now hopefully Crytek is working on some of this to implement in Crysis 2.
 

Ryan Smith

The New Boss
Staff member
Oct 22, 2005
537
117
116
www.anandtech.com
that's why there is no Dragon Age 2 either.
I'm assuming this is referring to our current test suite?

The GPU test suite is refreshed roughly every 6 months. The last time it was refreshed was in late October of 2010, and as such there are not any games newer than that in the suite. It will be refreshed here in the next month or so.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Nice work Ryan,

From all the PC games, Civ V was the last one i would expect to see such features as it isn't an FPS and yet it was the first, NICE WORK Firaxis

Edit: It really puts the Big FPS boys in to shame (Crytek ??)
 
Last edited:

Chiropteran

Diamond Member
Nov 14, 2003
9,811
110
106
That was a very interesting read. IMO this a very important technology, as it seems to be a big key for making games that properly scale to more than 2 CPU cores. I hope AMD gets their act together soon.
 

jimbo75

Senior member
Mar 29, 2011
223
0
0
I have to question a benchmark that puts the GTX480 being faster than the GTX580 by 10%.

Maybe you should check the older Civ V benchmarks? More often than not the gtx 460 came top ahead of the 480. That didnt matter because they were all ahead of the radeons. :whiste:
 
Last edited:

Martimus

Diamond Member
Apr 24, 2007
4,488
153
106
Civ 5 is the only game that uses a multithreaded graphics engine. I would guess that nVidia just released multithreaded DX11 rendering drivers first, and AMD has yet to release multithreaded DX11 rendering drivers. (http://www.pcgameshardware.com/aid,...h-Interview-What-DirectX-11-is-good-for/News/)

EDIT: I just went through and read the rest of the posts, and Ryan actually verified what I had thought all along. It is nice to see that this feature really does increase performance that much. We'll probably start seeing it more in games int he future. (Although I read that 4 other games canceled development on multithreaded rendering due to lack of support for it from current drivers).
 
Last edited:

Lonyo

Lifer
Aug 10, 2002
21,939
6
81
Maybe you should check the older Civ V benchmarks? More often than not the gtx 460 came top ahead of the 480. That didnt matter because they were all ahead of the radeons. :whiste:

But isn't that the whole point of this entire thread?
NV has fixed its driver problems -> performance increases -> the odd capped situation where the GTX460 is faster than the 480 is solved -> performance should now reflect, well, performance -> GTX580 should be ahead of GTX480.


470 and 460 faster than 480 on old (not fully functional) driver.


580 > 570 > 480 > 560 > 460
Properly functional driver. Properly reflective scores because they aren't capped by driver limitations.

The benchmark quoted was suggesting that AMD outscores NV even with the fixed driver because the performance of the built in benchmark doesn't reflect game performance in terms of NV vs AMD difference.
The implication of the other graph is not that NV has a lead, but that the fixed driver is in fact not fixed, or the benchmark is wrong, because with a fixed driver it should be that the GTX580 is faster than the 480, especially when the 580 and 570 have different performance to each other and it scales with the GTX590, meaning performance isn't capped.

Therefore the high GTX480 is completely unusual and not what you would expect under any circumstance. If it was because the cards were capped, the GTX590 shouldn't scale and the 580 and 480 should be at about the same level, the 480 shouldn't have a 10% lead.

They also have some odd results in other Civ 5 benchmarks, like the 6790 being soundly beaten by the 5770 and sometimes almost the 5750. Since they use the same architecture, and have the same functional units, and the 6790 is improved in many ways, it's odd that it would ever be slower, especially when you crank up AA and res (1920x1200/8xAA and the 5770 is faster than the 6790, despite the massive bandwidth advantage of the 6790 and the improved tess performance and the equal in every other aspect specs, on paper at least).
 
Last edited:

jimbo75

Senior member
Mar 29, 2011
223
0
0
The benchmark quoted was suggesting that AMD outscores NV even with the fixed driver because the performance of the built in benchmark doesn't reflect game performance in terms of NV vs AMD difference.
The implication of the other graph is not that NV has a lead, but that the fixed driver is in fact not fixed, or the benchmark is wrong, because with a fixed driver it should be that the GTX580 is faster than the 480, especially when the 580 and 570 have different performance to each other and it scales with the GTX590, meaning performance isn't capped.

Therefore the high GTX480 is completely unusual and not what you would expect under any circumstance. If it was because the cards were capped, the GTX590 shouldn't scale and the 580 and 480 should be at about the same level, the 480 shouldn't have a 10% lead.

Or maybe it's because AMD fixed the game drivers and not the benchmark drivers?

Even SKYMTL doesn't use Civ V because "In addition, most sites use the Civ benchmarking tool which I find IS NOT representative of the in-game performance."

http://www.xtremesystems.org/forums/showpost.php?p=4791582&postcount=81
 

Lonyo

Lifer
Aug 10, 2002
21,939
6
81
Or maybe it's because AMD fixed the game drivers and not the benchmark drivers?

Even SKYMTL doesn't use Civ V because "In addition, most sites use the Civ benchmarking tool which I find IS NOT representative of the in-game performance."

http://www.xtremesystems.org/forums/showpost.php?p=4791582&postcount=81

I'm not talking about NV vs AMD, I'm talking about AMD vs AMD and NV vs NV.
Internally within product families the performance shown in the benchmarks from that Polish site don't add up with what would be expected based on specs and performance in everything else.
6790 > 5770 in specs and in every other game.
580 > 480 in specs and every other game.
In the Polish sites Civ 5 benchmarks, it's the other way round.
 

jimbo75

Senior member
Mar 29, 2011
223
0
0
Yes and they didn't match up for nVidia in the benchmark until a driver fix.

My point is, they match up for AMD ingame and they don't for nVidia. To me that points to nVidia optimising for a benchmark while AMD optimised the actual game.

Would you *really* be surprised to find out this is the case?
 

Genx87

Lifer
Apr 8, 2002
41,095
513
126
Ryan, from what it sounds like. Is this multi-threaded rendering like out of order execution on the CPU side?
 

jimbo75

Senior member
Mar 29, 2011
223
0
0
That is it?

http://translate.googleusercontent....le.com&usg=ALkJrhiHa5XXx-4A_fr8JeQGsJ5L1baVZg

Click on any of the Civ 5 benchmarks. In fact, look at all of them - they are all exactly what you would expect to see.

The only problem is with Civ 5, and only because the 480 beats the 580 in that. To me thats a clear case of nVidia not bothering to fix the actual game issues when it was easier and more profitable to fix the benchmark instead.
 

notty22

Diamond Member
Jan 1, 2010
3,375
0
0
Yes and they didn't match up for nVidia in the benchmark until a driver fix.

My point is, they match up for AMD ingame and they don't for nVidia. To me that points to nVidia optimising for a benchmark while AMD optimised the actual game.

Would you *really* be surprised to find out this is the case?
Imo, thats not what is going on. Ryan's whole explanation was to explain about multi-threading for better performance.
The benchmark work load is very high. Its probably very similar to putting Heaven Benchark on extreme tessellation.
The resulting fps gap between AMD and Nvidia gpu's widens with that setting.
Optimizing for a benchmark hints at cheating, and would be done by lessening the work load on the gpu.
Some reviewers may feel the benchmark work load is never seen in the game. I don't know. Every reviewer is different . TPU used to only test Dirt2 in dx9, for his own reasons.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |