Ubisoft: AMD's Mantle API is a double-edged sword

Page 6 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Hopefully us nvidia users are not left out in the cold on DX 11.2.

Considering nvidia's Kepler cards don't even support DX 11.1 at the hardware level and only do some of the dx 11.1 features via software, will we even have support at all for DX 11.2 ? Considering they don't support DX 11.1 with hardware, there will be even less hardware support available for DX 11.2...

AMD may wind up having the API advantage not just with Mantle in Battlefield 4, which is a given, but also with DX 11.1 and 11.2.

Hopefully not!

Kepler supports every DX11.1 feature in hardware except two.

And those aren't game related..
 
Last edited:

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Are you ever going to stop spreading misinformation. Obviously that feature AMD chose not to implement that only 1 game takes advantage of doesn't really matter. This driver that is obviously broken is still performing better than nvidia's "working" driver. So show me an instance of AMD's broken driver performing poorly.

AC3 uses driver command lists as well. I'm going to laugh my ass off if the reason why NVidia has such a huge lead over AMD in BF4 Windows 8 is because of the driver command list function..


That must have been from an old review.. Here's a newer one..

 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
I'm playing Battlefield 4 right now on a Geforce 760, and while my fps have been consistently high, I am getting some hitching or choppiness that isn't reflected through the framerate. It is definitely there. But Beta is Beta, hopefully these issues get ironed out.

Are you using Windows 7 or Windows 8?
 

Skurge

Diamond Member
Aug 17, 2009
5,195
1
71
AC3 uses driver command lists as well. I'm going to laugh my ass off if the reason why NVidia has such a huge lead over AMD in BF4 Windows 8 is because of the driver command list function..



That must have been from an old review.. Here's a newer one..


Huge lead?
 

Carfax83

Diamond Member
Nov 1, 2010
6,841
1,536
136
Huge lead?

Driver command lists only help in CPU bound situations. The GTX 780 has a huge 46% lead over the 7970Ghz because it is the most CPU bound at that resolution, much more than the GTX 770 and GTX 680.

At a lower resolution, you would see a much bigger gap between the GTX 770 and the 7970Ghz..
 

RussianSensation

Elite Member
Sep 5, 2003
19,458
765
126
Driver command lists only help in CPU bound situations. The GTX 780 has a huge 46% lead over the 7970Ghz because it is the most CPU bound at that resolution, much more than the GTX 770 and GTX 680.

At a lower resolution, you would see a much bigger gap between the GTX 770 and the 7970Ghz..

Except no one can tell a difference in gameplay from this "huge" lead in a strategy game where a last generation card is getting 70 fps. A strategy game is not like a racing or an FPS game where you can actually tell apart 90 vs. 138 fps. Where are these other games where NV badly beats AMD in a DX11 game in a similar price bracket as a result of multi-threaded DX11 driver?

What difference does it make that AC3 uses driver command lists? 770 cannot beat R9 280X/7970GE in that title.



With NV refusing to lower prices on the 770, the 4GB version has become a laughing joke, priced $150 more than R9 280X cards and offering no performance advantage in DX9-11 games.

At least with the 780, one pays for the bragging rights. With the 770, NV asks gamers to throw $100-150 more for the brand name.

"The Bottom Line
It really does all come down to pricing. Though out of stock right, the ASUS R9 280X DC2T is being sold for exactly the quoted price from ASUS at $309. This is good news. The GeForce GTX 770 has an MSRP of $399 still, directly quoted from NVIDIA. That $90 price premium for the GTX 770 will not buy you an improved gaming experience. Even when you overclock the video cards, the gameplay experience is exactly the same.

The ASUS R9 280X DC2T is an incredible value."
~ HardOCP

NV should change its slogan: "NV, the way it's meant to throw your $ away."

770 vs. R9 280X/7970Ghz is the worst pricing mismatch since $299 4870 vs. $399 GTX260. NV are straight up throwing the middle finger at PC gamers. How can any unbiased PC gamer support the current NV's pricing structure with a straight face?

All this theoretical discussion about NV's "superior" DX drivers means nothing when their $450 770 4GB cannot beat a $279 AMD card. It's a joke and an insult to PC gamers that have followed hardware for a long time.

http://www.computerbase.de/artikel/grafikkarten/2013/nvidia-geforce-gtx-770-im-test/4/

Right now a PC gamer can buy GTX770 level of performance for only $280 and set aside $100-170 for a next GPU upgrade. Considering 770 cannot win in Crysis 3, Metro LL or Tomb Raider, NV are absolutely mad if they think PC gamers are so brand-brainwashed/ignorant as to pay 42-61% more over HD7970Ghz for the same average gaming performance.
 
Last edited:

zlatan

Senior member
Mar 15, 2011
580
291
136
Kepler supports every DX11.1 feature in hardware except two.

And those aren't game related..
Having 64 UAV slots is definitely the biggest gaming related feature in DX11.1. I wonder why they don't mention this.
UAVOnlyRenderingForcedSampleCount and UAVs at all shader stages are also gaming related features.
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
NVIDIA call the stream processors in their architectures CUDA cores.



Uploaded with ImageShack.us

While NVIDIA constantly optimizes and beef up their architecture, the CUDA core remains very similar.

And AMD/ATI called it stream processors since the X1000 series. No bells ringing yet?

 
Last edited:

GaiaHunter

Diamond Member
Jul 13, 2008
3,650
218
106
And AMD/ATI called it stream processors since the X1000 series. No bells ringing yet?

The difference being that AMD actually changed from VLIW5/4 to individual stream processors, while NVIDIA didn't.

http://cinwell.wordpress.com/2013/09/06/evolution-of-gpu-gt80-gt200-fermi-kepler/

G80: November 2006. Initial vision of what a unified graphics and computing parallel processors should look like. The first GPU to support C, the first to use unified processor and utilize a scalar thread processor, and it introduced SIMT execution model (multiple independent threads execute concurrently using a single instruction) and introduced shared memory and barrier synchronization for inter-thread communication.

GT200: June 2008. As a major revision to the G80 architecture, GT200 mainly extended the performance and functionality of G80. It increased #streaming processor cores from 128 to 240. Each processor register file was doubled in size, allowing a greater #threads to execute on-chip at any given time. Hardware memory access coalescing wad added to improve memory access efficiency. Double precision floating point support was also added.

Fermi: 2010. The Fermi architecture is the most significant leap forward in GPU architectures since the original G80. The key areas for Fermi to improve were gathered from user feedback on GPU computing since the introduction of G80 and GT200: (1) Improve Double Precision Performance; (2) ECC Support; (3) True Cache Hierarchy: some parallel algorithms were unable to use the GPU’s shared memory, and users requested a true cache architecture to aid them; (4) More Shared Memory: many CUDA programmers requested more than 16 KB of SM shared memory to speed up their applications; (5) Faster Context Switching; (6) Faster Atomic Operations: users requested faster read-modify-write atomic operations for their parallel algorithms.

New features in Kepler GK110:

Dynamic Parallelism: adds the capability for the GPU to generate new work for itself, synchronize on results, and control the scheduling of that work via dedicated, accelerated hardware paths, all without involving the CPU.
Hyper-Q: enables multiple CPU cores to launch work on a single GPU simultaneously, thereby dramatically increasing GPU utilization and significantly reducing CPU idle times. Hyper‐Q increases the total number of connections (work queues) between the host and theGK110GPUby allowing 32 simultaneous, hardware‐managed connections (compared to the single connection available with Fermi).
Grid Management Unit: enabling Dynamic Parallelism requires an advanced, flexible grid management and dispatch control system. The new GK110Grid Management Unit (GMU) manages and prioritizes grids to be executed on the GPU. The GMU can pause the dispatch of new grids and queue pending and suspended grids until they are ready to execute, providing the flexibility to enable powerful runtimes, such as Dynamic Parallelism. The GMU ensures both CPU‐ and GPU‐generated workloads are properly managed and dispatched.
GPU Direct: it is a capability that enables GPUs within a single computer, or GPUs in different servers located across a network, to directly exchange data without needing to go to CPU/system memory.
SMX Architecture:
--------------------------------------------------------------------------------------------

All the improvements are around the cuda cores, how they are organized, how they are fed, the caches available.

I'm not trying to minimize the evolution of the NVIDIA architecture and its performances and feature gains.

I'm just emphasizing how versatile and flexible the CUDA stream processors are.
We all know the WLIW5/4 could have awesome performance if properly coded for but if not much of its power would be lost and that is why AMD changed to GCN.
 
Last edited:

Slomo4shO

Senior member
Nov 17, 2008
586
0
71
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
I'm not talking about cuda or stream.

I'm talking about the stream processors.

http://en.wikipedia.org/wiki/Stream_processing
http://en.wikipedia.org/wiki/Graphics_processing_unit

Let me refresh your obvious bad memory since you are completely gone:

2nd) AMD used VLIW for 6 years. NVIDIA CUDA core has been with us since the GTX8800. As I said before GPU architectures have evolved in a different manner in the last few years and actually AMD and NVIDIA architectures share many more things in common now that they did 10 years ago.

AMD first stream processor was the X1000 series. And the HD7000/HD8000/Rx series is AMDs 7th generation stream processor.

The g80 was nVidias first generation cuda cores. And Kepler today is nVidias 3rd generation cuda cores.

There is nothing on the uarch level that makes then cuda cores or stream processors. They can be radically different as we have seen multiple times. The only thing they share is the API. STREAM API or CUDA API.

You could just as well say the HD7970 got 2048 mantle cores or 2048 opencl processors.

X1000 didnt even call it stream processors yet. Since it was still the time of pixel and vertex shaders.
 
Last edited:

GaiaHunter

Diamond Member
Jul 13, 2008
3,650
218
106
Let me refresh your obvious bad memory since you are completely gone:



AMD first stream processor was the X1000 series. And the HD7000/HD8000/Rx series is AMDs 7th generation stream processor.

The g80 was nVidias first generation cuda cores. And Kepler today is nVidias 3rd generation cuda cores.

There is nothing on the uarch level that makes then cuda cores or stream processors. They can be radically different as we have seen multiple times. The only thing they share is the API. STREAM API or CUDA API.

You could just as well say the HD7970 got 2048 mantle cores or 2048 opencl processors.

X1000 didnt even call it stream processors yet. Since it was still the time of pixel and vertex shaders.

No, the X1000 didn't have unified shaders.

Unified shaders are from G80 (Geforce 8) and R600 (X2000).

NVIDIA has been using SIMD shaders since G80.
AMD used VLIW5/4 shaders and only with GCN changed to SMID.

VLIW5 meant that the unified shaders were grouped in segments of 5 and one needed to code to take advantage of all 5 of them.

This meant something like a 4870 had a minimum of 160 unified shaders and a maximum of 800 shaders.






As long as the unified shader in the AMD architecture remain SMID they will have no problems supporting mantle.
 
Last edited:

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
AC3 uses driver command lists as well. I'm going to laugh my ass off if the reason why NVidia has such a huge lead over AMD in BF4 Windows 8 is because of the driver command list function..



That must have been from an old review.. Here's a newer one..


Uhm, your image is for a different resolution... So not really an apples to apples comparison there. Although its obvious driver changes have been made on both sides.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
No, the X1000 didn't have unified shaders.

*snip

As long as the unified shader in the AMD architecture remain SMID they will have no problems supporting mantle.

You are really completely lost and have no idea what the different things mean

 

GaiaHunter

Diamond Member
Jul 13, 2008
3,650
218
106
You are really completely lost and have no idea what the different things mean


If you don't know what a stream processor or a unified shader is and believe it is the AMD SDK Stream, well no point continuing this conversation.

You might by the way inform Ryan Smith to stop calling NVIDIA cores Stream Processors.

Sorry, Ryan dude you know [profanity not allowed] of what you talking about.

Guess A|T lost all credibility.



Uploaded with ImageShack.us
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Guess A|T lost all credibility.

Nope, only you.

You can pretty much call them what you want. But nVidia for example calls them CUDA cores. The X1000, while not unified was still able to do STREAM processing. I know you knowledge is limited. So let me show you instead.


 

GaiaHunter

Diamond Member
Jul 13, 2008
3,650
218
106
So you think stream processing is physics?

Ok...

Of course this discussion started with saying that NVIDIA has been on SIMD since the G80 and AMD was on VLIW5/4 since R600.
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Last edited:

SirPauly

Diamond Member
Apr 28, 2009
5,187
1
0
nVidia also was evangelizing with GPGPU and HavokFX with the 6XXX and 7XXX GeForce families in 2006 before Cuda and unification!

NVIDIA and Havok Demonstrate World's First GPU-Powered Game Physics Solution at Game Developer's Conference

http://www.nvidia.com/object/IO_30478.html


The reason why was shader model 3.0:

Havok FX is designed for GPUs supporting Shader Model 3.0, including the NVIDIA GeForce 6 and 7 Series GPUs.

Shader model 3.0 really started to get the ball rolling for ATI and nVidia, imho!
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
And what does that have to do with the current unified shaders stream processors?

The X1950XT had 48 Pixel Shaders and 8 Vertex Shaders.

The X2900XT had 320 unified shaders.

I would say that is a big architecture difference.

Unified shaders got nothing as such to do with STREAM capabilities as seen on the R580.
 

GaiaHunter

Diamond Member
Jul 13, 2008
3,650
218
106
ATI had SIMD on the R580.

No, it started because you couldnt see the difference between an API and an uarch.
And it didn't with the R600.

A unified shader/stream processor is part of the uarch.

You just thought CUDA core and Stream processor meant CUDA and STREAM.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |