NVIDIA Pascal Thread

Genx87 · Apr 5, 2016

USER8000 said:
Emm,you mean Kepler?? The GK210 was used in those markets not the GM200.

Intel can afford to throw billions at the market - you only have to look at how they have managed to sneak into the top 10 list of the TOP500.

I thought they had a maxwell based Tesla and it wasnt that popular due to FP64 performance?

USER8000 · Apr 5, 2016

Genx87 said:
I thought they had a maxwell based Tesla and it wasnt that popular due to FP64 performance?

Nvidia rejigged the GK110:

http://www.anandtech.com/show/8729/nvidia-launches-tesla-k80-gk210-gpu

Det0x · Apr 5, 2016

Hmm. The new FirePro delivers a maximum single-precision performance of 13.9 teraflops, while Nvidia nextgen p100 delivers 10.6 teraflops ?

Are i am missing something ?

n0x1ous · Apr 5, 2016

Det0x said:
Hmm. The new FirePro delivers a maximum single-precision performance of 13.9 teraflops, while Nvidia nextgen p100 delivers 10.6 teraflops ?

Are i am missing something ?

Yes you Are. That is a dual Fiji part so 596mm2 X 2 GPUs that are single precision focused with minimal double precision hardware

Glo. · Apr 5, 2016

Det0x said:
Hmm. The new FirePro delivers a maximum single-precision performance of 13.9 teraflops, while Nvidia nextgen p100 delivers 10.6 teraflops ?

Are i am missing something ?

At the same power and thermal envelope. Nothing you are missing. Apart from the fact that S9300x2 has 1/16 ratio of Double Precision and P100 has 1/2 ratio of DP.

But in single precision AMD has better solution, unless you run CUDA software.

ThatBuzzkiller · Apr 5, 2016

GP100 will need much more parallelism compared to GM200 ...

49152 warps vs 114688 warps

In that respect Pascal is more closer to GCN the ever in terms of on chip memory resources and occupancy and it even has 64K of shared mem per SM (64 wide) just like GCN CU (64 wide) having 64K of shared mem! (The same ALU/shared mem ratio!)

Genx87 · Apr 5, 2016

USER8000 said:
Nvidia rejigged the GK110:

http://www.anandtech.com/show/8729/nvidia-launches-tesla-k80-gk210-gpu

I stand corrected.

nvgpu · Apr 5, 2016

Nvidia is probably waiting for 8 stack HBM2 and better yields before enabling all 3840 CUDA cores on GP100 for a 32GB Tesla P100/P110 if a customer asks them for 32GB of HBM2 memory.

Erenhardt · Apr 5, 2016

n0x1ous said:
Yes you Are. That is a dual Fiji part so 596mm2 X 2 GPUs that are single precision focused with minimal double precision hardware

If by minimal you mean 4 times more than TitanX, then I need to study my English dictionary again.

Interesting, Titan lineup was introduced by a Kepler generation as a professional GPU for cheap (hell expensive for a geforce, third the price of pro card).
Maxwell made Titan lineup a glorified regular Geforce at an insane premium, tesla seems to continue that trend.

xpea · Apr 5, 2016

Det0x said:
Hmm. The new FirePro delivers a maximum single-precision performance of 13.9 teraflops, while Nvidia nextgen p100 delivers 10.6 teraflops ?

Are i am missing something ?

yes, these "good on paper" AMD FLOPS are most of the time "useless" because AMD has no resource to provide adequate support.
For example, if you want to do deep learning (the biggest trendy HPC market now), Nvidia provide lot of libraries and SDK like CuDNN when... you have nothing on AMD...
a good read :
http://timdettmers.com/2015/03/09/deep-learning-hardware-guide/

So what kind of GPU should I get? NVIDIA or AMD?
NVIDIAs standard libraries made it very easy to establish the first deep learning libraries in CUDA, while there were no such powerful standard libraries for AMDs OpenCL. Right now, there are just no good deep learning libraries for AMD cards so NVIDIA it is. Even if some OpenCL libraries would be available in the future I would stick with NVIDIA: The thing is that the GPU computing or GPGPU community is very large for CUDA and rather small for OpenCL. Thus in the CUDA community good open source solutions and solid advice for your programming is readily available.

some pages on nvidia website about deep learning:
http://www.nvidia.com/object/machine-learning.html
https://developer.nvidia.com/deep-learning
https://developer.nvidia.com/deep-learning-resources
and thousands of pages/studies/courses/research papers/code source available on google

and on AMD :
nothing or close to nothing with some hobby projects that are too rough to be seriously considered...

n0x1ous · Apr 5, 2016

Erenhardt said:
If by minimal you mean 4 times more than TitanX, then I need to study my English dictionary again.

Interesting, Titan lineup was introduced by a Kepler generation as a professional GPU for cheap (hell expensive for a geforce, third the price of pro card).
Maxwell made Titan lineup a glorified regular Geforce at an insane premium, tesla seems to continue that trend.

minimal compared to Hawaii. The point was that comparing a dual GPU SP focused product vs a single GPU HPC/DP focused product and then saying the latter is not impressive because the former has more SP TFlops is a bad comparison to make.

Head1985 · Apr 5, 2016

Can this be real GTX1070?Performance is around GTX980TI at 1400Mhz so 15% faster than stock GTX980TI
https://www.youtube.com/watch?v=A7Lf1Wmt4f8

Azix · Apr 5, 2016

xpea said:
yes, these "good on paper" AMD FLOPS are most of the time "useless" because AMD has no resource to provide adequate support.
For example, if you want to do deep learning (the biggest trendy HPC market now), Nvidia provide lot of libraries and SDK like CuDNN when... you have nothing on AMD...
a good read :
http://timdettmers.com/2015/03/09/deep-learning-hardware-guide/

some pages on nvidia website about deep learning:
http://www.nvidia.com/object/machine-learning.html
https://developer.nvidia.com/deep-learning
https://developer.nvidia.com/deep-learning-resources
and thousands of pages/studies/courses/research papers/code source available on google

and on AMD :
nothing or close to nothing with some hobby projects that are too rough to be seriously considered...

opencl is not "AMD's". Should basically be what everybody else uses. And clearly its more developed than you suggest since serious projects are using it.

NTMBK · Apr 5, 2016

Erenhardt said:
If by minimal you mean 4 times more than TitanX, then I need to study my English dictionary again.

Interesting, Titan lineup was introduced by a Kepler generation as a professional GPU for cheap (hell expensive for a geforce, third the price of pro card).
Maxwell made Titan lineup a glorified regular Geforce at an insane premium, tesla seems to continue that trend.

Not all professional workloads need double precision. Nvidia offered GK210 for DP heavy workloads, GM200 was for SP heavy workloads.

Kris194 · Apr 5, 2016

Head1985 said:
Can this be real GTX1070?Performance is around GTX980TI at 1400Mhz so 15% faster than stock GTX980TI
https://www.youtube.com/watch?v=A7Lf1Wmt4f8

Are you serious?

Det0x · Apr 5, 2016

xpea said:
yes, these "good on paper" AMD FLOPS are most of the time "useless" because AMD has no resource to provide adequate support.
For example, if you want to do deep learning (the biggest trendy HPC market now), Nvidia provide lot of libraries and SDK like CuDNN when... you have nothing on AMD...
a good read :
http://timdettmers.com/2015/03/09/deep-learning-hardware-guide/

some pages on nvidia website about deep learning:
http://www.nvidia.com/object/machine-learning.html
https://developer.nvidia.com/deep-learning
https://developer.nvidia.com/deep-learning-resources
and thousands of pages/studies/courses/research papers/code source available on google

and on AMD :
nothing or close to nothing with some hobby projects that are too rough to be seriously considered...

Hmm so lets me see if i understand you right then

The only advantage Nvidia possibly have , is on the software-side ?

It just seems really strange Nvidia next-gen 16nm finfet (HPC gpu only), cant match AMD's "old" 28nm tech (mainstream Fiji), in regards to single precision compute ?

I mean nevermind if they have to throw 2x the die-space to be able to compete (or even crush), as long as they have to operate within the same power and thermal envelope. (even at much lower cost)

@ n0x1ous

Yeah your right about the double precision, no comparison there :thumbsup:

Erenhardt · Apr 5, 2016

Azix said:
opencl is not "AMD's". Should basically be what everybody else uses. And clearly its more developed than you suggest since serious projects are using it.

Not to mention those linked by him are in C and amd have tools to run C and even CUDA code.

NTMBK said:
Not all professional workloads need double precision. Nvidia offered GK210 for DP heavy workloads, GM200 was for SP heavy workloads.

Does it mean DP performance increase is solely production process related and we should expect to see a single generation od DP products for each xtor generation?

Head1985 · Apr 5, 2016

Kris194 said:
Are you serious?

1070
https://www.youtube.com/watch?v=A7Lf1Wmt4f8
980TI 1354Mhz
https://www.youtube.com/watch?v=xxSp866aH1A

Two options:
it is GTX1070
it is dude with flashed GTX980TI runing at 1400Mhz.
Did you watch that video?He pauses 2x and shows settings and 1070 GPU name in options.

Samwell · Apr 5, 2016

People here speculating about new Titan should better think, whether we will see GP100 as a consumer chip anyway. My bet is no, GP100 will never go to consumer. This chip is a clear pure HPC Chip. Seems like the big gamer chip will be GP102, which was seen in drivers along with the other pascal chips. Just forget GP100 for gaming. It won't ever come.

ioni · Apr 5, 2016

Head1985 said:
1070
https://www.youtube.com/watch?v=A7Lf1Wmt4f8
980TI 1354Mhz
https://www.youtube.com/watch?v=xxSp866aH1A

Two options:
it is GTX1070
it is dude with flashed GTX980TI runing at 1400Mhz.
Did you watch that video?He pauses 2x and shows settings and 1070 GPU name in options.

DX12 is off in the "1070" video. Seems shady.

Kris194 · Apr 5, 2016

Samwell said:
People here speculating about new Titan should better think, whether we will see GP100 as a consumer chip anyway. My bet is no, GP100 will never go to consumer. This chip is a clear pure HPC Chip. Seems like the big gamer chip will be GP102, which was seen in drivers along with the other pascal chips. Just forget GP100 for gaming. It won't ever come.

It won't because gamers don't need features such as NvLink

ThatBuzzkiller · Apr 5, 2016

Does the GP100 even feature many of the previous fixed function units ?

If not then it's absolutely worthless for gaming and in that case there will be a GP102 which will be much less impressive gaming successor to the GM200 ...

Qwertilot · Apr 5, 2016

Why less impressive? Surely a good thing if they're doing a gaming focused big die?

Means you'll get most/all of the (non trivial) benefits that maxwell got vs Kepler from being purely gaming/SP focused. Might not get HBM2 so early and things, but whatever.

Scary amount of engineering if they have split their product stack, but I guess the main writers were speculating about it with Maxwell.

ThatBuzzkiller · Apr 5, 2016

Qwertilot said:
Why less impressive? Surely a good thing if they're doing a gaming focused big die?

Means you'll get most/all of the (non trivial) benefits that maxwell got vs Kepler from being purely gaming/SP focused. Might not get HBM2 so early and things, but whatever.

Scary amount of engineering if they have split their product stack, but I guess the main writers were speculating about it with Maxwell.

Means Nvidia will have to trade the SMs in order to get back the fixed function units which means lower theoretical compute performance in the end supposing that the GP100 didn't have those fixed function units to begin with ...

poofyhairguy · Apr 5, 2016

Qwertilot said:
Why less impressive? Surely a good thing if they're doing a gaming focused big die?

Eh, it helps gamers to tap into that compute economy of scale. Without that, whatever the next Titan is might stay on top for three plus years.

NVIDIA Pascal Thread

Lifer

Golden Member

Golden Member

Platinum Member

Diamond Member

Golden Member

Lifer

Senior member

Diamond Member

Senior member

Platinum Member

Golden Member

Golden Member

Lifer

Member

Golden Member

Diamond Member

Golden Member

Senior member

Senior member

Member

Golden Member

Golden Member

Golden Member

Lifer