NVIDIA Pascal Thread

Page 110 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Feb 19, 2009
10,457
10
76
Thanks for sharing, this was an informative read.

Well, he's got real programming background so it's more credible than us forum warriors.

But I've been saying the same thing for a long time. Well before we all found out NV uarch cannot handle graphics + compute concurrently in recent times.

It's quite obvious since late 2014 when NV talks about VR and async timewarp in their programming guide.

There's no driver update to add DX12 async compute, the uarch just cannot handle it.

For those who wants to know more about "Async Compute" (it really is more than that) and why Pascal is going to be great for modern games and VR:

http://forums.anandtech.com/showpost.php?p=38164220&postcount=349

http://forums.anandtech.com/showpost.php?p=38203007&postcount=374

http://forums.anandtech.com/showpost.php?p=38203009&postcount=375

We should really call the feature DX12 MULTI-ENGINE rather than DX12 Async Compute, because people jump to the conclusion that it's only related to shaders (queues for ALUs/Cuda Cores). Because in truth, it allows the Rasterizer & DMA engines to perform queues in parallel with the ALU/CC within the CU/SM cluster.

These steps would have been forced to run in serial mode in DX11, but they are allowed to run in parallel with DX12, if the uarch is compliant.
 

jpiniero

Lifer
Oct 1, 2010
15,176
5,717
136
If I were in charge of branding these cards, I would absolutely NOT call GP104-150 a "1060 Ti" or even a "1060" because, on some level, customers will feel that last gen they got an X70 card for $339 and now this generation they get only an X60 card. Extremely bad for business/image.

This card is rumored to be OEM only though. Calling it the 1060 Ti is fine, especially if it really is cut so far to only be in the 970-980 performance range.
 
Mar 10, 2006
11,715
2,012
126
Well, he's got real programming background so it's more credible than us forum warriors.

But I've been saying the same thing for a long time. Well before we all found out NV uarch cannot handle graphics + compute concurrently in recent times.

It's quite obvious since late 2014 when NV talks about VR and async timewarp in their programming guide.

There's no driver update to add DX12 async compute, the uarch just cannot handle it.

For those who wants to know more about "Async Compute" (it really is more than that) and why Pascal is going to be great for modern games and VR:

http://forums.anandtech.com/showpost.php?p=38164220&postcount=349

http://forums.anandtech.com/showpost.php?p=38203007&postcount=374

http://forums.anandtech.com/showpost.php?p=38203009&postcount=375

We should really call the feature DX12 MULTI-ENGINE rather than DX12 Async Compute, because people jump to the conclusion that it's only related to shaders (queues for ALUs/Cuda Cores). Because in truth, it allows the Rasterizer & DMA engines to perform queues in parallel with the ALU/CC within the CU/SM cluster.

These steps would have been forced to run in serial mode in DX11, but they are allowed to run in parallel with DX12, if the uarch is compliant.

It will be interesting to see what graphics-focused updates NVIDIA makes to Pascal. It shouldn't be too long now, based on this "Order of 10" website.
 

raghu78

Diamond Member
Aug 23, 2012
4,093
1,475
136
As long as they can make the additional price worth it, I agree. The GTX 980 was actually very poorly positioned against the GTX 970, with the 970 offering a staggeringly better value (for the price of one 980 + $150, you could get GTX 970 in SLI) at launch.

Here's my guess, on the product stack:

GP104-400 -> GTX 1080 Ti @ $649
GP104-200 -> GTX 1080 @ $499
GP104-150 -> GTX 1070 @ $339

Whenever GP102 drops, I expect that this will be priced at $799 for the cut down version, and $999 for a full-blown Titan version. I expect these to be branded GeForce GTX 1090 and GeForce GTX Titan [whatever]. I doubt that dual GPU flagships will be in vogue anytime soon, so the X90 can be reserved for very high end single GPUs.

If I were in charge of branding these cards, I would absolutely NOT call GP104-150 a "1060 Ti" or even a "1060" because, on some level, customers will feel that last gen they got an X70 card for $339 and now this generation they get only an X60 card. Extremely bad for business/image.

NVIDIA must make it clear to customers that they are getting a better value today than they did before Pascal launched, and the branding scheme I outlined above would do that.

I think GTX 1070 can sell at USD 399 - USD 449. I think USD 499 is unlikely. For GTX 1060 Ti USD 299 - USD 329 is the best price range. If GTX 1060 Ti can be close to GTX 980 Ti (say 5-10% slower) then it will be a hot seller. The key here is how competitive is Polaris 10, how many SKUs and what are their pricing.

It looks like there will be 3-4 SKUs based on Polaris 10.

http://wccftech.com/amd-gcn-4-0-c99-flagship-polaris-rra-certification/

I think a GDDR5X flagship is on the cards but launch timing could depend on how much GDDR5X supply is there and does Nvidia get preferential supply from Micron.

http://vrworld.com/2016/04/13/amd-polaris-10-gpu-beat-competing-pascal/
 
Mar 10, 2006
11,715
2,012
126
I think GTX 1070 can sell at USD 399 - USD 449. I think USD 499 is unlikely. For GTX 1060 Ti USD 299 - USD 329 is the best price range. If GTX 1060 Ti can be close to GTX 980 Ti (say 5-10% slower) then it will be a hot seller. The key here is how competitive is Polaris 10, how many SKUs and what are their pricing.

It looks like there will be 3-4 SKUs based on Polaris 10.

http://wccftech.com/amd-gcn-4-0-c99-flagship-polaris-rra-certification/

I think a GDDR5X flagship is on the cards but launch timing could depend on how much GDDR5X supply is there and does Nvidia get preferential supply from Micron.

http://vrworld.com/2016/04/13/amd-polaris-10-gpu-beat-competing-pascal/

An alternative pricing structure could be:

GTX 1080 @ $550
GTX 1070 Ti @ $449
GTX 1070 @ $339-$349

The problem here though is that NVIDIA essentially nukes the $650 offerings from its lineup until such time that GP102 drops, and unless that's going to be soon, NVIDIA and its AIB partners aren't going to be happy with what would very well wind up being a blended average selling price decline.

That being said, the volume/ASP trade-off is a tricky one, so NVIDIA and the AIB people may be OK with such a "new" pricing stack if they think it will ultimately net everybody greater sales/profits.

We'll see soon enough.
 
Feb 19, 2009
10,457
10
76
It will be interesting to see what graphics-focused updates NVIDIA makes to Pascal. It shouldn't be too long now, based on this "Order of 10" website.

Can tell you right now.

They've effectively doubled their cache and registers per cuda core per SM cluster. This means less trips to the memory, therefore less reliance on raw bandwidth to sustain cuda core activity.

Their warp scheduler and cuda core layout in the SM reaches "peak utilization" much faster than Maxwell (a lot faster than Kepler). It's also optimized for 64x wavefronts which is commonly used for console ports (due to GCN's optimal wavefront/layout). Kepler suffers a major drop due to this, and Maxwell too, though not as much.

The change of the uarch to support instant graphics <-> compute workload switches is a huge deal for modern games that are going compute heavy with effects.

The inclusion of fine-grained preemption is a massive win for VR latency. Expect NV to make a huge deal about this feature in particular.

These changes are why I said Pascal will be a modest improvement in older games, but in new games and DX12, the improvements will be much higher. Watch out for it, the best example is Quantum Break which the way the engine interweaves graphics + compute + copy queues utterly destroys Maxwell (worse on Kepler).
 
Mar 10, 2006
11,715
2,012
126
Can tell you right now.

They've effectively doubled their cache and registers per cuda core per SM cluster. This means less trips to the memory, therefore less reliance on raw bandwidth to sustain cuda core activity.

Their warp scheduler and cuda core layout in the SM reaches "peak utilization" much faster than Maxwell (a lot faster than Kepler). It's also optimized for 64x wavefronts which is commonly used for console ports (due to GCN's optimal wavefront/layout). Kepler suffers a major drop due to this, and Maxwell too, though not as much.

The change of the uarch to support instant graphics <-> compute workload switches is a huge deal for modern games that are going compute heavy with effects.

The inclusion of fine-grained preemption is a massive win for VR latency. Expect NV to make a huge deal about this feature in particular.

These changes are why I said Pascal will be a modest improvement in older games, but in new games and DX12, the improvements will be much higher. Watch out for it, the best example is Quantum Break which the way the engine interweaves graphics + compute + copy queues utterly destroys Maxwell (worse on Kepler).

Hope your prediction is correct. I don't want to have dumped my 980 Tis for nothing
 

raghu78

Diamond Member
Aug 23, 2012
4,093
1,475
136
An alternative pricing structure could be:

GTX 1080 @ $550
GTX 1070 Ti @ $449
GTX 1070 @ $339-$349

The problem here though is that NVIDIA essentially nukes the $650 offerings from its lineup until such time that GP102 drops, and unless that's going to be soon, NVIDIA and its AIB partners aren't going to be happy with what would very well wind up being a blended average selling price decline.

That being said, the volume/ASP trade-off is a tricky one, so NVIDIA and the AIB people may be OK with such a "new" pricing stack if they think it will ultimately net everybody greater sales/profits.

We'll see soon enough.

I don't see a USD 100 price gap between GTX 1080 and the slower GTX 1070 given the perf gap and the higher bandwidth due to GDDR5X on GTX 1080. I think USD 150 is the minimum and USD 200 is what I expect given the higher cost/lower volumes of GDDR5X wrt GDDR5.

So the price range could be

GTX 1080 - USD 599 - USD 649
GTX 1070 - USD 399 - USD 449
GTX 1060 Ti - USD 299 - USD 329.
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
oh leave it. He must have been confused with Geforce drivers bricking cards.

http://wccftech.com/nvidias-latest-game-ready-driver-allegedly-killing-gpus-plagued-issues/

anyway on topic the 1080 is going to be significantly faster due to GDDR5X. my guess is 20-25%. This means Nvidia can get more users to pick 1080 over 1070 because overclocking 1070 might not be enough to catch 1080.

The 1080 will definitely have a larger gap over the 1070 than 980 did over 970. I doubt the 1070 will be able to catch the 1080 in overclocking, either. Nvidia has been differentiating performance between it's existing products more so since the GTX 680 and 670 first came. The 670 could overclock to be slightly faster than the 680, the 660 TI could overclock to be slightly faster than the 670, and the GTX 660 could at least catch the 660 TI with overclocking. Then when the 700 series hit, no performance or enthusiast card could overclock to catch the one above it EXCEPT for the 780 OCing to 780 TI (but only because GK110 had better overclocking vs. the rest of the Kepler lineup). The same story played out with the 900 series. 980 can't catch 980 TI, 970 can't quite catch 980, and the 960 comes nowhere close to the 970.

Given all that info, I still think there is potential for the 1070 to be a decent overclocker but it'll depend on how much gas in the tank 8 ghz GDDR5 has left in it. If it can hit 9 ghz (I am skeptical of this), then you're looking at 288 gb/s bandwidth - a 28% increase over the current GTX 980. Nvidia was able to increase performance 70% at the same bandwidth moving from GK104 to G204 so it stands to reason that Nvidia will continue to make performance gains at similar bandwidth levels, even if not as much as the jump between Kepler and Maxwell.

I do not expect 1070 to reach 1080 levels in performance, but I am cautiously optimistic that 1070 still has decent OC potential dependent upon Pascal's headroom of course.
 

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
volta didnt existed on any roadmap till early 2015 it was pascal in the place of volta(2018 didnt even existed back then) it was 2016/17

Volta has been on roadmaps since early 2013 (although as I said without any dates), then in 2014 Pascal was announced and Maxwell and Volta were shuffled around (still with no date for Volta), and then finally in 2015 Volta was given a date of 2018.

So as I said, ever since a date was announced for Volta, it has always been 2018.
 

airfathaaaaa

Senior member
Feb 12, 2016
692
12
81
Volta has been on roadmaps since early 2013 (although as I said without any dates), then in 2014 Pascal was announced and Maxwell and Volta were shuffled around (still with no date for Volta), and then finally in 2015 Volta was given a date of 2018.

So as I said, ever since a date was announced for Volta, it has always been 2018.
i dont know why you stated the very same thing i stated but ok
 

Glo.

Diamond Member
Apr 25, 2015
5,802
4,776
136
Can tell you right now.

They've effectively doubled their cache and registers per cuda core per SM cluster. This means less trips to the memory, therefore less reliance on raw bandwidth to sustain cuda core activity.

Their warp scheduler and cuda core layout in the SM reaches "peak utilization" much faster than Maxwell (a lot faster than Kepler). It's also optimized for 64x wavefronts which is commonly used for console ports (due to GCN's optimal wavefront/layout). Kepler suffers a major drop due to this, and Maxwell too, though not as much.

The change of the uarch to support instant graphics <-> compute workload switches is a huge deal for modern games that are going compute heavy with effects.

The inclusion of fine-grained preemption is a massive win for VR latency. Expect NV to make a huge deal about this feature in particular.

These changes are why I said Pascal will be a modest improvement in older games, but in new games and DX12, the improvements will be much higher. Watch out for it, the best example is Quantum Break which the way the engine interweaves graphics + compute + copy queues utterly destroys Maxwell (worse on Kepler).
Everything will depend right now on core amount and core clocks. SIMD vs SIMD performance between GCN and Pascal will be similar, but the differences will come from memory bandwidth(for GCN) and core clocks(Pascal).
 

zentan

Member
Jan 23, 2015
177
5
36
Volta has been on roadmaps since early 2013 (although as I said without any dates), then in 2014 Pascal was announced and Maxwell and Volta were shuffled around (still with no date for Volta), and then finally in 2015 Volta was given a date of 2018.

So as I said, ever since a date was announced for Volta, it has always been 2018.
Quite clear and precise :thumbsup:
 

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
i also said that in the place of volta was pascal but ok

Pascal was not put in the place of Volta, since Volta didn't have a place (time wise).

Pascal when it was introduced was given a date of 2016, at that point in time (2014) Volta hadn't been given a date so there was no place for Pascal to take.

It is quite possible that Volta was originally planned for earlier than 2018, and perhaps even 2016 (where Pascal ended up), but we have no real way of knowing this without access to Nvidia's internal roadmaps.

Anyways this is getting a bit off topic, so I won't bother belabouring the point any further.
 

Sweepr

Diamond Member
May 12, 2006
5,148
1,143
131
Some last minute rumours/leaks before the event tomorrow (thanks to user PolyMorph @ Chiphell):

Translated said:
The Maxwell various public version, Master Card, frequency + 500mhz, Pascal is the current public version of the GM204 in 1228 ~ 1291mhz, corresponding to 1734 ~ 1797mhz is the master in the actual frequency between 1316 ~ 1455mhz corresponding Pascal is 1823 ~ 1962mhz frequency is close to the upper limit now, consider the redundancy must add up

...Pascal pure infrastructure upgrade at most 10 to 15%, with a frequency ratio came out, certainly not so much Maxwell

Original said:
&#25226;Maxwell&#21508;&#31181;&#20844;&#29256;&#12289;&#27597;&#29256;&#21345;&#65292;&#39057;&#29575;+500mhz&#65292;&#23601;&#26159;Pascal&#20102; &#29616;&#22312;GM204&#30340;&#20844;&#29256;&#22312;1228~1291mhz&#65292;&#23545;&#24212;&#23601;&#26159;1734~1797mhz &#27597;&#29256;&#23454;&#38469;&#39057;&#29575;&#22788;&#22312;1316~1455mhz&#38388;&#65292;&#23545;&#24212;Pascal&#23601;&#26159;1823~1962mhz &#39057;&#29575;&#24050;&#32463;&#25509;&#36817;&#29616;&#22312;&#19978;&#38480;&#65292;&#32771;&#34385;&#20887;&#20313;&#24517;&#39035;&#24448;&#19978;&#21152;&#20102;

...Pascal&#32431;&#26550;&#26500;&#25552;&#21319;&#39030;&#22810;10~15%&#65292;&#21516;&#39057;&#19968;&#27604;&#23601;&#20986;&#26469;&#65292;&#32943;&#23450;&#27809;&#26377;Maxwell&#37027;&#20040;&#22823;

www.chiphell.com/thread-1579388-1-1.html

500 MHz clock boost compared to Maxwell. 10-15% better IPC.

Reference Geforce GTX 1080 supposedly costs 3999 CNY. Probably U$599?
Custom models between 4500-5000+ CNY.
 
Last edited:
Feb 19, 2009
10,457
10
76
If they can hit 1.8ghz, that's going to lead to some nice gains even compared to OC 980Ti.

10-15% IPC gains, with real 20% clock speed advantage (1.5ghz Maxwell) makes for a nice upgrade!
 
Mar 10, 2006
11,715
2,012
126
Some last minute rumours/leaks before the event tomorrow (thanks to user PolyMorph @ Chiphell):





www.chiphell.com/thread-1579388-1-1.html

500 MHz clock boost compared to Maxwell. 10-15% better IPC.

Reference Geforce GTX 1080 supposedly costs 3999 CNY. Probably U$599?
Custom models between 4500-5000+ CNY.

Exactly what one would expect from a move from 28nm to 16FF+, big uplift in frequency. Combined with a reasonable perf/clock performance boost due to the architecture, and you have a recipe for a much faster, much more efficient GPU.
 

Glo.

Diamond Member
Apr 25, 2015
5,802
4,776
136
Some last minute rumours/leaks before the event tomorrow (thanks to user PolyMorph @ Chiphell):





www.chiphell.com/thread-1579388-1-1.html

500 MHz clock boost compared to Maxwell. 10-15% better IPC.

Reference Geforce GTX 1080 supposedly costs 3999 CNY. Probably U$599?
Custom models between 4500-5000+ CNY.
What it actually means is that 1300 MHz Pascall GPU will have performance of similar core count Maxwell GPU but clocked at 1800 MHz.
 
Mar 10, 2006
11,715
2,012
126
If they can hit 1.8ghz, that's going to lead to some nice gains even compared to OC 980Ti.

10-15% IPC gains, with real 20% clock speed advantage (1.5ghz Maxwell) makes for a nice upgrade!

Fingers crossed. Hope these pack a solid 8GB of GDDR5X, too. Don't want to be VRAM limited anytime soon.
 

Glo.

Diamond Member
Apr 25, 2015
5,802
4,776
136
2560 CUDA core GPU clocked at 1.48 GHz could be 40% faster than GTX 980 Ti.
 

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
What it actually means is that 1300 MHz Pascall GPU will have performance of similar core count Maxwell GPU but clocked at 1800 MHz.

That doesn't really seem to be the case. If that was what it meant, then it would imply a ~38% increase in IPC, but they say that the IPC increase is 10-15% at most.

Edit: On second thought you might be right, if they are comparing a 2560 core GP104 to a 2048 core GM204, then combined with a 10-15% increase in IPC, it should be 37.5-43.75% faster at iso-clocks, which lines up almost perfectly with the quoted numbers.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |