Sapphire AMD HD 9970 News

kapulek · Jul 14, 2013

Boxleitnerb, top Hawaii has more than 2560sp and less than 2944sp according to sushiwarrior.

boxleitnerb · Jul 14, 2013

kapulek said:
Boxleitnerb, top Hawaii has more than 2560sp and less than 2944sp according to sushiwarrior.

How do you arrive at the 2944 number?
I wonder how AMD did it if this is indeed true. Either it really is 20nm and we're in for hell of a surprise or they squeezed everything out of the process, architecture and made the GPU quite large. Like AMDs GK110.

@Russian:

Tahiti is not pixel-fillrate limited, at least that is no bullet-proof assumption. Pitcairn and Tahiti LE has more pixel fillrate than a full Tahiti, yet the full Tahiti is way faster all the time. The 7950 (800 MHz) only has 11% more GFLOPs than the 7870. No wonder they can be close. The GPU clock influences all fillrates and computational power by the same factor, so there simply is no way to tell which part is bottlenecking.

The increased number of ACEs may help performance by better utilizing the ALUs in the CUs, but how that affects gaming, remains to be seen. Bitcoin mining is irrelevant here.

kapulek · Jul 14, 2013

boxleitnerb said:
How do you arrive at the 2944 number?
I wonder how AMD did it if this is indeed true. Either it really is 20nm and we're in for hell of a surprise or they squeezed everything out of the process, architecture and made the GPU quite large. Like AMDs GK110.

I've just picked 40CUs and 46CUs to narrow it down. I've could have picked 48CUs. That's all.

I wonder if AMD could really price it at ~$1000 if it's faster than Titan.

As for Pitcairn's successor, I think it would be ~26-34CUs (I don't have any clue here).

boxleitnerb · Jul 14, 2013

If that is still on 28nm, they finally have caught up with Nvidias large GPUs or even surpassed them in both at the same time, perf and perf/W. If this is the case, it can only be good for us consumers. Healthy competition
I doubt they would charge $1000. AMD needs to regain market and mind share, the ratio is still 2:1 in favor of Nvidia. I doubt AMD would be willing to miss that opportunity by milking us, too.

RussianSensation · Jul 14, 2013

boxleitnerb said:
Tahiti is not pixel-fillrate limited, at least that is no bullet-proof assumption. The GPU clock influences all fillrates and computational power by the same factor, so there simply is no way to tell which part is bottlenecking.

Yes it is. You can check it by looking at how overclocking affects performance in certain games. I am not going to go through this again. I've done it before. GCN is pixel fill-rate limited. The entire front-end of Tahiti is a bottleneck. AMD's cards in general have been pixel fill-rate deficient for many generations now. AMD managed to increase ROP efficiency 50% which is why they kept it at 32 since 6970 but it's still not enough. NV is already on 48 ROPs. They have a better balanced design. The 580 had a major texture fill-rate limitation, which was corrected in 680.

If you look at various games and analyze benchmarks and resolution, you can usually find a bottleneck in a particular architecture. We knew 5870/6970 were tessellation bottlenecked, 580 was texture fill-rate bottlenecked, and Tahiti needs more ROPs/faster ROPs. I called it a long time ago that 580 was texture fill-rate limited and NV nearly tripled it with 680.

Here is another hint: HD7870 and HD7970GE have nearly identical pixel fill-rate but HD7970GE has 60% more texture performance, 87% more memory bandwidth, 68% more shader/GLOPs performance.

In Crysis 3 that stresses everything, HD7970GE is less than 50% faster than HD7870 but it should be much faster.

Looking at high resolution gaming that stress textures, we can note that HD7970GE is only 48% faster than 7870 at 1600P:
http://www.computerbase.de/artikel/grafikkarten/2013/nvidia-geforce-gtx-760-im-test/4/

The common denominator is that HD7970GE has a pixel fill-rate deficiency, ACE and geometry engine bottlenecks since those are the least improved over 7870. If HD7970GE was not pixel fill-rate limited, it would be beating 7870 by much higher amounts since GCN scales well with shaders & textures. That means in many games pixel fill-rate is a key component.

boxleitnerb said:
The increased number of ACEs may help performance by better utilizing the ALUs in the CUs, but how that affects gaming, remains to be seen. Bitcoin mining is irrelevant here.

Again, bitcoin mining benefits from efficient ALUs. With more ACEs, the ALUs are better utilized. If you read on GCN architecture, ACEs are critical for scheduling compute workloads.

"Meanwhile on the compute side, AMD’s new Asynchronous Compute Engines serve as the command processors for compute operations on GCN. The principal purpose of ACEs will be to accept work and to dispatch it off to the CUs for processing. As GCN is designed to concurrently work on several tasks, there can be multiple ACEs on a GPU, with the ACEs deciding on resource allocation, context switching, and task priority."

Since games use compute, more ACEs will allow GCN to perform compute calculations faster because 2 ACEs won't be bottlenecking as much.

Tessellation is another factory where GCN lags Kepler. Adding a 3rd geometry engine will help here too.

The card you just listed addresses 3 of 4 weak areas lacking in Tahiti - ROPs, ACEs and geometry shaders. There is no way it would only be 25% faster on average based on the specs you listed. It is actually mathematically impossible since I have already shown to you that if texture & shader performance alone improve 26% on Tahiti, performance improves 19-21%. Problem is Tahiti doesn't scale linearly after a while because shader & texture performance increase much greater relative to pixel fill-rate since you are applying higher GPU clock speeds to more shaders & textures but there are only 32 ROP units.

You seem to have added almost no performance increases related to:

1) 3rd geometry engine
2) doubling of ACEs
3) 50% increase in ROP.

You are assigning just a 5-6% increase in performance from those 3 factors. Does not compute. Your specs are either too high or your estimate for increase in speed is too low.

The other thing AMD needs to do is up the texture performance. Once they fix the pixel fill-rate, they'll be quickly running into a TMU problem because that will be the next bottleneck.

For FP16 textures, Titan is 88% faster than Tahiti:
http://techreport.com/review/24381/nvidia-geforce-gtx-titan-reviewed/6

GLOPs/shader performance & memory bandwidth are the least important areas to address in Tahiti. It has shader performance by truckloads. AMD needs to focus on ROPs, ACEs, geometry and textures. It's a little more complicated though since while Tahiti already has stronger shader performance than even the Titan, with compute features in games, you still want more ALUs for global illumination, SSAA passes, etc. Strictly from shader performance though GCN has more than enough.

boxleitnerb · Jul 14, 2013

ROPs are the backend and responsible for pixel fillrate, not the frontend...
And as I said, you cannot tell by overclocking what part of the GPU is mainly responsible for the performance gain. Clockspeed influences everything. Some synthetic benchmarks are not enough to come to a certain conclusion. Bitcoin mining certainly has not the same kind of workload as games.

The fact that Tahiti is not as fast as its GFLOPs and bandwidth etc. suggest, doesn't necessarily conclude a pixel-fillrate bottleneck. Relatively bad usage of the ALUs due to a weak frontend may also be the reason. Pitcairn has a dual-frontend - as does Tahiti. That right there is the point that you keep ignoring.

Also you cannot know what games stress what parts of a GPU. Stop pretending to be an expert there, you cannot possibly know that.

I agree on the ACEs though, they should help with utilization. I disagree with compute, at least in that context. The increased ACE should help everywhere, even with older games. What do you think the ALUs do in games from let's say 2006 - nothing? They run shader programs, too, and thus depend on throughput.

As for the specs I mentioned and the performance gain - hint: clock speed.

champion-7891 · Jul 14, 2013

kapulek said:
Boxleitnerb, top Hawaii has more than 2560sp and less than 2944sp according to sushiwarrior.

2688 SP seems more likely. 7790 x 3...also makes it consistent with 48 ROPs, 384 bit width, 168 TMUs

champion-7891 · Jul 14, 2013

kapulek said:
I've just picked 40CUs and 46CUs to narrow it down. I've could have picked 48CUs. That's all.

I wonder if AMD could really price it at ~$1000 if it's faster than Titan.

As for Pitcairn's successor, I think it would be ~26-34CUs (I don't have any clue here).

1792 shader part (28 CUs I think)...7790 x 2...since thats basically a 7950, give it higher clock speeds (1000-1100MHz) and 6-7GHz on 256 bit bus, and it will come quite close to 7970GE.

disclaimer: thats a guesstimate

tweakboy · Jul 14, 2013

9970 @ 500 to 600 dollars. At that time I think theyll update. gl

rgallant · Jul 14, 2013

I want it big and fast so nv doesn't drop a Maxwell 20nm small die with + 30%[ 2.5 gb vram] [over a gtx 770] like they did with gtx 680 over the 580 , suck up all the high end sales [this is what we have for you @$550 gtx 680 4gb ], then release the big guy's the following year at out rages prices [with no release plans].

Jacky60 · Jul 15, 2013

Are you getting 90% scaling with 2 6990s though?

Any reason you want to wait for HD9970 for this title? NV runs this game very well too. I'd consider dual 780s for you instead of waiting until October-November.

Side-note: HD7970GE is 100% faster in Arma 3 over HD6970 and easily hangs with GTX590/HD6990 - last gen's dual-GPU flagships! So much for people saying GCN was a marginal upgrade over 6970. In many new titles coming out, 7970GE is crushing the 6970 by 60-90%. GTX780 is ~ 2.8x faster than HD5870. That means after-market 780 will be > 3x faster. :awe:

RS Thanks for that I can never find this stuff on GPURU (Cyrillic reading skills lacking!). I want to see what Ivybridge E brings as I really 'need' 6 cores for upcoming games and also saving money for GPU upgrade at same time. My 4 ghz 920 is holding me back big time I'm pretty sure, also prices will have to drop if AMD brings some hurt but may well go Nvidia as I fancy a change. Pretty sure I'm not getting 90% scaling but I never bothered to check.

boxleitnerb · Jul 15, 2013

Okay then, let's speculate again (hopefully more accurate this time)
I won't state clocks, just performance. So if performance seems too high/low, it's because of higher/lower clocks.

9970 (3x Bonaire):

2688 SP
168 TMUs
48 ROPs
4 raster engines and 4 ACEs
384-bit, 7 Gbps, 6 GB GDDR5
Power consumption around 7970 GHz
Performance 7970 GHz +40%

9950:

2304 SP
144 TMUs
44-48 ROPs
4 raster engines and 4 ACEs
384-bit, 6 Gbps, 3 GB GDDR5
Power consumption around 7970
Performance 7970 GHz +20%

Any takers?

Erazor51 · Jul 15, 2013

Elfear · Jul 15, 2013

Erazor51 said:

Someone want to fly to Taiwan and bring us back some good info? :biggrin:

Edit: Nevermind. Just noticed the date of the class.

Borealis7 · Jul 15, 2013

boxleitnerb said:
Okay then, let's speculate again (hopefully more accurate this time)
I won't state clocks, just performance. So if performance seems too high/low, it's because of higher/lower clocks.

9970 (3x Bonaire):

2688 SP
168 TMUs
48 ROPs
4 raster engines and 4 ACEs
384-bit, 7 Gbps, 6 GB GDDR5
Power consumption around 7970 GHz
Performance 7970 GHz +40%

9950:

2304 SP
144 TMUs
44-48 ROPs
4 raster engines and 4 ACEs
384-bit, 6 Gbps, 3 GB GDDR5
Power consumption around 7970
Performance 7970 GHz +20%

Any takers?

40% improvement at same TDP without even shrinking the silicon? impossibru!

railven · Jul 15, 2013

Just read some of the posts in this thread and I'd like to share my opinon early:

I'll draw the line at $600 (since prior to HD 4870 I was use to dropping $400-600 for GPUs) but anything above that and AMD has lost me as a customer.

boxleitnerb · Jul 15, 2013

Borealis7 said:
40% improvement at same TDP without even shrinking the silicon? impossibru!

Pitcairn and Bonaire have a significantly better perf/W ratio than Tahiti, like 20-25%:
7970 non-GHz: 1.85 perf/W
7850: 2.29 perf/W
7790: 2.28 perf/W
http://www.3dcenter.org/artikel/grafikkarten-marktueberblick-juli-2013

If AMD uses one of those two as the basis and aggressively bin their chips, it could be in the realm of possibility.

BUnit1701 · Jul 15, 2013

Here's a little tid bit for all of you screaming that Apple is going to eat all of the 20nm production.

http://www.dailytech.com/Report+Sam...2014+Make+14+nm+iPhone+7+SoC/article31963.htm

According to that, TMSC is only under contract to produce older A4 SOC, nothing new and fancy that would take away all the 20nm production.

Nachmanowicz · Jul 15, 2013

Rumour has it they're already shipping samples to partners.

I'm going to the US late October and leaving November the 11th. Any chance it will be already out? Srsly hoping I can take one back to Braziland. Oh, and I'm drawing a line at $550. I mean, come on, it's already more than the overpriced XBox One!!!

RussianSensation · Jul 15, 2013

Nachmanowicz said:
Rumour has it they're already shipping samples to partners.

I'm going to the US late October and leaving November the 11th. Any chance it will be already out? Srsly hoping I can take one back to Braziland. Oh, and I'm drawing a line at $550. I mean, come on, it's already more than the overpriced XBox One!!!

You should take 3, sell 2 and walk away with a nearly free one. :biggrin: If AMD takes until November 2013 to launch this card, it would have to be faster or cheaper than the 780 or it would defeat the purpose being that late.

sontin · Jul 15, 2013

boxleitnerb said:
Pitcairn and Bonaire have a significantly better perf/W ratio than Tahiti, like 20-25%:
7970 non-GHz: 1.85 perf/W
7850: 2.29 perf/W
7790: 2.28 perf/W
http://www.3dcenter.org/artikel/grafikkarten-marktueberblick-juli-2013

If AMD uses one of those two as the basis and aggressively bin their chips, it could be in the realm of possibility.

The 7970 offers a 4:1 SP : DP ratio. So you think they will only offer 24:1 or so?!

boxleitnerb · Jul 15, 2013

sontin said:
The 7970 offers a 4:1 SP : DP ratio. So you think they will only offer 24:1 or so?!

Does a higher DP/SP ratio cost only transistors or also power consumption when DP is not in use? Aside from that, if they bring the voltage down by 0.1V or so, they could save 15-20% power already.

sushiwarrior · Jul 15, 2013

boxleitnerb said:
if they bring the voltage down by 0.1V or so, they could save 15-20% power already.

That is unlikely :whiste:

boxleitnerb · Jul 15, 2013

You damn tease

john88 · Jul 15, 2013

is it out yet or not?

Sapphire AMD HD 9970 News

Member

Platinum Member

Member

Platinum Member

Elite Member

Platinum Member

Member

Member

Diamond Member

Golden Member

Golden Member

Platinum Member

Member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Senior member

Junior Member

Elite Member

Diamond Member

Platinum Member

Senior member

Platinum Member

Junior Member