AMD's Tonga - R9 285 (Specs) and R9 285X (Partial Specs)

DiogoDX · Aug 28, 2014

One of the key features of the third-generation GCN is updated instruction set architecture (ISA). While there are not a lot of details available at the moment, expect the Tonga and other GPUs based on the GCN 1.2 to support certain new capabilities that improve its overall efficiency when it comes to computing.

Another thing featured by the new GCN is yet again improved tessellation/geometry processing performance. It is unclear whether AMD redesigned its geometry processors in general, or just increased their amount, but tessellation should work better on the GCN 1.2 compared to previous-gen GPUs.

To reduce requirements for memory bandwidth and improve performance in high resolutions, the GCN 1.2 supports delta colour compression. The latter is an image compression technique that encodes a row of pixels by replacing their colour information with values that represent difference between subsequent pixels (e.g. if one pixel in RGB format is 255:0:0, the other one is 255:1:1, the third one is 255:2:0, then with delta compression the sequence can be represented as 255:0:0, 0:1:1, 0:1:-1).

The GCN 1.2-based GPUs will also feature a new multimedia engine which comprises of universal video decoder 6.0 (UVD 6) and video encoder engine 3.1 (VCE 3.1) technologies as well as a new high-quality scaler for video. There is no word about support for ultra-high-definition (UVD) video codecs, such as H.265/HEVC or VP9, so it looks like the new GPUs will not support them.

http://www.kitguru.net/components/g...third-iteration-of-gcn-architecture-revealed/

Rvenger · Aug 28, 2014

To reduce requirements for memory bandwidth and improve performance in high resolutions, the GCN 1.2 supports delta colour compression. The latter is an image compression technique that encodes a row of pixels by replacing their colour information with values that represent difference between subsequent pixels (e.g. if one pixel in RGB format is 255:0:0, the other one is 255:1:1, the third one is 255:2:0, then with delta compression the sequence can be represented as 255:0:0, 0:1:1, 0:1:-1).

Bingo

This was what one of our fellow members was talking about that Nvidia does.

Kenmitch · Aug 28, 2014

KitGuru Says: The GCN 1.2 looks very promising on paper. It will be interesting to learn about all of its innovations in-depth…

So they added some secret sauce in the mix.

tviceman · Aug 28, 2014

Good! Glad to hear it is more than just Hawaii cut down.

DiogoDX · Aug 28, 2014

This also explain that 3dmark score provided by AMD.

DeathReborn · Aug 29, 2014

I must admit I am surprised that no "leaked" benchmarks have been hurled around yet, normally you get "leaked" benches before seeing AIB cards in pics.

Does make you wonder what they have up their sleeve.

parvadomus · Aug 29, 2014

Obviously there was something strange, and not just a scaled up Pitcairn with boost.

270X => 179Gb/s
285 => 179Gb/s but ~30% performance uplift (with very conservative clocks), probably the same power consumption.

NTMBK · Aug 29, 2014

I wonder if this card has done a Maxwell, and put a bigger cache on die? That could help make up for the narrower memory interface.

R0H1T · Aug 29, 2014

I guess it's GCN 1.2 after all & as we've seen previously with the 7790, the R9 285 should also be fairly efficient certainly as compared to the previous gen 280(x) parts.

PPB · Aug 29, 2014

NTMBK said:
I wonder if this card has done a Maxwell, and put a bigger cache on die? That could help make up for the narrower memory interface.

Somewhere (AMD slides probably) was disclosed that indeed the CUs sport beefer chaches. Both caches and the pixel color value delta compression technique would incur into better VRAM bandwith efficiency.´

It was due time. Sadly i'm on the same boat that thinks this card should be the original 280x/280. Lets see what approach pays off: a big architectural change between 2 generation of cards, with a little tweak in between like NVs Gx1xx then Gx2xx scheme, or AMD's incremental and evolutive changes to given architectural until there is a new far bigger and radical change from uarch (like VLIW to GCN was compared to NV's Fermi to Kepler to Maxwell).

Rvenger · Aug 29, 2014

PPB said:
Somewhere (AMD slides probably) was disclosed that indeed the CUs sport beefer chaches. Both caches and the pixel color value delta compression technique would incur into better VRAM bandwith efficiency.´

It was due time. Sadly i'm on the same boat that thinks this card should be the original 280x/280. Lets see what approach pays off: a big architectural change between 2 generation of cards, with a little tweak in between like NVs Gx1xx then Gx2xx scheme, or AMD's incremental and evolutive changes to given architectural until there is a new far bigger and radical change from uarch (like VLIW to GCN was compared to NV's Fermi to Kepler to Maxwell).

Actually should have been the HD 7950 and 7970 since the GTX 680/670 supported it. On the flip side those cards had a wider bus with more VRAM so they went the route of sacrificing efficency and added costs instead.

As someone stated before in this thread, these will definitely be a lower cost to manufacture, which is what Nvidia has been doing for 2 years + now.

PPB · Aug 29, 2014

Rvenger said:
Actually should have been the HD 7950 and 7970 since the GTX 680/670 supported it. On the flip side those cards had a wider bus with more VRAM so they went the route of sacrificing efficency and added costs instead.

As someone stated before in this thread, these will definitely be a lower cost to manufacture, which is what Nvidia has been doing for 2 years + now.

That statement is false, because:

- Tahiti had added redundancy (which incurred in bigger die area besides the area penalty to sport a 1.5x wider bus and higher DP rate) to sport better yields. This allowed not only AMD to have a few months head start compared to NV, but also have a better yield progression curve than NV, because their design was mostly die area optimized at the expense of yields.
- Having better % yields with fewer total dies (functional + nonfuctional ones) per wafer might be rougly the same cost per functional die than having more dies per wafer but lower yield rate.

Overal AMD's design decisions behind Tahiti were OK considering it allowed them to have a head start on the 28nm node by shipping products earlier, and reach better yield rates earlier than NV to compensate the bigger die area as a consequence because of the point explained earlier.

What AMD failed hard here was in letting R2xx series be a rebrand fest when it could have been a GCN 1.1/1.2 lineup. I know making a new die for every single market segment means big R&D costs, but Tahiti was just too long lived (if you think about it, it will be as long lived as the very 28nm itself, something that has never happened before, because a company would release a tweaked uarch halfaway into the node's lifespan).

Now they are in a marketing nightmare as they 285 product sports LESS memory than their 280 one. And if you are used with chatting with uninformed people, you know VRAM size equals performance for them. Those people will have a hard time understanding paying 30+ bucks for a 285 over a 280 and having 1GB VRAM less (which in their minds means "uh it must have less performance"), or even having to pay another 30 or 50 more to finally have a 4GB version.

Heck, even in this forum you have informed people like RussianSensation that argues the value the 285 over 280 from the VRAM standpoint, even knowing the former product is newer tech, has more performance and more features, but costs a few bucks more to deliver a little more performance with less VRAM in an industry shifting to higher than 2GB VRAM reqs for games on ultra settings.

Kenmitch · Aug 29, 2014

The 4gb cards might be nice in xfire depending on pricing. Wonder if they can somehow pull off a shared memory pool.

Ryujin Jakka · Aug 31, 2014

I dont recommend 4gb in xfire setup unless you have too much money to throw away

Leadbox · Aug 31, 2014

Ryujin Jakka said:
I dont recommend 4gb in xfire setup unless you have too much money to throw away

MathMan · Aug 31, 2014

PPB said:
- Tahiti had added redundancy (which incurred in bigger die area besides the area penalty to sport a 1.5x wider bus and higher DP rate) to sport better yields.

The problem with that statement is that AMD explicitly denied it. The initial allegation was that Tahiti has 8 pure redundant CUs (for a total of 40, with only 32 active.) That's an incredible amount of redundancy that can't possibly be exploited profitably over the lifetime of a product. It makes much more sense to sell partially disabled products initially and then deploy fully enabled ones later on, which is exactly what Nvidia did for GK110. GK104 was small enough for it not to really matter. A 1 out of 5 redundancy ratio is a staggering amount of real estate wasted that becomes a bigger liability with time. AMD engineers are not stupid. If you look at typical redundancy ratios, as used in RAMs etc., you get ratios of 1 out of 128 and such. The reason for this is that redundancy math is one of extremely rapid diminishing returns.

This is easy to see as follows:
- You need a perfect die.
- That means: zero defects in all non-RAM structures. (Assuming larger RAMs already have redundancy.)
- If 1 defect can kill a die, then that means that defects are in reality really quite rare, generally speaking.
- This means that 1 redundant group will already recover a vast majority of otherwise failing dies. It has the power of increase a yield of, say, 70% to, say, 95%. (Once again: RAM redundancy ratios support this claim.)
- Adding another level of redundancy is plain stupid, because, at best, it can only recover an additional 5%.
- And let's not forget: as time goes by, that 70% will go up, as will the 95%, so the payoff gets even less. And the cost even higher.

This allowed not only AMD to have a few months head start compared to NV, but also have a better yield progression curve than NV, because their design was mostly die area optimized at the expense of yields.

Yes, because correlation always means causation. /s

Here's the trouble with that statement: defect density changes very gradually over time. The difference of in-store availability between 7970 and GTX680 was all of 2 months. Defect density doesn't change a lot in such a short amount of time.

- Having better % yields with fewer total dies (functional + nonfuctional ones) per wafer might be roughly the same cost per functional die than having more dies per wafer but lower yield rate.

Might have, but unlikely. GTX680 had a die that was small enough (300mm2) for it not to matter all that much. And whatever wasn't perfect could be used anyway for GTX670. That's a crucial point you forgot: the vast majority of nonfunctional dies that you tally up as a negative for GTX680 could be reused for GTX670, which arrived on the scene just a couple of months later. So they were never a loss in the first place.

Overal AMD's design decisions behind Tahiti were OK considering it allowed them to have a head start on the 28nm node by shipping products earlier, and reach better yield rates earlier than NV to compensate the bigger die area as a consequence because of the point explained earlier.

You have product that takes 3 years to develop. What do you think is more likely? That this extra redundancy made a difference in time-to-market of 2 months or that 2 months in a large engineering project are in the noise?

...the rest of your statement...

Agreed.

sontin · Sep 1, 2014

GK104 and GK108 were both ready at the same time as AMD's 28nm products.
The difference is that nVidia had ~2/3 of the market so they needed much more volume.

That's the reason why their Q1 and Q2 was down in revenue because their were the 28nm supply constraint nearly the full year.

Piroko · Sep 1, 2014

MathMan said:
The problem with that statement is that AMD explicitly denied it. The initial allegation was that Tahiti has 8 pure redundant CUs (for a total of 40, with only 32 active.) That's an incredible amount of redundancy that can't possibly be exploited profitably over the lifetime of a product.

As far as I understand, redundancy (yield improving design, to be technically more correct) can go down to a level of individual transistors. Especially with the power and clock domains a design that is targeted primarily for higher yields instead of die size can also net huge improvements in time to market, power consumption and clock speed. Consensus is that Nvidia learned this the hard way with GF100.

If you look at typical redundancy ratios, as used in RAMs etc., you get ratios of 1 out of 128 and such.

We have been far exceeding this level of redundancy for several nodes now afaik, especially if you look at the whole system with layered redundancy. ECC alone has a penalty of 12.5%, and that is just the last level of 'redundancy' (not redundancy per se, but matching goals).

Also, 300mm² on 28nm is a large die for all intents and purposes and two months lead in time to market is quite a feat when your product gets phased out just 24 months later (with a few exceptions).

antihelten · Sep 1, 2014

MathMan said:
The problem with that statement is that AMD explicitly denied it. The initial allegation was that Tahiti has 8 pure redundant CUs (for a total of 40, with only 32 active.)

Redundancy doesn't have to be in the form of additional shaders. the 5800 series used additional VIAS for added redundancy for instance.

PPB · Sep 1, 2014

sontin said:
GK104 and GK108 were both ready at the same time as AMD's 28nm products.
The difference is that nVidia had ~2/3 of the market so they needed much more volume.

That's the reason why their Q1 and Q2 was down in revenue because their were the 28nm supply constraint nearly the full year.

My excuse-o-meter is tingling with this post.

So every time that NV launched on par with AMD on a new node, they actually had their working sillicon earlier? Please dont kid yourself.

Keysplayr · Sep 1, 2014

PPB said:
My excuse-o-meter is tingling with this post.

So every time that NV launched on par with AMD on a new node, they actually had their working sillicon earlier? Please dont kid yourself.

How does this have any meaning in your book whatsoever? One way or the other?

tviceman · Sep 1, 2014

Do the reviews come tomorrow (Sept. 2nd)?

metalliax · Sep 1, 2014

tviceman said:
Do the reviews come tomorrow (Sept. 2nd)?

I believe the NDAs are lifted at 8am EST Sep 2nd. So expect to see them early tomorrow morning.

tviceman · Sep 1, 2014

I am amazed no one broke NDA.

Rvenger · Sep 1, 2014

tviceman said:
I am amazed no one broke NDA.

Then again this product wasn't something hyped as ground breaking.

AMD's Tonga - R9 285 (Specs) and R9 285X (Partial Specs)

Senior member

Elite Member <br> Super Moderator <br> Video Cards

Diamond Member

Diamond Member

Senior member

Platinum Member

Senior member

Lifer

Platinum Member

Golden Member

Elite Member <br> Super Moderator <br> Video Cards

Golden Member

Diamond Member

Member

Senior member

Member

Diamond Member

Senior member

Golden Member

Golden Member

Elite Member

Diamond Member

Member

Diamond Member

Elite Member <br> Super Moderator <br> Video Cards