[VR-Zone]AMD upcoming Tonga GPU to be released in mid August

Cloudfire777

Golden Member
Mar 24, 2013
1,787
95
91
Finally, a reliable site, VR-Zone, have mentioned when Tonga will release.

In August a new card will release, but its not based on 20nm nor is it high end .

Reliable sources have told us that a new card that will replace R9 280X is expected to launch in August. In May we reported that AMD planned to launch a new card called Tonga which would be positioned between R9 270X and R9 280, but the card will actually replace Tahiti Pro (R9 280). The specs are unknown but it will have 2GB GDDR5 VRAM

Source and read more here:
http://chinese.vr-zone.com/119702/amd-might-replace-raderon-r9-280-from-tahiti-pro-to-tonga-06302014/

Google translate:
http://translate.googleusercontent.com/translate_c?depth=1&nv=1&rurl=translate.google.com&sl=auto&tl=en&u=http://chinese.vr-zone.com/119702/amd-might-replace-raderon-r9-280-from-tahiti-pro-to-tonga-06302014/&usg=ALkJrhhZ6ME4JV8lcQghgDKtW0cZ9yPUbQ
 
Last edited:

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,223
136
Tonga is the highest-end mobile part and the mid-end for Volcanic Islands. It will have 4 GB GDDR5 for mobile and 4/8 GB GDDR5 for desktop/professional.

Hawaii Pro ~=~ Tonga XT.

If designed well a 20-nm GPU can have ~1.9x increase in clock rates, ~0.6x decrease in power consumption, with a 2x density increase.

352 mm² * ~0.5 => ~176 mm², there is plenty of architectural enhancements for Volcanic Islands. While, I don't know exactly what is meant for VI or PI. Tonga is definitely not a filler between Pitcairn XT and Tahiti Pro.
 
Last edited:

boxleitnerb

Platinum Member
Nov 1, 2011
2,601
2
81
No. 1.9x increase in clock rate has not happened for a long long long time. Go read TSMCs predictions on their process. And all those increases are not happening at once but they are mutually exclusive. You either gain more performance OR lower power consumption.
Finally, I highly doubt that Tonga will be made in 20 nm.
 

Cloudfire777

Golden Member
Mar 24, 2013
1,787
95
91
Tonga is the highest-end mobile part and the mid-end for Volcanic Islands. It will have 4 GB GDDR5 for mobile and 4/8 GB GDDR5 for desktop/professional.

Hawaii Pro ~=~ Tonga XT.

If designed well a 20-nm GPU can have ~1.9x increase in clock rates, ~0.6x decrease in power consumption, with a 2x density increase.

352 mm² * ~0.5 => ~176 mm², there is plenty of architectural enhancements for Volcanic Islands. While, I don't know exactly what is meant for VI or PI. Tonga is definitely not a filler between Pitcairn XT and Tahiti Pro.

What makes you think that Tonga will be 20nm? Nvidia had great result with 28nm by just doing a new architecture. They reduced power consumption in half, and almost matched GK106 in performance. In 28nm. So why can`t AMD do the same and stick with 28nm?
Tonga for mobile, ie R9 M295X, does have 4GB VRAM yes, so chances are the desktop Tonga will too.

Ozone3d leak show that R9 M295X have 32CU`s, but I do wonder if it have some CU`s disabled which desktop Tonga will have enabled.

Also, are there any chances that they changed the shader counts for the CU, away from 64 shaders/CU?

Is Tonga GCN 2.0 or is it the old GCN 1.0 (1.1)..

So many questions, yet so few answers. Can`t wait to see the reveal
 
Last edited:

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,223
136
No. 1.9x increase in clock rate has not happened for a long long long time. Go read TSMCs predictions on their process. And all those increases are not happening at once but they are mutually exclusive. You either gain more performance OR lower power consumption.
Finally, I highly doubt that Tonga will be made in 20 nm.


Xillinx spends about $550 mil per year for R&D.
AMD spends about ~$1.2 bil per year for R&D.

FPGAs as the nodes shrink become more difficult to build than ASICs.

The scenarios are with 2x density;
~4x performance increase for same power
~4x power consumption decrease for same perf
~2x perf + ~2x power

These numbers are dependent on the intelligence of the designers of the device.
What makes you think that Tonga will be 20nm?
67a1 - Hawaii PRO GL [FirePro W8100]
692b - Tonga XT GL [FirePro W8100]
Nvidia had great result with 28nm by just doing a new architecture. They reduced power consumption in half, and almost matched GK106 in performance. In 28nm. So why can`t AMD do the same and stick with 28nm?
Nvidia cancelled the GM1xx series except for the GM107 and GM108 series. GM200, GM204, and GM206 are pretty much 20-nm GPUs.
 
Last edited:

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
You can't use products with low clock frequencies and extrapolate that when trying to predict what more complex, higher clock frequency chips are going to run at. Boxleitnerb is correct. TSMC has said on multiple occasions that 20nm can yield up to 1.9x density over 28nm, but only 30% more performance at the same power consumption.
 

f1sherman

Platinum Member
Apr 5, 2011
2,243
1
0
You can't use products with low clock frequencies and extrapolate that when trying to predict what more complex, higher clock frequency chips are going to run at. Boxleitnerb is correct. TSMC has said on multiple occasions that 20nm can yield up to 1.9x density over 28nm, but only 30% more performance at the same power consumption.

Pretty much.

TSMC's 20nm process technology can provide 30 percent higher speed, 1.9 times the density, or 25 percent less power than its 28nm technology.

http://www.tsmc.com/english/dedicatedFoundry/technology/20nm.htm
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
Pretty much.

TSMC's 20nm process technology can provide 30 percent higher speed, 1.9 times the density, or 25 percent less power than its 28nm technology.

http://www.tsmc.com/english/dedicatedFoundry/technology/20nm.htm

Tsmc own description is as usual nonsense and i prefer Tvicemans. But tsmc description goes to show density have been prioritised vs performance. But thats typical for tsmc vs eg Intel so no news here. Next node will probably be more similar to Intel 14nm for perf and density.
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
Tsmc own description is as usual nonsense and i prefer Tvicemans. But tsmc description goes to show density have been prioritised vs performance. But thats typical for tsmc vs eg Intel so no news here. Next node will probably be more similar to Intel 14nm for perf and density.

Density is quite useful for GPUs, more important than frequency bumps. Graphics is a highly parallelizable task. Double a current GPUs logic units without increasing frequencies and you'll still get quite a jump in performance, can see that in same gen GPUs. Main difference between a mid-range card and top end card is amount of transistors. Unfortunately due to increased costs I doubt we will see the full 1.9X increase in transistor count for 20nm GPUs.
 

boxleitnerb

Platinum Member
Nov 1, 2011
2,601
2
81


Xillinx spends about $550 mil per year for R&D.
AMD spends about ~$1.2 bil per year for R&D.

FPGAs as the nodes shrink become more difficult to build than ASICs.

The scenarios are with 2x density;
~4x performance increase for same power
~4x power consumption decrease for same perf
~2x perf + ~2x power

These numbers are dependent on the intelligence of the designers of the device.67a1 - Hawaii PRO GL [FirePro W8100]
692b - Tonga XT GL [FirePro W8100]Nvidia cancelled the GM1xx series except for the GM107 and GM108 series. GM200, GM204, and GM206 are pretty much 20-nm GPUs.

No, just no. You have no idea what you are talking about. You have a history of posting made up nonsense.
 

Cloudfire777

Golden Member
Mar 24, 2013
1,787
95
91
67a1 - Hawaii PRO GL [FirePro W8100]
692b - Tonga XT GL [FirePro W8100]

Come on, you got to be more specific than that Seronx.

Tonga - 1002:6920
R9 M295X - Amethyst XT Device 1002:6921
Hawaii PRO GL [FirePro W8100] - Device 1002:67a1
Tonga XT GL [FirePro W8100] - Device 1002:692b
The only thing I can make of this is that Tonga, Amethyst and Tonga XT GL (Probably M8100) are based on the same chip (692X). While Hawaii PRO GL (W8100) is based on Hawaii (67aX & 67bX). R9 290X is Hawaii and have an ID of 67b1.

How on earth do you come to the conclusion that its 20nm based on this?
 
Last edited:

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
Density is quite useful for GPUs, more important than frequency bumps. Graphics is a highly parallelizable task. Double a current GPUs logic units without increasing frequencies and you'll still get quite a jump in performance, can see that in same gen GPUs. Main difference between a mid-range card and top end card is amount of transistors. Unfortunately due to increased costs I doubt we will see the full 1.9X increase in transistor count for 20nm GPUs.

Yeaa its a good point. I actually wondered why tsmc seems to be leaving this strategy after 20nm?

- the soc have a huge gpu today and its probably going to be relatively even bigger going forward

- the DSP part is also more parallel task than cpu as i understand it?

Why go for more perf vs density?
 

buletaja

Member
Jul 1, 2013
80
0
66
you can guest why 256bit 32 CU

because AMD leaked china slide about excavator
said They are chasing 256bit FMA
infact they said 64 ALU as 8 Ex 256bit FMA
http://diybbs.zol.com.cn/11/11_106489.html
Google translate:
"CU 64 FMA's Scalar Unit re-constituted four support 256bit AVX complex vector or >> eight 256bit ADD / MUL << operation unit directly connected to a module"


so it means with 16 Execution unit per CU like on micro 46
1 Ex unit = 256bit FMA
16 EX = 16 FMA 256bit
1 CU = 16 FMA 256bit
it means 1 CU = 128 ALU
means only about 1.3 - 1.4 times bigger area per CU

it also means the 32CU actually 16CU (but using wider ALU)
and perfectly match for 256bit
and with AMD focus on Compute shader

also the differences like AMD shown on true audio block
they will use Local Scratchpad Memory instead LDS
dont know if per EX also has 64KB scratchpad
but based on AMD true audio, seem they will adding scratchpad memory too per EX

They showed this on Dec 2013 Micro 46



they also count as 32CU
but @micro46 they shown the bandwidth as 700-800 GB/s
so i think the 32CU with 700-800 GB/sec BW is related
to actual CU is 64, or same like above 32 CU but 128 ALU

@SeronX infact when you said seem Kaveri 8CU = 16CU
it is because per CU is not 64 ALU but 128 ALU (they only enabling it as GCN 1.0)

Also why they need LSM is related to this paper
Mike Mantor latest paper
Scalar unit on SIMT
http://people.engr.ncsu.edu/hzhou/ipdps14.pdf

Basically they try to focus on energy efficiency
they need to bring more calculation to locality
so instead 1 scalar unit for 64 ALU
what about 1 scalar unit for 8 ALU
then per Execution unit is also have LSM
so per EX can also fetch instruction without CPU
means GPU become 1st class citizen too.

But probably VI is for wider CU
then PI will also bring more scalar unit per CU
 
Last edited:

Cloudfire777

Golden Member
Mar 24, 2013
1,787
95
91
@buletaja:
You are confusing me a bit here.

R9 M295X = 32CU. We know that based on previous leak.

Are you saying Tonga will have 128 ALUs per CU? Or still using 64 ALUs per CU?
Or 16 CUs and 128 ALUs per CU (which it isnt).

If you are saying 128 ALUs per CU, which means 30-40% bigger die size if its 28nm, it should beat R9 290X by a good margin. Which I highly doubt.
 

tviceman

Diamond Member
Mar 25, 2008
6,734
514
126
www.facebook.com
I have a feeling that Tonga will be AMD's 28nm version of Barts. Better performance per mm^2 and perf per watt, but will ultimately end up a little bit slower than the r280 / hd7950 boost.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,223
136
20-nm Tonga;

1 Shader Engine;
64 KB(4x16 KB) Instruction Cache
128 KB(4x32 KB) Scalar Cache
256 KB(16x16 KB) L1 caches
256 KB(4x64 KB) L2 caches + 4(4x1) RBEs
8 MB L3 cache + DMA Engines + MCT/Heterogeneous DCTs
1024(16x64) 32-bit ALUs paired in 256 ALU arrays

(A) -> One 8 MB L3 interconnects to one HBM interface + one GDDR5 interface per shader engine;
8 MB -> 1024b HBM and 128b GDDR5

So, in total there would be 16 MBs of L3 and 2048 bits of HBM and 256 bits of GDDR5.

(B) -> One 8 MB L3 interconnects to one HBM interface + one GDDR5 interface to both shader engines;
8 MB -> 1024b HBM and 256b GDDR5

Scenario A for Ameythst XT;
224 GB/s from HBM
176 GB/s from GDDR5
Total => 400 GB/s

Scenaro B for Amethyst XT;
112 GB/s from HBM
176 GB/s from GDDR5
Total => 288 GB/s

Scenario A for Tonga XT;
256 GB/s from HBM
192 GB/s from GDDR5
Total => 448 GB/s

Scenario B for Tonga XT;
128 GB/s from HBM
192 GB/s from GDDR5
Total => 320 GB/s

I'm uncertain on how this heterogeneous interface is laid out.

If we go by the totals;
Amethyst XT (A) would need a clock rate around 1.5625 GHz.
Amethyst XT (B) would need a clock rate around 1.125 GHz.
Tonga XT (A) would need a clock rate around 1.75 GHz.
Tonga XT (B) would need a clock rate around 1.25 GHz.

If we go by the HBMs assuming that the GDDR5s are now just a texture cache dump;
Amethyst XT (A) would need a clock rate around 0.875 GHz.
Amethyst XT (B) would need a clock rate around 0.4375 GHz.
Tonga XT (A) would need a clock rate around 1 GHz.
Tonga XT (B) would need a clock rate around 0.5 GHz.
 
Last edited:

Techhog

Platinum Member
Sep 11, 2013
2,834
2
26
Nosta, it's not 20nm. The article and common sense say it's 28nm. Also stop with the silly clock speeds...
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
Yeaa its a good point. I actually wondered why tsmc seems to be leaving this strategy after 20nm?

- the soc have a huge gpu today and its probably going to be relatively even bigger going forward

- the DSP part is also more parallel task than cpu as i understand it?

Why go for more perf vs density?

They are probably targeting more efficient performance, i.e. better power efficiency for SoCs in the ~2-3GHz range. I can see SoC makers wanting that for both pure marketing and CPU performance reasons.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,223
136
Nosta, it's not 20nm. The article and common sense say it's 28nm. Also stop with the silly clock speeds...
The only new GPUs coming out is Iceland and Topaz.

Topaz XT/R7 M260;
Up to 980 MHz
Up to 4 GB DDR3
Up to 32 GB/s memory bandwidth
Up to 752 SP GFlops

Topaz Pro/R7 M255;
Up to 940 MHz
Up to 4 GB DDR3
Up to 16 GB/s memory bandwidth
Up to 601 SP GFlops

From what I can gather both of these GPUs are near below Cape Verde performance.

Tonga and Maui are either Q4 2014 or 1H 2015.
 
Last edited:

Techhog

Platinum Member
Sep 11, 2013
2,834
2
26
The only new GPUs coming out is Iceland and Topaz.

Topaz XT/R7 M260;
Up to 980 MHz
Up to 4 GB DDR3
Up to 32 GB/s memory bandwidth
Up to 752 SP GFlops

Topaz Pro/R7 M255;
Up to 940 MHz
Up to 4 GB DDR3
Up to 16 GB/s memory bandwidth
Up to 601 SP GFlops

From what I can gather both of these GPUs are near below Cape Verde performance.

Tonga and Maui are either Q4 2014 or 1H 2015.
Your ridiculous clock speed estimates mean that I need to see a valid source before taking what you say at face value. The way that you state everything as fact implies that you have inside knowledge, but you clearly don't. I see zero reason to take your word over VR-Zone's, especially when your numbers make no sense.
 
Last edited:

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,223
136
If you think I have insider info you to can have insider info then;

site:linkedin.com
filetypedf
"DDR3/DDR4 HBM"
"GDDR5 6.5 GBps / HBM 256 GBps"

https://www.google.com/?tbm=pts <= Patents
etc

Everything I say is hinted at while I have to guess the correct specifications before AMD launches. As we get more close to the launch more information is displayed.

For giggles;
/ site:linkedin.com GDDR6 Tablets /

--
Topaz;
http://www.amd.com/en-us/products/graphics/notebook/r5-m200#2
http://www.amd.com/en-us/products/graphics/notebook/r7-m200#2

The only GPU I have yet to see launch is Emerald XTX R9 M280X (Bonaire XTX)
 
Last edited:

ocre

Golden Member
Dec 26, 2008
1,594
7
81
Density is quite useful for GPUs, more important than frequency bumps. Graphics is a highly parallelizable task. Double a current GPUs logic units without increasing frequencies and you'll still get quite a jump in performance, can see that in same gen GPUs. Main difference between a mid-range card and top end card is amount of transistors. Unfortunately due to increased costs I doubt we will see the full 1.9X increase in transistor count for 20nm GPUs.
But that's impossible

U are forgetting that transistors consume and burn energy. Just because the node can pack in 1.9x the density doesn't mean we will see 14billon transistor chips. The huge factor limiting factor here is power consumption.
Without improving power consumption, increasing density will only make hotter and more power hungry chips.

Both nvidia and AMD are just about max on practical tdp. Just can't see them going a lot higher. So this will be the boundary.
20mm node only brings down consumption slightly, hopefully 30%.
So this makes it more complex.

There is not a lot of options. You could run a big chip at a low low speeds. But having to turn the clocks down causes overall performance to suffer not to mention the cost of silicon. 20mm chips are much more pricey than 28mm chips of the same size. So there is always a need to find the right balance, transistors, density, the target speed, power consumption-its a complex situation. One effects the other and the ultimate goal is to have the most powerful chip for the lowest cost. Even the most talented chip designers don't know how things will turn out until the first samples get back from the fab. Then they find out if they miss their mark and by how much. Sometimes they they get rather close right,off the bat.
There is always comprise.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,863
3,417
136
but at the same time you can drop clocks and low voltage. its all about picking the sweet spot of die size, power and clocks. on die size you can then further segregate that between alu/mtu/rop/cache/memory controller etc. given that power consumption across clocking isn't linear, if your goal is absolute performance within a TDP, going as big as you possibly can and then clocking to the TDP limit will likely give you the best result.
 
Last edited:

Cloudfire777

Golden Member
Mar 24, 2013
1,787
95
91
@NostaSeronx:
You still havent explained why Amethyst XT is 20nm.
All I see is a guy who is throwing random numbers out there. The mobile chips you are linking to have been out for a while so im not sure what you are trying to prove there

What makes you believe that Tonga will get HBM? Why now and not the 300 series?
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |