Zen APUs made by GloFo, 14nm FinFET node, and packaged by Amkor

Page 9 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
Wouldn't an IVR (and HBM for that matter) reduce the number of pins needed?

~350 extra pins doesn't really seem necessary from what we've seen advertised so far. That's just about the right number for another dual channel MC if they want to make the socket extensible in the future, but that seems unlikely.
I think it depends on where that IVR sits. Maybe there is a design, which works more efficiently (and doesn't need Intel's FIVR know how), when given lots of different voltages. Although The Stilt already said, that there won't be more voltage rails.

Two more channels might be worth a thought, as the gap between mem B/W and compute power continues to widen. For an APU with Polaris like GPU efficiency gains and more powerful CPU cores (possibly without that heavy throttling) might either need HBM or more mem channels. HBM likely is a bigger cost factor than some more channels (or at least it moves the costs off package).
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
Fottemberg teases its big at SA and that makes no sence just for some extra ordinary memory lanes. Same goes for extra power.
Come on guys !
 

Dresdenboy

Golden Member
Jul 28, 2003
1,730
554
136
citavia.blog.de
Yea i get that, but how many pins do you need for that?
It depends on the design, while moving parts of the whole circuit on or off package.

Havnt Carrizo per IP adaptive voltage?
This still has been marked as "in development" on the Carrizo slides.

Fottemberg teases its big at SA and that makes no sence just for some extra ordinary memory lanes. Same goes for extra power.
Come on guys !
Yeah, being used to 8 channels and 32 cores per socket, who cares about two additional channels. Wait, wasn't there just a discussion about Carrizo being crippled with only one channel? ^^

-> dresdenboy -> krumme

off topic question of the day ; does the PIT(Programmable Interval Timer) still factor into todays x86 designs/concepts or is it called something else? How and where?

- or just ignore at will
Is that an important detail for derivating Zen's performance? Other than the A20 gate this might still be around. But isn't that part of the MB electronics? I'm mostly using RDTSC, RDTSCP.
 
Last edited:

deasd

Senior member
Dec 31, 2013
560
890
136
Fottemberg teases its big at SA and that makes no sence just for some extra ordinary memory lanes. Same goes for extra power.
Come on guys !

I bet more pins due to either more than triple-channel memory or HBM.(latter is much more possible)
 

MrTeal

Diamond Member
Dec 7, 2003
3,596
1,773
136
I bet more pins due to either more than triple-channel memory or HBM.(latter is much more possible)

HBM wouldn't increase pin count other than possibly another voltage rail. The whole point of it is you have another 1k pins or so per stack, but they all stay on the interposer. They never hit the socket.
 

The Stilt

Golden Member
Dec 5, 2015
1,709
3,057
106
14nm LPP gives you both an increase in Fmax and a reduction of leakage.

Why does everyone assume that smaller transistor size will increase the Fmax, when all available examples beyond 32nm prove the opposite?

-- 32nm SHP SOI -> 28nm (GF28A / GF28HPP)
-- 32nm P1268 HKMG GL -> 22nm P1270 FinFet
-- 22nm P1270 FinFet -> 14nm P1272 FinFet

Also unless you are designing a chipset for a pocket calculator, you can pretty much forget about getting the full (0.25 scale) out of 28nm to 14nm transition. For anything even remotely high performance you'll see something like 0.4 - 0.5 scale in that transition. Until a certain point it is up to the designer to decide how tight the design will be, depending on design targets (area, cost, power optimized at the expence of performance / purely performance).
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
With 14nm using 20nm BEOL that would be a 50% reduction in area compared to 28nm (which uses 28nm BEOL).

That could lead to some pretty good gains in iGPU size.

Someone correct me if I am wrong, but if each GCN 1.x CU (ie, 64sp) uses 7.5mm2 on 28nm that would come out to be 3.75mm2 on 14nm.
 
Last edited:

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
It depends on the design, while moving parts of the whole circuit on or off package.


This still has been marked as "in development" on the Carrizo slides.


Yeah, being used to 8 channels and 32 cores per socket, who cares about two additional channels. Wait, wasn't there just a discussion about Carrizo being crippled with only one channel? ^^


Is that an important detail for derivating Zen's performance? Other than the A20 gate this might still be around. But isn't that part of the MB electronics? I'm mostly using RDTSC, RDTSCP.
Eg. What parts moved off package?
How do you interprete per ip?
You are right about per ip not in carrizo
Even the per parts still not fully implemented;
http://techreport.com/review/27853/amd-previews-carrizo-apu-offers-insights-into-power-savings/3
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
With 14nm using 20nm BEOL that would be a 50% reduction in area compared to 28nm (which uses 28nm BEOL).

That could lead to some pretty good gains in iGPU size.

Someone correct me if I am wrong, but if each GCN 1.x CU (ie, 64sp) uses 7.5mm2 on 28nm that would come out to be 3.75mm2 on 14nm.

Even if we assume that. Your 3.75mm2 part cost the same or more as the 7.5mm2 part

So if you want bigger you are going to pay up.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
Also unless you are designing a chipset for a pocket calculator, you can pretty much forget about getting the full (0.25 scale) out of 28nm to 14nm transition.

There's a better reason.

The decrease in pitch from 28nm to 14nm doesn't allow that. 0.25 best-case scaling can only happen if pitches Contacted Gate Pitch x Minimum Metal Pitch(M1) both decrease to 0.5x.

It seems that the max scaling from 28nm GF to 16nm TSMC is only about 0.66x.

You have a point about Fmax. The highest ever clocked chip is in the 9GHz range with insanely exotic setups. I don't think a regular chip will *ever* get to that range.

Fottemberg teases its big at SA and that makes no sence just for some extra ordinary memory lanes. Same goes for extra power.
Come on guys !

You don't think a 1600-2000 SP APU with HBM memory would be big news?
 
Last edited:

ItsAlive

Golden Member
Oct 7, 2005
1,147
9
81
I don't see why they couldn't just take an R9 Nano, cut the chip in half, and die shrink it. A full sized R9 Nano at 28nm with all the extra power cost of the entire PCB only uses around 200w, so adding just the chip to an apu shouldn't add more than 200w. Cutting the chip in half should drastically reduce the power consumption, and then the die shrink would cut it down even further.

R9 Nano Stock: 200w
R9 chip only -10%: 180w
Cut in half: 100w
Die shrunk: 66w

These are all just basic estimations, please let me know if my numbers are off base. I tried to estimate toward the higher end of the spectrum. They could also lower clocks a bit to lower power consumption, but with a lower power cpu on a new node I really dont think they would have to.

This could allow them to essentially remove the PCI express card slots on a custom motherboard. Integrate wifi to the motherboard, integrate msata SSD, and once HBM2 is available they can use 16GB on chip 8GB for video 8GB for system RAM. They could then delete the ram slots entirely from the motherboard.

Think Pico ITX sized system with quad core and R9 graphics.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
These are all just basic estimations, please let me know if my numbers are off base. I tried to estimate toward the higher end of the spectrum. They could also lower clocks a bit to lower power consumption, but with a lower power cpu on a new node I really dont think they would have to.

Things are nowhere that simple.
-Certain circuits like I/O can't be just cut in half, meaning layout changes
-How do you communicate between CPU and GPU?
-Power management between the two is different. For one thing, the GPU has far higher idle power than CPU, and far slower with changes too
-The process used in both may be different. What's the impact of using one process variant?
-Companies exist to earn money. That means they'd want to make derivatives. That means what makes sense on high end makes no sense on low end

Anything that requires layout changes take few years. Before APUs were released AMD has said it takes additional year for their discrete GPU architecture to reach APUs. So that's another insight on whether its that easy to make such changes.

and once HBM2 is available they can use 16GB on chip 8GB for video 8GB for system RAM. They could then delete the ram slots entirely from the motherboard.

HBM is only being used in the highest configurations because they concluded the cost is too high. One stack for 128GB/s bandwidth is fine, but enough to have 16GB capacity? I guess they can price the APU at $600+. It might make sense, but it might not.
 
Last edited:

PPB

Golden Member
Jul 5, 2013
1,118
168
106
I don't see why they couldn't just take an R9 Nano, cut the chip in half, and die shrink it. A full sized R9 Nano at 28nm with all the extra power cost of the entire PCB only uses around 200w, so adding just the chip to an apu shouldn't add more than 200w. Cutting the chip in half should drastically reduce the power consumption, and then the die shrink would cut it down even further.

R9 Nano Stock: 200w
R9 chip only -10%: 180w
Cut in half: 100w
Die shrunk: 66w

These are all just basic estimations, please let me know if my numbers are off base. I tried to estimate toward the higher end of the spectrum. They could also lower clocks a bit to lower power consumption, but with a lower power cpu on a new node I really dont think they would have to.

This could allow them to essentially remove the PCI express card slots on a custom motherboard. Integrate wifi to the motherboard, integrate msata SSD, and once HBM2 is available they can use 16GB on chip 8GB for video 8GB for system RAM. They could then delete the ram slots entirely from the motherboard.

Think Pico ITX sized system with quad core and R9 graphics.
Dont need to cut nano (full fiji) in half. There is tonga for that.

More pins in same package size = smaller pitch. Hopefully this doesnt mean filmsier pins. The other day i broke a 478 pentium 4's already bent pin almost by just looking at it, lol. In the other hand I never had bad experiencies handling bent pins on AMx chips.

Max 140w for am4's package is good considering it will also host APUs. 24CU+4c8t zen+HBM = we can only dream.
 

ItsAlive

Golden Member
Oct 7, 2005
1,147
9
81
Things are nowhere that simple.
-Certain circuits like I/O can't be just cut in half, meaning layout changes
-How do you communicate between CPU and GPU?
-Power management between the two is different. For one thing, the GPU has far higher idle power than CPU, and far slower with changes too
-The process used in both may be different. What's the impact of using one process variant?
-Companies exist to earn money. That means they'd want to make derivatives. That means what makes sense on high end makes no sense on low end

I get that it wouldn't be as simple as cutting the chip in half, as you said some areas of the chip cant be cut in half, but power would be drastically reduced.

As far as designing the APU and how it communicates with the GPU, wouldn't they already be in the midst of designing this for Zen APUs?

I would expect it to take some time to design of course, but Nano was released in Sept. last year so the tech has been around for awhile.

An extra year of design time would just give them time for the price of HBM2 to come down.
 
Last edited:

ItsAlive

Golden Member
Oct 7, 2005
1,147
9
81
Dont need to cut nano (full fiji) in half. There is tonga for that.

More pins in same package size = smaller pitch. Hopefully this doesnt mean filmsier pins. The other day i broke a 478 pentium 4's already bent pin almost by just looking at it, lol. In the other hand I never had bad experiencies handling bent pins on AMx chips.

Max 140w for am4's package is good considering it will also host APUs. 24CU+4c8t zen+HBM = we can only dream.

The only reason I said Nano (Fiji) is because it is already designed to work with HBM. I realize it will be an entirely new chip.
 

cbn

Lifer
Mar 27, 2009
12,968
221
106
With 14nm using 20nm BEOL that would be a 50% reduction in area compared to 28nm (which uses 28nm BEOL).

That could lead to some pretty good gains in iGPU size.

Someone correct me if I am wrong, but if each GCN 1.x CU (ie, 64sp) uses 7.5mm2 on 28nm that would come out to be 3.75mm2 on 14nm.

Even if we assume that. Your 3.75mm2 part cost the same or more as the 7.5mm2 part

So if you want bigger you are going to pay up.

Since these consumer APUs are designed for laptop first I'm thinking the bigger problem will probably be power.....and iGPU size will have to be constrained because of that.

However, if Polaris does in fact scale 2.5x Performance per watt compared to 28nm GCN 1.x that means a 35W/42W AC boost laptop would have the equivalent of 1280sp 28nm GCN 1.x @ 750 Mhz (with the CPU cores simultaneously throttling). (re: Carrizo @ 512sp gets around 750Mhz at 35W/42W AC boost during CPU intensive gameplay).

Assuming that marketing number of "2.5x performance per watt" is really 2x performance per watt that still net the equivalent of 1024sp 28nm GCN 1.x @ 750 Mhz (with the CPU cores simultaneously throttling).

Looking at the performance per watt of Vega (below) it looks like AMD is claiming 3.5x to 4x compared to 28nm GCN 1.x GPUs, so based the 35W/42W performance of Carrizo 512sp @ 750 Mhz that would equate to somewhere around 1792sp 28nm GCN 1.x @ 750Mhz to 2048sp 28nm GCN 1.x @ 750Mhz (with the CPU cores simultaneously throttling) for a Zen/Vega 35W/42W APU. Assuming the true performance per watt is 3x that would come out to 1536sp 28nm GCN 1.x @ 750Mhz (with the CPU cores simultaneously throttling) for a Zen/Vega 35W/42W APU.



P.S. As far as the final power rating the Zen laptop APU goes, I'm thinking it probably be 45W (just like the Skylake GT4e)....mainly because of the CPU cores.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
I get that it wouldn't be as simple as cutting the chip in half, as you said some areas of the chip cant be cut in half, but power would be drastically reduced.

It's *much* more complicated than that.

Ahh, regardless AMD engineers and management knows far more than either of us, or even probably everyone on AT Forums. They may screw up, but they are working on a very complicated project.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
Well, we heard the same expectations when Kaveri came out around the time of the consoles. A projected 1024 shader part with GDDR5 that was going to take over the gaming market. We all know how that turned out.

Is this a joke? A 1152 shader GDDR5 part did take over the gaming market. It's called the PS4 and it sold millions of units. Most AAA gaming is done on consoles though PC gaming is by no means dead or dying.
 
Aug 11, 2008
10,451
642
126
Is this a joke? A 1152 shader GDDR5 part did take over the gaming market. It's called the PS4 and it sold millions of units. Most AAA gaming is done on consoles though PC gaming is by no means dead or dying.

Yea I am laughing hysterically inside. What I meant to say, and thought was clear, since this is the PC cpu section, not the Console Gaming section, was the rumors for a super APU for the PC with GDDR5.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
Yea I am laughing hysterically inside. What I meant to say, and thought was clear, since this is the PC cpu section, not the Console Gaming section, was the rumors for a super APU for the PC with GDDR5.

'Gaming market' is subject to a lot of interpretations. Some include phones because of the astonishing amount of revenue there, some include facebook browser games for the same reason. I assumed the 'gaming market' was AAA gaming, which is done on consoles and gaming PCs, since the other markets are not performance sensitive, so hardware doesn't matter much. If you want to narrow the field of comparison to APUs, then the PS4 pulls even farther away from its competition since Gaming PCs using discrete graphics are the largest competitor to the consoles.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |