[SiSoft] AMD monster APU?

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DrMrLordX

Lifer
Apr 27, 2000
21,813
11,168
136
There is nothing there indicating any need to change chip designs for EMIB. The whole point of EMIB is that you mix and match anything in an EMIB package and it uses standard flip chip bumps and microbumps to connect chips, stated in your own link.

That surprises me. EMIB appears to be topologically incompatible with standard flip chip bumps/microbumps since it connects dice at the edge rather than in the physical areas where you have those bumps/microbumps. It was my understanding that any dice utilizing EMIB would have to be designed from the ground-up to a). work with EMIB at all and b). work within the desired configuration.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
There's no problem.
It's 4/8 RR + Vega32/28 on the same package.
Vega32/28 is connected using PCIe link, just as usual GPU.
It's KBL-G, but with a better GPU and without Intel tax.
Yeaa. Its an understatement to write they absolutely need this in their mobile lineup. Pascal killed the miniscule pieces that was left outside of apple.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
Kbl g and this is also signs the mobile dgpu market largely, at last, is comming to an end. Took some time an and demanded hbm2 and an entirely new cpu arch and some innovative packaging from Intel.
7nm will bring it to 99% igpu coverage.
 

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136
That surprises me. EMIB appears to be topologically incompatible with standard flip chip bumps/microbumps since it connects dice at the edge rather than in the physical areas where you have those bumps/microbumps. It was my understanding that any dice utilizing EMIB would have to be designed from the ground-up to a). work with EMIB at all and b). work within the desired configuration.

What gives you that idea?

EMIB would be almost unworkable if it was limited to connecting only the edges of the die, rather than the bottom of it.

Watch the video of that Intel link under discussion. Eventually you will see them placing dies with the using pads/bumps on the EMIB to connect to the bottom of the dies:
https://www.intel.com/content/www/us/en/foundry/emib.html

"We use micro-bumps for high density signals, and coarser pitch, standard flip chip bumps for direct power and ground connections from chip to package. "

From what I have seen, interposer designs look like EMIB designs:

With the power pins separated from the data pins. I don't see any issue designing a chip that works with both Silicon Interposer, and EMIB:
 
Last edited:

DrMrLordX

Lifer
Apr 27, 2000
21,813
11,168
136
What gives you that idea?

Everything I've seen related to EMIB to date, including much of what was shown on the link you sent to me.

Intel has made a big to-do about their tiny EMIB bridges being easier to fab that large silicon interposers. All the graphics I've seen displaying a hypothetical EMIB configuration, show little strips of green to represent physical die interconnects. It appeared that all signals to be transmitted from one die to another in the EMIB configuration would have to cross one or more bridges to reach the intended destination. The rest should be obvious from there.
 

Yotsugi

Golden Member
Oct 16, 2017
1,029
487
106
Btw, EMIB and 2.5D designs are different and actually incompatible: 2.5D uses uBump only, C4 are on the bottom of Si interposer, EMIB is uBumps for EMIB + usual C4 for the rest of the die.
It appeared that all signals to be transmitted from one die to another in the EMIB configuration would have to cross one or more bridges to reach the intended destination.
Well, yeah. The bridges are embedded into substrate.
So?
 

FIVR

Diamond Member
Jun 1, 2016
3,753
911
106
There's no problem.
It's 4/8 RR + Vega32/28 on the same package.
Vega32/28 is connected using PCIe link, just as usual GPU.
It's KBL-G, but with a better GPU and without Intel tax.

I think this is very different than KBL-G. This is 4C/8T Ryzen + 28CU Vega on one die, with 2GB HBM2 on package (via interposer). It will use vastly less power than Polaris + 8GB HBM2 + EMIB, but it probably won't outperform KBL-G. Definitely not if it's clocked as low as 550Mhz. I think it will be clocked higher (prob 750-850Mhz) but not as high as the gpu in KBL-G
 

Yotsugi

Golden Member
Oct 16, 2017
1,029
487
106
I think this is very different than KBL-G. This is 4C/8T Ryzen + 28CU Vega on one die, with 2GB HBM2 on package (via interposer). It will use vastly less power than Polaris + 8GB HBM2 + EMIB, but it probably won't outperform KBL-G. Definitely not if it's clocked as low as 550Mhz. I think it will be clocked higher (prob 750-850Mhz) but not as high as the gpu in KBL-G
I sincerely doubt it's an actual APU.
Highly integrated MCM package makes more sense, and you even have a Polaris replacement with it.
Besides, what stops AMD from using 4-Hi and 8-Hi stacks should they want to?
 

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136
Everything I've seen related to EMIB to date, including much of what was shown on the link you sent to me.

Intel has made a big to-do about their tiny EMIB bridges being easier to fab that large silicon interposers. All the graphics I've seen displaying a hypothetical EMIB configuration, show little strips of green to represent physical die interconnects. It appeared that all signals to be transmitted from one die to another in the EMIB configuration would have to cross one or more bridges to reach the intended destination. The rest should be obvious from there.

Look at the Silicon Interposer picture I included in the previous post. It is exactly the same way that EMIB is laid out.

Regardless of which solution you are using, you want the data connections closest to the outside near where the meet the other die, and power plane away from the other dies. Because regardless of which solution you use you want to simplify routing and minimize path length. This does not change between EMIB/SI, and both use the same bumbs/microbumps for connections, for the same reasonse.

These are functionally equivalent from the chip perspective.

The difference is that Intel solution is much less expensive, because it using smaller pieces of silicon, only where needed to run connections, while Silicon Interposer, is using one massive piece of silicon to cover the whole area.

Even if you need to be more picky in pin layout to minimize the EMIB silicon slivers.

There is NOTHING stopping an EMIB design from being used on Silicon Interposer as is.

EMIB is just Silicon interposer with excess silicon eliminated. They are compatible solutions.

And in this case, the Intel EMIB part exists, it could be used on Silicon Interposer with ZERO need for change.
 

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136
Btw, EMIB and 2.5D designs are different and actually incompatible: 2.5D uses uBump only, C4 are on the bottom of Si interposer, EMIB is uBumps for EMIB + usual C4 for the rest of the die.

Well, yeah. The bridges are embedded into substrate.
So?

Both EMIB and Silicon Interposer use a mix of microbumps and standard bumps.
 

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136
Chips intended for Si interposers do not use C4 bumps.
C4 is on the interposer.

C4 is on the bottom of the interposer. The chip surface for high density connections uses microbumps just like the EMIB. Also note the high density connections are near the edge, using microbumps just like EMIB.







EMIB is essentially a cost optimization of Silicon Interposer. They are functionally equivalent from a chip perspective.

There is NO reason whatsover that a chip designed for EMIB won't work on SI. SI will just cost more.
 

Yotsugi

Golden Member
Oct 16, 2017
1,029
487
106
This is going places.
There is NO reason whatsover that a chip designed for EMIB won't work on SI. SI will just cost more.
Chip designed for Si has no C4 bumps.
Chip designed for EMIB/SLIM/SLIT/whatever else non-TSV interconnect has C4 bumps.
That's the main difference. EMIB design has uBumps only on the edge of the die intended for EMIB linking.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
I think this is very different than KBL-G. This is 4C/8T Ryzen + 28CU Vega on one die, with 2GB HBM2 on package (via interposer). It will use vastly less power than Polaris + 8GB HBM2 + EMIB, but it probably won't outperform KBL-G. Definitely not if it's clocked as low as 550Mhz. I think it will be clocked higher (prob 750-850Mhz) but not as high as the gpu in KBL-G

That sounds counterproductive. You'd go with larger GPU(28CUs) to clock it lower. If you go with an on-die solution, now the interposer has to be quite large, because its the size of the APU + HBM2. If its on an MCM, the interposer could be smaller because the GPU is separate and you can do that plus HBM2. Kinda like KBL-G but using interposers instead.

If the performance is lower too, then you end up with a device with significantly larger die, significantly higher packaging cost, but being sold for less.

Being a server part makes more sense.

The 2GB size doesn't make sense either. I don't think Sisoft is misreporting the speed of the memory. If its GDDR5x, then "2.4GHz" means 9.6GT/s and you end up with 38GB/s on a 32-bit interface.
 
Last edited:

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136
This is going places.

Chip designed for Si has no C4 bumps.
Chip designed for EMIB/SLIM/SLIT/whatever else non-TSV interconnect has C4 bumps.
That's the main difference. EMIB design has uBumps only on the edge of the die intended for EMIB linking.

I see your point now. It isn't so much the high density connections, it's the low density ones, though it seems like this could be be mitigated, since these are often power connection and will be using multiple high density microbumps to eventually map to the bigger, low density ones.

I would love to see the changes to the HBM memory for Kaby-G vs use in Video cards.
 

FIVR

Diamond Member
Jun 1, 2016
3,753
911
106
That sounds counterproductive. You'd go with larger GPU(28CUs) to clock it lower. If you go with an on-die solution, now the interposer has to be quite large, because its the size of the APU + HBM2. If its on an MCM, the interposer could be smaller because the GPU is separate and you can do that plus HBM2. Kinda like KBL-G but using interposers instead.

If the performance is lower too, then you end up with a device with significantly larger die, significantly higher packaging cost, but being sold for less.

Being a server part makes more sense.

The 2GB size doesn't make sense either. I don't think Sisoft is misreporting the speed of the memory. If its GDDR5x, then "2.4GHz" means 9.6GT/s and you end up with 38GB/s on a 32-bit interface.

2GB @ 200GB/s gives an excellent high-speed video frame buffer which I believe is the intention. 28CUs clocked very low with 200GB memory bandwidth will give vastly superior performance to Vega 10 in Ryzen 2500 with its 10CU and ~40GB/s of DDR4 memory bandwidth.


I think AMD is targeting the <$200 dGPU space with this product. It will give RX580 + Ryzen 1500 performance in a much smaller and better integrated package. 2GB will be a somewhat limiting factor, but it will still provide far superior performance than any current APU and much lower power use than a solution like KBL-G.

I don't understand why people are having a hard time believing that AMD would create a single-die APU with 28CU. It is likely that SiSoft is having a hard time reading the specs of the APU which is why it is confused about 32 bit/2400mhz. I'm sure it's trying to read the DDR4 memory bandwidth separately from the HBM2 bandwidth.
 

jpiniero

Lifer
Oct 1, 2010
14,847
5,457
136
If it were a monolithic die, it might be more expensive actually.

Edit: Upon thinking about it, I actually think it's more likely to be Raven Ridge (w/ GPU intact but perhaps not fully enabled) + Vega 12 in an MCM package. The main benefit would be to power down the extra GPU when not gaming. The cost benefit is questionable but you would save space.
 
Last edited:
Reactions: Gideon

Yotsugi

Golden Member
Oct 16, 2017
1,029
487
106
I would love to see the changes to the HBM memory for Kaby-G vs use in Video cards.
I don't think there would be any, it's supposed to use standard ubumps.
It's sitting very close to the die though, just as EMIB requires.
 

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136
I think AMD is targeting the <$200 dGPU space with this product. It will give RX580 + Ryzen 1500 performance in a much smaller and better integrated package. 2GB will be a somewhat limiting factor, but it will still provide far superior performance than any current APU and much lower power use than a solution like KBL-G.

I really don't see low clocked 28 CU solution matching higher clocked 36 CU solution. This will likely be under RX470.

I don't understand why people are having a hard time believing that AMD would create a single-die APU with 28CU.

Big expensive monolithic die applied to niche solution. When it gets that big, what is the advantage of integrating it?

Using some kind of MCM packaging delivers more flexibility and more reuse potential, higher yield rates, etc..

Really the main "intergration" part that creates a benefit, is packaging the GPU together with HBM, not so much the CPU with the GPU.
 

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136
I don't think there would be any, it's supposed to use standard ubumps.
It's sitting very close to the die though, just as EMIB requires.

I thought you said that EMIB requires C4 on the chip? Saying you can use the same chip for EMIB and SI is what I have been arguing all along, and you have been arguing against.
 

Yotsugi

Golden Member
Oct 16, 2017
1,029
487
106
I thought you said that EMIB requires C4 on the chip? Saying you can use the same chip for EMIB and SI is what I have been arguing all along, and you have been arguing against.
Only part of designed for EMIB-designed chip will have ubumps (the one that will latch to the buried bridge itself). The rest will be C4.
 

Glo.

Diamond Member
Apr 25, 2015
5,765
4,670
136
Big expensive monolithic die applied to niche solution. When it gets that big, what is the advantage of integrating it?

Using some kind of MCM packaging delivers more flexibility and more reuse potential, higher yield rates, etc..

Really the main "intergration" part that creates a benefit, is packaging the GPU together with HBM, not so much the CPU with the GPU.
How much bigger in your opinion the die would be? Because 17 CUs added to RR and HBM2 PHY would give 270-280 mm2 die. Two Shader Engines with 14 CUs, each. I suggest looking at the Raven Ridge die. If RR die costs AMD low enough to be able to sell it for 150$ on desktop, at best(Ryzen 5 2400G), then 33-40% bigger die only will bring AMD one thing - bigger margin. That is main reason why I believe this is monolithic design. To save costs, ultimately on manufacturing.

Zeppelin die cost AMD to manufacture around 40$. RR should not cost more than that. Knowing this 60-75$ manufacturing cost for 270-280 mm2 die with interposer and HBM2 stack does not look that car fetched for something that can cost somewhere between 299 and 400$.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,595
136
How much bigger in your opinion the die would be? Because 17 CUs added to RR and HBM2 PHY would give 270-280 mm2 die. Two Shader Engines with 14 CUs, each. I suggest looking at the Raven Ridge die. If RR die costs AMD low enough to be able to sell it for 150$ on desktop, at best(Ryzen 5 2400G), then 33-40% bigger die only will bring AMD one thing - bigger margin. That is main reason why I believe this is monolithic design. To save costs, ultimately on manufacturing.

Zeppelin die cost AMD to manufacture around 40$. RR should not cost more than that. Knowing this 60-75$ manufacturing cost for 270-280 mm2 die with interposer and HBM2 stack does not look that car fetched for something that can cost somewhere between 299 and 400$.
Amd have to find funds for the extra 50M startup cost plus the man power for it. In reality probably prioritize it over 7nm ryzen 2 TTM.
Is it worth it?
 

Glo.

Diamond Member
Apr 25, 2015
5,765
4,670
136
Amd have to find funds for the extra 50M startup cost plus the man power for it. In reality probably prioritize it over 7nm ryzen 2 TTM.
Is it worth it?
If the profit is high enough - yes it is worth it.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |