AMD Ryzen SKU and Price Information/Speculation.

JDG1980 · Feb 26, 2017

lolfail9001 said:
It is the HBM2 and interposer it involves that skyrockets the cost, don't you realize that? No, these are NOT getting cheaper.

Why do you think they won't get cheaper? Maturity and mass production almost always makes everything cheaper. Right now, interposers with HBM/HBM2 are being used only on a tiny handful of high-end GPUs, so they are a low-volume specialty item. If they're used on mid-range SoCs being cranked out by the millions, costs will almost inevitably go down, just like they do with all other technology when it enters the mainstream. Of course there will always be some added marginal costs by having an interposer, but those will get more and more minimal as time goes by.

Glo. · Feb 26, 2017

It wont be better than RX 470. In theory it could be, if the clocks would be high enough(lets say, 1.5-1.6 GHz range). But Im pretty sure the APU GPU will not achieve that level of core clocks. 1.4 GHz is the maximum we are looking at, in the best case scenario.

lolfail9001 · Feb 26, 2017

JDG1980 said:
Maturity and mass production almost always makes everything cheaper.

We have neither with HBM and have talks about how to make it cheaper explicitly. And when people talk about making stuff explicitly cheaper, you know it is expensive otherwise. TSVs are hard, apparently.

JDG1980 said:
If they're used on mid-range SoCs being cranked out by the millions, costs will almost inevitably go down, just like they do with all other technology when it enters the mainstream.

They may *go down*. Or there may be yield issues instead, producing millions of useless silicon dies in the process. Once again, HBM is hard enough that AMD did not even bother putting 3rd/4th stack of it on their incoming halo GPU, consider that for a minute. Not to mention, the inherent cost of interposer is easily over 5 times larger than inherent cost of APU die itself, without any memory in consideration. If you claim that Zeppelin is ~$20 to produce, then interposer alone will, from basic math, be ~$100 or more to produce. Then whatever it requires for the memory. And that's with assumption of no yield issues, interposers are sensitive bunch. If there are, which i expect to be there should such design ever get a pass, it would be relatively small in volume, theses costs go up even more. It is not even unrealistic to end up landing at $200 minimum cost for such APU... That does not make it compelling suggestion on cost.

Glo. · Feb 26, 2017

lolfail9001 said:
We have neither with HBM and have talks about how to make it cheaper explicitly. And when people talk about making stuff explicitly cheaper, you know it is expensive otherwise. TSVs are hard, apparently.

They may *go down*. Or there may be yield issues instead, producing millions of useless silicon dies in the process. Once again, HBM is hard enough that AMD did not even bother putting 3rd/4th stack of it on their incoming halo GPU, consider that for a minute. Not to mention, the inherent cost of interposer is easily over 5 times larger than inherent cost of APU die itself, without any memory in consideration. If you claim that Zeppelin is ~$20 to produce, then interposer alone will, from basic math, be ~$100 or more to produce. Then whatever it requires for the memory. And that's with assumption of no yield issues, interposers are sensitive bunch. If there are, which i expect to be there should such design ever get a pass, it would be relatively small in volume, theses costs go up even more. It is not even unrealistic to end up landing at $200 minimum cost for such APU... That does not make it compelling suggestion on cost.

Want to know the reason why they decided to put only 2 stacks? Because its cheap enough to make it not only to the highest end GPUs, and APUs, but also slightly cheaper ones. Vega 11 will cost at best 399$. Mainstream APU will cost at best 299$. Those are the price markets they have targeted. HBM2 gives enough bandwidth, with 2 stacks, and is cheap enough, to make it apparent in wider market.

lolfail9001 · Feb 26, 2017

Glo. said:
Want to know the reason why they decided to put only 2 stacks?

Sure. While you're at it, explain the reason why nVidia's salvage parts for Tesla P100 actually lose a stack of HBM2 instead of any more GPU parts.

Glo. said:
Because its cheap enough to make it not only to the highest end GPUs, and APUs, but also slightly cheaper ones. Vega 11 will cost at best 399$.

We had literally 0 information about Vega 11 so far, so hold your horses.

Glo. said:
Mainstream APU will cost at best 299$.

And here the issue arises: and where does it go? If it can't prove itself more efficient than dGPU+pure CPU/basic APU with hybrid graphics combo, it is waste of money. Your own description ensures it can't fit the bill.

Glo. said:
HBM2 gives enough bandwidth, with 2 stacks, and is cheap enough, to make it apparent in wider market

Sure, find a cost of single HBM2 stack and cost of... let's give it 600mm^2 because these stacks sure look big, interposer. For reference, few years ago, a single GGDR5 512GB (i believe) chip in bulk had a cost of $5 in bulk. That's probably the cost of 1GB one but i won't claim it as a fact.

JDG1980 · Feb 26, 2017

lolfail9001 said:
They may *go down*. Or there may be yield issues instead, producing millions of useless silicon dies in the process.

Bad yields are a sign of an immature process. Yields will improve with time and experience, thus driving the cost per (working) part down.

lolfail9001 said:
Once again, HBM is hard enough that AMD did not even bother putting 3rd/4th stack of it on their incoming halo GPU, consider that for a minute.

If two stacks can provide 16GB with enough memory bandwidth for Vega, then why bother with four? Just to show off? I don't think we can infer technical limitations from this design decision. AMD did four stacks the first time around on Fiji, so they're clearly capable of doing so if they want.

lolfail9001 said:
Not to mention, the inherent cost of interposer is easily over 5 times larger than inherent cost of APU die itself, without any memory in consideration. If you claim that Zeppelin is ~$20 to produce, then interposer alone will, from basic math, be ~$100 or more to produce.

What is your basis for the above figures? Why do you assume that an interposer (produced on an older, fully amortized process) costs 5x as much as an APU (fabricated on a leading edge process)?

Glo. · Feb 26, 2017

lolfail9001 said:
Sure. While you're at it, explain the reason why nVidia's salvage parts for Tesla P100 actually lose a stack of HBM2 instead of any more GPU parts.

32 GB of HBM2 are not possible with 3 stacks of HBM2. Even 16 GB are very hard to do. Every Nvidia GPU has either 32 or 16 GB, with 4 stacks of HBM2.

lolfail9001 said:
And here the issue arises: and where does it go? If it can't prove itself more efficient than dGPU+pure CPU/basic APU with hybrid graphics combo, it is waste of money. Your own description ensures it can't fit the bill.

Wait, so for example. You get 199$ worth of CPU, with 150$ worth of GPU, at 50% power consumption, and 50$ lower cost for consumer, and with this it does not fit the bill? Are you sure about what you have read?

lolfail9001 said:
Sure, find a cost of single HBM2 stack and cost of... let's give it 600mm^2 because these stacks sure look big, interposer. For reference, few years ago, a single GGDR5 512GB (i believe) chip in bulk had a cost of $5 in bulk. That's probably the cost of 1GB one but i won't claim it as a fact.

HBM2 production is bigger than HBM was. You get: AMD with Vega 10, and 11, APUs, both server and consumer. Nvidia with GP100 chips.

And before you will say anything about Vega 11: HBCC is INHERENT part of the architecture. Every single GPU is designed to work with it. It does not mean they have to have HBM2, on package(mobile APUs will not have HBM2 on package). But they have the ability to work with HBM2 memory that is connected to the CPU, regardless if it is through PCIe(internally) or through Thunderbolt 3(Externally). Thats the whole idea behind High Bandwidth Cache.
But dGPUs like Vega 11 and Vega 10 will have HBM2, because that is the only memory controller that Vega is designed to work with, as far is it goes for graphical Memory.

lolfail9001 said:
If you claim that Zeppelin is ~$20 to produce, then interposer alone will, from basic math, be ~$100 or more to produce.

Provide numbers. MUCH bigger interposer for Fiji chip costs AMD 10$. Stop posting things based on what you believe is true, because it turns out - it is not.

Total cost for 230 mm2 die, with HBM2 on package is 40$.

IEC · Feb 26, 2017

Economies of scale should make HBM2 on APUs possible.

But the key question is: when

jpiniero · Feb 26, 2017

IEC said:
Economies of scale should make HBM2 on APUs possible.

But the key question is: when

Probably a year at least after you see it on AMD's midrange GPUs (ie: in the 480 price range).

Glo. · Feb 26, 2017

IEC said:
Economies of scale should make HBM2 on APUs possible.

But the key question is: when

Simple answer. If Vega 10 is Q2, and if Vega 11 is Q2-Q3/2017, then HBM2 APUs will be Q3-Q4/2017

lolfail9001 · Feb 26, 2017

JDG1980 said:
Bad yields are a sign of an immature process. Yields will improve with time and experience, thus driving the cost per (working) part down.

Yep, but we have no information how bad they are.

JDG1980 said:
If two stacks can provide 16GB with enough memory bandwidth for Vega, then why bother with four?

Except these 2 stacks are 4-hi from what we know, so only 8GB. Not to mention, 8-his are like unicorns.

JDG1980 said:
I don't think we can infer technical limitations from this design decision. AMD did four stacks the first time around on Fiji, so they're clearly capable of doing so if they want.

Sure, but Fiji is known to not be exactly profitable.

JDG1980 said:
What is your basis for the above figures?

Eyeballing the DPW difference. That may, or may not be an overestimation, but it is close enough from what i know of their properties.

Glo. said:
32 GB of HBM2 are not possible with 3 stacks of HBM2. Even 16 GB are very hard to do. Every Nvidia GPU has either 32 or 16 GB, with 4 stacks of HBM2.

Nvidia sells 3 stack Tesla P100 and no 32 GB GPU i heard of. Next time, please, use something more than early wccf rumors as description of existing GPU.

Glo. said:
Wait, so for example. You get 199$ worth of CPU, with 150$ worth of GPU, at 50% power consumption, and 50$ lower cost for consumer, and with this it does not fit the bill? Are you sure about what you have read?

You are claiming that such hypothetical 14nm GPU will be 3 times more efficient than power throttled rx470, i.e. consume 40 watts for the same performance. Bold prediction, let's see if it pays off. Also, GPUs depreciate quickly, by the time such APU comes out rx470 may just be an $100 card. Are you sure you are not making up what you want to see and then claim it can happen? Because i sure want myself a new Regera if it works like that.

Glo. said:
MUCH bigger interposer for Fiji chip costs AMD 10$.

Source. I mean, i have found a paywalled report on it, but i am sure as hell not paying more than my own kidney's worth for it. So, source, please. You two have a point, though, older process counteracts the die size somewhat. Make it twice the cost of Zeppelin die instead.

Glo. said:
Total cost for 230 mm2 die, with HBM2 on package is 40$.

Cost breakdown, please.

Glo. said:
Simple answer. If Vega 10 is Q2, and if Vega 11 is Q2-Q3/2017, then HBM2 APUs will be Q3-Q4/2017

See ya on Jan 1st 2018

Glo. · Feb 26, 2017

lolfail9001 said:
You are claiming that such hypothetical 14nm GPU will be 3 times more efficient than power throttled rx470, i.e. consume 40 watts for the same performance. Bold prediction, let's see if it pays off. Also, GPUs depreciate quickly, by the time such APU comes out rx470 may just be an $100 card. Are you sure you are not making up what you want to see and then claim it can happen? Because i sure want myself a new Regera if it works like that.

Have you were paying attention to the Vega architecture breakdown we have seen? High level changes of the architecture make the GPU have at least 25% more powerful. Low-level changes, such as the fabled RF's, account for increased efficiency, and increased core clocks, at the same voltage and thermal envelope, and also results in higher IPC. Then we have ROPs connected directly to L2 cache, which results in less stalls of the pipeline, no talk to memory, which results in memory saved, and therefore - power saved. AMD claims that the GPUs are up to 4 times more efficient than 2015 released GPUs. One of them was R7 370, which was 16 CU unit.
How does it make 4 times more efficient? 2 times higher performance with 50% lower power consumption.

lolfail9001 said:
Source. I mean, i have found a paywalled report on it, but i am sure as hell not paying more than my own kidney's worth for it. So, source, please. You two have a point, though, older process counteracts the die size somewhat. Make it twice the cost of Zeppelin die instead.

Well if the source is paywalled I cannot give you anything more.

lolfail9001 said:
Cost breakdown, please.

18-20$, as of today, cost can go down at the end of the year, for the die. If the Package requires 350-400mm2 interposer, it will add up 4$ to the price. So we are in the range of 25$. Do you think that HBM2 costs 7.5$ for each stack? How much do you think TSV's cost? 200$ a piece? or rather 1$ for each memory stack?

40$ manufacturing cost is worst case scenario. Best case scenario, with lower cost of dies, lower cost of HBM2 make it extremely affordable, to make each APU with HBM2.

JDG1980 · Feb 26, 2017

lolfail9001 said:
Yep, but we have no information how bad they are.

That argument goes both ways. You don't have any solid data for your pessimistic estimates either.

lolfail9001 said:
Except these 2 stacks are 4-hi from what we know, so only 8GB. Not to mention, 8-his are like unicorns.

Where are you getting this from? The official AMD architecture teaser boasts "8x Capacity/stack" for HBM2 vs HBM. That means 8GB per stack, so 2 stacks = 16GB. Backing this up, the leaked slides for Vega 10 specify 16GB of HBM2. (Look carefully near the blurred-out section on the first slide - it's hard to see, but it's there.)

lolfail9001 said:
Sure, but Fiji is known to not be exactly profitable.

What do you mean by "profitable" in this context? If you're saying that the Fiji project as a whole lost money, then you're probably right. But if you're arguing that AMD is actually getting less for each new unit than the current marginal cost of manufacturing it, then that raises the question of why they would even bother to keep producing it rather than just discontinue it with the rest of the 28nm GPU lineup.

lolfail9001 said:
You are claiming that such hypothetical 14nm GPU will be 3 times more efficient than power throttled rx470, i.e. consume 40 watts for the same performance. Bold prediction, let's see if it pays off.

Look at the Radeon Pro WX 5100. It's a pro card so I'm not aware of any gaming benchmarks, but in terms of raw stats (shaders x clock) it's roughly equivalent to an R9 380. And it only consumes 75W - in fact, it's the strongest card in the no-PCIe-connector class currently available from either vendor. Now, consider the fact that, conservatively, at least 10W of that TDP is due to the memory controller. (You can verify this by looking at the Idle vs. Multi-Monitor power ratings for a Polaris card in TPU - the only difference is that Multi-Monitor runs the VRAM at full speed, and it's often a difference of 20W or more.) We know HBM is much more efficient in this regard - Fury X used only 1W more in Multi-Monitor. Add to that the fact that Vega will be adding tiled rendering, which was one of the biggest reasons why Maxwell gained so much in power efficiency, plus various other architectural tweaks and improvements.

Given all this, can something like a Vega version of the WX 5100 fit into an APU power envelope? Probably.

Mopetar · Feb 26, 2017

Glo. said:
Margins do not have to be as good, as they are with Ryzen 5 and 7 chips. The market for APUs is MUCH BIGGER, than just for the CPUs.

There's a limited amount of 14 nm production available. AMD would rather utilize it for high-margin Ryzen and GPU products than they would for an APU they can't sell for as much money and that is going to take up just as much wafer space as the high-margin Ryzen chips.

4C/8T with 16 CU's and 4 GB of HBM2, in two stacks, in 95W TDP package, if performance is right would presumably kill everything on the mainstream and low-end market, even if the APU would cost 300$.

It would cost far more than that when you include the HBM and a 95W TDP means its only good for desktops which aren't as important as getting something in the notebook space. Also, who wants to buy that at $300? I can get a much better Ryzen CPU and pair it with a much better discrete GPU for not that much more and get significantly better performance. Get a $220 R5 and a $130 RX 470 and for $50 more I've got far better performance.

Instead it makes far more sense to try to make a smaller APU that can be sold for $100 - $200. AMD is already going to have a massive edge over then Intel graphics, so just having Ryzen cores is going to make such an APU a powerhouse for that part of the market. AMD just needs to be better than Intel, which they will be with a CPU-part that's within 10% of Intel's best and a GPU-part that blows Intel away. It's also easier to fit that kind of chip into notebooks without intentionally gimping the hardware just to fit into a lower TDP budget.

Building a massive APU this early just isn't a good idea. You'd need to have better wafer capacity, better HBM availability, etc. It's a product that sounds really cool, but makes very little sense for the market right now. Small die APU that can compete in the market segment below Ryzen is what AMD needs.

CatMerc · Feb 26, 2017

Glo. said:
Each Ryzen CPU costs AMD 18-20$ depending on the yield. Fottemberg have said that before, it even is correct if you get the wafer calculator to hand, and try to estimate the yield by yourself alongside with the die size. BitsAndChips also have provided rounded die sizes about the designs for mobile and desktop parts. Mobile part is 170mm2 according to their sources, and desktop is 210mm2, so around Polaris 10, and Ryzen CPUs. Cost of Interposer is 1$ per 10mm2, IIRC. Fiji Interposer cost AMD around 10$, from what we knew about it. So how much HBM2 costs that it would stop AMD to sell it for 299$? 500$ per memory cell?

AMD can sell Polaris 10 GPU with such low markup because the market is bigger than CPU market. If you consider this, and look what APU is, and think about what markets it would be targeted, the market is much bigger, because it combines both: CPU and GPU.

Funniest part IMO. Considering packaging, board, memory, and package of the die itself, 95W APU can be cheaper with 16CU APU can be cheaper to make than Polaris 10 GPU. Because its much less complex, if you think about it.

And you completely forgot the volume, you would produce, which would drive the costs down.

Once again, 16CU Vega GPU with HBM2 on package will be around 40% faster than similar core count and similar core clock Polaris GPU. If we have 1024 core Sapphire RX 460, and it is 10% faster than RX 460, you end with performance exactly between GTX 1050 Ti, and RX 470.

Actually no. CPU's are both higher in margins and volumes, mostly because competition was still alive in the GPU space while Intel was getting high on margins.

lolfail9001 · Feb 26, 2017

Glo. said:
Have you were paying attention to the Vega architecture breakdown we have seen? High level changes of the architecture make the GPU have at least 25% more powerful. Low-level changes, such as the fabled RF's, account for increased efficiency, and increased core clocks, at the same voltage and thermal envelope, and also results in higher IPC. Then we have ROPs connected directly to L2 cache, which results in less stalls of the pipeline, no talk to memory, which results in memory saved, and therefore - power saved. AMD claims that the GPUs are up to 4 times more efficient than 2015 released GPUs. One of them was R7 370, which was 16 CU unit.
How does it make 4 times more efficient? 2 times higher performance with 50% lower power consumption.

Remember AMD's claim of 2.8x perf/watt? Or Polaris presentation at '15 CES? That went well, didn't it. So, until proven otherwise i consider Vega to be a ~50% jump in power efficiency over Fiji PRO. At very most. 120/1.5=80. Sorry man, at most you are having a 25% more power efficient PC with APU for unknown cost.

Glo. said:
Well if the source is paywalled I cannot give you anything more.

Oh, so YOU do not have a source either. Well, then why dream about it.

Glo. said:
18-20$, as of today, cost can go down at the end of the year, for the die. If the Package requires 350-400mm2 interposer, it will add up 4$ to the price. So we are in the range of 25$. Do you think that HBM2 costs 7.5$ for each stack? How much do you think TSV's cost? 200$ a piece? or rather 1$ for each memory stack?

I don't think, i know that each GDDR5 chip costs about ~$5 from PS4 cost breakdown. You are claiming that each HBM stack is 5 times less in spite of being much much harder to manufacture with good yields. See why i call that a dream?

JDG1980 said:
That argument goes both ways. You don't have any solid data for your pessimistic estimates either.

Yes, and the first thing my parents taught is to be pessimistic when you can't know it all.

JDG1980 said:
Where are you getting this from? The official AMD architecture teaser boasts "8x Capacity/stack" for HBM2 vs HBM. That means 8GB per stack, so 2 stacks = 16GB. Backing this up, the leaked slides for Vega 10 specify 16GB of HBM2. (Look carefully near the blurred-out section on the first slide - it's hard to see, but it's there.)

Because i am yet to see a single evidence of 8-his being even produced, let alone 2Ghz 8-his. Needless to say, i assume the conservative position of these being produced for use in accelerators only.

JDG1980 said:
What do you mean by "profitable" in this context? If you're saying that the Fiji project as a whole lost money, then you're probably right. But if you're arguing that AMD is actually getting less for each new unit than the current marginal cost of manufacturing it, then that raises the question of why they would even bother to keep producing it rather than just discontinue it with the rest of the 28nm GPU lineup.

Who told you they are still producing it? Most of what i know about Fiji sales are basically inventory clearance because they did not really move units at all.

JDG1980 said:
It's a pro card so I'm not aware of any gaming benchmarks, but in terms of raw stats (shaders x clock) it's roughly equivalent to an R9 380.

It performs like rx460, actually, a far cry from "strongest card in PCI-e envelope". So yes, downclocking rx470D to 75W TDP just turns it into oversized rx460, not something more visibly efficient.

JDG1980 said:
Given all this, can something like a Vega version of the WX 5100 fit into an APU power envelope? Probably.

If you are going to go so far, then yes, it may, but then value proposition of dGPU or hybrid setup in laptop becomes even stronger.

JDG1980 · Feb 26, 2017

lolfail9001 said:
Remember AMD's claim of 2.8x perf/watt? Or Polaris presentation at '15 CES? That went well, didn't it. So, until proven otherwise i consider Vega to be a ~50% jump in power efficiency over Fiji PRO. At very most. 120/1.5=80. Sorry man, at most you are having a 25% more power efficient PC with APU for unknown cost.

Radeon Pro WX 5100 was indeed 2.8x as efficient as its direct predecesor (FirePro W5100).
WX 5100 (Polaris 10): 3.9 TFlops
W5100 (Bonaire): 1.4 TFlops
That's over 2.7x the performance at the same TDP (75W) in the exact same form factor with the same cooler.
I agree they should not have implied that this would apply to DX11 gaming on RX 480.

lolfail9001 said:
I don't think, i know that each GDDR5 chip costs about ~$5 from PS4 cost breakdown.

Is that what it costs now or what it cost >3 years ago when the PS4 was first released?

lolfail9001 said:
You are claiming that each HBM stack is 5 times less in spite of being much much harder to manufacture with good yields. See why i call that a dream?

Nothing I said related to the cost of the HBM2 stack. I was questioning your estimates on interposer costs.
We also don't know how an APU would make use of HBM. It could be as little as one 2-hi stack of 2GB used as a high bandwidth cache and shared between CPU and GPU, similar to Intel's eDRAM. Infinity Fabric should enable this to work.

lolfail9001 said:
Because i am yet to see a single evidence of 8-his being even produced, let alone 2Ghz 8-his. Needless to say, i assume the conservative position of these being produced for use in accelerators only.

AMD explicitly advertised 8x capacity per stack compared to Fiji. That means 8GB a stack. The Vega 10 leaked slide also supports this. You have adduced absolutely no evidence to the contrary, just personal skepticism.

lolfail9001 said:
It performs like rx460, actually, a far cry from "strongest card in PCI-e envelope".

Produce some evidence for this assertion. RX 460 has 2.1 TFlops at most, which is only ~54% as much as WX 5100. And it has a memory bus only half as wide. Both use the same architecture; with such divergent resources, why would you expect these cards to produce comparable results?

Glo. · Feb 26, 2017

lolfail9001 said:
It performs like rx460, actually, a far cry from "strongest card in PCI-e envelope". So yes, downclocking rx470D to 75W TDP just turns it into oversized rx460, not something more visibly efficient.

Have you actually used the GPU, by yourself? I have my own company that designs and builds efficient computers, apart from PV instalaltions, home battery installations, home automation systems.

We have tested WX5100 as a gaming GPU, just for fun, it was too expensive to consider it viable. You know what? It outperformed the GTX 1050 Ti, while consuming the same amount of power.

We tested it in: DOTA 2, League of Legends, Heroes of the Storm, Overwatch, Doom. In ALL of those games it was faster.
Is RX 460, by your omission faster than GTX 1050 Ti? If it is, it is for sure better buy.

lolfail9001 · Feb 26, 2017

Glo. said:
Have you actually used the GPU, by yourself?

Nope, i have dug up time spy score for it, then went to 3dmark database and learned that rx460 posts similar scores down to 3dmark's variance. In fact, looking at another review with fire strike and unigine heaven, it scores exactly like wx4100/rx460. Oh, the irony of synthetics.

Glo. said:
It outperformed the GTX 1050 Ti, while consuming the same amount of power.

Outperformed 1911Mhz 1050 Ti (that's the clock of 75W TDP 1050 Ti)? Not bad, but too bad there are no numbers to run with.

Glo. said:
Is RX 460, by your omission faster than GTX 1050 Ti? If it is, it is for sure better buy.

Nope, but i would like to see harder numbers, if synthetics are now so drasitcally different from games even in the very ballpark of performance.

Glo. · Feb 26, 2017

lolfail9001 said:
Nope, i have dug up time spy score for it, then went to 3dmark database and learned that rx460 posts similar scores down to 3dmark's variance. In fact, looking at another review with fire strike and unigine heaven, it scores exactly like wx4100/rx460. Oh, the irony of synthetics.

Nope, but i would like to see harder numbers, if synthetics are now so drasitcally different from games even in the very ballpark of performance.

So let me get this straight. You truly believe that 1792 GCN core chip, with 1.086 GHz core clock, and 160 GB/s bandwidth, is slightly faster, or in the same ballpart as 896 GCN core, 1.216 GHz GPU with 112 GB/s of Bandwidth, because it scored similarly in 3dMark?

I can bring the numbers for Overwatch 1080P Epic setting, because I can still remember them. Averages around 75 FPS, vs 63 for GTX 1050 Ti.

RX 460 averages in the same settings 40 FPS. And this I think concludes whole discussion.

SPBHM · Feb 26, 2017

it looks to me like the best efficiency for Polaris/14nm is at 900MHz, so I think some extra CUs and 900MHz would make sense with the faster DDR4... clocks higher than that, and a lot more CUs, not really, I thinking something like 640-768SPs would make more sense!?

imported_jjj · Feb 27, 2017

Glo. said:
First, the aforementioned APU would cost 299$, not 500$.

Secondly, the die cost would be extremely similar to the cost of Ryzen 8C CPU, because of similar die size, for 4C+16CU design.
So pretty much around 20$ per die we are talking about wafer costs.

Thirdly, what APU we are talking about? 4C/8T+12CU is mobile design, with up to 35W TDP. It is called Raven Ridge.
4C/8T+16CU+ HBM2 is most likely Horned Owl APU with TDPs up to 95W, that is targeted for not only Mainstream market, but also embedded, professional, server, and Machine Learning Markets. Those are the markets where the margins can be higher.
Lastly we have Snowy Owl design: 16C/32T, 64CU's, 16 GB of HBM2. Note that it is exactly 4 times bigger design than Horned Owl. Coincidence? Or perfect scaling?

Next is the cost. Whole APU+2 stacks of HBM2 package will not cost more than 40$. Price it at 300$ and you still have huge margin, in a market that is much bigger than CPU only. You are targeting the market where you have BOTH CPU and GPU.

Lastly. Which of the markets would explode with such a product? Small-Form-Factor, efficient designs. NUC's. The markets in which the money lies, currently.

It is the funniest part, that this APU design can be a cure for dying mainstream market.

Im sure we will see this design this year, but it will be at least Q4 2017.

Seriously you want 2 HBM stacks for 16 CUs lol? They have 2 stacks for 64 CUs with Vega 10 at 12.5TFLOP and 200+W.
And how much does it cost, 1GB of DDR4 2133 is 6-8$ today. Even a single stack of 2GB HBM doubles the cost or more, considering the interposer and yields.
Vs a 11-12CU APU they get what?
Higher dev costs (at much lower volumes, sales wise), more than 2x the manufacturing costs (larger die, HBM, interposer) and what do they gain?
11-12 CUs vs 16 CUs at same overall TDP - the 16CU has the HBM adding significant heat and IF perf is higher, the CPU's TDP rises a bit too.
So they more than double the manufacturing costs , dev costs are much higher on a per unit basis and they gain almost nothing in perf at the same TDP for the entire SoC.
If the OS could use the HBM as a system memory to reduce the amount of DRAM required, costs become slightly better but still won't make any sense unless you have a firm commitment in laptop from some big client.
If you consider the basics, CPU, GPU, memory interface, you are better off finding the right balance between how wide the GPU is and clocks than to go after costly solutions that get you nothing.The memory BW is just a limitation they need to accept and deal with. Just like folks in phones don't go 128 bit memory interface because it is too power hungry.

The 16 cores server SKU should be a more traditional MCP (not a monolithic die), the CPUs are not on the interposer so they just add a Vega 10 to the package ,more or less.
As i have mentioned, such a solution comes with minimal cost and could be doable, paring a 4C die with a Vega 11 (lets say 32CU or so- note that Vega 11 might not use HBM an then this solution isn't viable anymore), it would be a niche market for folks that need a very compact machine- like laptops.This would make sense if Vega is good as it would hurt Nvidia but would need to be close in perf, power and cost to Nvidia's discrete offering (+ a discrete CPU) and that might be tough. It wouldn't be an APU per say but close enough.

Anyway, the normal 11-12CU APU will be up to 300-350$, APU's are not cheaper, they just replace some cores with a GPU to serve a certain market, Kaby Lake is an APU.
Anyone expecting Raven Ridge with 4 cores and 11-12CUs at 150$, is delusional. And BTW, Raven Ridge is laptop first, we don't even know if it comes to desktop this year.

The point of using advanced packaging in APUs and CPUs is to lower costs (better yield and lower dev costs) and gain flexibility. To achieve that, they need much cheaper solutions like organic interposer or Intel's silicon bridge.On the memory side ,HBM in its current form is not ideal either, from a cost perspective. It's highly likely that we'll see them use chiplets and advanced packaging but it will take a bit more time.
There is a packaging conference in about a week and there could be some interesting things shown http://www.imaps.org/devicepackaging/

Glo. · Mar 1, 2017

After reading and looking at Vega I have changed my belief how the APU will look like.

4C/8T+16CU+2 GB's of HBM2 with 256 GB/s@ 65 and 95W.

thepaleobiker · Mar 1, 2017

Folks, wccftech has some leaked slides posted up that explains Ryzen SKU naming scheme, as well as how XFR works. Its interesting.

Regards,
Vishnu

Mockingbird · Mar 19, 2017

Did I not say this is going to happen?

I must be an oracle.

AMD Ryzen SKU and Price Information/Speculation.

Golden Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Golden Member

Diamond Member

Elite Member

Lifer

Diamond Member

Golden Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Golden Member

Golden Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Member

Senior member