Why have AMD APUs failed on the market?

III-V · Feb 1, 2015

NUSNA_Moebius said:
Mobile platforms

Yep. I've noticed some really killer deals from AMD here as well.

III-V · Feb 1, 2015

redzo said:
It's damn hard to review a smartphone or laptop cpu objectively. The overall product itself dictates the success of the cpu, not the cpu itself.

You can evaluate it objectively, just like the desktop form factor.

The laptop and smartphone buyer cares more about the overall product itself(looks, product battery life, dimensions, weight, price) than the hardware platform it accompanies.

If you're looking strictly at the CPU, those qualities do not matter.

mrmt · Feb 1, 2015

NUSNA_Moebius said:
Mobile platforms. But most consumers don't know shit and those that do will buy an Intel + dGPU system if they really need it unless they are monetarily constrained. Depending on their prospective workload though, they might still opt for Intel because of the superior CPU performance.

Not only CPU performance, battery life also matters a lot for mobile, and AMD is quite behind here.

Ramses · Feb 1, 2015

III-V said:
Cars aren't even remotely relevant. You don't test drive a computer yourself -- it's the software that interfaces with your hardware. (As an aside, if you wanted to go crazy with measurements, you very well could build an accurate simulation of a car, and know exactly how they handle, without stepping foot into one.

Forming your own opinion of hardware, without paying any attention to what numbers your software is spitting out and comparing them other hardware's numbers, is stupid (excluding extraneous stuff, like reliability or customer service). Computer science is an objective one -- it's not art.

The only opinion that matters, in this context, is that of software.

Read more Richard Bach. Start with A Gift of Wings.

The phrase "more than the sum of its parts" is of note. Anyone in the automotive field beyond a dullard spinning wrenches will confirm that one cannot in fact judge a complex machine by numbers alone. No more than you can judge a person by a bio or resume.
This is as much because of the fickle nature of the operators perception as it is the many complex interactions in the machine but it exists. CPU's alone are a bit of a yawn but as a functioning system they each have character and/or quirks and personality. Of course this is all human perception, but it's no less there because of it. One does not have to look far into the modding community to see folks putting parts of there own psyche into them, same as they do automobiles.

If you don't see or feel these things from man's mechanical children I'm truly sorry. I can't imagine life without.

mrmt · Feb 1, 2015

redzo said:
Mobile AMD cpus failed and will fail because there are not enough mobile products using them.

AMD's mobile quad cores are low power enough and should have competed with intel's low power i3's. If you check available market products, they don't. That's because intel affords to sell lower spec alternatives at lower prices. The result is that manufacturers choose them(intel's) in the quest of maximizing profit. It is hard being Lenovo, Dell, HP and ditch an offer like that when you profit most from high volume sales.

Well, from what you are saying here AMD APU failed because there is no market for them. It was too uncompetitive in terms of costs to make through the supply chain and reach the consumers.

The way I see AMD APU vs Intel Core line is the same way I see ARM vs Bay Trail. Bay Trail was a competitive product in terms of performance and power consumption with ARM, as the 46 million of Intel-powered tablets show, but Bay Trail is too inefficient as a product to compete against ARM, hence Intel had to compensate with contra-revenue in order to make Bay Trail a viable product.

AMD APUs are the Bay Trail of this situation, but AMD balance sheet is unable to support their big core APU the same way Intel can Bay Trail, and the result is this unprecedented retreat we are witnessing, with AMD CPU business shrinking 75% in the last three years.

Had AMD a stronger balance sheet, maybe AMD could drop the prices or even add contra-revenue to make their APUs competitive with other Intel offers

Ramses · Feb 1, 2015

ShintaiDK said:
We already got software using it. And the problem is, it havent really expanded in ages in terms of usage ability. Its a hunt for fools gold for the enduser. 7 years and counting. And the most useful parts we got is PhysX and transcoding. And the last one isnt even dependent on GPGPU.

Is this like the thing where they keep adding cores but software isn't using them overly quickly, but they keep adding them anyway, from the other thread? Google'ing around I see a lot of tech demos and papers about how great it's going to be, and they are all several years old. Other than laptops and SFF or bottom barrel entry level stuff I'm not seeing the attraction still.

ShintaiDK · Feb 1, 2015

Ramses said:
Is this like the thing where they keep adding cores but software isn't using them overly quickly, but they keep adding them anyway, from the other thread? Google'ing around I see a lot of tech demos and papers about how great it's going to be, and they are all several years old. Other than laptops and SFF or bottom barrel entry level stuff I'm not seeing the attraction still.

The problem is it looks great on paper...in theory. But real world is entirely different. Specially when you have to add in cost, complexity and usage. And some things are simply not suited for it. You are not going to see GPGPU replacing FPU units anytime soon.

Just take a look at quicksync. Its dedicated hardware instead.

frozentundra123456 · Feb 1, 2015

Ramses said:
Read more Richard Bach. Start with A Gift of Wings.

The phrase "more than the sum of its parts" is of note. Anyone in the automotive field beyond a dullard spinning wrenches will confirm that one cannot in fact judge a complex machine by numbers alone. No more than you can judge a person by a bio or resume.
This is as much because of the fickle nature of the operators perception as it is the many complex interactions in the machine but it exists. CPU's alone are a bit of a yawn but as a functioning system they each have character and/or quirks and personality. Of course this is all human perception, but it's no less there because of it. One does not have to look far into the modding community to see folks putting parts of there own psyche into them, same as they do automobiles.

If you don't see or feel these things from man's mechanical children I'm truly sorry. I can't imagine life without.

I would agree with you on a car. There are still a lot of objectively measureable metrics, but also a lot of intangibles such as ride, subjective handling, styling, and so forth that influence the user experience and how they percieve it. It is also a much more complex machine than a computer.

I does seem a bit of a stretch to try to extend that to a computer though. With the exception of case styling or modding, pretty much everything you do with a computer can be measured objectively. I just dont by the "loses the benchmarks but feels faster" or "has a better user experience" arguments.

Ramses · Feb 1, 2015

I can see it's a stretch, they aren't the same animal in all ways, but some of it crosses over. Absolutely it's all human projection, but it's the only never-changing factor, all the users are human. So it's there. It's interesting to me being in the old car business, that people cast off old tech by and large as quickly as possible, but will pay ridiculous money about for old cars that are clearly and demonstrably inferior to anything new, even in price with the stuff I deal with. The subjective and intangible matters with one machine, but not the other. Curious.

I think anyone old enough or who has had an old carbureted vehicle as a daily driver for years is more apt to understand.
Especially a Solex 4v. A Solex 4v will make a man find religion. Or turn from it.

In the end i can't speak for anyone else but computers I build have personality. My taste in them is as eccentric as in motor vehicles too. My non-work toy cars are frequently old German diesels of which there are no end to the tales told of their being slow and loud and smelly and blowing headgaskets and insane repair costs, etc, etc, etc. I've never experienced any of that with over a million miles logged across a few of them in the last twenty years. Not unlike the much maligned FX CPU's, mine have performed very well. You can measure how slow the car is or how poor the FX performs, but it does not in my experience adequately relate what life is like with either of them on a daily basis. Reducing things purely to numbers removes my interest in short order. It's valuable and part of a whole, but not the whole itself ime. I'm not upset with anyone that thinks otherwise or doesn't "get it" unless they aggressively dismiss those who do.

III-V · Feb 1, 2015

frozentundra123456 said:
I would agree with you on a car. There are still a lot of objectively measureable metrics, but also a lot of intangibles such as ride, subjective handling, styling, and so forth that influence the user experience and how they percieve it. It is also a much more complex machine than a computer.

Those things that you subjectively prefer can be objectively quantified, though. You could be given a sheet of data in regards to handling of two vehicles, and if you had the knowledge to be able to interpret that data, you'd know which car would be the better choice for you, in terms of handling without even stepping into the car.

A CPU though... It's simply a matter of which is better for your workloads.

Ramses · Feb 1, 2015

That's an engineer's response, and perfectly valid and it has it's place.
But it's only one part of a multifaceted issue. Someone told me once that knowledge was
knowing things and wisdom was knowing things from all perspectives. Easier said than done
but I keep trying.

If it was strictly true, one could at this stage after so many decades of the automobile, fill out a questionnaire and be directed to the ideal vehicle. It should be a trivial matter.
This is not the case however.

One of my customers gave up on a project he was never going to finish in his lifetime the other week, it sold for sixty thousand on ebay. It had not motor or trans, was not in it's as-delivered configuration which is more desirable these days, had some rust issues and was generally a bit raggy. What's more is even in the prime of it's life thirty plus years ago, it was an ill handling, slow, inefficient, generally annoying car by even 1990's standards.
But despite all those easily measurable things, it's a joy to drive and in incredible demand.

Nothing is ever simple when you mix people in.

frozentundra123456 · Feb 1, 2015

III-V said:
Those things that you subjectively prefer can be objectively quantified, though. You could be given a sheet of data in regards to handling of two vehicles, and if you had the knowledge to be able to interpret that data, you'd know which car would be the better choice for you, in terms of handling without even stepping into the car.

A CPU though... It's simply a matter of which is better for your workloads.

Like I said, for a computer, I would agree. However, a car is more complex. I dont want to get too far off topic, but take the "handling" of a car. You can measure lateral acceleration for instance. Take two cars that I had. An Altima and a Civic. The Altima had wide tires and a stiffer suspension, but the steering feel was heavy at low speeds and too light at high speeds. The civic OTOH, had a nicely weighted feel at both low and high speed, but narrow tires and a soft suspension. If you measured the lateral acceleration or lane change speeds of the Altima, I am sure it would win in both tests, but the Civic felt easier to control because the steering feel was more consistent at various speeds.

Now for a computer, I dont think there is much of this type "feel" that comes into play. But definitely there is for a car.

Edit: I work in science, so I am a strong believer in objective measurements. I am sure what I was talking about could be measured, but I dont think the tests have been devised yet.

VirtualLarry · Feb 1, 2015

frozentundra123456 said:
Now for a computer, I dont think there is much of this type "feel" that comes into play. But definitely there is for a car.

Cue the "AMD just feels smoother" posts.

Ramses · Feb 1, 2015

I can't speak as to a particular AMD setup being "smooth-er". Only that the handful I've spent any significant time with have been "smooth" enough given good backing components.

First time I ever heard that was folks referring to dual CPU systems, socket 8/Slot 1 era. Those with them said well heck yeah, sure does. Those without said nah BS my single CPU is way faster. I bought one to find out what the deal was myself.

Reminds me of another interesting experience along about that time.
I discovered I could hear when my CPU was under load.
At the time I had a few local computer buddies and they collectively thought I was full of it. And not un-rightfully so at the time.
I eventually ran into an electrical engineer that explained high frequency noise from electrical components, what is commonly known these days as "coil whine" but was not common back then and I had a big ole satisfying "told-ya-so" at the expense of my comrades. I was even able to measure the sound with the help of a musician friend. My takeaway from that and a handful of other similar experiences was to try hard to resist the natural impulse to dismiss something I had not witnessed or understood myself. I don't always succeed but I try and am a better man for it I believe.

There's another great story in there about a worn vane assembly on a VE injection pump and advance pressure curve/temperatures I'll spare you guys.

DrMrLordX · Feb 1, 2015

cbn said:
Yes, I definitely think die size should not be larger. I am all for die size to be smaller.

P.S. I'm not sure why you think Kaveri with strock heatsink isn't constrained by thermals when the iGPU clock is only 720 Mhz. (Yes, lack of memory bandwidth is one reason not to run iGPU clocks higher for gaming, but lack of bandwidth should not affect OPEN CL apps which could benefit from a faster iGPU clockspeed)

In contrast, AMD clocks R7 250 @ 1000/1050 Mhz, R7 250X @ 1000 Mhz, R7 260X @ 1100 Mhz.

Memory bandwidth actually is an issue, even for certain compute tasks. Not every GPGPU app is going to run like GPUPI (which is completely unaffected by memory speed beyond some very low threshold that I have been unable to find).

Having owned and operated an APU (albeit a 384-shader one), I find it difficult to budge the CPU's thermals or VRM temps by loading the GPU, even when it is clocked to 1028 mhz (and I have run it higher for GPUPI). Maybe I need to get a kill-a-watt and measure it that way?

greatnoob said:
PCIe or not it shouldn't matter, the more compute hardware, the better (that is if you can split the workload to all those units in the first place, which we certainly can at the moment).

PCIe matters because of latency. Let's look at some hypothetical code (oh no it's Java!) snippets and think about how they relate to GPGPU.

First, we have:

Code:

final int somebig = somebignumber;

for (i = 0; i < somebig; i++)
{  
    for (j = 0; j < somebig; j++)
    {
         int3[i][j] = int1[i][j] + int2[i][j];
         int4[i][j] = int1[i][j] * int2[i][j];
         //some other work
         . . .
    }
}
//We start using the values from the target arrays here.
. . .

Here we fill arbitrarily large two-dimensional arrays with values calculated from some other arbitrarily large two-dimensional arrays. Then we use it for something . Provided the value of somebig is big enough, there's going to be a metric ton of work to be done before we ever use any of the values from the target arrays, so latency probably doesn't matter, since it would take a long-ish time to do it with something other than your available GPU resources. Passing the workload off to a dGPU makes sense here. Who cares if it takes 32 extra PCI-e bus clock cycles (or however long it is) to pass the workload off to the dGPU and then get it back?

Now compare it to this scenario:

Code:

final int notbig = 100;

for (i = 0; i < notbig; i++)
{  
    for (j = 0; j < notbig; j++)
    {
         int3[i][j] = int1[i][j] + int2[i][j];
         int4[i][j] = int1[i][j] * int2[i][j];
         //some other work
         . . .
    }
}//Lots of worker threads like this one
//We start using the values from each worker thread here, seperately.
. . .

In this circumstance, you have a bunch of worker threads processing data independently, and the GPU (d or i) is being handed the bulk of the work. But as we can see, no individual worker thread is going to be producing enough work to significantly tax either an iGPU or dGPU since the value of notbig is so low. There's enough work here that using a GPU would be better than relying on the SIMD computational capabilities of most (if not all) CPUs thanks to the large number of similar concurrent threads.

The problem with using a dGPU here is that each worker thread expects results from the GPU after only 10000 loop iterations (100 outer loop iterations and 100 inner loop iterations per outer loop iteration). Remember that passing work along to a dGPU is going to incur latency equal to n number of PCI-e clock cycles (I think 32 is normal?), where n equals the number of PCI-e clock cycles that must pass before an average payload of data can be sent to and then back from a device connected to the PCI-e bus. Alternatively, during those ~32 cycles, we could probably handle most if not all the inner loops from all the worker threads using an iGPU that suffers far lower latency when receiving work (there's a driver involved, so driver-induced latency is a factor I guess).

Up until a certain number of worker threads are operating concurrently, it is rational to conclude that a measly little 512-shader R7 iGPU could knock out all the inner loop contents for all concurrent worker threads before a dGPU could do it, just on account of latency. Below a certain number of worker threads, it would make more sense to pass that work off to the host CPU.

Now, some people would say, "yeah well that's poor software design". But some scenarios will force the programmer's hand into making those kinds of decisions. Imagine, for example, an lpMUD running on top of a mud driver that's been altered to spawn every room object in its own thread (most MUDs run on top of single-threaded drivers). That's kind of a silly example, but at the same time, if you look at what other kinds of game servers have to put up with on a regular basis, it might not seem so silly after all.

There are other examples out there of existing software that is somewhat SIMD-friendly thanks to occasional loops that have a small number of parallelized instructions that could potentially be restructured to use GPGPU as an alternative, provided the thread(s) don't have a lot of waiting to do to get results back from the GPU. Again, that's a situation where an iGPU will beat a dGPU every time.

For point of reference, according to some old documentation highlighting the differences in transmission latency between HyperTransport and PCI-e, the PCI-e bus could have latencies of 21 ns (vs 3.6 ns for HT). Both of those figures are a bit out of date, but still, even an iGPU can knock out a lot of instructions in 21 ns.

Ramses said:
So, when software starts using iGPU for compute, a dGPU which is generally dramatically faster could be used for such just as well?

See above. There are some situations where a dGPU would not be appropriate compared to an iGPU.

NUSNA_Moebius said:
The APU will be right when it can dynamically sent tasks to the GPU compute units purely based on ISA alone, and without specialized software. It needs to work as seamlessly as FPU and SIMD tasks on the CPU already do.

That will be useful from a programming and compiling point-of-view. It would also probably reduce some overhead associated with the driver (latency, etc).

VirtualLarry said:
Cue the "AMD just feels smoother" posts.

Smooth as a baby's Bungbungame!

monstercameron · Feb 1, 2015

DrMrLordX said:
-snip- Bungbungame!

just perfect! :awe:

cbn · Feb 1, 2015

ShintaiDK said:
The problem is it looks great on paper...in theory. But real world is entirely different. Specially when you have to add in cost, complexity and usage. And some things are simply not suited for it. You are not going to see GPGPU replacing FPU units anytime soon.

Just take a look at quicksync. Its dedicated hardware instead.

I think for the average person a lot of this HSA/compute should be scaled back to the smaller chips/mobile.

If a person needs anything beyond basic GPGPU they are probably going for the big guns anyway, not a compromise solution like these expensive to make Kaveri desktop chips.

DrMrLordX · Feb 1, 2015

monstercameron said:
just perfect! :awe:

We are going to make that OEM famous. Or shame them into changing their name to BBG or something.

Ramses · Feb 1, 2015

DrMrLordX said:
snip/ cool explanations

That was awesome, and probably (definitely) over my head.

But!

So dGPU via pcie is latency bound, in the past this has just produced a new interface to alleviate that latency. vlbuss, pci, agp, pcix, pcie, whatever's next.
Compute unit daughter cards? pciS(super)? One of them seems more likely(cheaper) than trying to squeeze a foot long GPU onto the CPU's real estate. Or is the entirety of todays dGPU not needed?

cbn · Feb 1, 2015

DrMrLordX,

You obviously understand the GPGPU technical matters better than I do, but I think these large iGPU big core desktop APUs will need to be much better in value gaming if they are to take off for a few reasons I wrote in this post.

As it stands now, I am not sure I believe even the (rumored) Bristol Ridge APU will be enough. (re: Only four "weak" AMD big cores in late 2016? and it still needs dual channel memory (although it is DDR4) to get the 512sp going.)

At a bare minimum, AMD will need to get 1000 Mhz out of the iGPU as well as cross their fingers 2 x 4GB DDR4 3200 is cheap enough not to impact the total system cost significantly compared to what single channel (4GB/8GB) dGPU gamers will pay for memory.

With that mentioned, even if all these factors come out optimally, I am still very doubtful 512sp iGPU desktop will be the right choice for the company to make. Sure there is octocore Zen (rumored) for FM3, but if that rumor is true then I am concerned why AMD decided to re-use excavator (rather than Zen) in a brand new quad core APU die variant (complete with DDR4 memory controller) in late 2016.

Instead of Bristol Ridge, I would like the company to make an cheaper, but higher cpu performance chip (without Zen octocore's L3 cache) using a small iGPU for FM3.

DrMrLordX · Feb 2, 2015

Ramses said:
That was awesome, and probably (definitely) over my head.

Glad you liked it. Bear in mind that there's a lot more for me to learn on the subject. Once of these days I'll mess with aparapi and see what I can learn that way. All this waiting for Project Sumatra to finally show up in Java 9 is like . . . waiting for desktop Carrizo.

But!

So dGPU via pcie is latency bound, in the past this has just produced a new interface to alleviate that latency. vlbuss, pci, agp, pcix, pcie, whatever's next.
Compute unit daughter cards? pciS(super)? One of them seems more likely(cheaper) than trying to squeeze a foot long GPU onto the CPU's real estate. Or is the entirety of todays dGPU not needed?

Vesa local bus, ah the memories. But I digress.

I was poking around on wikipedia's PCI-e article today, and I found this line to be . . . interesting:

"PCIe sends all control messages, including interrupts, over the same links used for data. The serial protocol can never be blocked, so latency is still comparable to conventional PCI, which has dedicated interrupt lines."

Over the years, changes in expansion buses have mostly brought us higher bandwidth and more capability to run multiple devices without conflicts (anyone who was an early adopter of PCI sound cards may know what I'm talking about here. Ensoniq AudioPCI + 3dfx Voodoo Rush = fail). What it does not seem to have brought us is significantly lower latencies, at least not in the jump from PCI to PCI-e (I will qualify this statement by saying that I am uncertain of PCI and PCI-e latencies being similar as a function of bus clocks or as a direct comparison in nanoseconds). And, for the most part, this makes sense: most expansion cards don't require latencies that low. Graphics cards get away with it by rendering the scene on-card and pushing out to the monitor without having to send anything much back to the CPU for processing. With GPGPU, you can't do that. All the work has to go back to the CPU.

Buuuuut if you look at certain niche network applications, such as super-duper highspeed cluster interconnects, then you will see some expansion slots that HAVE produced lower latencies over the years. Example: AMD's own HTX slot. HTX is a workstation/server standard that basically offers a direct link to the Hypertransport system to an expansion card. It is undoubtedly expensive to implement since the HT topography requires direct links between pretty much every other HT device local to the board. Or . . . something like that.

So if there's any expansion slot out there today in AMD's stable that's really suited to dGPU compute in latency-intensive situations, it's HTX. Intel's QPI spec allows for QPI expansion devices if I recall, though I have not seen them market anything like that or offer QPI slots on their platforms.

NostaSeronX (where has that guy been lately? It's like he vanished after several of the AMD roadmap announcements/leaks . . .) was going on somewhere about a hypothetical "PCI-e over HT" hybrid slot, also dubbed HTX, which might find its way onto certain AMD systems in the near-ish future. It seemed to function in such a way that a small, short-headered HT packet could nest itself in a larger PCI-e packet. It was basically allowing the system to extend an HT link to a device over the PCI-e bus with HT-like latency, which would be hella cool if AMD could make that work, especially since cost-to-implement would (presumably) be low enough for it to make it onto consumer motherboards. That would also make for an acceptably low-latency expansion slot for dGPUs to make them more versatile in latency-intensive situations. That being said, it was NostaSeronX, and . . . yeah.

All that aside, I am going to agree with greatnoob that the future of GPGPU is probably going to involve a lot of cooperation between iGPUs and dGPUs as they work asynchronously to knock out operations much more quickly than you could with traditional x86 cores. Until low-latency expansion slots become a reality for consumer-level machines, the low-hanging fruit (massive, highly-parallel workloads) will go to whatever kind of GPU is available while the oddball stuff (smaller, intermittent parallel workloads handled today via SIMD) will make iGPUs shine if the coders bother with it.

cbn said:
DrMrLordX,

As it stands now, I am not sure I believe even the (rumored) Bristol Ridge APU will be enough. (re: Only four "weak" AMD big cores in late 2016? and it still needs dual channel memory (although it is DDR4) to get the 512sp going.)

Two things here (some of which I've said elsewhere . . .):

1). I think we all know that Bristol Ridge is a stopgap until AMD can replace Excavator with Zen in their APUs (and move to 14nm on all their CPUs). Whether or not this is solely a WSA issue is an interesting question which may never receive satisfactory answer.

2). Don't be so quick to dismiss AMD's 28nm planar Construction cores as "weak". We haven't seen Excavator at work, and it will be awhile before we see it in desktop form. Expect Carrizo to have the usual mobile processor compromises. Regardless . . . Steamroller alone has some real performance surprises as I have learned in recent weeks running code on the thing (and I'm not talking about aparapi, c++ amp, or anything like that).

As an example, take a loot at CRFX's 500m y-cruncher score with a 5 ghz 8350 (161.398s). Now compare that to my 500m run with a 4.7 ghz 7700k (212.737s). I got a little speed boost running Linux, interestingly enough, but still: I was only about 50s off from a Piledriver with twice the modules and a 300 mhz clockspeed advantage. You double up my modules (which is technically impossible on FM2+) so I have 4M and my time would be somewhere in the 110-120s range, probably. Still not world-beating, but it would open a few eyebrows.

beginner99 · Feb 2, 2015

ShintaiDK said:
And all them suffers from the too much IGP focus for something that isnt there to begin with. Its too slow to be useful. And its too fast (if you can say that) for the basic tasks its supposed to handle.

Exactly. And because of the huge iGPU it consumes more power and the die is bigger. Hence it needs a bigger heatsink and fan (more expensive) and a bigger die (more expensive). Most people are better of with an i3 or Pentium for basic internet and office use. iGPU is good enough, CPU is faster and it's cheaper.
Anyone that would actually know what an APU is (very little people) also know it's limitations.

EDIT: And even in a laptop intel + nv solution will not use a lot more power but have way higher performance. And this combination also is in the same price range as the top SKU APU usually was (pretty expensive).

cbn · Feb 2, 2015

DrMrLordX said:
1). I think we all know that Bristol Ridge is a stopgap until AMD can replace Excavator with Zen in their APUs (and move to 14nm on all their CPUs). Whether or not this is solely a WSA issue is an interesting question which may never receive satisfactory answer.

WSA crosssed my find, but I also have to wonder if AMD chose Excavator because Zen was too slow in single thread? (remember Zen will be built on a Samsung 14nm SOC process tech)

DrMrLordX said:
2). Don't be so quick to dismiss AMD's 28nm planar Construction cores as "weak". We haven't seen Excavator at work, and it will be awhile before we see it in desktop form. Expect Carrizo to have the usual mobile processor compromises. Regardless . . . Steamroller alone has some real performance surprises as I have learned in recent weeks running code on the thing (and I'm not talking about aparapi, c++ amp, or anything like that).

Steamroller is a lot weaker than Haswell in IPC. And Excavator (in Carrizo) is supposed to have HDL which should slow down clockspeeds (not sure where single thread performance will end up if the same core (extremely likely) is used in Bristol Ridge, but I think it could be lower than SR)

With that mentioned, I personally would be OK with six Steamroller cores in a value sized chip (ie, No L3 cache, replace the big Kaveri DDR3 PHY (which also has GDDR5 in it) with a smaller DDR3 or DDR4 PHY, replace large iGPU with small iGPU). In fact, I think a cheap hexcore bang for the buck set-up that could be used with a single stick of RAM (if DDR4) and a dGPU without paying the price premium I see with the large iGPU APUs would be awesome.

P.S. The quad core variants of the hexcore die could be used for office machines, with most of the full hexcore chips saved for entry level enthusiasts.

schmuckley · Feb 2, 2015

..because what AMD needs is a real performing big chip..maybe 2 on a 2p board
and they come up with a little bitty weak chip with onboard graphics.
Fail strategy is fail.
They've all but lost server market.
K10 did pretty good in server.

AtenRa · Feb 2, 2015

beginner99 said:
Exactly. And because of the huge iGPU it consumes more power and the die is bigger. Hence it needs a bigger heatsink and fan (more expensive) and a bigger die (more expensive). Most people are better of with an i3 or Pentium for basic internet and office use. iGPU is good enough, CPU is faster and it's cheaper.
Anyone that would actually know what an APU is (very little people) also know it's limitations.

A little example to show you exactly how wrong you are in all counts. Not only the larger iGPU is faster than Intels HD4600, but A8-7600 is also cheaper than any Core i3 Haswell(check newegg) and at 45W TDP it needs a smaller heat-sink than 54W TDP of those Core i3 Haswell.
And the best part is that 28nm Kaveri has way higher efficiency (perf/watt) than 22nm FF Core i3 Haswell(iGPU).

Both CPUs are fine for everyday jobs (internet and office), but the AMD APU can play games at higher resolutions and/or higher Image Quality than the Intel CPUs.
For the average consumer that also like to casually play games, the AMD APUs are way better by far.

Why have AMD APUs failed on the market?

Senior member

Senior member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Lifer

Lifer

Platinum Member

Senior member

Platinum Member

Lifer

No Lifer

Platinum Member

Lifer

Diamond Member

Lifer

Lifer

Platinum Member

Lifer

Lifer

Diamond Member

Lifer

Platinum Member

Lifer