AMD Carrizo APU Details Leaked

Page 11 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Fjodor2001

Diamond Member
Feb 6, 2010
3,989
440
126
Its quite obvious that big.LITTLE isnt optimal. When the 2 biggest ARM companies have rejected it.
Which two companies do you talk about - Samsung and Apple? Samsung uses it in the Galaxy S5 and S4 too. Apple doesn't use it for the moment, but may do in the future. big.LITTLE is a quite new concept you know.

Also, do you know why some companies are not using it at the moment? Perhaps it's because Qualcomm does not have any uArch (i.e. Qualcomm Krait core variants) that suit the big.LITTLE concept yet. We don't know.
And PR slides doesnt change that fact. While the 3rd largest mainly uses it to sell the notion of moar cores.
State what is technically wrong with them instead if you can find something. Are you saying the graphs are incorrect?
They will suffer in extended production and software cost instead.
And how much die area does an ARM7 core occupy again? Nearly nothing, so production cost will not increase much.

And OS support for big.LITTLE is free and being developed for the Linux Kernel, which is used e.g. in Android where these CPUs primarily will be used. As for App SW, it doesn't need to be changed. The OS makes big.LITTLE transparent to the App SW.
 
Last edited:

Fjodor2001

Diamond Member
Feb 6, 2010
3,989
440
126
big.LITTLE is simply for companies that cant afford the R&D needed.
Then please show us the single uArch that scales perfectly across the complete performance spectrum, i.e. is performance vs power consumption optimal on every point on the curve, from micro-controller to high performance server workloads.

Also, I think the reason Intel does not yet have a big.LITTLE style solution available is that they have no cores suitable for that yet. They basically only have the Atom and normal x86 big core range (the rest are slight variants based on those). What they would need is an uArch that sits slightly below or slightly above the current Atom cores performance wise. Then they could pair an Atom core with one of those.
 
Last edited:

CHADBOGA

Platinum Member
Mar 31, 2009
2,135
832
136
Also, I think the reason Intel does not yet have a big.LITTLE style solution available is that they have no cores suitable for that yet.

Or as Intel stated, "big.LITTLE" is an abomination.

What is the leading device that is equipped with "big.LITTLE"?
 

DrMrLordX

Lifer
Apr 27, 2000
22,035
11,620
136
Okay, so when is Carrizo going to launch again? I've seen guesses posted. Anyone got something concrete?
 

podspi

Golden Member
Jan 11, 2011
1,982
102
106
Or as Intel stated, "big.LITTLE" is an abomination.

I wouldn't go that far. Sometimes there is more than one solution to a problem, and that's ok. Even if we just go by what everyone in this thread is claiming, it is clear there are advantages disadvantages to each approach:

1) 1-core: High R&D costs, higher probability of failing to hit design targets (?), but cheaper to manufacture

2) big.LITTLE: Cheap(er) to design, as long as you have two cores that can hit the TDP targets at the desired performance levels you are good to go, at the expense of die area.


This reminds me of the old bulk-vs-SOI debate (which appears to be over now that SOI isn't quite as useful anymore). SOI had higher manufacturing costs, but you saved on R&D. Depending on what you were doing, you went one way or the other.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
I mean the number of flops/core is about the same if I take your 960 Xeon cores number.

Flops is one of the worst things to use for any crosscompare. X might need 1 flop to do the operation. Y may need 3 and Z needs 8. So even if Z had 7 times the flops of X it would still be slower in that operation.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
1) 1-core: High R&D costs, higher probability of failing to hit design targets (?), but cheaper to manufacture

The R&D bill is always smaller than the COGS bill, so it seems to be a good idea to spend a little more in R&D in order to get a smaller manufacturing costs.

Big Little means duplicated logic between the small and the big cores, and a lot of overhead to manage the transition between using the small and the big cores. This will hurt exactly your COGS bill, while reducing R&D. It's easy to see why Nvidia is pursuing that route, as they are neither high volume nor leader in resource availability, and it might make sense from a TTM POV, but if you are one of the big guys (Intel, Qualcomm, Apple) with adequate resource levels and designing a high volume product, why bother with it? These companies will put a lot of energy to ensure optimum usage of the die layout (the smallest possible to reach established performance targets), big little is a non-starter here, they would rather focus on better power management (higher R&D bill).
 
Last edited:

podspi

Golden Member
Jan 11, 2011
1,982
102
106
The R&D bill is always smaller than the COGS bill, so it seems to be a good idea to spend a little more in R&D in order to get a smaller manufacturing costs.

Maybe, but not always clearly (unless we really think all these companies are incompetent).

The real question is will the savings in R&D be larger than the increase in COGS? I don't know a lot of intimate knowledge about the semiconductor market, but it seems that ARM seems to think so.

Of course, ARM doesn't manufacture anything... hrm

Edit: And yes I agree it makes no sense if you are big, which is why INTC, QCOM, and AAPL don't

What I'm saying is that if somebody can use big.LITTLE to reduce their R&D costs more than their COGS, and the power savings are the same, the consumer doesn't care.
 
Last edited:

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
What I'm saying is that if somebody can use big.LITTLE to reduce their R&D costs more than their COGS, and the power savings are the same, the consumer doesn't care.

Big.LITTLE is in essence, this:

Instead of spending a lot of time and resources optimizing a big core, the company eats some of its gross margins and bolts a small and a big core, both simpler designs than the custom cores of the market leaders, and call it a day. The company is essentially trading less R&D for more COGS, and less R&D means either less engineering resources devoted to the development of the chip or smaller TTM.

Big.LITTLE will have some inherent weaknesses, because both cores won't be as optimized as they should be, meaning that it will never be kings of performance, and once cost optimized designs arrive the design will not be able to compete on cost too.

That is not to imply that companies using big.LITTLE are incompetent, but that they are bound by a different set of constraints other than those of Apple, Mediatek, Intel or Qualcomm and because of that they must look for other engineering solutions. Big.LITTLE is just one of those solutions.

If you are a small player that won't sell that many chips and need something out of the door as soon as possible, then big.LITTLE might make sense, because your COGS bill won't be that high and you can make more money by launching a product faster than your stronger competitor instead of going against him head on with an inferior design.

This is exactly what Nvidia did with Tegra. They launched dual cores and quad-cores faster than their competitors, and their sales boomed, but then when Qualcomm, Mediatek and others launched their next generation designs, Nvidia sales busted big time.

If you are Qualcomm, Intel, Apple, meaning over a hundred million processors of the same core, then it doesn't make sense to go big.LITTLE, because your COGS bill will be HUGE, so that's where you need to focus your energy.
 

Fjodor2001

Diamond Member
Feb 6, 2010
3,989
440
126
It's not a matter of how much R&D money you have. No matter how much R&D money you have you can never create a single uArch core type that scales perfectly across the complete performance range, i.e. is performance vs power consumption optimal on every point on the curve, from micro-controller to high performance server workloads.

So you can always to better with two separate core types and a big.LITTLE style handover at suitable performance levels. Intel could too. The problem is that they do not have any cores suitable for that at the moment. I.e. they do not have any small power efficient cores that sit just below the Atom core in performance at the moment.

ARM has many more core types to choose from at various performance levels, and they are ISA compatible so the OS can move processes seamlessly between core types depending on workload.
 
Last edited:

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
It's not a matter of how much R&D money you have. No matter how much R&D money you have you can never create a single uArch core type that scales perfectly across the complete performance range, i.e. is performance vs power consumption optimal on every point on the curve, from micro-controller to high performance server workloads.

It's not a matter of R&D only, it's a matter of R&D, COGS, TTM constraints and projected TAM. Change one and big.LITTLE feasibility also changes.

Optimizing a single core like Qualcomm or Apple is also a compromise: They need to spend a lot of R&D resources to bridge the power management gap between the standard ARM big core and the standard ARM small core, meaning that they need to compromise on time or spend *far* more R&D resources than the competition to get a equivalent generation chip out of the door. Benefits is that we'll see higher performance than standards parts *and* smaller die size than big.LITTLE parts.

Once the ARM market becomes more mature, I expect big.LITTLE to die because there will be less competitors and those will be focused on costs or performance, not in TTM.
 
Last edited:

Fjodor2001

Diamond Member
Feb 6, 2010
3,989
440
126
Optimizing a single core like Qualcomm or Apple is also a compromise: They need to spend a lot of R&D resources to bridge the power management gap between the standard ARM big core and the standard ARM small core, meaning that they need to compromise on time or spend *far* more R&D resources than the competition to get a equivalent generation chip out of the door. Benefits is that we'll see higher performance than standards parts *and* smaller die size than big.LITTLE parts.

Once the ARM market becomes more mature, I expect big.LITTLE to die because there will be less competitors and those will be focused on costs or performance, not in TTM.

As I said before:

"It's not a matter of how much R&D money you have. No matter how much R&D money you have you can never create a single uArch core type that scales perfectly across the complete performance range, i.e. is performance vs power consumption optimal on every point on the curve, from micro-controller to high performance server workloads."

So even if Qualcomm spends all these R&D money on their CPU arch, they can always do better perf/watt-wise over a larger performance range if they develop an additional uArch core more suitable to a lower or higher performance compared to what they already got, and pair it with that.

Sure, they'll have to pay with some additional die area. But an ARM7 core occupies so little die area the cost is negligible.
 

Fjodor2001

Diamond Member
Feb 6, 2010
3,989
440
126
Also, note that big.LITTLE does in fact not necessarily mean that you'll have to pay with additional unused die area, assuming only the big or the little cores will ever be active simultaneously. Because there's an execution mode where all cores are active, both the big and little ones. It is used when max performance is needed. So there will be no wasted silicon in that case.
 

podspi

Golden Member
Jan 11, 2011
1,982
102
106
Big.LITTLE is in essence, this:
...

How much longer are we going to argue while agreeing with each other :hmm:. I don't disagree with anything you said, and as far as I can tell, nothing you've said contradicts what I said.


My entire point is that Big.LITTLE has legitimate reasons for being used, and isn't inherently worse if the company isn't large enough to warrant the extra R&D. Which is what you just said! :biggrin:
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
Also, note that big.LITTLE does in fact not necessarily mean that you'll have to pay with additional unused die area, assuming only the big or the little cores will ever be active simultaneously. Because there's an execution mode where all cores are active, both the big and little ones. It is used when max performance is needed. So there will be no wasted silicon in that case.

When all cores are being used the small cores will be operating outside their optimum frequency/power range, and this will eat in the power/thermal budget of the big cores while generating a corresponding lower perf/watt for that level. When the just one set of cores is being used, the other will be leaking current, eating in the power budget of the processor while not processing anything, this on top of the overhead of managing two kind of cores if you have to switch from one set to another. In both cases we have wasted silicon, in both cases we have wasted power/thermal budget.

Sure, they'll have to pay with some additional die area. But an ARM7 core occupies so little die area the cost is negligible.

There is no such thing as negligible costs when we are talking about 100+ million units. Save 10 cents on the die and you are getting 10+ million straight to your pocket. Now if you are Nvidia, shipping 10-15 million of each die you design, the math changes, because R&D will be a bigger ratio of your COGS.
 
Last edited:

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
My entire point is that Big.LITTLE has legitimate reasons for being used, and isn't inherently worse if the company isn't large enough to warrant the extra R&D. Which is what you just said! :biggrin:

Yup!
 

Fjodor2001

Diamond Member
Feb 6, 2010
3,989
440
126
When all cores are being used the small cores will be operating outside their optimum frequency/power range, and this will eat in the power/thermal budget of the big cores while generating a corresponding lower perf/watt for that level. When the just one set of cores is being used, the other will be leaking current, eating in the power budget of the processor while not processing anything, this on top of the overhead of managing two kind of cores if you have to switch from one set to another. In both cases we have wasted silicon, in both cases we have wasted power/thermal budget.
No, they are not used outside their optimal frequency/performance range. Neither the Cortex A7 nor the A15 will not go beyond the max end of they respective optimal range. They will both be at their Highest Operating Point. See:



Regarding leaking current, it's minimal when the complete core is power gated and shut off.

There is no such thing as negligible costs when we are talking about 100+ million units. Save 10 cents on the die and you are getting 10+ million straight to your pocket. Now if you are Nvidia, shipping 10-15 million of each die you design, the math changes, because R&D will be a bigger ratio of your COGS.
A single uArch core type will not necessarily be smaller. If you intend to design it so it's suitable for a wide performance range, you might have to pay for that with designs requiring additional die area too. An Atom core might even be larger than an ARM Cortex A15 + A7.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
No, they are not used outside their optimal frequency/performance range. Neither the Cortex A7 nor the A15 will not go beyond the max end of they respective optimal range. They will both be at their Highest Operating Point. See:

If you have a given workload that that requires the bigger core, instead of using ALL the available power budget for the big core, a share of the budget will be used with the small core. For that higher performance level, it's obvious to say that the bigger core is more efficient than the smaller core, because if it wasn't the entire point of the bigger core would be rendered moot, and yet a big.Little design is sacrificing a bit of the thermal budget with the small/big core.

Minimal is not the argument here. Minimal > 0, and since we are talking about extremely constrained power environments here, minimal is a lot worse than 0.

And you are considering a perfect workload to run with the hybrid cores all running at the same time, on some workloads you might end up with a performance regression. If two threads are dependent and one is running in the big core and other in the small core, once the big core finishes the job it will have to wait for the small core to deal with the thread or pay the penalty to move the thread from the small core to the big core, while a design with a single type of core will load the other thread right away and process it. That's a programmer's nightmare here, much worse than Hyperthreading.

A single uArch core type will not necessarily be smaller. If you intend to design it so it's suitable for a wide performance range, you might have to pay for that with designs requiring additional die area too. An Atom core might even be larger than an ARM Cortex A15 + A7.

You are comparing the core, but the core is just a small part of the SoC. Think about the core, the uncore, cache, etc. Big.LITTLE adds a lot of complexity for the design. It's not for anything that Atom and Krait have very lean die sizes despite being the top performers on the market.

Anyway, I rest my case on the subject.
 
Last edited:

Fjodor2001

Diamond Member
Feb 6, 2010
3,989
440
126
@mrmt:

To make it simple:

1. Please let us know what Intel core is more power efficient than an ARM Cortex A7 at low workloads.

2. Please let us know what CPU core scales perfectly across the complete performance range, i.e. is performance vs power consumption optimal on every point on the curve, from micro-controller to high performance server workloads.

If your theory is correct that Intel with all its billion dollars of R&D budget should be able to develop a single uArch core that match this, the questions should be easy for you to answer. If not, your theory is likely not correct.
 

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
1. Please let us know what Intel core is more power efficient than an ARM Cortex A7 at low workloads.
I think A7 is sort of an exception, because it's an in-order design. I don't exactly know its performance/watt, but I'm pretty sure Silvermont can compete quite easily with it. And not only does Silvermont have a great performance/watt like A7 does, it also scales up nicely to the high-end of tablets, while A7 is only slow (but efficient).


2. Please let us know what CPU core scales perfectly across the complete performance range, i.e. is performance vs power consumption optimal on every point on the curve, from micro-controller to high performance server workloads.
Core is the best microarchitecure that fits your description. It is both the fastest and most efficient at high performance workloads >130W, and it also scales to just a few watts (and possibly even lower) for tablets.

But you've got to be realistic. If you're covering almost 2 entire orders of magnitude, you're already doing extremely well with your architecture.

Core is a very interesting microarchitecture: it's made for high-end desktops (highest single thread performance), but made to be very efficient too: they use a rule where a new feature that costs X% more power, must be at least 2X% more powerful. As a result, Core stays the best performing microarchitecure while needing less energy and thus becoming better suited for low power every die shrink. So you could keep waiting and odds are Core will sometime become suited for "micro-controllers to high performance server workloads."

But like I said, if you want 3 orders of magnitude of dynamic range, you're without doubt going to have to make sacrifices and become less optimal at some points along the curve.

So at some point, it's better to just develop a new architecture, which if you're doing it right like Intel does, has a substantial dynamic range.

ARM on the other hand has a whopping ~3 different architectures (A7, A9/A12/A17, A15) built to be used with TDPs from 0.5W to 5W, which Intel does, and does better, with just 1.

If your theory is correct that Intel with all its billion dollars of R&D budget should be able to develop a single uArch core that match this, the questions should be easy for you to answer. If not, your theory is likely not correct.
Intel might not fulfill your ideological view of a perfect microarchitecture, but I think they do a great job. If my theory is correct, we should see Core in smartphones within the next few generations. I'm quite sure that could happen, but at that time, Intel will already have the much more cost optimized Atom architecture.
 

NTMBK

Lifer
Nov 14, 2011
10,322
5,352
136
Witeken, don't fool yourself about "three orders of magnitude". I have a Haswell tablet and it's a very nice device, but it runs pretty hot sometimes; this isn't a 7W part, despite the "SDP" marketing. As for the 130W parts, they have considerably higher core counts to spread that TDP between- 8 for the Haswell E parts, up to 14 for their workstation counterparts. A single Haswell core doesn't go above ~30W I expect, even with Turbo enabled, unless you start heavily overclocking it. That's roughly an order of magnitude that it scales over, same as any other processor core.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
@mrmt:

To make it simple:

1. Please let us know what Intel core is more power efficient than an ARM Cortex A7 at low workloads.

2. Please let us know what CPU core scales perfectly across the complete performance range, i.e. is performance vs power consumption optimal on every point on the curve, from micro-controller to high performance server workloads.

No problem making things simple, but I do have problems with red herrings like the one you posted here.

I never said that A7 isn't efficient at low power workloads, but that it is not more efficient than A15 or krait in high power workloads. But I think we need to enlarge the scope here.

You should not evaluate A7 at low power workloads only, but *the entire SoC* against *the entire set of workloads* you are supposed to run on the SoC. So A7 should not compete against A15 or Krait in low power only workloads, but on high power workloads too. It's obvious that A7 will not be more efficient here, because if it were the SoC makers would add more A7, and not A15 cores to deal with those workloads. And here Krait, Silvermont, Cyclone and others are shining, because despite having to run different levels of workloads they have comparable or better battery life than big.LITTLE devices *AND* have more performance to show when the need arrives.

But, think about it, what are those companies getting with big.LITTLE? Does big.LITTLE devices have comparatively better battery life than Krait or A6 devices? If what you were saying were true, then big.LITTLE devices should have been showing much improved battery life, which clearly is not the case. And are big.LITTLE devices showing comparatively higher performance? no, they aren't. So if you can't extract top performance or top battery life from big.LITTLE devices, what's the point of it from a performance POV?

It's you defending big.LITTLE as a theoretical performance/efficiency leader, it's up to you to show real cases of big.LITTLE devices being more efficient/faster than custom core SoCs.
 

ams23

Senior member
Feb 18, 2013
907
0
0
That is not to imply that companies using big.LITTLE are incompetent, but that they are bound by a different set of constraints other than those of Apple, Mediatek, Intel or Qualcomm and because of that they must look for other engineering solutions. Big.LITTLE is just one of those solutions.

If you are a small player that won't sell that many chips and need something out of the door as soon as possible, then big.LITTLE might make sense, because your COGS bill won't be that high and you can make more money by launching a product faster than your stronger competitor instead of going against him head on with an inferior design.

This is exactly what Nvidia did with Tegra. They launched dual cores and quad-cores faster than their competitors, and their sales boomed, but then when Qualcomm, Mediatek and others launched their next generation designs, Nvidia sales busted big time.

You aren't making too much sense here. First of all, Mediatek has shown off various big.LITTLE designs. Second of all, NVIDIA doesn't actually use big.LITTLE per se, but rather uses a 4+1 architecture. NVIDIA implemented a low power companion CPU core in Tegra 3 and Tegra 4 (and the soon-to-come Tegra K1 v1), but not in Tegra 2 nor in the upcoming Tegra K1 v2. The Tegra sales boom in the Tegra 3 generation had nothing to do with the use of a low power companion core. Rather, they were first to market with quad-core CPU in the ultra mobile space, and they also secured the original Nexus 7 design win. Tegra 4 was delayed by quite a few months which led to the sales dip, but the quad-core Cortex A15 was and is very competitive with quad-core Krait 400.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |