Intel Broadwell Thread

Page 30 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
You have a misunderstanding. PL3 is only reachable for 10ms. That won't affect much longer running benchmarks like Cinebench, although it might help for response rates.

Even for Haswell it mentions that bit and the duration is same at 10ms.

PL2 is what is reachable for above TDP levels. It can be set to tens of seconds, and its a 25% above TDP. That's the new one that came with Sandy Bridge chips. Which is what you are saying here:

PL3 may be 10ms continuously but it could run multiple times per second or minutes if we don't care about Battery life or heat. That means PL3 may run for 10ms, then drop to PL2 for a few ms and then run at PL3 again for 10ms and drop to PL2 again and again.

If the Battery can sustain the load and the device has an adequate Heat-Sink to be able to sustain that heat load, then there is no limit how many times PL3 may work per second or per minute.
 

Abwx

Lifer
Apr 2, 2011
11,167
3,862
136
don't mind his nonsense. That was the rear metal chassis of the tablet. Obviously they dont really cost thousands of dollars, as many retail tablet use the same setup.

ROFL...

Do you understand something about cooling actualy, yet it was easy to spot the squared surface that is on contact with the chip at the center of the back cover...

Anyway i like the non sense branding...
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
PL3 may be 10ms continuously but it could run multiple times per second or minutes if we don't care about Battery life or heat. That means PL3 may run for 10ms, then drop to PL2 for a few ms and then run at PL3 again for 10ms and drop to PL2 again and again.

If the Battery can sustain the load and the device has an adequate Heat-Sink to be able to sustain that heat load, then there is no limit how many times PL3 may work per second or per minute.

Straight from the Core M Datasheet:
"When operating in turbo mode, the processor monitors its own power and adjusts the turbo frequencies to maintain the average power within limits over a thermally significant time period."
"Turbo Time Parameter: An averaging constant used for PL1 exponential weighted moving average power calculation"

If PL3 runs significant amount of time in order to affect longer-running benchmark times, it would be reflected in power use, and thus thermals. Regardless of whether its running at PL2 or PL3 it all needs to be not exceeding TDP, or PL1. Therefore its inaccurate to say its a "15W CPU" when the TDP is at 4.5W, since it'll have to go back down there anyway. Now Intel might be lying through their teeth and it really goes over set TDP limits but if they act according to their own Datasheets what you described can't happen and its irrelevant.

then there is no limit how many times PL3 may work per second or per minute.
Sure there is!

Even at PL2 there's a maximum duration it can stay at that time before it has to come BACK DOWN. You can cool it using LN2 and connect to a 5000W power supply, but after some time it'll have to go back down to PL1.

And again, Haswell has PL3 too.
In that case, with Core M, it has TDP of 4.5/6W and PL3 of 15W. That means Haswell Y is at 11.5W TDP and 40-60W PL3. So what's your point?
 
Last edited:

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
The first (p)reviews are being published, this one is from a prerelease Transformer T300F. Core M is slightly slower in Cinebench multi than 11.5W (6W SDP) i3-4020Y. There seems to be some discussion about Core M turbo above, but the site reports power consumption <4.5W (although only slightly less than 4.5W). The sustained frequency is 1.3-1.4GHz.

In Cinebench single, Core M (5Y10) is 17% faster than Core i3-4030U (1679MHz average).

Its OpenGL score is 11.6 vs 7.5 of Z3775 and 15.5 of i3-4020Y (the site wonders of the drivers are final).

In 3DMark Cloud Gate, Core M is 2X as fast as Bay Trail.

(Source: http://tweakers.net/reviews/3751/6/...et-met-core-m-processor-getest-conclusie.html)
 

SAAA

Senior member
May 14, 2014
541
126
116
Nice chip: it's not even the top bin but still it manages to double or more the efficiency in most tests, "real world" ones too.
Just look at how close is to the 4020y, or even better how it pummels the Atom models... twice the score in Cinebench single while running 1Ghz less!
 
Mar 10, 2006
11,715
2,012
126
Is anybody here buying a Core M system? If so, would love to hear people's "real world" experience with it.
 

Nothingness

Platinum Member
Jul 3, 2013
2,757
1,405
136
Nice chip: it's not even the top bin but still it manages to double or more the efficiency in most tests, "real world" ones too.
Just look at how close is to the 4020y, or even better how it pummels the Atom models... twice the score in Cinebench single while running 1Ghz less!
That's interesting, I don't only read it like you do: MT score in Cinebench is only slightly better than Bay Trail and with a similar power consumption. Of course this is due to having half the number of cores, but it's also somewhat disappointing.

PS - To moderate my point I had missed that it was 5Y10. I hope the other models will fare better
 
Last edited:

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Is it similar power consumption? I don't know what 2 Broadwell cores consume, but 4 Silvermont ones peak at 2.5W, maybe that's the price of a much better desktop-class feature set (MT, 2X IPC, AVX2, TSX). In any case, all this performance is within a fanless package. I can't wait for reviews comparing it against Android ARM SoCs. The army's best time seems to have passed, while Intel was absent.
 
Last edited:
Mar 10, 2006
11,715
2,012
126
Intel really needs to ditch that 32nm PCH and integrate it onto the same die as the CPU/GPU complex if it wants to take Core M to the next level and have it be a true "premium" alternative to Atom.

Why tout a process lead when a big portion of your chip is on an n-2 process?
 
Last edited:

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
Well presuming they've got some reasonable designers to produce Atom, it has (well will have!) the same process so it won't crush the things

Different sets of compromises/targets of course, with M hitting single thread performance and you'd imagine some much faster turbos/brief burst performance.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Straight from the Core M Datasheet:
"When operating in turbo mode, the processor monitors its own power and adjusts the turbo frequencies to maintain the average power within limits over a thermally significant time period."
"Turbo Time Parameter: An averaging constant used for PL1 exponential weighted moving average power calculation"

If PL3 runs significant amount of time in order to affect longer-running benchmark times, it would be reflected in power use, and thus thermals. Regardless of whether its running at PL2 or PL3 it all needs to be not exceeding TDP, or PL1. Therefore its inaccurate to say its a "15W CPU" when the TDP is at 4.5W, since it'll have to go back down there anyway. Now Intel might be lying through their teeth and it really goes over set TDP limits but if they act according to their own Datasheets what you described can't happen and its irrelevant.

Sure there is!

Even at PL2 there's a maximum duration it can stay at that time before it has to come BACK DOWN. You can cool it using LN2 and connect to a 5000W power supply, but after some time it'll have to go back down to PL1.

And again, Haswell has PL3 too.
In that case, with Core M, it has TDP of 4.5/6W and PL3 of 15W. That means Haswell Y is at 11.5W TDP and 40-60W PL3. So what's your point?

First off all, lets understand what TDP is and that TDP is configurable in Core-M.

From Intel Core-M 14nm Datasheet

http://www.intel.com/content/www/us/en/processors/core/core-m-processor-family-datasheet-vol-1.html
5.0 Thermal Management
The thermal solution provides both component-level and system-level thermal
management. To allow for the optimal operation and long-term reliability of Intel
processor-based systems, the system/processor thermal solution should be designed
so that the processor:
&#8226; Remains below the maximum junction temperature (Tj
Max
) specification at the
maximum thermal design power (TDP).

&#8226; Conforms to system constraints, such as system acoustics, system skin-
temperatures, and exhaust-temperature requirements.
Caution:
Thermal specifications given in this chapter are on the component and
package level and apply specifically to the processor. Operating the processor outside
the specified limits may result in permanent damage to the processor and potentially
other components in the system.

5.1 Thermal Considerations
The processor TDP is the maximum sustained power that should be used for design of
the processor thermal solution. TDP is a power dissipation and junction temperature
operating condition limit, specified in this document, that is validated during
manufacturing for the base configuration when executing a near worst case
commercially available workload as specified by Intel for the SKU segment. TDP may
be exceeded for short periods of time or if running a "power virus" workload.

The processor integrates multiple processing and graphics cores and PCH on a single
package.This may result in differences in the power distribution across the die and
must be considered when designing the thermal solution.
Intel
®
Turbo Boost Technology 2.0 allows processor cores and processor graphics
cores to run faster than the guaranteed frequency. It is invoked opportunistically and
automatically as long as the processor is conforming to its temperature, power
delivery, and current specification limits. When Intel Turbo Boost Technology 2.0 is
enabled:
&#8226; Applications are expected to run closer to TDP more often as the processor will
attempt to maximize performance by taking advantage of available TDP headroom
in the processor package.

&#8226; The processor may exceed the TDP for short durations to use any available
thermal capacitance within the thermal solution. The duration and time of such
operation can be limited by platform runtime configurable registers within the
processor.

&#8226; Thermal solutions and platform cooling that are designed to less than thermal
design guidance may experience thermal and performance issues since more
applications will tend to run at or near TDP for significant periods of time.
Note:
Intel Turbo Boost Technology 2.0 availability may vary between the different SKUs.


1.
TDP is the maximum sustained power that should be used for design of
the processor thermal solution. TDP is a power dissipation and junction temperature operating condition limit.

2. The processor may exceed the TDP for short durations to use any available
thermal capacitance within the thermal solution. The duration and time of such
operation can be limited by platform (a)runtime configurable registers within the
processor.

2(a) refers to PLs

Now lets have a look at the PLs

5.3.1
Package Power Control

The package power control settings of PL1, PL2, and PL3 Tau allow the designer to
configure Intel Turbo Boost Technology 2.0 to match the platform power delivery and
package thermal solution limitations.


3.
PL1, PL2 and PL3 can be adjusted by the designer to match platform power delivery and thermal solution limitations.

5.3.1 (continue)
&#8226; Power Limit 1 (PL1): A threshold for average power that will not exceed -
recommend to set to equal TDP power. PL1 should not be set higher than thermal
solution cooling limits.
&#8226; Power Limit 2 (PL2): A threshold that if exceeded, the PL2 rapid power limiting
algorithms will attempt to limit the spike above PL2.
&#8226; Power Limit 3 (PL3): A threshold that if exceeded, the PL3 rapid power limiting
algorithms will attempt to limit the duty cycle of spikes above PL3 by reactively
limiting frequency. This is an optional setting

&#8226; Turbo Time Parameter (Tau): An averaging constant used for PL1 exponential
weighted moving average (EWMA) power calculation.
Notes:
&#8226; Implementation of Intel
®
Turbo Boost Technology 2.0 only requires configuring
PL1, PL1 Tau and PL2.
&#8226; See the Turbo Implementation guide and BIOS Writers Guide (BWG) for additional
details on use in your system (see related documents section).
&#8226; PL3 is disabled by default.


4:
PL3 : A threshold that if exceeded the PL3 algorithm will attempt to limit the duty cycle of spikes above PL3 by reactively limiting frequency (throttle down)



5.3.2 Turbo Time Parameter
Turbo Time Parameter is a mathematical parameter (units in seconds) that controls
the Intel Turbo Boost Technology 2.0 algorithm using moving average of energy
usage. During a maximum power turbo event of about 1.25 x TDP, the processor
could sustain PL2 for up to approximately 1.5 times the Turbo Time Parameter. If the
power value and/or Turbo Time Parameter is changed during runtime, it may take
approximately 3 to 5 times the Turbo Time Parameter for the algorithm to settle at the
new control limits. The time varies depending on the magnitude of the change and
other factors. There is an individual Turbo Time Parameter associated with Package
Power Control.
----------------

Now, lets see the back plate of that Tablet again. Remember that back plate acts as a heat-sink.





Quote from the tt article
This device is actually cooled through its back plate.


The backplate is high grade aluminum and acts as a giant heat sink.
-------------





Now, from 1,2,3,4

The processor will be able to overcome TDP and even spike OVER PL3, then it will throttle down.
It can operate at PL3 for up to 10ms if the Thermal and current limits are not breached.
Then it will revert to lower PL2 until thermal or current limits are lowered. Then it can again reach PL3 for another up to 10ms, if thermal and current limits allow it.
Then it will revert to PL2 until thermal or current limits are lowered again so it can revert to PL3 again.
The time that it will be able to revert to PL3 state(that is for how many times it will be able to go to PL3 then PL2 then PL3 etc) depends on the Heat-sink and Ambient temperature.
With that Aluminum Backplate the lower the ambient temperature the higher the ability to operate at PL3 for more consecutive time. Also, that Aluminum backplate has higher thermal capacity allowing the processor to reach PL3 state for extended times.

ps: I never said the CPU is 15W TDP. But with that heat-sink it will allow it to sustain long periods of PL3 and PL2.
 
Last edited:

Nothingness

Platinum Member
Jul 3, 2013
2,757
1,405
136
Is it similar power consumption? I don't know what 2 Broadwell cores consume, but 4 Silvermont ones peak at 2.5W, maybe that's the price of a much better desktop-class feature set (MT, 2X IPC, AVX2, TSX). In any case, all this performance is within a fanless package.
Oh I thought it was supposed to consume 4.5W as you wrote. Did I misunderstand? And I agree on the cost of the added features, I just wanted to balance the claim of the post I answered to

I can't wait for reviews comparing it against Android ARM SoCs. The army's best time seems to have passed, while Intel was absent.
You mean something like this?
 

Khato

Golden Member
Jul 15, 2001
1,225
280
136
Intel really needs to ditch that 32nm PCH and integrate it onto the same die as the CPU/GPU complex if it wants to take Core M to the next level and have it be a true "premium" alternative to Atom.

Why tout a process lead when a big portion of your chip is on an n-2 process?
My understanding is that there's not as much gain to be had from moving the PCH on leading edge due to it being primarily I/O limited. Now if they started making it a true SoC with all the other miscellaneous logic that accompanies such then it would start making more sense.

You mean something like this?
If only Geekbench results actually meant something on comparisons between ARM and x86.
 

Roland00Address

Platinum Member
Dec 17, 2008
2,196
260
126
AtenRa said:
Way too busy and I can't follow without getting a headache

Come AtenaRa why are you so cruel with the formating. It is way too busy. I count

Normal Font, Bolded Font, Italic Font, URL Underline Font,
Black, Bolded Black, Magenta, Bolded Magenta, Red, Bolded Red, URL Blue,
Dashed Lines, Quote Boxes, Images, Different Fonts.




That is way too much. I normally like some breaks and changes in formatting for it helps my dyslexic mind but what you wrote was unreadable and gave me a headache. Having 15 style changes does not help, please try to keep it more limited like around 5 or so. If you previewed your post you would have probably noticed how chaotic your formating was.

Whatever it is your post and you can do what you want, but understand that sometimes less is more and there is something as going overboard.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Come AtenaRa why are you so cruel with the formating. It is way too busy. I count

Normal Font, Bolded Font, Italic Font, URL Underline Font,
Black, Bolded Black, Magenta, Bolded Magenta, Red, Bolded Red, URL Blue,
Dashed Lines, Quote Boxes, Images, Different Fonts.




That is way too much. I normally like some breaks and changes in formatting for it helps my dyslexic mind but what you wrote was unreadable and gave me a headache. Having 15 style changes does not help, please try to keep it more limited like around 5 or so. If you previewed your post you would have probably noticed how chaotic your formating was.

Whatever it is your post and you can do what you want, but understand that sometimes less is more and there is something as going overboard.

Im sorry but that was actually as small as i could make it.

To help you, just read 1,2,3,4 and the final part.
 

Roland00Address

Platinum Member
Dec 17, 2008
2,196
260
126
Im sorry but that was actually as small as i could make it.

To help you, just read 1,2,3,4 and the final part.

Okay (trying again) I knew all that already but it is important for everyone to know it Stuff like this is the reason I am not judging broadwell until we have actual devices in hand. You can't make accurate predictions on its performance for you can't take info and extrapolate it for Intel is trying to make it variable in power/performance on purpose (and doing so is a smart decision, in this case dynamic power is all upside and no downside.)

You mean something like this?

One problem with that geekbench comparison is project denver is 4x faster in SHA which is cryptography. Is this a real world task comparing real world problems?
 

Nothingness

Platinum Member
Jul 3, 2013
2,757
1,405
136
If only Geekbench results actually meant something on comparisons between ARM and x86.
Apart from [SD]GEMM (something I would dispute, but that's not the place), what makes the comparison invalid according to you?
 

Khato

Golden Member
Jul 15, 2001
1,225
280
136
One problem with that geekbench comparison is project denver is 4x faster in SHA which is cryptography. Is this a real world task comparing real world problems?

The larger problem is that x86 isn't receiving anywhere near the level of available optimization/acceleration as ARM. In fact, judging by the numbers sisoft is reporting - http://www.sisoftware.co.uk/?d=qa&f=cpu_sha_mb - I wouldn't be surprised if Geekbench is running SHA1 code that's in the 'single buffer ALU' category. So most programs which actually use a SHA1 library that supports AVX2 would see over 6x the SHA1 performance that Geekbench reports.

Apart from [SD]GEMM (something I would dispute, but that's not the place), what makes the comparison invalid according to you?

As per above, the SHA results appear to be using equally non-optimized code for x86.

Edit: To be fair, I have no evidence that Geekbench is fully optimized for ARM as that level of detail on the various ARM architectures is more difficult to come by. But there's plenty of information regarding optimization of these algorithms (specifically SHA and S/DGEMM) on x86, and from such it's quite evident that Geekbench code is only making use of a fraction of the available performance of modern x86 processors.
 
Last edited:

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Oh I thought it was supposed to consume 4.5W as you wrote. Did I misunderstand? And I agree on the cost of the added features, I just wanted to balance the claim of the post I answered to
Not sure what you're referring to, but comparing fanless SoCs is quite straightforward. You just have to wait until they can't use turbo anymore, so they use as much as the devices can handle, and then you'll see their performance at that state. To me it's clear that Silvermont had a lot more potential because tablets can consume more than 2.5W.

You mean something like this?
Real reviews, like AnandTech.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
You can't make accurate predictions on its performance

I didn't make any predictions about performance. I made an analysis of how PL3 could work using Intels available data and how that aluminum backplate can play a role in the device performance.
 

Nothingness

Platinum Member
Jul 3, 2013
2,757
1,405
136
Edit: To be fair, I have no evidence that Geekbench is fully optimized for ARM as that level of detail on the various ARM architectures is more difficult to come by. But there's plenty of information regarding optimization of these algorithms (specifically SHA and S/DGEMM) on x86, and from such it's quite evident that Geekbench code is only making use of a fraction of the available performance of modern x86 processors.
I can guarantee the code for DGEMM and SGEMM on ARM is just plain horrible and could also be made much faster. But the aim of Geekbench is not to get the absolute fastest code for a given subtest, so IMHO it wouldn't make sense to overtune one of the benchmarks.

As far as optimization goes, jfpoole has provided a list of compiler flags he uses here and that looks fair to me.
 

Nothingness

Platinum Member
Jul 3, 2013
2,757
1,405
136
Not sure what you're referring to, but comparing fanless SoCs is quite straightforward.
You were writing this:
Is it similar power consumption? I don't know what 2 Broadwell cores consume, but 4 Silvermont ones peak at 2.5W
But you have also written this:
There seems to be some discussion about Core M turbo above, but the site reports power consumption <4.5W (although only slightly less than 4.5W).
So to me it meant that Core M is less power efficient than BT on multi threaded cinebench.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |