Discussion Intel current and future Lakes & Rapids thread

Page 394 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

RTX

Member
Nov 5, 2020
90
40
61
There's no matching Xeon to the 11900K unlike the W-1290P to the 10900K with identical specs in this generation?
 

DrMrLordX

Lifer
Apr 27, 2000
21,797
11,144
136
It will be a success based on how much effort they put into it. Intel's 14nm would be a massive improvement for many designs. The majority of TSMC revenue comes from 16nm and older processes I think Intel's soon-to-be-spare 14nm could compete well in that space if they help potential customers get their designs working on the process.

Intel is building two entirely new fabs in AZ:


The curious thing here is that the announced fabs are not geared towards Intel's IDM model. It's unclear what nodes will be produced here, or for which customers. Intel's previous foundry efforts have mostly met with failure.
 

Thala

Golden Member
Nov 12, 2014
1,355
653
136
You want a bet or something? Tremont is already at Ivy Bridge levels. Another 30% gets us to Skylake.

Do you know what makes that possible? Because ARM cores can do it.

Nope. Let's not even get that far. Their own "little" core team is owning them..

You are delusional if you honestly believe, you can get the same efficiency from a core design using x86/64 ISA compared to ARM ISA - everything else being the same. Just look at Lakefield to understand where Tremont stands compared to generic synthesizable ARM cores from even few years back.
My expectation is that Gracemont might roughly match Cortex A76 IPC, if lucky - at worse power and size.
 
Reactions: spursindonesia

Thala

Golden Member
Nov 12, 2014
1,355
653
136
Lakefield had problems, yes. But complaining about rendering performance on a 7 W device is probably the worst possible argument you could make against Lakefield. Honestly who buys a laptop with low performance and long battery life with the intention of fast image rendering? That would be like a high-end restaurant buying their ingredients from McDonald's down the street, doing poorly, then complaining that therefore the McDonald's food could not possibly be successful.

That would be an argument, if not something like Qualcomm 8CX (which is using roughly 2 years older core designs), also at 7W TDP, would not outperform Lakefield by a substantial amount. On top of this, battery lasting significantly longer even while outperforming Lakefield (e.g. under load), which points to a much higher power efficiency.
 

dullard

Elite Member
May 21, 2001
25,203
3,617
126
battery lasting significantly longer even while outperforming Lakefield (e.g. under load), which points to a much higher power efficiency.
See, now there is a great argument about Lakefield. Lakefield was supposed to be used in laptops that can run a full 24-hour day. It can't, it only gets 17 hours max.

My point was that to be taken seriously, arguments need to be based in actual performance numbers that matter to the buyers of the device. Power efficiency is an important performance metric that matters to Lakefield buyers.
 

uzzi38

Platinum Member
Oct 16, 2019
2,702
6,405
146
You are delusional if you honestly believe, you can get the same efficiency from a core design using x86/64 ISA compared to ARM ISA - everything else being the same. Just look at Lakefield to understand where Tremont stands compared to generic synthesizable ARM cores from even few years back.
My expectation is that Gracemont might roughly match Cortex A76 IPC, if lucky - at worse power and size.
Lakefield is a terrible comparison point given the node it's on completely blows amongst other issues with the design.

Let me just link a research paper on the topic, which you should read through yourself: hpca13-isa-power-struggles.pdf (wisc.edu), but ultimately the TL;DR is that ISA does not have a significant affect on efficiency. The uArch is what has by far the largest effect on energy efficiency (when normalising the designs to the same node)
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
Lakefield had problems, yes. But complaining about rendering performance on a 7 W device is probably the worst possible argument you could make against Lakefield. Honestly who buys a laptop with low performance and long battery life with the intention of fast image rendering?

It was compared in tests to other low wattage devices and found to be lacking in performance in Cinebench.
I have specifically singled out Cinebench, as it shows worthiness of Atom core to act as multicore performance booster alongside Golden Cove in desktop setting. Tremont is simply not ready for those tasks. What else is there for 8+8 desktop CPU if we remove rendering? Encoding videos? Compression?

I am almost certain that the best first step for all desktop Alder Lake buyers will be going to BIOS and disabling Atom cores. Performance will be way more consistent and not at the mercy of Windows scheduler.
 
Reactions: Tlh97 and coercitiv

dmens

Platinum Member
Mar 18, 2005
2,271
917
136
Lakefield is a terrible comparison point given the node it's on completely blows amongst other issues with the design.

Let me just link a research paper on the topic, which you should read through yourself: hpca13-isa-power-struggles.pdf (wisc.edu), but ultimately the TL;DR is that ISA does not have a significant affect on efficiency. The uArch is what has by far the largest effect on energy efficiency (when normalising the designs to the same node)

You drew the wrong conclusion. The paper says, our models show uarch has a massive effect on power/perf, and, x86 can be low(er) power. True enough. What it cannot refute is that the x86 handling is a significant design burden across the whole product spectrum, from low power to high performance. Academic papers using abstract performance models cannot see that.
 
Reactions: Tlh97 and Viknet

Hulk

Diamond Member
Oct 9, 1999
4,372
2,246
136
I am almost certain that the best first step for all desktop Alder Lake buyers will be going to BIOS and disabling Atom cores. Performance will be way more consistent and not at the mercy of Windows scheduler.

I'm not following? How would reducing the amount of available compute in a CPU result in better performance? It seems as though this would be a massive mistake for Intel if they spent tens of millions of dollars on engineering, design, and production, trying to optimize every square millimeter of die space and then produce a product that would perform better if some of that die area was turned off.
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,101
136
With the Lakefield comparison, in addition to the borked process, don't forget about uncore power.
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
I'm not following? How would reducing the amount of available compute in a CPU result in better performance? It seems as though this would be a massive mistake for Intel if they spent tens of millions of dollars on engineering, design, and production, trying to optimize every square millimeter of die space and then produce a product that would perform better if some of that die area was turned off.

The same as disabling power saving features like downclocking improves performance and performance consistency. If it takes CPU and OS 10-15ms to realize it is under load heavy enough to ramp from 800mhz to 5ghz that is billions of clock cycles missed.

Same applies to OS wrongly scheduling the task on weaker core(s). It has to move threads from one CPU to another, and that is different L2 and it takes cache misses. And in stock configuration, big core also needs to wake from power save modes, ramp clocks and so on. Also some non-deterministic things can happen, like critical GPU driver thread being stuck on small core and OS deciding to keep it there, cause it has history of being idle. Too bad GPU heavy game is running now and your FPS are somehow half and you take off to Reddit and forums to blame AMD.

All that is avoided by not having to choose at all.
 
Last edited:
Reactions: Tlh97 and scineram

dr1337

Senior member
May 25, 2020
385
639
136
The same as disabling power saving features like downclocking improves performance and performance consistency. If it takes CPU and OS 10-15ms to realize it is under load heavy enough to ramp from 800mhz to 5ghz that is billions of clock cycles missed.

Same applies to OS wrongly scheduling the task on weaker core(s). It has to move threads from one CPU to another, and that is different L2 and it takes cache misses. And in stock configuration, big core also needs to wake from power save modes, ramp clocks and so on. Also some non-deterministic things can happen, like critical GPU driver thread being stuck on small core and OS deciding to keep it there, cause it has history of being idle. Too bad GPU heavy game is running now and your FPS are somehow half and you take off to Reddit and forums to blame AMD.

All that is avoided by not having to choose at all.
Im 100% confident intel can do a working big.little design. Theres no reason they should be limited by windows (or any OS) in any meaningful sense.

Id like to think we're well past the days of half baked implementations that only sorta work from billion dollar companies. 8+8 alder lake could be slower in MT than a 12 core zen 4 chip and I'd believe it. But I don't think its going to have any major functional issues from the small cores holding the big ones back outside of power budget. IMO the extra MT performance is better than the slight clock bump you might get if you did turn the little cores off. And thats all assuming intel even lets people turn them off.
 

dullard

Elite Member
May 21, 2001
25,203
3,617
126

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
I could probably list dozens of more ways. I don't know if any of those are the route that will be taken, but there are plenty of options available to eliminate any possibility of penalty.

To each its own. I prefer predictable performance, over some imaginary extension in MT power on desktop I did not ask for ( no need to ask, 34 cores of Intel/AMD sit within 5m ). To that end I lock core/uncore frequency, put windows on 100% performance plan. On 10C CML i even go as far as to disable HT.
Sure MS and Intel can design power plan and scheduler mode called "Ignore small cores until big cores are full 100%", that would be best for desktop users, but most likely we will get "Balanced" instead

P.S. I don't destroy power efficiency just unlike those who disable C1E or deeper package states: my CPU is still quite efficient, despite having static voltage OC:


Just my CPU takes hundreds of uSecs instead of a dozen miliseconds to spring to action.
 
Reactions: Tlh97

lobz

Platinum Member
Feb 10, 2017
2,057
2,856
136
Well.

Intel decided to put a slide into the Ponte Vecchio presentation with the phrase: 47 Magical Tiles in it.

So I'm starting to think we'll see 7nm EUV Intel products maybe in 2025? Maybe.

What's it gonna be in the Sapphire Rapids presentation? Breakthrough Server Solution With Forty-Four Phantasmal Cores?
 

coercitiv

Diamond Member
Jan 24, 2014
6,390
12,814
136
I'm not following? How would reducing the amount of available compute in a CPU result in better performance? It seems as though this would be a massive mistake for Intel if they spent tens of millions of dollars on engineering, design, and production, trying to optimize every square millimeter of die space and then produce a product that would perform better if some of that die area was turned off.
@JoeRambo is more concerned about performance consistency than he is about the extra MT performance brought by the little cores. His example with frequency and sleep states control shows how the same system can be configured to get the same results in classic throughput benchmarks and yet feel more or less responsive.

Many Intel Skylake DYI systems aren't properly configured for the best blend of responsiveness and efficiency:
  1. stock behavior usually ends up with most sleep states disabled, and frequency controlled by the OS (Balanced Profile). Problem here is sleep states save more power than lower clocks, and the OS takes a while to ramp up the cores to optimal frequency - think tens of milliseconds. Some users learn to move to the more agressive High Performance profile, which kills idle efficiency instead.
  2. enabling sleep states in BIOS usually puts the system in a more favorable position, since idle power consumption is way lower... and this also potentially enables SpeedShift - which is a quicker, hardware based CPU frequency control mechanic from Intel. Frequency is still variable, but ramping the CPU up is faster by an order of magnitude to around 1ms as the OS no longer (fully) controls P-States.
  3. further customization can be done by keeping Sleep States but making sure SpeedShift is disabled and the OS does not scale CPU clocks. This ensures the CPU is as responsive as possible while still enjoying most of the energy saving benefits .
I went with option 2, while JoeRambo probably went with option 3. Regular users may get served with option 1, meaning less responsiveness and less efficiency combined (but more stability for auto-overlocks by means of enabling MCE).
 
Last edited:

Hulk

Diamond Member
Oct 9, 1999
4,372
2,246
136
@JoeRambo is more concerned about performance consistency than he is about the extra MT performance brought by the little cores. His example with frequency and sleep states control shows how the same system can be configured to get the same results in classic throughput benchmarks and yet feel more or less responsive.

Many Intel Skylake DYI systems aren't properly configured for the best blend of responsiveness and efficiency:
  1. stock behavior usually ends up with most sleep states disabled, and frequency controlled by the OS (Balanced Profile). Problem here is sleep states save more power than lower clocks, and the OS takes a while to ramp up the cores to optimal frequency - think tens of milliseconds. Some users learn to move to the more agressive High Performance profile, which kills idle efficiency instead.
  2. enabling sleep states in BIOS usually puts the system in a more favorable position, since idle power consumption is way lower... and this also potentially enables SpeedShift - which is a quicker, hardware based CPU frequency control mechanic from Intel. Frequency is still variable, but ramping the CPU up is faster by an order of magnitude to around 1ms as the OS no longer (fully) controls P-States.
  3. further customization can be done by keeping Sleep States but making sure SpeedShift is disabled and the OS does not scale CPU clocks. This ensures the CPU is as responsive as possible while still enjoying most of the energy saving benefits .
I went with option 2, while JoeRambo probably went with option 3. Regular users may get served with option 1, meaning less responsiveness and less efficiency combined (but more stability for auto-overlocks by means of enabling MCE).

I understand that frequency and sleep state controls are designed to reduce performance when the system deems it not needed in order to save energy. I'm not sure I totally follow the analogy to the small cores also actually being a performance detriment when active. It has been my experience that unless you get really aggressive with all of the power saving settings, most of the time they operate behind the scenes and are not felt by the user.

But then again there are a lot of questions that won't be answered until Alder Lake is officially released and this is just one to add to the list.
 

dullard

Elite Member
May 21, 2001
25,203
3,617
126
I understand that frequency and sleep state controls are designed to reduce performance when the system deems it not needed in order to save energy. I'm not sure I totally follow the analogy to the small cores also actually being a performance detriment when active.
It is up to Intel to put up or shut up on this topic when Alder Lake launches. Intel claims: "Alder Lake will involve Intel’s next generation hardware scheduler, which we are told will be able to leverage all cores for performance and make it seamless to any software package." However, many people on Anandtech feel that claim is not possible or at least will not be met.

 
Reactions: Elfear

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
I understand for desktop Alderlake will struggle because it still keeps the same core configs as the high end laptop parts. I'd have liked to see 8+16. Alderlake I see it as an analogue to Presler - decent advancement but still lot of work to do. I do not expect competitive desktop chips from Intel to happen until 2023 or later.

But laptops are different. My XPS 12 sometimes runs games better on Balanced mode than it does on High performance. Leaving the device alone to do it's thing works pretty well on power constrained systems.

In the mobile area I expect Alderlake to be a pretty nice chip.

According to this graph Lakefield did horrible in regards to power management and the goals for Alderlake-M is much better: https://adoredtv.com/exclusive-alder-lake-m-and-p-power-consumption-figures/

So internally Lakefield might have missed targets. Even Tigerlake does better.

You can see how efficient Amberlake is in comparison:

  1. enabling sleep states in BIOS usually puts the system in a more favorable position, since idle power consumption is way lower... and this also potentially enables SpeedShift - which is a quicker, hardware based CPU frequency control mechanic from Intel. Frequency is still variable, but ramping the CPU up is faster by an order of magnitude to around 1ms as the OS no longer (fully) controls P-States.
Just to add. Speedshift is adjustable. You can set it to 1 for the fastest response or 255 for the slowest. You can disable Speedstep but keep Speedshift for optimal battery on laptops and the number around the middle(80-100) for balance of performance and power usage.
 
Last edited:
Apr 30, 2020
68
170
76
P.S. I don't destroy power efficiency just unlike those who disable C1E or deeper package states: my CPU is still quite efficient, despite having static voltage OC:
View attachment 41743

Just my CPU takes hundreds of uSecs instead of a dozen miliseconds to spring to action.
Your "power efficient" CPU is using 1.5x the entire idle desktop power budget of modern laptops. A single extra watt of power on a mobile CPU can tank battery life by 2-3 hours.
The same as disabling power saving features like downclocking improves performance and performance consistency. If it takes CPU and OS 10-15ms to realize it is under load heavy enough to ramp from 800mhz to 5ghz that is billions of clock cycles missed.

Same applies to OS wrongly scheduling the task on weaker core(s). It has to move threads from one CPU to another, and that is different L2 and it takes cache misses. And in stock configuration, big core also needs to wake from power save modes, ramp clocks and so on. Also some non-deterministic things can happen, like critical GPU driver thread being stuck on small core and OS deciding to keep it there, cause it has history of being idle. Too bad GPU heavy game is running now and your FPS are somehow half and you take off to Reddit and forums to blame AMD.
10-15 ms is 1 frame at 60 FPS. 10-15ms delay from 800 Mhz to 5 GHz is also millions of clock cycles, not billions
 
Reactions: scineram

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Your "power efficient" CPU is using 1.5x the entire idle desktop power budget of modern laptops. A single extra watt of power on a mobile CPU can tank battery life by 2-3 hours.

By the way, since he has a 10900K, he'll be hard pressed to get it under 7.5W for CPU package power. 7.5W idle and 200W max load is an awesome dynamic range.

For that level of performance it's quite efficient.

Many -H series laptops are not much below the power figures he's getting.

Also if you are idling on desktop the power supply efficiency goes down and you'll waste few W just because it goes out of the optimal curve for efficiency on the power supply.

If you aren't on the battery it's really not a huge deal.
 
Reactions: Tlh97

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
By the way, since he has a 10900K, he'll be hard pressed to get it under 7.5W for CPU package power. 7.5W idle and 200W max load is an awesome dynamic range.

Closing messenger programs and SSH tunnels resulted in even better usage:


I've seen down to 3.1W actually, so dynamic range is even more awesome - 3W to 250W under stress tests.
But if I limit C-States in BIOS, down to package C6 (from C10) -> it won't go below ~30W. It is 5.1Ghz on ~1.33V with 4.7Ghz uncore + SA/IO voltages also are rised to support DDR4 3900.
I think it shows just how efficient these CPUs are when idling and properly configured. What matters is how many clocks are "unhalted" and actually executing instructions.
(talking about desktop context here, I am aware of laptop CPUs that are sipping hundreds of miliwatts)

10-15 ms is 1 frame at 60 FPS. 10-15ms delay from 800 Mhz to 5 GHz is also millions of clock cycles, not billions

It involves wake up on every core Still not billions, cause not all of them will be loaded. But I meant such wake ups keep wasting cycles all the time, and not during first one.
And it is not gaming I'd be worried about since it pegs all cores up to max clocks, but rather every day use.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Closing messenger programs and SSH tunnels resulted in even better usage:
I've seen down to 3.1W actually, so dynamic range is even more awesome - 3W to 250W under stress tests.

Yea that's awesome.

The -U and -Y chips use lower leakage parts but actually use more power at the same frequency in the higher end of the clock range. It's a trade-off that's worth it for laptops as majority of the time it's idle and leakage matters a lot. Plus it has more advanced C-states.

C6/C7 on 4th Gen core and later laptops are good enough for <1W.

As I said though, in laptops the Balanced mode might allow it to perform even better. High Performance turns off all power saver modes and sometimes end up detrimental to performance since it has less headroom.
 

coercitiv

Diamond Member
Jan 24, 2014
6,390
12,814
136
I'm not sure I totally follow the analogy to the small cores also actually being a performance detriment when active.
It's quite possible the small cores will be a performance detriment for some workloads and a performance enhancement for others:
  • detriment for latency sensitive workloads and/or workloads that do not scale properly past 8 threads (the latency sensitive part is speculation at this point)
  • enhancement for throughput oriented workloads that scale well with more than 8 threads
Of course we need to see actual implementation to see which way the ratio goes, but you can rest assured hybrid will be a trade-off.

It has been my experience that unless you get really aggressive with all of the power saving settings, most of the time they operate behind the scenes and are not felt by the user.
They are felt by the user, that's why both Intel and AMD invested R&D into making frequency and sleep state transition faster by migrating the decision process from software to hardware. Humans can sense a 10-30ms delay, especially as it adds up on top of the actual computation time to get the expected result on the screen, and especially if this is linked to motion in the UI.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |