Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Page 352 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
686
576
106






As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



Comparison of upcoming Intel's U-series CPU: Core Ultra 100U, Lunar Lake and Panther Lake

ModelCode-NameDateTDPNodeTilesMain TileCPULP E-CoreLLCGPUXe-cores
Core Ultra 100UMeteor LakeQ4 202315 - 57 WIntel 4 + N5 + N64tCPU2P + 8E212 MBIntel Graphics4
?Lunar LakeQ4 202417 - 30 WN3B + N62CPU + GPU & IMC4P + 4E08 MBArc8
?Panther LakeQ1 2026 ??Intel 18A + N3E3CPU + MC4P + 8E4?Arc12



Comparison of die size of Each Tile of Meteor Lake, Arrow Lake, Lunar Lake and Panther Lake

Meteor LakeArrow Lake (20A)Arrow Lake (N3B)Arrow Lake Refresh (N3B)Lunar LakePanther Lake
PlatformMobile H/U OnlyDesktop OnlyDesktop & Mobile H&HXDesktop OnlyMobile U OnlyMobile H
Process NodeIntel 4Intel 20ATSMC N3BTSMC N3BTSMC N3BIntel 18A
DateQ4 2023Q1 2025 ?Desktop-Q4-2024
H&HX-Q1-2025
Q4 2025 ?Q4 2024Q1 2026 ?
Full Die6P + 8P6P + 8E ?8P + 16E8P + 32E4P + 4E4P + 8E
LLC24 MB24 MB ?36 MB ??8 MB?
tCPU66.48
tGPU44.45
SoC96.77
IOE44.45
Total252.15



Intel Core Ultra 100 - Meteor Lake



As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)

 

Attachments

  • PantherLake.png
    283.5 KB · Views: 23,981
  • LNL.png
    881.8 KB · Views: 25,453
Last edited:

Ghostsonplanets

Senior member
Mar 1, 2024
537
943
96
I assume the 14% includes the latency. I maybe wrong though.
The quoted IPC is for Lion Cove on Lunar Lake when compared to Redwood Cove on Meteor Lake.

How will Arrow Lake DT fare is anyone guess. There will be implemetation changes and higher clocks, but ARL also inherit MTL tile structure. We'll need to wait until Innovation Days in September.
 
Reactions: SiliconFly

Hulk

Diamond Member
Oct 9, 1999
4,367
2,233
136
The quoted IPC is for Lion Cove on Lunar Lake when compared to Redwood Cove on Meteor Lake.

How will Arrow Lake DT fare is anyone guess. There will be implemetation changes and higher clocks, but ARL also inherit MTL tile structure. We'll need to wait until Innovation Days in September.
ARL will do much better than MTL even though it will probably be on a similar tile layout due to the much faster memory subsystem of desktop vs. mobile. We've seen this before, the numbers for the same cores on the different platforms aren't really comparable unless the entire workload fits in cache.

In addition I have a feeling that the "L1.5" cache added to Lion Cove is probably there in part to offset the tile latency. This thread is like science. Something is unknown, then you know it, and then because you know it there are new unknowns.
 
Jun 4, 2024
116
146
71
ARL will do much better than MTL even though it will probably be on a similar tile layout due to the much faster memory subsystem of desktop vs. mobile. We've seen this before, the numbers for the same cores on the different platforms aren't really comparable unless the entire workload fits in cache.

In addition I have a feeling that the "L1.5" cache added to Lion Cove is probably there in part to offset the tile latency. This thread is like science. Something is unknown, then you know it, and then because you know it there are new unknowns.
Yeah, it would be disappointing if we had IPC regression with Arrow Lake, and since they've known about the Meteor Lake issue for at least 2 years, they should have had time to find some solution. Also, given that the Skymont vs Raptor Cove IPC figures that were given at Computex likely came from Arrow Lake hardware, it seems that if there is a latency hit, whatever they're doing is hiding the issue. We'll see about Lion Cove Arrow Lake implementation.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,323
2,929
106
ARL will do much better than MTL even though it will probably be on a similar tile layout due to the much faster memory subsystem of desktop vs. mobile. We've seen this before, the numbers for the same cores on the different platforms aren't really comparable unless the entire workload fits in cache.

In addition I have a feeling that the "L1.5" cache added to Lion Cove is probably there in part to offset the tile latency. This thread is like science. Something is unknown, then you know it, and then because you know it there are new unknowns.

Is ARL getting a new SoC tile? I thought SoC tile that houses memory controllers was kept from MTL.
 

lightisgood

Senior member
May 27, 2022
205
89
71
We don't know the exact configuration of the ARL package but from what Intel reps have implied it won't be like LNL but more like MTL.

MSI displayed the AI PC of ARL-S in Computex 2024.
It looks like ARL-S will have new SoC tile.

> A built-in microphone and speaker will let you issue voice commands either to MSI's app or to Copilot.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,420
2,910
136
50% is big. Though I remember hearing Arrow Lake Lion Cove cores would have a clock speed regression and would top at 5.5GHz and its IPC was only 10-15% better. If it clocks at 5.8GHz that exceeds expectations and really is not a clock speed regression from 13th and 14th Gen. Yes I know 14th Gen can tehcially go to 6GHz or some cases 6.2GHz, but it so so unstable at those settings it does not count. 5.8GHz is more like for 13th and 14th Gen max when stable and if Arrow Lake has that no clock speed regression afterall compared to Raptor Cove.
I don't know what is the boost clock of Lion Cove. It can be more or less than 5.8GHz. My point was that there is still a big performance difference between P vs E-cores in case you have enough power.
 

SiliconFly

Golden Member
Mar 10, 2023
1,192
612
96
I don't know what is the boost clock of Lion Cove. It can be more or less than 5.8GHz. My point was that there is still a big performance difference between P vs E-cores in case you have enough power.
In Arrow Lake, LNC might be up to 50% faster than SKT due to clock & µarch differences.
 

The Hardcard

Member
Oct 19, 2021
124
177
86
I scanned the thread and didn’t see this. Sorry if I missed it and it is a repost.

An attendee posted the actual full presentations given by a Lion Cove architect, and the head architect of Skymont at Computex. He also includes in the video the actual slides, rather than phone video of the slides on a screen in the room.

I didn’t hear anything that wasn’t in the articles, however, it is interesting to me at least to hear it straight from the designers mouths.

Lion Cove

Skymont
 

inf64

Diamond Member
Mar 11, 2011
3,759
4,213
136
I scanned the thread and didn’t see this. Sorry if I missed it and it is a repost.

An attendee posted the actual full presentations given by a Lion Cove architect, and the head architect of Skymont at Computex. He also includes in the video the actual slides, rather than phone video of the slides on a screen in the room.

I didn’t hear anything that wasn’t in the articles, however, it is interesting to me at least to hear it straight from the designers mouths.

Lion Cove

Skymont
It really seems that there is a chance ARL-S has HT enabled on its P cores, based on that video. Interesting decisions for P core in LNL.
 

mikk

Diamond Member
May 15, 2012
4,171
2,209
136
It really seems that there is a chance ARL-S has HT enabled on its P cores, based on that video. Interesting decisions for P core in LNL.

From all the sources I have seen there is no HT on ARL-S. Every source says this and we got some from Intel too. It is very very unlikely. From what we know only some server variants might ship with HT enabled.
 
Jun 4, 2024
116
146
71
From all the sources I have seen there is no HT on ARL-S. Every source says this and we got some from Intel too. It is very very unlikely. From what we know only some server variants might ship with HT enabled.
Yeah that seems likely. We’ll see, Intel needs to have a blowout product asap. I’m sure they know that releasing a small incremental improvement with no AVX512 and no obvious replacement (e.g. NPU offload, AVX10) is a non-starter.
 

DrMrLordX

Lifer
Apr 27, 2000
21,794
11,143
136
At least they figured out they weren't serving their intended purpose and are axing them in favor of a much more potent e-core.

The problem is that with Lunar Lake, it looks like they have to power up the entire e-core "island" at a minimum. They don't have the ability to put two or three e-cores to sleep. You win some, you lose some.

Arrow Lake-U will probably still have the LP-e cores, running (hopefully) at higher clocks thanks to being on Intel 3 without burning any additional power. But Arrow Lake-U will be significantly worse than Lunar Lake in any performance-intensive application. It's going to be a really awkward situation.

We know that the wafer orders Intel has with TSMC are significant

Intel spent billions years ago to secure TSMC N3. They'll get whatever wafers they want, especially when you consider how little wafer area is required per compute tile on Arrow Lake and Lunar Lake. And it doesn't seem that any of Intel's enterprise CPUs will be sharing the same wafers so it's consumer all the way down.
 

Thunder 57

Platinum Member
Aug 19, 2007
2,808
4,089
136
The problem is that with Lunar Lake, it looks like they have to power up the entire e-core "island" at a minimum. They don't have the ability to put two or three e-cores to sleep. You win some, you lose some.

Arrow Lake-U will probably still have the LP-e cores, running (hopefully) at higher clocks thanks to being on Intel 3 without burning any additional power. But Arrow Lake-U will be significantly worse than Lunar Lake in any performance-intensive application. It's going to be a really awkward situation.



Intel spent billions years ago to secure TSMC N3. They'll get whatever wafers they want, especially when you consider how little wafer area is required per compute tile on Arrow Lake and Lunar Lake. And it doesn't seem that any of Intel's enterprise CPUs will be sharing the same wafers so it's consumer all the way down.

Right but if two LP-e cores aren't enough for anything other than idle it hardly matters. I thought LP-e cores were gone in all future designs? Could be wrong.
 

Hulk

Diamond Member
Oct 9, 1999
4,367
2,233
136
I found a really great talk about the Lion Cove core changes and had a go at transcibing it, well kind of. Some transcribe, some notes. Couple spots with (?) I could follow. Let me know if you can figure those parts out. Good info here that I didn't see published.


Hyperthreading

HT can add 30% MT performance for only a 20% increase in power and a 10% increase in die area.

Compared to a core that includes HT the same core without HT can use 15% less power for the same ST performance and be 10% smaller in area.

20 years ago, when software threads often outnumbered core count and prior to the advent of E cores or processors with high core count, HT was a good way to increase MT performance. But now with E cores and P cores, each optimized by area and power to excel at ST and MT performance, respectively, HT isn’t the best way to increase overall performance on the desktop. In addition, in most desktop applications there are sufficient P and E cores to handle all of the threads created by the software.

HT is still useful in server situations where the software thread count is extremely high and all available compute cores can be utilized. But for Lunar Lake the idea was to remove any transistors that don’t increase the goodness of the CPU and MT is better handled by E cores than HT.

Smarter Thermal Management and 16.67MHz Bins

While die shrinks have allowed for more transistors to be placed in smaller and smaller areas, along with this advancement in comes increase in thermal density that must be dealt with. Historically thermal issues in the core were either dealt with by frequency throttling or reducing frequency. Since these “guard band” settings were static and were determined in the lab pre-launch they had to be set conservatively to protect the core under the highest environmental conditions and heaviest compute loads. This invariably leaves some performance on the table in most “normal” operating conditions.

Intel is now using a novel approach to thermal management by using a network based self-tuning controller that adapts to the real time operations conditions of the actual workload being run and takes into account the environmental conditions and thermal solution being used. The controller is thereby able to maximize all of the thermal headroom with tighter frequency control to maximize performance. Efficiency is also increased as the frequency bins have been reduced from 100MHz step to 16.67MHz, again allowing for maximum performance and efficiency from the core.

For example, if conditions permit operation at 3.05GHz, but not 3.1GHz the old system would have to hold frequency at 3GHz, the new system will be able to operate at close to 3.05GHz. The overall performance benefit is approximately 2%.

Front End Improvements

The front end of a CPU is responsible for fetching x86 instructions and decoding them into micro-operations. In order to adequately supply the core with instructions the Out-Of-Order part of the front end must be able to accurately determine the correct code blocks from which to generate instructions. Lion Cover fundamentally changes the branch prediction scheme to significantly widen the prediction block up to 8x the previous generation without sacrificing performance accuracy. This has two important benefits. First, it allows the BPU (Branch Prediction Unit) to run ahead and prefetch code lines into the instruction cache alleviating possible instruction cache misses. In this context cache request bandwidth towards the L2 was increased in Lion Cove 3x to capitalize on the BPU running ahead.

Second, wider prediction blocks allow the increase in instruction fetch bandwidth in instruction fetch bandwidth and indeed the instruction fetch bandwidth was doubled from 64 bytes per cycle to 128 bytes per cycle and the decode bandwidth was increased from 6 to 8 instructions per cycle. These instructions are steered towards the micro-op queue and are also built into the micro-of cache since code lines are often reused the micro-op cache allows for efficient low latency and high bandwidth supply of previously decoded micro-ops toward the OoO engine without having to power up the fetch and decode pipeline.

The Lion Cove micro-op cache grew from 4,000 micro-ops in Redwood Cove to 5.250 in Lion Cove and the read bandwidth was increased to supply 12 micro-ops per cycle verses 8 for Redwood Cove. Finally, the micro-op cache grew from 144 to 192 entries facilitating the service of longer or larger code loops in a power efficient manner. The OoO engine is responsible for scheduling micro instructions for execution in a manner which maximized parallelism thus increasing IPC.

Prior generation P cores employed a monolithic scheduling scheme where a single scheduler was tasked with determining the data readiness of all micro-op types and scheduling and dispatching them across all execution ports. This scheme was exceedingly hard to scale and incurred significant hardware overhead.

Lion Cove solves this by splitting the OoO engine into two domains. Integer, which also holds address generation units for memory operations, and vector. These two domains now have independent renaming structures catering to optimize the bandwidth and independent schedulers catering to optimized portability. This allows future expansion of each of these domains independently of each other and provides opportunities on workloads are on only one of these domains.

Lion Cove increases the allocation rename bandwidth from 6 to 8 micro-ops per cycle and the OoO depth or instruction window was increased from 512 to 576 micro-ops. In addition, the physical register files were enlarged appropriately versus prior generation. Lion Cove retires 12 micro-ops per cycle versus eight previously.

Execution

Lion Cove increases the total number of execution ports from 12 to 18. On the integer side, 6 ALU’s are complemented by 3 shift units and 3 64-bit multipliers operating at 3 cycles latency and 1 cycle throughput. 3 branches can be resolved in parallel per cycle. Onthe vector side Lion Cove has four 256 bit ALU’s plus two 256 bit FMA’s operating at 4 cycle latency and two 56-bit floating point dividers with significantly improved latency and throughput for both single and double precision operations vs prior generation.

Crypto acceleration hardware for AES, Shaw and SM3 and 4 resides in the vector stack.

Memory Subsystem

The memory subsystem is a key part of a performant microarchitecture. At the heart of that core memory subsystem are the data caches. Caches are all about striking the perfect balance between bandwidth latency and capacity given a certain area and power budget. Lion Cove significantly re architects the core’s memory subsystem to allow for sustainable high bandwidth with low average latency while still keeping built-in scalability and flexibility to increase cache capacity.

The first data level cache was completely redesigned to allow full operation at 4 cycles latency vs 5 cycles in the previous generation. Lion Cove also introduces a new three level cache hierarchy by inserting an intermediate 192KB cache between the 1st and 2nd level caches.

This has two key benefits. First and foremost, it decreases the average load to use latency by the core, which increase IPC. Second, it allows us to grow the L2 cache capacity to keep a larger portion of the data set closer inside the core without paying the IPC penalty of the added L2 cache latency and indeed the L2 online curve grows to 2.5MB on Lunar Lake and 3 MB on Arrow Lake along with several other L2 controller optimization as well as an increase in L1 fill buffers to 24 and L2 mis-queues(?) to 80, Lion Cove shows a significant improvement in its capacity to consume external bandwidth and this as you know is key to running performant AI workloads.

In other memory subsystem enhancements the first level DTLB (Data Translation Lookaside Buffer) was increased to support coverage for 128 pages vs 96 pages previously and in order to improve load execution in the shadow of older stores Lion Cove adds a 3rd store address generation unit. It employs a new fine grain memory disambiguation algorithm to safely conflicts (audio dropout) and enhances the stored load forwarding scheme to allow a young load to collect and stitch data from any number of older pending resolved stores as well as from the data cache.

Lion Cove drives a significant double-digit IPC improvement over a wide spectrum of fork loads (?). Having optimized for lower TDP’s (Total Design Power) on Lunar Lake, Lion Cove delivers more than 18% PNP (Performance at Power) at that low TDP.
 
Jun 4, 2024
116
146
71
I found a really great talk about the Lion Cove core changes and had a go at transcibing it, well kind of. Some transcribe, some notes. Couple spots with (?) I could follow. Let me know if you can figure those parts out. Good info here that I didn't see published.


Hyperthreading

HT can add 30% MT performance for only a 20% increase in power and a 10% increase in die area.

Compared to a core that includes HT the same core without HT can use 15% less power for the same ST performance and be 10% smaller in area.

20 years ago, when software threads often outnumbered core count and prior to the advent of E cores or processors with high core count, HT was a good way to increase MT performance. But now with E cores and P cores, each optimized by area and power to excel at ST and MT performance, respectively, HT isn’t the best way to increase overall performance on the desktop. In addition, in most desktop applications there are sufficient P and E cores to handle all of the threads created by the software.

HT is still useful in server situations where the software thread count is extremely high and all available compute cores can be utilized. But for Lunar Lake the idea was to remove any transistors that don’t increase the goodness of the CPU and MT is better handled by E cores than HT.

Smarter Thermal Management and 16.67MHz Bins

While die shrinks have allowed for more transistors to be placed in smaller and smaller areas, along with this advancement in comes increase in thermal density that must be dealt with. Historically thermal issues in the core were either dealt with by frequency throttling or reducing frequency. Since these “guard band” settings were static and were determined in the lab pre-launch they had to be set conservatively to protect the core under the highest environmental conditions and heaviest compute loads. This invariably leaves some performance on the table in most “normal” operating conditions.

Intel is now using a novel approach to thermal management by using a network based self-tuning controller that adapts to the real time operations conditions of the actual workload being run and takes into account the environmental conditions and thermal solution being used. The controller is thereby able to maximize all of the thermal headroom with tighter frequency control to maximize performance. Efficiency is also increased as the frequency bins have been reduced from 100MHz step to 16.67MHz, again allowing for maximum performance and efficiency from the core.

For example, if conditions permit operation at 3.05GHz, but not 3.1GHz the old system would have to hold frequency at 3GHz, the new system will be able to operate at close to 3.05GHz. The overall performance benefit is approximately 2%.

Front End Improvements

The front end of a CPU is responsible for fetching x86 instructions and decoding them into micro-operations. In order to adequately supply the core with instructions the Out-Of-Order part of the front end must be able to accurately determine the correct code blocks from which to generate instructions. Lion Cover fundamentally changes the branch prediction scheme to significantly widen the prediction block up to 8x the previous generation without sacrificing performance accuracy. This has two important benefits. First, it allows the BPU (Branch Prediction Unit) to run ahead and prefetch code lines into the instruction cache alleviating possible instruction cache misses. In this context cache request bandwidth towards the L2 was increased in Lion Cove 3x to capitalize on the BPU running ahead.

Second, wider prediction blocks allow the increase in instruction fetch bandwidth in instruction fetch bandwidth and indeed the instruction fetch bandwidth was doubled from 64 bytes per cycle to 128 bytes per cycle and the decode bandwidth was increased from 6 to 8 instructions per cycle. These instructions are steered towards the micro-op queue and are also built into the micro-of cache since code lines are often reused the micro-op cache allows for efficient low latency and high bandwidth supply of previously decoded micro-ops toward the OoO engine without having to power up the fetch and decode pipeline.

The Lion Cove micro-op cache grew from 4,000 micro-ops in Redwood Cove to 5.250 in Lion Cove and the read bandwidth was increased to supply 12 micro-ops per cycle verses 8 for Redwood Cove. Finally, the micro-op cache grew from 144 to 192 entries facilitating the service of longer or larger code loops in a power efficient manner. The OoO engine is responsible for scheduling micro instructions for execution in a manner which maximized parallelism thus increasing IPC.

Prior generation P cores employed a monolithic scheduling scheme where a single scheduler was tasked with determining the data readiness of all micro-op types and scheduling and dispatching them across all execution ports. This scheme was exceedingly hard to scale and incurred significant hardware overhead.

Lion Cove solves this by splitting the OoO engine into two domains. Integer, which also holds address generation units for memory operations, and vector. These two domains now have independent renaming structures catering to optimize the bandwidth and independent schedulers catering to optimized portability. This allows future expansion of each of these domains independently of each other and provides opportunities on workloads are on only one of these domains.

Lion Cove increases the allocation rename bandwidth from 6 to 8 micro-ops per cycle and the OoO depth or instruction window was increased from 512 to 576 micro-ops. In addition, the physical register files were enlarged appropriately versus prior generation. Lion Cove retires 12 micro-ops per cycle versus eight previously.

Execution

Lion Cove increases the total number of execution ports from 12 to 18. On the integer side, 6 ALU’s are complemented by 3 shift units and 3 64-bit multipliers operating at 3 cycles latency and 1 cycle throughput. 3 branches can be resolved in parallel per cycle. Onthe vector side Lion Cove has four 256 bit ALU’s plus two 256 bit FMA’s operating at 4 cycle latency and two 56-bit floating point dividers with significantly improved latency and throughput for both single and double precision operations vs prior generation.

Crypto acceleration hardware for AES, Shaw and SM3 and 4 resides in the vector stack.

Memory Subsystem

The memory subsystem is a key part of a performant microarchitecture. At the heart of that core memory subsystem are the data caches. Caches are all about striking the perfect balance between bandwidth latency and capacity given a certain area and power budget. Lion Cove significantly re architects the core’s memory subsystem to allow for sustainable high bandwidth with low average latency while still keeping built-in scalability and flexibility to increase cache capacity.

The first data level cache was completely redesigned to allow full operation at 4 cycles latency vs 5 cycles in the previous generation. Lion Cove also introduces a new three level cache hierarchy by inserting an intermediate 192KB cache between the 1st and 2nd level caches.

This has two key benefits. First and foremost, it decreases the average load to use latency by the core, which increase IPC. Second, it allows us to grow the L2 cache capacity to keep a larger portion of the data set closer inside the core without paying the IPC penalty of the added L2 cache latency and indeed the L2 online curve grows to 2.5MB on Lunar Lake and 3 MB on Arrow Lake along with several other L2 controller optimization as well as an increase in L1 fill buffers to 24 and L2 mis-queues(?) to 80, Lion Cove shows a significant improvement in its capacity to consume external bandwidth and this as you know is key to running performant AI workloads.

In other memory subsystem enhancements the first level DTLB (Data Translation Lookaside Buffer) was increased to support coverage for 128 pages vs 96 pages previously and in order to improve load execution in the shadow of older stores Lion Cove adds a 3rd store address generation unit. It employs a new fine grain memory disambiguation algorithm to safely conflicts (audio dropout) and enhances the stored load forwarding scheme to allow a young load to collect and stitch data from any number of older pending resolved stores as well as from the data cache.

Lion Cove drives a significant double-digit IPC improvement over a wide spectrum of fork loads (?). Having optimized for lower TDP’s (Total Design Power) on Lunar Lake, Lion Cove delivers more than 18% PNP (Performance at Power) at that low TDP.
Thanks. The more I think about it, the more I think Arrow Lake even with 24 cores is going to be a great deal for most users.

I want more, so I’m looking forward for the refresh hopefully with 40 cores.

But yeah, I’d rather have 15% more ST on every workload than 20% MT given how workload dependent that 20% is, and how it means that some threads are dramatically less useful than others.
 
Reactions: Elfear

Hulk

Diamond Member
Oct 9, 1999
4,367
2,233
136
Thanks. The more I think about it, the more I think Arrow Lake even with 24 cores is going to be a great deal for most users.

I want more, so I’m looking forward for the refresh hopefully with 40 cores.

But yeah, I’d rather have 15% more ST on every workload than 20% MT given how workload dependent that 20% is, and how it means that some threads are dramatically less useful than others.
The HT discussion in my post made a lot of sense to me. I have a 14900K and it's rare, really rare that I'm hitting all 24 cores really hard, and I have HT turned off. In anything outside of CB HT is useless for me. ARL would be a great upgrade with stronger P cores and much stronger E cores. Of course I'm going to wait and see what Zen 5 brings when reviews are actually published for both.
 
Reactions: invisible_city

DrMrLordX

Lifer
Apr 27, 2000
21,794
11,143
136
Compared to a core that includes HT the same core without HT can use 15% less power for the same ST performance and be 10% smaller in area.

~10% die area is committed to HT? That's hard to believe. Also the extra power consumption only occurs when the HT core is forced to handle two threads, no?
 

Thunder 57

Platinum Member
Aug 19, 2007
2,808
4,089
136
~10% die area is committed to HT? That's hard to believe. Also the extra power consumption only occurs when the HT core is forced to handle two threads, no?

10% sounded weird to me too. Intel said it was 5% back in the P4 days. I figured it has shrunk since then compared to other structures.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |