Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Page 341 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
686
576
106






As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



Comparison of upcoming Intel's U-series CPU: Core Ultra 100U, Lunar Lake and Panther Lake

ModelCode-NameDateTDPNodeTilesMain TileCPULP E-CoreLLCGPUXe-cores
Core Ultra 100UMeteor LakeQ4 202315 - 57 WIntel 4 + N5 + N64tCPU2P + 8E212 MBIntel Graphics4
?Lunar LakeQ4 202417 - 30 WN3B + N62CPU + GPU & IMC4P + 4E08 MBArc8
?Panther LakeQ1 2026 ??Intel 18A + N3E3CPU + MC4P + 8E4?Arc12



Comparison of die size of Each Tile of Meteor Lake, Arrow Lake, Lunar Lake and Panther Lake

Meteor LakeArrow Lake (20A)Arrow Lake (N3B)Arrow Lake Refresh (N3B)Lunar LakePanther Lake
PlatformMobile H/U OnlyDesktop OnlyDesktop & Mobile H&HXDesktop OnlyMobile U OnlyMobile H
Process NodeIntel 4Intel 20ATSMC N3BTSMC N3BTSMC N3BIntel 18A
DateQ4 2023Q1 2025 ?Desktop-Q4-2024
H&HX-Q1-2025
Q4 2025 ?Q4 2024Q1 2026 ?
Full Die6P + 8P6P + 8E ?8P + 16E8P + 32E4P + 4E4P + 8E
LLC24 MB24 MB ?36 MB ??8 MB?
tCPU66.48
tGPU44.45
SoC96.77
IOE44.45
Total252.15



Intel Core Ultra 100 - Meteor Lake



As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)

 

Attachments

  • PantherLake.png
    283.5 KB · Views: 23,984
  • LNL.png
    881.8 KB · Views: 25,456
Last edited:

lightisgood

Senior member
May 27, 2022
206
89
71
Lunar Lake E Cores are now able to talk to each other through their L1 cache, which should dramatically improve core - core latency: https://hothardware.com/reviews/intel-lunar-lake-deep-dive?page=3

Meaning we avoid this in Arrow Lake: https://www.anandtech.com/show/1704...hybrid-performance-brings-hybrid-complexity/6

Previously core communication required trip through ring bus, or in case of LP cores, Meteor Lake’s LP Scalable Fabric. See also https://chipsandcheese.com/2024/05/13/meteor-lakes-e-cores-crestmont-makes-incremental-progress/

Damned good design changes

I remember that this L1$-to-L1$ link was adopted for C2D (Merom) in 2006...
I had been thinking that Alder Lake, the 1st gen x86 hybrid, is very primitive design.
So, I was correct.
 
Reactions: del42sa

ondma

Platinum Member
Mar 18, 2018
2,787
1,356
136
That isn't the goal with the P cores, and likely never will be. The P cores are to have a single task done ASAP at the cost of high power. Move to a new architecture, or process and the goal is still the same: complete a single task ASAP at the cost of high power. The P cores are for when you want something very responsive and fluid. But, you can't have large numbers of cores all doing tasks at the cost of high power. There is no free lunch. With 8 P cores, running at 125 W, each gets ~15.6 W. Those P cores can clock a lot faster than 16 P cores each with only ~7.8 W. No matter the architecture or process, when you split your power budget up amongst more and more cores, each core gets less and less to work with.

The E cores are designed to be the workhorses that you can spam in large numbers to do grunt work. The real issue was when the P/E core was first released, the E cores were clocked too high and there were too few of them. The result was that the first E cores were neither that efficient nor that good at grunt work. So, people got the whole idea of P and E cores backwards in their mind thinking that P cores were for the grunt work. You have to switch your mindset. You want more E cores for more work done.
I mean, you just gave a textbook justification of hybrid architecture. You didnt really address the point of my post though. Sorry to keep bringing up AMD in an Intel thread, but they are able to put 16 big cores into a chip and still have excellent performance and power consumption. I guess what I am trying to say, is that Lion Cove still seems behind in performance and/or power consumption, or they would not have to bother with the E cores. It is also disappointing that Lunar Lake and the most performant Arrow Lake are on a TSMC node. What happened to process leadership? I though 20A was supposed to bring leadership. Are we depending on 18A now? And if it is simply a matter of supply, I dont consider a process leading edge if it cant provide sufficient wafers with adequate yields to satisfy production demands.
 
Reactions: Lodix and H433x0n

poke01

Golden Member
Mar 8, 2022
1,455
1,683
106

talking about Lunar Lake while wearing a Apple shirt, love the irony.
Can't wait for the deepdive from them.

Intel has implemented simliar power management to M1, these are the best chips to come out in a long time from Intel.
 
Last edited:

TESKATLIPOKA

Platinum Member
May 1, 2020
2,430
2,915
136
Because customers don't run Cinebench on an ultrabook.
Thanks, you didn't disappoint with your "useful" reply as always.

So once more, why did they choose 4+4 config instead of 2+8 for example, which would be comparable in size If not a bit smaller.

Lion Cove is for max ST performance and responsiveness, so It's understandable, to use them, but why 4, when this is intended for ultrabooks with a limited TDP?
Skymont cluster offers better perf/W than a Lion Cove cluster and is also a lot smaller, 2 of them would provide significantly higher performance than a single Lion Cove cluster.

That was made to achieve better efficiency. In Lunar Lake, E-cores are always active. Having more E cores will increase the idle power consumption. Intel is planning to turn off the whole P-cluster when it's not used.
Why can't there be 3 clusters? One with 2 P-cores and 2 clusters with 4 E-cores each?
And Intel could keep active only a single E-core cluster.
 

Joe NYC

Platinum Member
Jun 26, 2021
2,333
2,947
106
I am expecting a good 15-20% total single core uplift (ipc + clocks) over Raptor Lake. Multicore is going to come down to what process gets used due to power limits. The more power efficient the process is, the better the performance. We could see Intel lead AMD by a substantial amount here, but they are also (allegedly) pulling back power limits to be similar to AMD’s limits, so who knows?

Raptor Lake goes up to 6.2 GHz (or 6.0 GHz). Do you expect +1% to +6% clock speed increase?

It 5.7 GHz is the clock speed of Arrow Lake, then it is -5% to -9% clock speed regression.
 

TESKATLIPOKA

Platinum Member
May 1, 2020
2,430
2,915
136
You are excited about lower clock speeds and lower IPC?

Seems like you are excited about Zen 3 in era of Zen 5...
Both types of cores have higher IPC than the predecessors and LNL is a low TDP SoC, so boost clockspeed doesn't necessarily need to be much lower than MTL-U(5GHz/3.8GHz) or RPL-U(5.2GHz/3.9GHz) for either core.
Not sure about sustained clocks during full load, we will have to wait for reviews.

It will be very interesting to limit MTL-U, LNL, PHX(2) and Strix to 15W-30W and see how It performs in CB.
 

DavidC1

Senior member
Dec 29, 2023
413
593
96
Also Intel stated that at iso power Lion Cove in Lunar Lame is up to an 18% performance uplift, not 14. Just depends on where you sit on the power curve.

Something most are missing is theyre describing 14% uplift in the Lunar Lake iteration, not in all implementations.
That has nothing to do with perf/clock.

The curve has shifted likely due to design/process change which benefits lower power.
Another tidbit from Chips n' Cheese:


So we're looking at possibilities of, on Arrow Lake DT:
- Bigger cache
- Return of HT
- L1 to L2 bandwidth to 110B per cycle

Intel new modern sea of cells design really allow for finer grained changes that fit different markers. Quite interesting.
Granite Rapids according to Pat: ten-plus % changes in the core
 

TwistedAndy

Member
May 23, 2024
139
104
71
Why can't there be 3 clusters? One with 2 P-cores and 2 clusters with 4 E-cores each?
And Intel could keep active only a single E-core cluster.

Intel probably decided that there was no sense in having three independent clusters because of the power and memory latency issues. Intel had to introduce a separate 8MB side cache to make the current approach with two independent clusters work.
 

DavidC1

Senior member
Dec 29, 2023
413
593
96
Thanks, you didn't disappoint with your "useful" reply as always.

So once more, why did they choose 4+4 config instead of 2+8 for example, which would be comparable in size If not a bit smaller.
4+4 might be better for Lunarlake being focused as a low power(I mean for battery life, not TDP).

Skymont even at lower clocks is high enough performance to cover most performance needs, and two cores is little bit small nowadays so they bumped it up to 4.

2x P cores again is under the core requirements so for applications that require higher responsiveness and lightly threaded 4 is a good number.

This is just a guess, there might be technical reasons to do so, but Apple also does something similar.
 

DavidC1

Senior member
Dec 29, 2023
413
593
96
He's basically saying what @Exist50 has said.

P core design is in shambles, in addition to the E core team being excellent.

Third: @adroc_thurston doesn't really have sources. I was waiting and waiting to see what he says is true.
CWF is 18A so that's even better, possibly.
Either way, the thing is basically Z4c with worse SIMD.
Cope. Again, and again. Can you at least admit you are wrong once in a while? Or at least don't be like AI and pretend everything you say is written in stone?
 

FlameTail

Diamond Member
Dec 15, 2021
3,209
1,847
106
Intel's P-cores are clearly excessively bloated.

Where's the Lunar Lake die shot? I want to compare Lion Cove and Apple M3-P core die area.

@poke01 you said you would make a Lunar Lake vs M3 thread sometime?
 

coercitiv

Diamond Member
Jan 24, 2014
6,403
12,864
136
Thanks, you didn't disappoint with your "useful" reply as always.

So once more, why did they choose 4+4 config instead of 2+8 for example, which would be comparable in size If not a bit smaller.

Lion Cove is for max ST performance and responsiveness, so It's understandable, to use them, but why 4, when this is intended for ultrabooks with a limited TDP?
Skymont cluster offers better perf/W than a Lion Cove cluster and is also a lot smaller, 2 of them would provide significantly higher performance than a single Lion Cove cluster.
His reply may have seemed cryptic because you're less focused on the needs of the users who will be buying this product. Workloads will be relatively lightly threaded and rather latency sensitive, 4P cores will make the device look snappy, more cores overall will only help in isolated cases. (in fact most of them see a "real" MT workload when they boot or when they make OS updates)

Browsing and apps built on chromium will probably make up quite a good chunk of the user scenarios. Modern browsers can scale to 6+ cores, but what is more important for browser speed is ST performance of the cores being used. This is in stark contrast with Cinebench, where available throughput is all that matters, because software scaling is... well... embarrassing

For the upper range of TDP covered by LNL it would be nice if it came with something like 4+8 (my favorite would still be 6+4, with a better P core), but the NPU stole the rest of the pizza, sorry.

 

DavidC1

Senior member
Dec 29, 2023
413
593
96
His reply may have seemed cryptic because you're less focused on the needs of the users who will be buying this product. Workloads will be relatively lightly threaded and rather latency sensitive, 4P cores will make the device look snappy, more cores overall will only help in isolated cases. (in fact most of them see a "real" MT workload when they boot or when they make OS updates)

Browsing and apps built on chromium will probably make up quite a good chunk of the user scenarios. Modern browsers can scale to 6+ cores, but what is more important for browser speed is ST performance of the cores being used. This is in stark contrast with Cinebench, where available throughput is all that matters, because software scaling is... well... embarrassing

For the upper range of TDP covered by LNL it would be nice if it came with something like 4+8 (my favorite would still be 6+4, with a better P core), but the NPU stole the rest of the pizza, sorry.

View attachment 100557
Let's think of that die shot.
-Take out the P cores
-Take out the NPUs

There's probably enough room left to put a 20 Xe core monster in there. So much for "AI revolution". 20 Xe cores 320 EUs in old Intel terminology. Skymont is more than fast enough to feed such a GPU.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |