Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Page 322 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
686
576
106






As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



Comparison of upcoming Intel's U-series CPU: Core Ultra 100U, Lunar Lake and Panther Lake

ModelCode-NameDateTDPNodeTilesMain TileCPULP E-CoreLLCGPUXe-cores
Core Ultra 100UMeteor LakeQ4 202315 - 57 WIntel 4 + N5 + N64tCPU2P + 8E212 MBIntel Graphics4
?Lunar LakeQ4 202417 - 30 WN3B + N62CPU + GPU & IMC4P + 4E08 MBArc8
?Panther LakeQ1 2026 ??Intel 18A + N3E3CPU + MC4P + 8E4?Arc12



Comparison of die size of Each Tile of Meteor Lake, Arrow Lake, Lunar Lake and Panther Lake

Meteor LakeArrow Lake (20A)Arrow Lake (N3B)Arrow Lake Refresh (N3B)Lunar LakePanther Lake
PlatformMobile H/U OnlyDesktop OnlyDesktop & Mobile H&HXDesktop OnlyMobile U OnlyMobile H
Process NodeIntel 4Intel 20ATSMC N3BTSMC N3BTSMC N3BIntel 18A
DateQ4 2023Q1 2025 ?Desktop-Q4-2024
H&HX-Q1-2025
Q4 2025 ?Q4 2024Q1 2026 ?
Full Die6P + 8P6P + 8E ?8P + 16E8P + 32E4P + 4E4P + 8E
LLC24 MB24 MB ?36 MB ??8 MB?
tCPU66.48
tGPU44.45
SoC96.77
IOE44.45
Total252.15



Intel Core Ultra 100 - Meteor Lake



As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)

 

Attachments

  • PantherLake.png
    283.5 KB · Views: 23,981
  • LNL.png
    881.8 KB · Views: 25,453
Last edited:

DavidC1

Senior member
Dec 29, 2023
359
560
96
Assuming Hyperthreading gives 1.3x benefit, and Lion Cove is 20% faster:

Raptorlake
-1x P core x 1.3 x 8
-1x E cores x 16
Total: 26.4

Arrowlake
-1.2x P core x 8
-1.4 x E cores x 16
Total: 32

21% advantage for Arrowlake assuming a well multi-threaded application. For the P cores it's 9.6 vs 10.4, so the losses aren't big. Likely, the faster single thread portion can make up for loss of multi-threaded performance in many applications that aren't perfectly threaded, and it'll be many.

It may also be more correct to assume a 1.5x modifier for Skymont because client workloads use quite a bit of FP it seems. The faster E core relative to the P core will further reduce cases where there are sudden performance drops, and the lack of HT will likely help too. It looks like games use a fair bit of SIMD code too, this is where the doubled vector units will be a boon.

Bionic_Squash also says Lion Cove goes to 4x256-bit SIMD. It's FP performance should improve significantly.
 

Wolverine2349

Senior member
Oct 9, 2022
244
90
61
Skymont may well be a good architecture but being on tile might be what doesn't let it perform to its fullest.

Yes thats possibly true.

Just like for some reason Sapphire Rapids Golden Cove has gimped IPC compared to client Golden Cove due to mesh arch instead of ring bus.

It sucks because Xeon Workstation is not a good substitute for getting more than 8 P cores on Golden Cove or above unlike Broadwell E and prior. Not only is it more expensive requiring ECC RAM, the platform is so different and has too many enterprise features detrimental to things like gaming unlike HEDT Broadwell E and prior back in the day which were more like mainstream desktop on steroids rather than Enterprise workstation.

And HEDT was less expensive too. But even besides cost, Xeon SPR Workstation not a good option for gaming as it has gimped lPC and latency unlike Broadwell E and prior and ECC RAM required which is detrimental to things like gaming.

Xeon Workstation these days you are paying the premium for things like so many more PCIe lanes and ECC RAM and advanced platform. Better single thread performance its not unlike Boradwell E and before which were basically just quad channel RA and ore cores and a little more PCIe and still excellent gaming and single thread and multi thread performance.
 
Reactions: igor_kavinski
Jul 27, 2020
17,824
11,615
116
Xeon Workstation these days you are paying the premium for things like so many more PCIe lanes and ECC RAM and advanced platform. Better single thread performance its not unlike Boradwell E and before which were basically just quad channel RA and ore cores and a little more PCIe and still excellent gaming and single thread and multi thread performance.
Yeah there are users on this forum who used Xeons for gaming and many even now may buy them used for gaming coz they are dirt cheap. I was certainly considering one to replace my 65W i7-5775C coz the Xeon eDRAM part is 95W so it would definitely perform a bit better.
 

Wolverine2349

Senior member
Oct 9, 2022
244
90
61
Yeah there are users on this forum who used Xeons for gaming and many even now may buy them used for gaming coz they are dirt cheap. I was certainly considering one to replace my 65W i7-5775C coz the Xeon eDRAM part is 95W so it would definitely perform a bit better.

Yes some old ones cause they are so cheap but performant they are not unlike prior Intel HEDT.

Well older arch Xeons had more in common with their desktop counterparts than todays Xeons, but they are locked, but at least latency and IPC was not gimped in favor of all these glamourous more expensive enterprise features that gimp the IPC of modern Xeons where fewer faster cores for gaming than a bunch of slow lower IPC cores and also lower clocked. But still a little above 8 would be nice. You had those options and for less money too on X99 and before unlike today's Xeons for gaming.
 
Reactions: igor_kavinski

Kepler_L2

Senior member
Sep 6, 2020
459
1,892
106

inf64

Diamond Member
Mar 11, 2011
3,761
4,214
136
Assuming Hyperthreading gives 1.3x benefit, and Lion Cove is 20% faster:

Raptorlake
-1x P core x 1.3 x 8
-1x E cores x 16
Total: 26.4

Arrowlake
-1.2x P core x 8
-1.4 x E cores x 16
Total: 32

21% advantage for Arrowlake assuming a well multi-threaded application. For the P cores it's 9.6 vs 10.4, so the losses aren't big. Likely, the faster single thread portion can make up for loss of multi-threaded performance in many applications that aren't perfectly threaded, and it'll be many.

It may also be more correct to assume a 1.5x modifier for Skymont because client workloads use quite a bit of FP it seems. The faster E core relative to the P core will further reduce cases where there are sudden performance drops, and the lack of HT will likely help too. It looks like games use a fair bit of SIMD code too, this is where the doubled vector units will be a boon.

Bionic_Squash also says Lion Cove goes to 4x256-bit SIMD. It's FP performance should improve significantly.
One problem, Lion Cove and Raptor Cove P core sustained clockspeeds will likely be very different (read: Raptor cove will likely run at noticeably higher clock in any nT mode). So not only that Lion Cove P cores will have to deal with the lack of SMT, they will have to match 5.5Ghz all core turbo for your calculation to be close to reality.
 

DavidC1

Senior member
Dec 29, 2023
359
560
96
One problem, Lion Cove and Raptor Cove P core sustained clockspeeds will likely be very different (read: Raptor cove will likely run at noticeably higher clock in any nT mode). So not only that Lion Cove P cores will have to deal with the lack of SMT, they will have to match 5.5Ghz all core turbo for your calculation to be close to reality.
It's true. Arrowlake and Lunarlake is a strange combination of two overlapping era cores together, as if you had Netburst and Core.

The MT gains will still be decent, because most is offered by the E cores. The P core would need to be less than half before you start losing performance.

The question is how much faster is Lion Cove over Skymont? 20% Int and 30-35% FP means only a clockspeed advantage in Int and 25% faster in FP.
 

Hulk

Diamond Member
Oct 9, 1999
4,367
2,234
136
Like the results for SpecCPU shows, Golden Cove is 23%/66% faster in Int and FP respectively.

If you take 5% for Crestmont and add 38% for Int and 68% for FP, you get my figure.

Of course for real world applications it will vary, and Spec is also a composite benchmark so some tests will do better than others. I'm talking about overall. "Outright" of course depends on the context. If they are clocked the same it should be in the ballpark I said.

Think of Nehalem vs Penryn. Nehalem is basically Penryn with a vastly improved memory and communications system. Nearly the same core, but greatly enhanced uncore.

So as a core Skymont is very competent just like Gracemont was. Darkmont-based Clearwater Forest on 18A with Foveros Direct sounds good to me.
Okay these figures, not your post, are nuts!
If Skymont is on par with Golden Cove from an IPC point-of-view then the performance between Skymont and Lion Cove will amount to perhaps 20% IPC and frequency advantage for LC. If this pans out then ARL will be a beast both in terms of performance and economics for Intel.
 

AMDK11

Senior member
Jul 15, 2019
341
235
116
Think of Nehalem vs Penryn. Nehalem is basically Penryn with a vastly improved memory and communications system. Nearly the same core, but greatly enhanced uncore.
I don't agree with that. Nehalem is a new core architecture that has been redesigned and expanded. The scale of changes and expansion may seem small nowadays, but it is certainly not nearly Penryn.
 

AMDK11

Senior member
Jul 15, 2019
341
235
116
Okay these figures, not your post, are nuts!
If Skymont is on par with Golden Cove from an IPC point-of-view then the performance between Skymont and Lion Cove will amount to perhaps 20% IPC and frequency advantage for LC. If this pans out then ARL will be a beast both in terms of performance and economics for Intel.
I believe Skymont is just an appetizer and LionCove is the main course. I find LionCove much faster than Skymont. Computex is just a few days away and I expect the LionCove slides to be released by weekend.

Very big changes do not only concern Skymont. There are huge changes taking place, especially at LionCove, as you will see for yourself.

The LunarLake graphic doesn't tell everything about LionCove, just like it doesn't tell everything about Skymont. Based on the Skymont diagram, I thought there were even fewer execution units than in Gracemont. Only the Intel slide leak shows huge changes.

The same will happen with LionCove.

If anyone has any doubts about this, they will find out for themselves within maybe not even days, but hours.

EDIT:
LionCove is not only about huge changes or reconstruction, but also Intel's new approach to architecture with the highest possible IPC. LionCove, among others, implements new techniques.
 
Last edited:
Reactions: Henry swagger

Henry swagger

Senior member
Feb 9, 2022
439
280
106
I believe Skymont is just an appetizer and LionCove is the main course. I find LionCove much faster than Skymont. Computex is just a few days away and I expect the LionCove slides to be released by weekend.

Very big changes do not only concern Skymont. There are huge changes taking place, especially at LionCove, as you will see for yourself.

The LunarLake graphic doesn't tell everything about LionCove, just like it doesn't tell everything about Skymont. Based on the Skymont diagram, I thought there were even fewer execution units than in Gracemont. Only the Intel slide leak shows huge changes.

The same will happen with LionCove.

If anyone has any doubts about this, they will find out for themselves within maybe not even days, but hours.

EDIT:
LionCove is not only about huge changes or reconstruction, but also Intel's new approach to architecture with the highest possible IPC. LionCove, among others, implements new techniques.
interesting
 

DavidC1

Senior member
Dec 29, 2023
359
560
96
I don't agree with that. Nehalem is a new core architecture that has been redesigned and expanded. The scale of changes and expansion may seem small nowadays, but it is certainly not nearly Penryn.
You can disagree but it's true. Nehalem's main focus is everything other than the core. Which was badly needed for sure.

When you isolate the uarch the gains are like a Tick. 5-10%. It's not wider in any way either nor brought any noticeable improvements.
 

adroc_thurston

Diamond Member
Jul 2, 2023
3,314
4,782
96
I don't agree with that. Nehalem is a new core architecture that has been redesigned and expanded. The scale of changes and expansion may seem small nowadays, but it is certainly not nearly Penryn.
The core was pretty much Penryn with like 3.5 server-focused changes.
The new stuff was everything not core (new $ hierarchy, IMC, new not-Hypertransport etc).
Beckton (-EX) went even further and became the first Intel chip with a ringbus.
 

DavidC1

Senior member
Dec 29, 2023
359
560
96
@Exist50 has said that the E core team was subject to being shuttered if they did not execute and also they adopted the new better approach of designing processors years ago.

The P core team has been wasting a lot of resources and time exactly because they were on top and became complacent. At this point it's more than complacency - it's delusion.
 

AMDK11

Senior member
Jul 15, 2019
341
235
116
You can disagree but it's true. Nehalem's main focus is everything other than the core. Which was badly needed for sure.

When you isolate the uarch the gains are like a Tick. 5-10%. It's not wider in any way either nor brought any noticeable improvements.
Following this line of reasoning, SandyBridge is almost a Nehalem core because it is not wider at all (the decoding is the same width and the number of execution units remains the same).
 

AMDK11

Senior member
Jul 15, 2019
341
235
116
@Exist50 has said that the E core team was subject to being shuttered if they did not execute and also they adopted the new better approach of designing processors years ago.

The P core team has been wasting a lot of resources and time exactly because they were on top and became complacent. At this point it's more than complacency - it's delusion.
He also believed that LionCove belongs to "RoyalCore" and what does that mean?
 
Last edited:

AMDK11

Senior member
Jul 15, 2019
341
235
116
You can disagree but it's true. Nehalem's main focus is everything other than the core. Which was badly needed for sure.

When you isolate the uarch the gains are like a Tick. 5-10%. It's not wider in any way either nor brought any noticeable improvements.
NO. The single core of Nehalem, apart from the width, i.e. the number of streams, does not resemble and is not close to Penryn.
 

DavidC1

Senior member
Dec 29, 2023
359
560
96
Following this line of reasoning, SandyBridge is almost a Nehalem core because it is not wider at all (the decoding is the same width and the number of execution units remains the same).
Yes but Sandy Bridge brings innovative new ideas and actually improves performance significantly.

-Physical Register Files
-uOP Cache, which is a hugely improved Trace Cache
-Rebuilt branch predictor for efficiency taking up same space but with twice the targets
-Load/Store units that also improve efficiency
-256-bit vector by reusing ports, saving space
-Finally to top it off, a ring bus that allows high bandwidth communication and ultra low latency access for the L3 while allowing modularity in design
-Turbo 2.0 that is actually innovative taking advantage of thermal headroom for bursty applications rather than simple core count based as with Nehalem.

While Nehalem does improve performance though, it's mostly in memory bandwidth limited applications and high thread. Back when Xbitlabs was active they did a really good test comparing Nehalem versus Penryn at equal clocks with Turbo on/off and SMT on/off. It showed that in real single threaded applications it was a glorified Tick with 5-10% gain and the rest were based on much needed changes unrelated to the core such as the integrated memory controller.

The guys at RWT did a test too and noted that with Nehalem there are even circumstances where performance degrades with single thread. It was everything BUT single thread.

The two teams complemented each other very well with one enhancing the core(IDC) and the other enhancing the uncore(Oregon).
 
Reactions: Ghostsonplanets

AMDK11

Senior member
Jul 15, 2019
341
235
116
Yes but Sandy Bridge brings innovative new ideas and actually improves performance significantly.

-Physical Register Files
-uOP Cache, which is a hugely improved Trace Cache
-Rebuilt branch predictor for efficiency taking up same space but with twice the targets
-Load/Store units that also improve efficiency
-256-bit vector by reusing ports, saving space
-Finally to top it off, a ring bus that allows high bandwidth communication and ultra low latency access for the L3 while allowing modularity in design
-Turbo 2.0 that is actually innovative taking advantage of thermal headroom for bursty applications rather than simple core count based as with Nehalem.

While Nehalem does improve performance though, it's mostly in memory bandwidth limited applications and high thread. Back when Xbitlabs was active they did a really good test comparing Nehalem versus Penryn at equal clocks with Turbo on/off and SMT on/off. It showed that in real single threaded applications it was a glorified Tick with 5-10% gain and the rest were based on much needed changes unrelated to the core such as the integrated memory controller.

The guys at RWT did a test too and noted that with Nehalem there are even circumstances where performance degrades with single thread. It was everything BUT single thread.

The two teams complemented each other very well with one enhancing the core(IDC) and the other enhancing the uncore(Oregon).
I never wrote that Nehalem (x86 core) is a breakthrough. I wrote that it's definitely not Conroe/Peneyn. SMT implementation, all architecture modes (Marco and micro) have been adapted for x64, which was not available in Conreoe/Penryn. Even some functions have been moved before the decoding stage and to extended buffers, a new predictor, etc. Writing that Nehalem is roughly Penryn is an oversimplification.

The IPC drop in some cases was caused by changing the cache subsystem from very large L2 to very small L2 and large unified L3.

Edit:

Features a 33% wider instruction window and overall new control logic. All this and changes throughout the system have made Nehalem a big generational leap.
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |