Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Tigerick · Aug 22, 2022

As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.

Comparison of upcoming Intel's U-series CPU: Core Ultra 100U, Lunar Lake and Panther Lake

Model	Code-Name	Date	TDP	Node	Tiles	Main Tile	CPU	LP E-Core	LLC	GPU	Xe-cores
Core Ultra 100U	Meteor Lake	Q4 2023	15 - 57 W	Intel 4 + N5 + N6	4	tCPU	2P + 8E	2	12 MB	Intel Graphics	4
?	Lunar Lake	Q4 2024	17 - 30 W	N3B + N6	2	CPU + GPU & IMC	4P + 4E	0	12 MB	Arc	8
?	Panther Lake	Q1 2026 ?	?	Intel 18A + N3E	3	CPU + MC	4P + 8E	4	?	Arc	12

Comparison of die size of Each Tile of Meteor Lake, Arrow Lake, Lunar Lake and Panther Lake

	Meteor Lake	Arrow Lake (N3B)	Lunar Lake	Panther Lake
Platform	Mobile H/U Only	Desktop & Mobile H&HX	Mobile U Only	Mobile H
Process Node	Intel 4	TSMC N3B	TSMC N3B	Intel 18A
Date	Q4 2023	Desktop-Q4-2024 H&HX-Q1-2025	Q4 2024	Q1 2026 ?
Full Die	6P + 8P	8P + 16E	4P + 4E	4P + 8E
LLC	24 MB	36 MB ?	12 MB	?
tCPU	66.48
tGPU	44.45
SoC	96.77
IOE	44.45
Total	252.15

Intel Core Ultra 100 - Meteor Lake

As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)

511 · Oct 22, 2024

You forgot 3MB L2 which per P core Sram didn't scale between N4 and N3 much

cannedlake240 · Oct 22, 2024

ARL skymont appears to have way more L2 cache than the LNL variant or the SRAM cells used are less dense

AMDK11 · Oct 22, 2024

511 said:
You forgot 3MB L2 which per P core Sram didn't scale between N4 and N3 much

I didn't forget. L2 3MB is the lion's share of LionCove core. You can clearly see that in ArrowLake it is 512KB(0.5MB) bigger than the variant in LunarLake(L2 2.5MB).

dullard · Oct 22, 2024

Hulk said:
Or maybe software developers will be able to specifically access certain cores in the CPU through the software for specific routines in the code.

They can and should be doing so.

Page 491 - Discussion - Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Page 491 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

AMDK11 · Oct 22, 2024

Delete

igor_kavinski · Oct 22, 2024

AMDK11 said:

Got one for Skymont too?

AMDK11 · Oct 22, 2024

Got one for Skymont too?

In the case of Skymont, it is more difficult for me to recognize the structures. I found the description, but it looks like someone confused Front-end with Back-end (I think).

In ArrowLake's LionCove variant, there is a larger space between the lower half of L2 and the upper half of L2, so that the upper half of L2 extends beyond the outline of the core. The rest of the logic seems to be similar.

DavidC1 · Oct 22, 2024

AMDK11 said:
In ArrowLake's LionCove variant, there is a larger space between the lower half of L2 and the upper half of L2, so that the upper half of L2 extends beyond the outline of the core. The rest of the logic seems to be similar.

Of course. The L2 is larger so it needs more Tag structures. The picture isn't clear enough but usually is between them.

cannedlake240 said:
ARL skymont appears to have way more L2 cache than the LNL variant or the SRAM cells used are less dense

It looks similar. Explain?

OneEng2 · Oct 22, 2024

moinmoin said:
We already have accelerators. GPUs are ones, sound cards pretty much were ones. Specifically lower latency sound mixing I honestly think that ship sailed ages ago already. We had excellent hardware mixing of sound already, but then Microsoft came and instead of standardizing hardware mixing made all the driver models incompatible with the new Windows OS release. Since then cheap on board sound and software mixing had been considered good enough for like 99.99% of all use cases.

I used to have a Creative Soundblaster Platinum which had an external logic box to keep the noise level down. It was outstanding. I now use a firefly red, but honestly, A/B between the two (when I still had the Platinum) the old Platinum had better sound (using a high quality PA). So yea, I think we definitely have gone backwards there.

Still, we have P Cores, E Cores, GPU's and AI processors today. I remember reading something about cores designed to run JVM stuff ultra fast (although any interpreted language and "fast execution" should never be used in the same sentence ). I suspect that moving forward, specific hardware will be used more and more to improve performance by orders of magnitude. This will be especially true as process technology advances grind to a screeching halt and the cost of new nodes continues on its exponential curve.

Hulk said:
Or maybe software developers will be able to specifically access certain cores in the CPU through the software for specific routines in the code.

This takes me back to the early '90's when I was using the CardD and Software Audio Workshop (SAW), written by Bob Lentini in Assembly. It was absolutely miraculous that you could multitrack record and mix with a pentium class computer. I was at Javits centers in NY way back when saw the demo of CardD and SAW. I bought them within a week, convining the band we HAD to have them. We also had a small recording studio. I used it instead of a DAT recorder to digitize final mixdown and for mastering. It was fabulous for the time and even holds up today. SAW, coded in Assembly was a few MB and ran from the exe file. It was stable, tiny, fast, and brilliant. I remember Bob had studied the Windows API for a bit and decided he couldn't do this through Windows, it was going to have to happen in Assembly and just made it happen.

Good software can be the way "around" insufficient hardware and it is the more elegant approach. We've already seen what good game drivers can do.

I have made plenty of code where you can specify core affinity in Windows. If you are careful, you can also get a time critical thread devoted to a core. I suspect that in the future, you will be able to dedicate tasks to a specific kind of core in addition to a core (if you can't already).

I have found that good C code (and even C++) will operate as quickly as assembly. In fact, in many cases (most all cases for me) the compiler will know cool tricks that you don't know about for different processors and use a more efficient binary than you can get using assembly. I actually tried this in the 90's. Once I found the compiler wrote better assembly than me (using the option to output the asm file of the C code), I never wrote another asm program again, but instead used inline assembly in C to do the low low level dirty work .

Now days, software engineers don't have the very first clue how low level instructions are created or carried out. Ask them about what a linker does, or God forbid, how to set one up for an embedded system (when the OS doesn't do all the work for you) and they are lost.

Shoot, most of them only program in Python .... which only barely ranks in my book as a real programming language, and only then because of the crazy extensive data analysis libraries it has.

AMDK11 said:
Including larger Front-End, 8x wider predictor, 8-Wide decode, L1-I 64KB, UOP Cache(L0) 5250, Queue 192, ROB 576, Branch Order Buffer 180, Scheduler FP 114 and PRF ~400, Scheduler INT 97 and PRF 290, Execution units on 10 (4x FP + 6x ALU) ports instead of 5 (3x FP/ALU + 2x ALU), Non-Scheduling-Queue Buffer, L0-D 48KB, L1-D 192KB and L2 3MB. + all resource control logic.

Also, splitting the unified FP/ALU scheduler with 97 entries for 5 execution ports into a separate scheduler for 6 ALU units with 97 entries and 4 FP units with 114 entries required large resources.

I recall many a post where all of those individual elements were estimated to provide some % IPC uplift. When all of them were combined numbers like 30% were being thrown about.

Sooooo. There are really only a couple of explanations I see. 1) Intel has some awful flaw in Lion Cove architecture that needs fixing. 2) All of those core improvements are implemented horribly .... I mean really horribly.

I personally am going with #1. I think that Intel is going to seriously surprise some people with the NEXT core improvement in Lion Cove. Not sure I am buying 30% IPC, but 20% might be easily the case.

511 · Oct 22, 2024

https://twitter.com/x/status/1848915508491415629

MLID at it again

Hulk · Oct 23, 2024

OneEng2 said:
I used to have a Creative Soundblaster Platinum which had an external logic box to keep the noise level down. It was outstanding. I now use a firefly red, but honestly, A/B between the two (when I still had the Platinum) the old Platinum had better sound (using a high quality PA). So yea, I think we definitely have gone backwards there.

Still, we have P Cores, E Cores, GPU's and AI processors today. I remember reading something about cores designed to run JVM stuff ultra fast (although any interpreted language and "fast execution" should never be used in the same sentence ). I suspect that moving forward, specific hardware will be used more and more to improve performance by orders of magnitude. This will be especially true as process technology advances grind to a screeching halt and the cost of new nodes continues on its exponential curve.

I have made plenty of code where you can specify core affinity in Windows. If you are careful, you can also get a time critical thread devoted to a core. I suspect that in the future, you will be able to dedicate tasks to a specific kind of core in addition to a core (if you can't already).

I have found that good C code (and even C++) will operate as quickly as assembly. In fact, in many cases (most all cases for me) the compiler will know cool tricks that you don't know about for different processors and use a more efficient binary than you can get using assembly. I actually tried this in the 90's. Once I found the compiler wrote better assembly than me (using the option to output the asm file of the C code), I never wrote another asm program again, but instead used inline assembly in C to do the low low level dirty work .

Now days, software engineers don't have the very first clue how low level instructions are created or carried out. Ask them about what a linker does, or God forbid, how to set one up for an embedded system (when the OS doesn't do all the work for you) and they are lost.

Shoot, most of them only program in Python .... which only barely ranks in my book as a real programming language, and only then because of the crazy extensive data analysis libraries it has.

I recall many a post where all of those individual elements were estimated to provide some % IPC uplift. When all of them were combined numbers like 30% were being thrown about.

Sooooo. There are really only a couple of explanations I see. 1) Intel has some awful flaw in Lion Cove architecture that needs fixing. 2) All of those core improvements are implemented horribly .... I mean really horribly.

I personally am going with #1. I think that Intel is going to seriously surprise some people with the NEXT core improvement in Lion Cove. Not sure I am buying 30% IPC, but 20% might be easily the case.

My only assembly programming was for my old Atari 800 when I was in high school. We could use the time during a vertical blank interupt to change a color clock or other trickery to fake more colors than were available.

511 · Oct 23, 2024

Mine was also collage Years had to do both 8085/8086 assembly and ARM assembly

coercitiv · Oct 23, 2024

511 said:
MLID at it again

Of course he is. Care to guess who doesn't have a review sample and signed NDA? We'll know the details tomorrow.

Joe NYC · Oct 23, 2024

coercitiv said:
Of course he is. Care to guess who doesn't have a review sample and signed NDA? We'll know the details tomorrow.

Is tomorrow the end of embargo? Anybody know what time?

Joe NYC · Oct 23, 2024

511 said:
https://twitter.com/x/status/1848915508491415629
MLID at it again

What stability issues are they talking about?

https://twitter.com/x/status/1848978693898834429

Sunaiac · Oct 23, 2024

Some 265k scores are appearing at forums.overclockers.co.uk https://forums.overclockers.co.uk/t...on-news-15th-gen-on-lga-1851.18980945/page-52

coercitiv · Oct 23, 2024

Joe NYC said:
What stability issues are they talking about?

https://twitter.com/x/status/1848978693898834429

BSODs. Probably firmware related. Based on recent events and the fact that the 285K seems to be one affected, I would hazard to guess it's related to power delivery at high clocks.

This launch really should not have any issues like this. Unless it's limited to only a few chips or only one motherboard model, I'd rather they delay it.

Sunaiac · Oct 23, 2024

I have a feeling november 8th would be a good launch date 🙃

SmokSmog · Oct 23, 2024

Joe NYC said:
Is tomorrow the end of embargo? Anybody know what time?

3PM CET, as always.

DrMrLordX · Oct 23, 2024

Joe NYC said:
What stability issues are they talking about?

If you watch the MLID video, he explains that some early reviewers and leakers are indicating that the 285k in particular just . . . doesn't work. Allegedly Intel is aware of the problem and is working on a microcode fix.

edit: @coercitiv beat me to it, though I'll add (as I indicated above) that MLID claims that Intel has identified the problem as microcode-related and is gonna fix it that way. Let's see if that's how it actually works out eh.

igor_kavinski · Oct 23, 2024

Sunaiac said:
Some 265k scores are appearing at forums.overclockers.co.uk https://forums.overclockers.co.uk/t...on-news-15th-gen-on-lga-1851.18980945/page-52

He got 265K over a 7800X3D as his ex-CPU???

Either he's smoking something really good or he hates HT.

igor_kavinski · Oct 23, 2024

coercitiv said:
We'll know the details tomorrow.

This will be Intel's third lukewarm desktop launch since Alder Lake

At least the 14900KS seemed cool in theory. In theory ONLY.

PJVol · Oct 23, 2024

DrMrLordX said:
If you watch the MLID video

What? ) How dare anyone even talk about this, let alone watch his videos?
Just cancel mf once and for all

igor_kavinski · Oct 23, 2024

PJVol said:
Just cancel mf once and for all

Hey, he put on a special moustache for me to talk about my graphs

PJVol · Oct 23, 2024

igor_kavinski said:
Hey, he put on a special moustache for me to talk about my graphs

Oh, really? )
Well, at least no one gonna call him moustache-man then (I hope)

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Senior member

Attachments

Golden Member

Senior member

Senior member

Elite Member

Senior member

Lifer

Senior member

Golden Member

Senior member

Golden Member

Diamond Member

Golden Member

Diamond Member

Platinum Member

Platinum Member

Member

Diamond Member

Member

Member

Lifer

Lifer

Lifer

Senior member

Lifer

Senior member