Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Tigerick · Aug 22, 2022

As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.

Comparison of upcoming Intel's U-series CPU: Core Ultra 100U, Lunar Lake and Panther Lake

Model	Code-Name	Date	TDP	Node	Tiles	Main Tile	CPU	LP E-Core	LLC	GPU	Xe-cores
Core Ultra 100U	Meteor Lake	Q4 2023	15 - 57 W	Intel 4 + N5 + N6	4	tCPU	2P + 8E	2	12 MB	Intel Graphics	4
?	Lunar Lake	Q4 2024	17 - 30 W	N3B + N6	2	CPU + GPU & IMC	4P + 4E	0	12 MB	Arc	8
?	Panther Lake	Q1 2026 ?	?	Intel 18A + N3E	3	CPU + MC	4P + 8E	4	?	Arc	12

Comparison of die size of Each Tile of Meteor Lake, Arrow Lake, Lunar Lake and Panther Lake

	Meteor Lake	Arrow Lake (N3B)	Lunar Lake	Panther Lake
Platform	Mobile H/U Only	Desktop & Mobile H&HX	Mobile U Only	Mobile H
Process Node	Intel 4	TSMC N3B	TSMC N3B	Intel 18A
Date	Q4 2023	Desktop-Q4-2024 H&HX-Q1-2025	Q4 2024	Q1 2026 ?
Full Die	6P + 8P	8P + 16E	4P + 4E	4P + 8E
LLC	24 MB	36 MB ?	12 MB	?
tCPU	66.48
tGPU	44.45
SoC	96.77
IOE	44.45
Total	252.15

Intel Core Ultra 100 - Meteor Lake

As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)

DavidC1 · Oct 15, 2024

Hulk said:
I wonder if Intel considered something like 2+10 for Lunar Lake.

2+10 sounds good but that means creating another configuration as Skymont is only quad cluster. So... 2+8.

Hulk said:
What are the advantages and disadvantages of Skymonts 3x3 decoder vs. Lion Cove's 8 simple decoders?

The x86 variable instruction length makes it hugely advantageous to go with the cluster decode configuration.

Here's the page talking about variable length decode:

Page 2 - Discussion - Zen 5 Architecture & Technical discussion

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

forums.anandtech.com

Whatever advantage a full decoder has, it's not worth it at all after certain width, especially when you consider that wide decode is for increasing the peaks, so it already diminishes the w/e perceived advantage there might be. So yes, back in the P3, P4, and Core days they were small. Now we're talking double and triple that width, which dramatically increases the complexity.

Here's a great thread talking about Skymont's clustered decode: https://news.ycombinator.com/item?id=40711835

And because it's a brand new idea, you have potentially other opportunities to further improve it, compared to traditional superscalar approach which existed for 30+ years. Kinda hard to see it as anything other than a win-win.

coercitiv · Oct 15, 2024

Hulk said:
I wonder if Intel considered something like 2+10 for Lunar Lake. Given the huge increase in IPC for Skymont coupled with the fact that the clockspeed advantage of Lion Cove is quite small in this segment, 2+10 seems like it could have been a nice configuration and probably even a bit less area then 4+4.

Like @gdansk already wrote, given their previous config they sure did consider it. However, 4P config is much more suitable for the kind of workloads this chip will see, from browser engines to low power gaming. And again, the IPC delta between Skymont and Lion Cove does not tell you the full story:

Skymont in LNL runs at much lower clocks than Lion, and would also have trouble scaling without access to a fast L3
P and E cores no longer share the ring bus, so it's a good idea to have enough of them on each side to tackle most common workloads

The reality of the situation is we would not be having this conversation if LC was more efficient and a bit faster. I would argue 4P is a very good foundation for the 12-17W envelope. From here Intel could add 2-4 more E cores to keep scaling the design within the same TDP range (assuming the quad cluster is a limiting factor then it would be 4), or alternatively add +2P and maybe another +4E for higher TDP designs. Obviously it ain't happening this way, as we have no direct iteration from LNL, so we'll need to see how they follow up.

Personally I'm getting increasingly convinced the high performance NPUs are wasted in all the CPUs in the of this gen (in the Win ecosystem). I think we would have been fine with the 12-20 TOPS as hardware support for the fiorst generation of software. By the time we'll get proper AI feature implementation on Windows the current "high performance" NPUs will likely be obsolete in terms of performance or hardware support. In the case of LNL we could have gotten another 4E cluster if the NPU was half the size, and that would have improved performance scaling up to 28W.

DavidC1 · Oct 15, 2024

coercitiv said:
Like @gdansk already wrote, given their previous config they sure did consider it. However, 4P config is much more suitable for the kind of workloads this chip will see, from browser engines to low power gaming. And again, the IPC delta between Skymont and Lion Cove does not tell you the full story:

Skymont in LNL runs at much lower clocks than Lion, and would also have trouble scaling without access to a fast L3

P and E cores no longer share the ring bus, so it's a good idea to have enough of them on each side to tackle most common workloads

The lack of a fast L3 cache hurts Skymont in LNL quite a bit, especially in FP, where Crestmont is quite weak. It seems it loses almost all of the gains from the doubled FP units.

We wondered exactly how it would perform, considering it was an improvement over LPE in Meteorlake, but the SLC is quite slow and the LPE moniker is appropriate in Lunarlake.

In the case of LNL we could have gotten another 4E cluster if the NPU was half the size, and that would have improved performance scaling up to 28W.

Yea. It's really only useful if you have a discrete GPU, like Nvidia's top end parts. What matters if you go from 0.01x of the performance to 0.05x?

They could have doubled the SLC cache to 16MB too, reduced power further and improved performance. Or 12 Xe2 cores. Heck, 2 extra Lion Cove cores! Anything really other than a honking NPU.

poke01 · Oct 15, 2024

DavidC1 said:
Anything really other than a honking NPU.

Gen4 NPU is too big, Gen5 in PTL should be much smaller.

IMO, NPUs right now are useless in the Windows ecosystem. Especially AMD's which is half baked in terms of support

511 · Oct 15, 2024

My point what is the point of NPU nearly 6X performance for an NPU on Arrow Lake H iGPU on LNL it is 48TOPS vs 67TOPs 40% more TOPs

coercitiv · Oct 15, 2024

511 said:
My point what is the point of NPU nearly 6X performance for an NPU on Arrow Lake H iGPU on LNL it is 48TOPS vs 67TOPs 40% more TOPs

The more time you save by writing down stream of consciousness instead of purposely structured sentences, the less people will care to read your replies.

moinmoin · Oct 15, 2024

Jan Olšan said:
The design company (Centaur Tech) architecting their cores was a relatively small team.

That reminds me, Intel bought that design staff from VIA for $125M back in 2021. Anybody happen to know what happened to them? I guess they strengthened the Atom/e core team in Austin?

511 · Oct 15, 2024

coercitiv said:
The more time you save by writing down stream of consciousness instead of purposely structured sentences, the less people will care to read your replies.

I can't see the Die Space wasted that should have been better utilized or removed to make it cheaper but yes i will keep it in mind😅

Thanks Microsoft for another useless feature

They should have just slashed NPU by 1/3 and made it more cheaper for us to buy i am pretty sure people would have bought it

cannedlake240 · Oct 15, 2024

511 said:
I can't see the Die Space wasted that should have been better utilized or removed to make it cheaper but yes i will keep it in mind😅

Thanks Microsoft for another useless feature

They should have just slashed NPU by 1/3 and made it more cheaper for us to buy i am pretty sure people would have bought it

Intel's npus are too big, look at apple, it's 5-7mm2 at most on A18. Lnl npu4 isn't much smaller than the entire 4C CPU cluster. How is apple so good at designing all these IPs man... This is why Intel didn't stand a chance in the mobile market

coercitiv · Oct 15, 2024

511 said:
I can't see the Die Space wasted that should have been better utilized or removed to make it cheaper

Here's a simple comparison based on the LNL floor plan. Two of those NPU NCEs would almost cover an entire Skymont cluster.

Today, from a consumer point of view, the NPU just adds to the weight of the chip. MS failed to deliver anything of substance this year, and by the time they do come up with cool features we'll probably have a new generation of chips anyway. Today's chips need to have NPUs so that devs can count on the functionality, but whether they need this much NPU is debatable in my opinion (as a simple consumer, the one who's supposed to pay for this as a product). There's a high chance this hardware will become obsolete before the really good AI features come online.

I'll stop here as this was supposed to be just an observation to @Hulk 's commentary about P and E core configs in Lunar Lake... and it's borderline a rant now.

poke01 · Oct 15, 2024

cannedlake240 said:
How is apple so good at designing all these IPs man

Apple is more of a hardware company than a software company, their hardware teams are one of the best in the industry. Their software is okay but not industry leading. It comes down to planning, setting achievable goals and iterating on IP every year till you mastered it.

You also have to hire the right people and manage them properly. Apple also ain't perfect they made plenty of mistakes like Intel, their car project which was the perfect case study on how not to handle a long term project. You just have to remember not to repeat those failures again.

511 · Oct 15, 2024

cannedlake240 said:
Intel's npus are too big, look at apple, it's 5-7mm2 at most on A18. Lnl npu4 isn't much smaller than the entire 4C CPU cluster. How is apple so good at designing all these IPs man... This is why Intel didn't stand a chance in the mobile market

They didn't cause they never bothered and when they bothered it was too late also

Apple and Intel's use of Libraries makes the difference as well one uses HP other uses HD apple is moving to HP Now since they have started chasing clocks

There is the point of stagnation of design team and foundry as well both were interdependent Meanwhile Apple Multi sourced and their execution ability which Intel lacked

Combining these with Apples amazing Design we get M series SOC

FlameTail · Oct 15, 2024

Lunar Lake Compute Tile = 140 mm²

Strix Point die = 232 mm²

Apple M4 die = 165 mm²

X Elite die = 169 mm²

It does seem like Intel/AMD have bigger NPUs than Apple/Qualcomm.

FlameTail · Oct 15, 2024

511 said:
Apple and Intel's use of Libraries makes the difference as well one uses HP other uses HD apple is moving to HP Now since they have started chasing clocks

I doubt Intel is using HP libraries for the NPU.

Intel's NPU does support more data types than Apple's does, iirc. Though I don't think that alone explains the much bigger NPU size.

Edit: Apple NPU is 35 TOPS of INT8, Lunar Lake's NPU is 48 TOPS of INT8. So Intel does have more "TOPS'.

511 · Oct 15, 2024

FlameTail said:
I doubt Intel is using HP libraries for the NPU.

HP and Intel are two things that are interlinked they used HP Library for GPU (Xe-HPG) as well they will use it for almost everything

dullard · Oct 15, 2024

coercitiv said:
Today, from a consumer point of view, the NPU just adds to the weight of the chip. MS failed to deliver anything of substance this year, and by the time they do come up with cool features we'll probably have a new generation of chips anyway.

I'd have to disagree. The previews of AI computers coming out the next 3 months are finally getting good reviews of the AI features. Some of this is brand focused (HP's AI print seems to make people quite happy using natural language to reformat documents for exactly the right margins, correct printer problems, etc.) This includes reviewers who previously panned AI as being useless or not ready.

But some things from Microsoft will work on all Windows computers. I'm most looking forward to the new Windows search--no longer do you have to search for file names or specific text in documents. Just type "BBQ Party" in any search bar and all photos from your 2017 barbeque party appear (regardless of their file names or any tags that you might have used). But, voice clarity for conference calls and paint getting features like erase or fill will be quite useful to me. Click to Do might be good -- depends on implementation and I haven't tried it yet.

jpiniero · Oct 15, 2024

The NPU is only because of Wall Street's obsession with AI. Just going to have to live with it until it ends.

Wolverine2349 · Oct 15, 2024

jpiniero said:
The NPU is only because of Wall Street's obsession with AI. Just going to have to live with it until it ends.

BINGO Well said.

SO sick of AI and I hope it blows up and crashes so much worse than dot com bubble ever did.

Irony of dot com is the Internet was a positive impact and revolutionary impact on our lives during the time. There was just irrational exuberance in bidding in stock market with anything with so much as a dot com name even companies with 0 assets or bad debt which caused the severe crash.

AI is so much more useless and there is starting to become irrational exuberance for something that is dangerous and should never revolutionize and enslave our lives. Cannot wait until its wiped out of Wallstreet.

jur · Oct 15, 2024

AI is not going away; if anything its presence will increase. MS will probably push it to all office products. I bet it will also become a core part of Windows, so that we'll be able to actually talk to the os. Google will integrate it into its services. Photo / video editing software will have (or already has it). All text editing software... Essentially, it will be everywhere and from my experience, in some cases, in can be a big time saver. Accelerators are the future, unless there's a big jump in CPU performance, but from latest Intel / AMD releases, the outlook is pessimistic.

CakeMonster · Oct 15, 2024

My problem is more with the sharing of data, if they want to introduce local AI accelerated by CPU/GPU that can run offline with local accounts, then fine. Just let me enable/disable it at will in Edge/Office/Desktop etc and don't make any other Windows features dependent on it.

dullard · Oct 15, 2024

CakeMonster said:
My problem is more with the sharing of data, if they want to introduce local AI accelerated by CPU/GPU that can run offline with local accounts, then fine.

That is the whole reason for NPUs: to do the AI work locally without sharing data.

CakeMonster said:
Just let me enable/disable it at will in Edge/Office/Desktop etc and don't make any other Windows features dependent on it.

You can choose not to use many of the AI features. But, so much of it will be built in. For example, I doubt that you'll be able to turn of natural language file searching (nor once you use it will you want to turn it off).

I realize that I'm just about the only AI cheerleader here. But, damn is it useful when I use it. It is like night and day better.

Jan Olšan · Oct 15, 2024

poke01 said:
The fact that Pat killed Royal core was something. That team was the only team at Intel who could have matched Apple's P core in one generation.

But nooo kill that team. Now, that group founded a RISC-V start up.

Do we even know the project was any good?

Actually, where do we even know about "Royal Core" from (and Royal Cove supposedly being the next best thing after sliced bread), isn't it just from some MLID video? Remember that Arrow Lake was supposed to have +40% ST performance according to the same source?

I mean, if Royal Core or Beast Lake or whatever was supposed to have the rentable units (also from MLID?), those things sounded awfully like the VISC concept Intel bought in a a startup. And VISC sounded like a design trying to implement inverse HT (which was an april joke at one point), likely through some software-translation scheme a la Nvidia ARM cores and Transmeta.

I think that was more likely to be the next colossal trainwreck rather than being the next CPU revolution. If Gelsinger killed these crazy projects and ordered the teams to focus on core that gets good in a conventional way instead of trying to be the next Bulldozer or Itanium, that is probably a good thing, not a bad one.

"Conventional" may sound bad but that is the success path so far - Apple, AMD, ARM, even Intel post Netburst (and Itanium).

DrMrLordX · Oct 15, 2024

moinmoin said:
That reminds me, Intel bought that design staff from VIA for $125M back in 2021. Anybody happen to know what happened to them? I guess they strengthened the Atom/e core team in Austin?

It would be interesting to know what happened to that team. Somehow I doubt they're contributing to Atom design, but I could be wrong.

OneEng2 · Oct 15, 2024

DavidC1 said:
The lack of a fast L3 cache hurts Skymont in LNL quite a bit, especially in FP, where Crestmont is quite weak. It seems it loses almost all of the gains from the doubled FP units.

This, and lots of other limitations will hurt Skymont in may situations and applications. That's OK though, that is exactly why a P core cluster is needed.

I believe that the future of processing is more stratification of workloads, not unification of the CPU Core designs.

Already we have:

High Performance Core
Efficiency Core
Graphics Core
AI Core

I think that as time moves on, the number of specific core types will increase and products will be defined mostly by the mix of core types and the number of those core types. In fact, musical instruments already employ a DSP core for sound processing algorithms as an example. They are able to do the kind of work a PC workstation takes many seconds to do in micro-seconds to the point of where all channel processing within a mixer is done and converted to analog signals in under 1 mSec. In other words, the specific hardware is THOUSANDS of times faster than a very fast PC algorithm can be.

I do not imagine a world where a single compute design rules them all.

You disparage Lion Cove and praise Skymont; however, I think that while there may be some justification to your Lion Cove animosity, the design is undoubtably more efficient than the previous generation. That efficiency is where things will pay off in the future IMO.

Josh128 · Oct 15, 2024

dullard said:
That is the whole reason for NPUs: to do the AI work locally without sharing data.

You can choose not to use many of the AI features. But, so much of it will be built in. For example, I doubt that you'll be able to turn of natural language file searching (nor once you use it will you want to turn it off).

I realize that I'm just about the only AI cheerleader here. But, damn is it useful when I use it. It is like night and day better.

What precisely do you use it for? Other than image generation which often requires a lot of manual tweaking once you obtain the output, web search summary is the only thing I use it for (because its default for most browsers now), and thats not particularly a must have feature to me.

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Senior member

Attachments

Golden Member

Diamond Member

Golden Member

Platinum Member

Golden Member

Attachments

Diamond Member

Diamond Member

Golden Member

Senior member

Diamond Member

Platinum Member

Golden Member

Diamond Member

Diamond Member

Golden Member

Elite Member

Lifer

Senior member

Member

Golden Member

Elite Member

Senior member

Lifer

Senior member

Senior member