Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Page 584 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
702
632
106






As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



Comparison of upcoming Intel's U-series CPU: Core Ultra 100U, Lunar Lake and Panther Lake

ModelCode-NameDateTDPNodeTilesMain TileCPULP E-CoreLLCGPUXe-cores
Core Ultra 100UMeteor LakeQ4 202315 - 57 WIntel 4 + N5 + N64tCPU2P + 8E212 MBIntel Graphics4
?Lunar LakeQ4 202417 - 30 WN3B + N62CPU + GPU & IMC4P + 4E012 MBArc8
?Panther LakeQ1 2026 ??Intel 18A + N3E3CPU + MC4P + 8E4?Arc12



Comparison of die size of Each Tile of Meteor Lake, Arrow Lake, Lunar Lake and Panther Lake

Meteor LakeArrow Lake (N3B)Lunar LakePanther Lake
PlatformMobile H/U OnlyDesktop & Mobile H&HXMobile U OnlyMobile H
Process NodeIntel 4TSMC N3BTSMC N3BIntel 18A
DateQ4 2023Desktop-Q4-2024
H&HX-Q1-2025
Q4 2024Q1 2026 ?
Full Die6P + 8P8P + 16E4P + 4E4P + 8E
LLC24 MB36 MB ?12 MB?
tCPU66.48
tGPU44.45
SoC96.77
IOE44.45
Total252.15



Intel Core Ultra 100 - Meteor Lake



As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)



 

Attachments

  • PantherLake.png
    283.5 KB · Views: 24,014
  • LNL.png
    881.8 KB · Views: 25,501
Last edited:

DavidC1

Golden Member
Dec 29, 2023
1,211
1,932
96
I wonder if Intel considered something like 2+10 for Lunar Lake.
2+10 sounds good but that means creating another configuration as Skymont is only quad cluster. So... 2+8.
What are the advantages and disadvantages of Skymonts 3x3 decoder vs. Lion Cove's 8 simple decoders?
The x86 variable instruction length makes it hugely advantageous to go with the cluster decode configuration.

Here's the page talking about variable length decode:
Whatever advantage a full decoder has, it's not worth it at all after certain width, especially when you consider that wide decode is for increasing the peaks, so it already diminishes the w/e perceived advantage there might be. So yes, back in the P3, P4, and Core days they were small. Now we're talking double and triple that width, which dramatically increases the complexity.

Here's a great thread talking about Skymont's clustered decode: https://news.ycombinator.com/item?id=40711835

And because it's a brand new idea, you have potentially other opportunities to further improve it, compared to traditional superscalar approach which existed for 30+ years. Kinda hard to see it as anything other than a win-win.
 
Last edited:

coercitiv

Diamond Member
Jan 24, 2014
6,761
14,684
136
I wonder if Intel considered something like 2+10 for Lunar Lake. Given the huge increase in IPC for Skymont coupled with the fact that the clockspeed advantage of Lion Cove is quite small in this segment, 2+10 seems like it could have been a nice configuration and probably even a bit less area then 4+4.
Like @gdansk already wrote, given their previous config they sure did consider it. However, 4P config is much more suitable for the kind of workloads this chip will see, from browser engines to low power gaming. And again, the IPC delta between Skymont and Lion Cove does not tell you the full story:
  • Skymont in LNL runs at much lower clocks than Lion, and would also have trouble scaling without access to a fast L3
  • P and E cores no longer share the ring bus, so it's a good idea to have enough of them on each side to tackle most common workloads
The reality of the situation is we would not be having this conversation if LC was more efficient and a bit faster. I would argue 4P is a very good foundation for the 12-17W envelope. From here Intel could add 2-4 more E cores to keep scaling the design within the same TDP range (assuming the quad cluster is a limiting factor then it would be 4), or alternatively add +2P and maybe another +4E for higher TDP designs. Obviously it ain't happening this way, as we have no direct iteration from LNL, so we'll need to see how they follow up.

Personally I'm getting increasingly convinced the high performance NPUs are wasted in all the CPUs in the of this gen (in the Win ecosystem). I think we would have been fine with the 12-20 TOPS as hardware support for the fiorst generation of software. By the time we'll get proper AI feature implementation on Windows the current "high performance" NPUs will likely be obsolete in terms of performance or hardware support. In the case of LNL we could have gotten another 4E cluster if the NPU was half the size, and that would have improved performance scaling up to 28W.
 

DavidC1

Golden Member
Dec 29, 2023
1,211
1,932
96
Like @gdansk already wrote, given their previous config they sure did consider it. However, 4P config is much more suitable for the kind of workloads this chip will see, from browser engines to low power gaming. And again, the IPC delta between Skymont and Lion Cove does not tell you the full story:
  • Skymont in LNL runs at much lower clocks than Lion, and would also have trouble scaling without access to a fast L3
  • P and E cores no longer share the ring bus, so it's a good idea to have enough of them on each side to tackle most common workloads
The lack of a fast L3 cache hurts Skymont in LNL quite a bit, especially in FP, where Crestmont is quite weak. It seems it loses almost all of the gains from the doubled FP units.

We wondered exactly how it would perform, considering it was an improvement over LPE in Meteorlake, but the SLC is quite slow and the LPE moniker is appropriate in Lunarlake.
In the case of LNL we could have gotten another 4E cluster if the NPU was half the size, and that would have improved performance scaling up to 28W.
Yea. It's really only useful if you have a discrete GPU, like Nvidia's top end parts. What matters if you go from 0.01x of the performance to 0.05x?

They could have doubled the SLC cache to 16MB too, reduced power further and improved performance. Or 12 Xe2 cores. Heck, 2 extra Lion Cove cores! Anything really other than a honking NPU.
 

511

Golden Member
Jul 12, 2024
1,038
896
106
My point what is the point of NPU nearly 6X performance for an NPU on Arrow Lake H iGPU on LNL it is 48TOPS vs 67TOPs 40% more TOPs
 

Attachments

  • 2024-10-10_2-03-31-1456x819.png
    863.7 KB · Views: 15

511

Golden Member
Jul 12, 2024
1,038
896
106
The more time you save by writing down stream of consciousness instead of purposely structured sentences, the less people will care to read your replies.
I can't see the Die Space wasted that should have been better utilized or removed to make it cheaper but yes i will keep it in mind😅

Thanks Microsoft for another useless feature

They should have just slashed NPU by 1/3 and made it more cheaper for us to buy i am pretty sure people would have bought it
 

cannedlake240

Senior member
Jul 4, 2024
207
111
76
I can't see the Die Space wasted that should have been better utilized or removed to make it cheaper but yes i will keep it in mind😅

Thanks Microsoft for another useless feature

They should have just slashed NPU by 1/3 and made it more cheaper for us to buy i am pretty sure people would have bought it
Intel's npus are too big, look at apple, it's 5-7mm2 at most on A18. Lnl npu4 isn't much smaller than the entire 4C CPU cluster. How is apple so good at designing all these IPs man... This is why Intel didn't stand a chance in the mobile market
 

coercitiv

Diamond Member
Jan 24, 2014
6,761
14,684
136
I can't see the Die Space wasted that should have been better utilized or removed to make it cheaper
Here's a simple comparison based on the LNL floor plan. Two of those NPU NCEs would almost cover an entire Skymont cluster.



Today, from a consumer point of view, the NPU just adds to the weight of the chip. MS failed to deliver anything of substance this year, and by the time they do come up with cool features we'll probably have a new generation of chips anyway. Today's chips need to have NPUs so that devs can count on the functionality, but whether they need this much NPU is debatable in my opinion (as a simple consumer, the one who's supposed to pay for this as a product). There's a high chance this hardware will become obsolete before the really good AI features come online.

I'll stop here as this was supposed to be just an observation to @Hulk 's commentary about P and E core configs in Lunar Lake... and it's borderline a rant now.
 

poke01

Platinum Member
Mar 8, 2022
2,581
3,409
106
How is apple so good at designing all these IPs man
Apple is more of a hardware company than a software company, their hardware teams are one of the best in the industry. Their software is okay but not industry leading. It comes down to planning, setting achievable goals and iterating on IP every year till you mastered it.

You also have to hire the right people and manage them properly. Apple also ain't perfect they made plenty of mistakes like Intel, their car project which was the perfect case study on how not to handle a long term project. You just have to remember not to repeat those failures again.
 
Reactions: Tlh97 and 511

511

Golden Member
Jul 12, 2024
1,038
896
106
Intel's npus are too big, look at apple, it's 5-7mm2 at most on A18. Lnl npu4 isn't much smaller than the entire 4C CPU cluster. How is apple so good at designing all these IPs man... This is why Intel didn't stand a chance in the mobile market
They didn't cause they never bothered and when they bothered it was too late also

Apple and Intel's use of Libraries makes the difference as well one uses HP other uses HD apple is moving to HP Now since they have started chasing clocks

There is the point of stagnation of design team and foundry as well both were interdependent Meanwhile Apple Multi sourced and their execution ability which Intel lacked

Combining these with Apples amazing Design we get M series SOC
 

FlameTail

Diamond Member
Dec 15, 2021
4,238
2,594
106
Apple and Intel's use of Libraries makes the difference as well one uses HP other uses HD apple is moving to HP Now since they have started chasing clocks
I doubt Intel is using HP libraries for the NPU.

Intel's NPU does support more data types than Apple's does, iirc. Though I don't think that alone explains the much bigger NPU size.

Edit: Apple NPU is 35 TOPS of INT8, Lunar Lake's NPU is 48 TOPS of INT8. So Intel does have more "TOPS'.
 

dullard

Elite Member
May 21, 2001
25,554
4,050
126
Today, from a consumer point of view, the NPU just adds to the weight of the chip. MS failed to deliver anything of substance this year, and by the time they do come up with cool features we'll probably have a new generation of chips anyway.
I'd have to disagree. The previews of AI computers coming out the next 3 months are finally getting good reviews of the AI features. Some of this is brand focused (HP's AI print seems to make people quite happy using natural language to reformat documents for exactly the right margins, correct printer problems, etc.) This includes reviewers who previously panned AI as being useless or not ready.

But some things from Microsoft will work on all Windows computers. I'm most looking forward to the new Windows search--no longer do you have to search for file names or specific text in documents. Just type "BBQ Party" in any search bar and all photos from your 2017 barbeque party appear (regardless of their file names or any tags that you might have used). But, voice clarity for conference calls and paint getting features like erase or fill will be quite useful to me. Click to Do might be good -- depends on implementation and I haven't tried it yet.
 
Reactions: Elfear

Wolverine2349

Senior member
Oct 9, 2022
438
143
86
The NPU is only because of Wall Street's obsession with AI. Just going to have to live with it until it ends.

BINGO Well said.

SO sick of AI and I hope it blows up and crashes so much worse than dot com bubble ever did.

Irony of dot com is the Internet was a positive impact and revolutionary impact on our lives during the time. There was just irrational exuberance in bidding in stock market with anything with so much as a dot com name even companies with 0 assets or bad debt which caused the severe crash.

AI is so much more useless and there is starting to become irrational exuberance for something that is dangerous and should never revolutionize and enslave our lives. Cannot wait until its wiped out of Wallstreet.
 
Last edited:
Reactions: sgs_x86

jur

Member
Nov 23, 2016
31
11
81
AI is not going away; if anything its presence will increase. MS will probably push it to all office products. I bet it will also become a core part of Windows, so that we'll be able to actually talk to the os. Google will integrate it into its services. Photo / video editing software will have (or already has it). All text editing software... Essentially, it will be everywhere and from my experience, in some cases, in can be a big time saver. Accelerators are the future, unless there's a big jump in CPU performance, but from latest Intel / AMD releases, the outlook is pessimistic.
 

CakeMonster

Golden Member
Nov 22, 2012
1,522
690
136
My problem is more with the sharing of data, if they want to introduce local AI accelerated by CPU/GPU that can run offline with local accounts, then fine. Just let me enable/disable it at will in Edge/Office/Desktop etc and don't make any other Windows features dependent on it.
 

dullard

Elite Member
May 21, 2001
25,554
4,050
126
My problem is more with the sharing of data, if they want to introduce local AI accelerated by CPU/GPU that can run offline with local accounts, then fine.
That is the whole reason for NPUs: to do the AI work locally without sharing data.
Just let me enable/disable it at will in Edge/Office/Desktop etc and don't make any other Windows features dependent on it.
You can choose not to use many of the AI features. But, so much of it will be built in. For example, I doubt that you'll be able to turn of natural language file searching (nor once you use it will you want to turn it off).

I realize that I'm just about the only AI cheerleader here. But, damn is it useful when I use it. It is like night and day better.
 

Jan Olšan

Senior member
Jan 12, 2017
427
776
136
The fact that Pat killed Royal core was something. That team was the only team at Intel who could have matched Apple's P core in one generation.

But nooo kill that team. Now, that group founded a RISC-V start up.
Do we even know the project was any good?

Actually, where do we even know about "Royal Core" from (and Royal Cove supposedly being the next best thing after sliced bread), isn't it just from some MLID video? Remember that Arrow Lake was supposed to have +40% ST performance according to the same source?

I mean, if Royal Core or Beast Lake or whatever was supposed to have the rentable units (also from MLID?), those things sounded awfully like the VISC concept Intel bought in a a startup. And VISC sounded like a design trying to implement inverse HT (which was an april joke at one point), likely through some software-translation scheme a la Nvidia ARM cores and Transmeta.

I think that was more likely to be the next colossal trainwreck rather than being the next CPU revolution. If Gelsinger killed these crazy projects and ordered the teams to focus on core that gets good in a conventional way instead of trying to be the next Bulldozer or Itanium, that is probably a good thing, not a bad one.

"Conventional" may sound bad but that is the success path so far - Apple, AMD, ARM, even Intel post Netburst (and Itanium).
 
Reactions: techjunkie123

DrMrLordX

Lifer
Apr 27, 2000
22,184
11,890
136
That reminds me, Intel bought that design staff from VIA for $125M back in 2021. Anybody happen to know what happened to them? I guess they strengthened the Atom/e core team in Austin?
It would be interesting to know what happened to that team. Somehow I doubt they're contributing to Atom design, but I could be wrong.
 
Reactions: moinmoin

OneEng2

Senior member
Sep 19, 2022
259
358
106
The lack of a fast L3 cache hurts Skymont in LNL quite a bit, especially in FP, where Crestmont is quite weak. It seems it loses almost all of the gains from the doubled FP units.
This, and lots of other limitations will hurt Skymont in may situations and applications. That's OK though, that is exactly why a P core cluster is needed.

I believe that the future of processing is more stratification of workloads, not unification of the CPU Core designs.

Already we have:
  • High Performance Core
  • Efficiency Core
  • Graphics Core
  • AI Core
I think that as time moves on, the number of specific core types will increase and products will be defined mostly by the mix of core types and the number of those core types. In fact, musical instruments already employ a DSP core for sound processing algorithms as an example. They are able to do the kind of work a PC workstation takes many seconds to do in micro-seconds to the point of where all channel processing within a mixer is done and converted to analog signals in under 1 mSec. In other words, the specific hardware is THOUSANDS of times faster than a very fast PC algorithm can be.

I do not imagine a world where a single compute design rules them all.

You disparage Lion Cove and praise Skymont; however, I think that while there may be some justification to your Lion Cove animosity, the design is undoubtably more efficient than the previous generation. That efficiency is where things will pay off in the future IMO.
 

Josh128

Senior member
Oct 14, 2022
511
865
106
That is the whole reason for NPUs: to do the AI work locally without sharing data.

You can choose not to use many of the AI features. But, so much of it will be built in. For example, I doubt that you'll be able to turn of natural language file searching (nor once you use it will you want to turn it off).

I realize that I'm just about the only AI cheerleader here. But, damn is it useful when I use it. It is like night and day better.
What precisely do you use it for? Other than image generation which often requires a lot of manual tweaking once you obtain the output, web search summary is the only thing I use it for (because its default for most browsers now), and thats not particularly a must have feature to me.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |