Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Page 748 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Tigerick

Senior member
Apr 1, 2022
712
657
106






As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.



Comparison of upcoming Intel's U-series CPU: Core Ultra 100U, Lunar Lake and Panther Lake

ModelCode-NameDateTDPNodeTilesMain TileCPULP E-CoreLLCGPUXe-cores
Core Ultra 100UMeteor LakeQ4 202315 - 57 WIntel 4 + N5 + N64tCPU2P + 8E212 MBIntel Graphics4
?Lunar LakeQ4 202417 - 30 WN3B + N62CPU + GPU & IMC4P + 4E012 MBArc8
?Panther LakeQ1 2026 ??Intel 18A + N3E3CPU + MC4P + 8E4?Arc12



Comparison of die size of Each Tile of Meteor Lake, Arrow Lake, Lunar Lake and Panther Lake

Meteor LakeArrow Lake (N3B)Lunar LakePanther Lake
PlatformMobile H/U OnlyDesktop & Mobile H&HXMobile U OnlyMobile H
Process NodeIntel 4TSMC N3BTSMC N3BIntel 18A
DateQ4 2023Desktop-Q4-2024
H&HX-Q1-2025
Q4 2024Q1 2026 ?
Full Die6P + 8P8P + 16E4P + 4E4P + 8E
LLC24 MB36 MB ?12 MB?
tCPU66.48
tGPU44.45
SoC96.77
IOE44.45
Total252.15



Intel Core Ultra 100 - Meteor Lake



As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)



 

Attachments

  • PantherLake.png
    283.5 KB · Views: 24,018
  • LNL.png
    881.8 KB · Views: 25,508
Last edited:

Doug S

Diamond Member
Feb 8, 2020
3,005
5,167
136
There's nothing easy anymore. NOTHING. It's obvious to anyone not lying to themselves and have played with simple GPU voltage/frequency scaling. At about 0.6V it can barely reach 300MHz.

From that point it scales superlinearly in regards to voltage - 5-10% voltage may double frequency, or 1.05x1.05 x 2 = 2.2x power for 2.2x frequency assuming we ignore static leakage+fixed block power.

V/F curve has a superlinear, linear, and sublinear relation depending on what frequency curve you are at. It's not that simple.

It used to be simple as @Abwx claims back in the 0.35u days when the voltages were in the 2-2.5V range. Not anymore. The transistors in modern CPUs have been stuck not far above threshold voltage for a long, long time.


Isn't that supralinear scaling because saturating current for silicon is about 1V so once you get below that domain things start to get squirrely? CPUs used to not operate at all much below that range, but modern techniques have pushed down that minimum voltage bit by bit over time. It isn't so much that ADDING voltage from 0.6V is supralinear, but the design tradeoffs required for reducing voltage below saturating current range exact a bigger cost in operational frequency.

I remember reading a decade ago or so that Intel was doing research into near threshold operation (IIRC about 0.25V, the minimum where silicon transistors can begin to switch 'on') because that was the most power efficient operation range. There were so many tradeoffs designing for operation at such low voltages that they couldn't also operate at "normal" voltages, so they'd only be suitable for very low power roles. Since you have to choose your process for the whole chip it wouldn't be applicable to higher power designs. But in the age of chiplets, who knows. Maybe you can have a chiplet full of near threshold super-E cores that max out well under 1 GHz but possessing 10x-50x the energy efficiency of a regular E core.

Dunno if anything ever came of that research. They may have canceled it once they were forced to admit to the reality that x86 was never going to compete with ARM for very low power embedded roles.
 

Hulk

Diamond Member
Oct 9, 1999
4,934
3,367
136
285K media engine fully supports HEVC (H.265) hardware-accelerated encode/decode and should easily beat the competition. Which encoder are you using? Maybe the one you're using doesn't support/use Intel HEVC media engine. Also, Lunar Lake supports VVC (H.266).
Hardware encode is not even close to CPU encode considering quality at a given bandwidth. I only hardware encode.
 

AcrosTinus

Member
Jun 23, 2024
168
169
76
NVL is doing 16+32
This sounds a bit too utopic.
16P cores and 32e on Intel 18A or N2, that is some die size increase on a much more expensive node.

Furthermore if Intel does not have some kind of caching layer for games, the typical reviewers will negate the workstation performance by simply saying AMD XXXX-x3D is faster and more power efficient without adding the constraint "in gaming".

I am done with mainstream anyways for my next build, after the Intel 5820K which was just a dream machine that left no wishes open, everything else that followed was cut down restrained, bugged or plastered with security issues a la Spectre or Meltdown.

HEDT it is now and into the future due to trying to get more into local LLMs as a hobby, the next build is already in motion.
 
Reactions: SteinFG

511

Golden Member
Jul 12, 2024
1,495
1,332
106
This sounds a bit too utopic.
16P cores and 32e on Intel 18A or N2, that is some die size increase on a much more expensive node.
No they are gonna do join 2 8+16 dies so there won't be a monolithic 16+32 die.
If it's 18A it's not much expensive for Intel and jn the end for us vs N2
Furthermore if Intel does not have some kind of caching layer for games, the typical reviewers will negate the workstation performance by simply saying AMD XXXX-x3D is faster and more power efficient without adding the constraint "in gaming".

I am done with mainstream anyways for my next build, after the Intel 5820K which was just a dream machine that left no wishes open, everything else that followed was cut down restrained, bugged or plastered with security issues a la Spectre or Meltdown.

HEDT it is now and into the future due to trying to get more into local LLMs as a hobby, the next build is already in motion.
Yeah
 

SiliconFly

Golden Member
Mar 10, 2023
1,925
1,277
96
This sounds a bit too utopic.
16P cores and 32e on Intel 18A or N2, that is some die size increase on a much more expensive node.

Furthermore if Intel does not have some kind of caching layer for games, the typical reviewers will negate the workstation performance by simply saying AMD XXXX-x3D is faster and more power efficient without adding the constraint "in gaming".

I am done with mainstream anyways for my next build, after the Intel 5820K which was just a dream machine that left no wishes open, everything else that followed was cut down restrained, bugged or plastered with security issues a la Spectre or Meltdown.

HEDT it is now and into the future due to trying to get more into local LLMs as a hobby, the next build is already in motion.
It's 18A-P and N2. And it's 2 8+16 dies stitched together. And each one shouldn't be that different from previous gen 8+16 dies.

Also NVL is expected to introduce LLC in the base tile. Should be 144MB.
 
Reactions: AcrosTinus
Jul 27, 2020
22,299
15,557
146
And it's 2 8+16 dies stitched together.
Any source for that stitching comment? They may be able to "stitch" for server but for consumer, I think it will be very hard to produce something like that in volume. Most likely it's going to be two separate tiles and there will be a latency penalty for workloads that need both tiles.
 

SiliconFly

Golden Member
Mar 10, 2023
1,925
1,277
96
Any source for that stitching comment? They may be able to "stitch" for server but for consumer, I think it will be very hard to produce something like that in volume. Most likely it's going to be two separate tiles and there will be a latency penalty for workloads that need both tiles.
Sorry abt the confusion. It is separate tiles. I used stitch for foveros. Need a different word.
 
Reactions: igor_kavinski

Hulk

Diamond Member
Oct 9, 1999
4,934
3,367
136
Based on how close Skymont has come to Lion Cove coupled with the fact that most software will do better with xPcores and 2 or 3x xE cores, it stands to reason that 8 P's and the remaining die space for E's is the way to go.

You would need an application that needs exactly 9-12 cores and not more or less, for example, if you have 8P + 24E for something with 12P to do better. Outside possibly of some games I don't think that exists.

Furthermore, the E cores not being used for the application that requires exactly 9-12 cores will still be using the E's to off-load housekeeping or other applications that are running, thus increasing the performance of that application running on the 8P's or better chiplet on a 9950X.

8 P's running at high frequencies (kind of inefficiently) and 30 or 40 E's running in the sweet spot of the v/f graph is what I want.

AMD is already kind of doing that with the 9950X by using one higher binned compute chiplet (higher frequency) and one lower binned one (lower frequency). I realize this is an economic decision to save the better binned chiplets for other processors in the stack, but even though the 9950X is the flagship product, AMD realizes that most of the gains of a high clocked chiplet come from the first one being loaded.

All of these decisions are driven by economics coupled with real world performance metrics. It's called optimization.

People here constantly want "pie in the sky" products that defy real world economic decision and optimization.

Both AMD and Intel have "told" us the future is a high performance number of cores and a number of lower performing more area efficient and power efficient ones.

Whether it's P's and E's or C's and c's it's the same philosophy.
 
Last edited:

SiliconFly

Golden Member
Mar 10, 2023
1,925
1,277
96
Based on how close Skymont has come to Lion Cove coupled with the fact that most software will do better with xPcores and 2 or 3x xE cores, there 8 P's and the remaining die space for E's is the way to go.

You would need an application that needs exactly 9-12 cores and not more or less, for example, if you have 8P + 24E for something with 12P to do better. Outside possibly of some games I don't think that exists.

Furthermore, the E cores not being used for the application that requires exactly 9-12 cores will still be using the E's to off-load housekeeping or other applications that are running, thus increasing the performance of that application running on the 8P's or better chiplet on a 9950X.

8 P's running at high frequencies (kind of inefficiently) and 30 or 40 E's running in the sweet spot of the v/f graph is what I want.

AMD is already kind of doing that with the 9950X by using one higher binned compute chiplet (higher frequency) and one lower binned one (lower frequency). I realize this is an economic decision to save the better binned chiplets for other processors in the stack, but even though the 9950X is the flagship product, AMD realizes that most of the gains of a high clocked chiplet come from the first one being loaded.

All of these decisions are driven by economics coupled with real world performance metrics. It's called optimization.

People here constantly want "pie in the sky" products that defy real world economic decision and optimization.

Both AMD and Intel have "told" us the future is a high performance number of cores and a number of lower performing more area efficient and power efficient ones.

Whether it's P's and E's or C's and c's it's the same philosophy.
Then there's LPE cores. Maybe they should come up with HPP cores. Higher Performance P cores.
 

Dave3000

Golden Member
Jan 10, 2011
1,421
99
91
At this site they know, with Windows updates comparisons for both CPUs.

I noticed in that review the X-Plane 12 chart it is showing that the 285k has slightly better 1% lows than the 9800X3D but the 9800X3D has a much higher average. Would this be due to single core IPC/clock speed is more important than a huge L3 cache in that scenario in X-Plane 12 when it hits those 1% lows? Also, does a CPU intensive 3rd party aircraft in X-Plane 12 benefit more from high single-core performance than a large L3 cache if there is a big difference in single-core performance between two CPU's?
 
Last edited:

Hulk

Diamond Member
Oct 9, 1999
4,934
3,367
136
I noticed in that review the X-Plane 12 chart it is showing that the 285k has slightly better 1% lows than the 9800X3D but the 9800X3D has a much higher average. Would this be due to single core IPC/clock speed is more important than a huge L3 cache in that scenario in X-Plane 12 when it hits those 1% lows? Also, does a CPU intensive 3rd party aircraft in X-Plane 12 benefit more from high single-core performance than a large L3 cache if there is a big difference in single-core performance between two CPU's?
When the information has to come from main memory then it comes down to which processor has the faster main memory and compute.
 
Reactions: DavidC1

OneEng2

Senior member
Sep 19, 2022
385
590
106
Panther Lake sounds great .... at least on paper.

I do wonder about the workload in a laptop computer though. For MOST consumers, 8P along with 10 or so E cores would work just fine.

For professionals doing workstation stuff, forget the power and die size and go full in with something crazy like Threadripper.

Still, my biggest concern for Panther Lake isn't performance .... or power ..... but rather profitability. Intel keeps making the foundry out to be the big bad boogie monster and filing all their loss in that one area of the company ..... like it would have been possible to build any of the CPU's of the past at Intel without the foundry!

I think they are still shuffling around the shells in the financial bucket in inventive ways. They scream to the hills that TSMC is too expensive ..... and say how much BETTER profit will be when they get their chips back on 18A internally ..... but HOW? If the foundry is losing all that money, what it means TO ME is that the design area isn't being properly charged for the services they get from the Internal foundry.

I hate company politics though. My life was so much more simple when all I had to do was be an outstanding engineer .
 

511

Golden Member
Jul 12, 2024
1,495
1,332
106
They have Tax Credit on investment in Fabs and by their own Admissions a I3 wafer is more profitable than I7 wafer for foundry so all they need is to increase the volume of I3/18A wafers significantly to reach any profitability.

The foundry drives their overall GM they say break even by 2027, if they achieve this will be a big thing.
 

DavidC1

Golden Member
Dec 29, 2023
1,362
2,222
96
Not what I asked.

Same cores. 16+8 or 8+40?

For me 16+8 is good, 8+40 is better, much better.
The P core in Nova Lake must be a nice improvement to justify being called a "P core" against Arctic Wolf, or there's something missing where the differences are much closer than even today.

Imagine a CPU where it consisted of a Prescott "P" core and Yonah "E" core.

That's what I meant by 16+8 with 16 being Arctic Wolf. Arrowlake shows that the differences are already so small that 8P is slower than 1P+16E in gaming where it should amplify the differences.

Skymont being so close to Lion Cove is because the P core team is so badly sucking relative to the E. If they were more equal it should be similar to how Gracemont and Golden Cove was. At least the core size differences were justified from the performance difference.

@OneEng2 Don't you mean Novalake, since Pantherlake is mobile only?
 
Last edited:
Reactions: SiliconFly

Hulk

Diamond Member
Oct 9, 1999
4,934
3,367
136
The P core in Nova Lake must be a nice improvement to justify being called a "P core" against Arctic Wolf, or there's something missing where the differences are much closer than even today.

Imagine a CPU where it consisted of a Prescott "P" core and Yonah "E" core.

That's what I meant by 16+8 with 16 being Arctic Wolf. Arrowlake shows that the differences are already so small that 8P is slower than 1P+16E in gaming where it should amplify the differences.

Skymont being so close to Lion Cove is because the P core team is so badly sucking relative to the E. If they were more equal it should be similar to how Gracemont and Golden Cove was. At least the core size differences were justified from the performance difference.

@OneEng2 Don't you mean Novalake, since Pantherlake is mobile only?
I realize that your knowledge is a lot deeper than mine here. Just want to let you know I appreciate your very informative posts and am only challenging to learn and further discussion.

But as a novice enthusiast in the "CPU Wars" it seems like there would be a lot more low hanging fruit for the monts as they are newer and just a few generations ago had very poor performance relative to the P cores. I mean before Gracemont, we're really talking about very low performing Atom cores, right?

In addition, the P core has been the focus of Intel's drive to create the most performant core possible for almost 50 years.

This gets to the point I've never really understood. It seems like there is only so much instruction level parallelism that can be extracted from the code? Like walking to a wall and decreasing your distance to the wall by half each time you move. Yes, you are continually "getting there" but at a certain point you're really not making headway.

I understand with SMT you can keep increasing "single core" performance as structures get wider, but for ST I just don't see how much improvement can be had with current code?
 

DavidC1

Golden Member
Dec 29, 2023
1,362
2,222
96
I realize that your knowledge is a lot deeper than mine here. Just want to let you know I appreciate your very informative posts and am only challenging to learn and further discussion.

But as a novice enthusiast in the "CPU Wars" it seems like there would be a lot more low hanging fruit for the monts as they are newer and just a few generations ago had very poor performance relative to the P cores. I mean before Gracemont, we're really talking about very low performing Atom cores, right?
It's simply a human factor, nothing else. Remember the Core uarch revival? Well, they basically got gutted during Kraznich days. Mooly Eden, Dadi Perlmutter all left or got kicked out. And we don't know how many lower level guys left either.

But I have a feeling even those much praised guys weren't the best either. So I'm saying the P core team got even worse.

You read about things on reddit and it's shocking. Like one said Microsoft made a CPU team that was mostly based on former Intel engineers. So much so that the code names they used were identical to the one they used back at Intel. Like the whole division left. Or how Exist keeps saying the entirety of Sapphire Rapids validation team got canned. Who cares about tech when there's no-one developing them?

I think Atom's saving grace during Kraznich days were that they were pushed to be efficient in both power and cost as possible. So they were forced to innovate, or die. While rest of the company were involved in petty politics that plagued the company for decades.
This gets to the point I've never really understood. It seems like there is only so much instruction level parallelism that can be extracted from the code? Like walking to a wall and decreasing your distance to the wall by half each time you move. Yes, you are continually "getting there" but at a certain point you're really not making headway.
Yes, but things can always be done better. And we know the floor is higher, much higher as demonstrated by the ARM camp.

You say there's a limit right? Well some thought the original Athlon was a marvel and not much more could be done. If you look at the development of Moore's Law, it was around 0.5um days when they thought the end was nigh. Few years ago they pointed out how the pads used in modern processors were bigger than the transistors themselves.

It's clear from the DeepSeek vs OpenAI news that innovation is what keeps things going and it comes from none other than a single source - people working on it. It allows you to beat conventional beliefs and entrenched paradigms. It allows you to "circumvent" laws like the inverse square law.

Even the internal combustion engine is still getting better and better, when you think it's just simple burning fuel source. Always room to fine tune things it seems.

Patent trolls, buying skeletal companies, lawsuits over new inventions, it's all silly nonsense. It's all about people. The AI folks are deluded into thinking that they'll replace the brain and even surpass it one day(peak delusion by guys like Ray Kurzweil), when they don't even understand 1% of 1% how our brain really works.
 
Reactions: pcp7 and Hulk

DavidC1

Golden Member
Dec 29, 2023
1,362
2,222
96
But as a novice enthusiast in the "CPU Wars" it seems like there would be a lot more low hanging fruit for the monts as they are newer and just a few generations ago had very poor performance relative to the P cores. I mean before Gracemont, we're really talking about very low performing Atom cores, right?
It wasn't difficult to see they were playing at a different level ever since Goldmont. You could see from the performance improvement relative to die area and power increase. At the same process the Goldmont Plus offered 30% improvement per clock, clocked 10% higher, while consuming the same amount of power.

And you could see from the architectural innovations that they were far more capable than the P core team, that they were willing to take risks on new ideas, but capable enough to execute on them to make it work.

Simply go and compare new uarch features of both lines. For Core the big thing was Pentium M, Core 2, and Sandy Bridge. Everything from then and between them were very minor and/or just expansions. What new things in Haswell, Skylake, Sunny/Golden/Lion Cove? That's a span of 22 years.

Goldmont brought new ideas, so did Tremont, Gracemont, and Skymont. Things like the clustered decode addresses long-standing x86 bottleneck, the decoders, even more significant than PM, C2, and SNB changes.
This gets to the point I've never really understood. It seems like there is only so much instruction level parallelism that can be extracted from the code? Like walking to a wall and decreasing your distance to the wall by half each time you move. Yes, you are continually "getting there" but at a certain point you're really not making headway.
Yet the wider processors are faster than the narrower ones. Engineers at Google has said that it can be wider and Jim Keller said this too. Keller made a comment about processors with "Kilo-size ROBs". You tweak things here, and you tweak things there, it all adds up. Bunch of 0.5% improvements add up to cumulative 10, 20, or even 30%.
 
Reactions: Hulk

OneEng2

Senior member
Sep 19, 2022
385
590
106
@OneEng2 Don't you mean Novalake, since Pantherlake is mobile only?
Well, I meant mobile as Nova Lake is for desktop correct? I was thinking that MOST laptop applications don't require heavy MT performance and are more influenced by battery life and ST performance. Applications that need lots of power are more likely candidates for high end desktop or even workstation setups IMO.
I understand with SMT you can keep increasing "single core" performance as structures get wider, but for ST I just don't see how much improvement can be had with current code?
I think you get diminishing returns as you double the SMT. AMD currently enjoys a 1.4 uplift using SMT 2 way .... which is really great value for the added transistors needed (~10% as I understand it). Moving to 4 way would require a great deal more execution units and scheduling .... and I think you don't get nearly as good of scaling (someone correct me if I am off base here). I don't think it is so much code dependent as it is how many threads are running since SMT is done in hardware and scheduled by the OS.
I think Atom's saving grace during Kraznich days were that they were pushed to be efficient in both power and cost as possible. So they were forced to innovate, or die. While rest of the company were involved in petty politics that plagued the company for decades.
One could argue that this is the path that a good strategist would take in the current environment where die shrinks are becoming further and further apart ..... and you get less and less from each one. Seems like Intel still believes that they can fund this insanity forever when in fact, this will all come down to efficiency as you mentioned. It isn't just about who can make the fastest processor, it is about who can make the fastest processor and make money doing it!
 

DavidC1

Golden Member
Dec 29, 2023
1,362
2,222
96
Seems like Intel still believes that they can fund this insanity forever when in fact, this will all come down to efficiency as you mentioned. It isn't just about who can make the fastest processor, it is about who can make the fastest processor and make money doing it!
They are suffering financially in a market with uptrend. If they do not reverse this quick, when the downtrend happens, they will be gone.
 
Reactions: 511
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |