Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Tigerick · Aug 22, 2022

As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.

MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.

ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.

Comparison of upcoming Intel's U-series CPU: Core Ultra 100U, Lunar Lake and Panther Lake

Model	Code-Name	Date	TDP	Node	Tiles	Main Tile	CPU	LP E-Core	LLC	GPU	Xe-cores
Core Ultra 100U	Meteor Lake	Q4 2023	15 - 57 W	Intel 4 + N5 + N6	4	tCPU	2P + 8E	2	12 MB	Intel Graphics	4
?	Lunar Lake	Q4 2024	17 - 30 W	N3B + N6	2	CPU + GPU & IMC	4P + 4E	0	12 MB	Arc	8
?	Panther Lake	Q1 2026 ?	?	Intel 18A + N3E	3	CPU + MC	4P + 8E	4	?	Arc	12

Comparison of die size of Each Tile of Meteor Lake, Arrow Lake, Lunar Lake and Panther Lake

	Meteor Lake	Arrow Lake (N3B)	Lunar Lake	Panther Lake
Platform	Mobile H/U Only	Desktop & Mobile H&HX	Mobile U Only	Mobile H
Process Node	Intel 4	TSMC N3B	TSMC N3B	Intel 18A
Date	Q4 2023	Desktop-Q4-2024 H&HX-Q1-2025	Q4 2024	Q1 2026 ?
Full Die	6P + 8P	8P + 16E	4P + 4E	4P + 8E
LLC	24 MB	36 MB ?	12 MB	?
tCPU	66.48
tGPU	44.45
SoC	96.77
IOE	44.45
Total	252.15

Intel Core Ultra 100 - Meteor Lake

As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)

poke01 · Jun 12, 2024

AMDK11 said:
"Intel also switched from using proprietary design tools to industry-standard tools optimized for its use. Intel’s old architectures were designed with “Fubs” (functional blocks) of tens of thousands of cells consisting of manually drawn circuits, but it has now moved to using big, synthesized partitions of hundreds of thousands to millions of cells. The removal of the artificial boundaries improves design time, increases utilization, and reduces area.

This also allowed for the addition of more configuration knobs into the design to spin off customized SoC-specific designs faster, with the lead architect saying this allows for more customization between the cores used for Lunar Lake and Arrow Lake. This design methodology also makes 99% of the design transferable to other process nodes, a key advance that prevents the stumbles we’ve seen in the past where intel’s new architectures were delayed by massive process node delays (as with 10nm, for instance)."

It appears that Jim Keller contributed to the LionCove project.

EDIT:
"Intel says it widened the prediction block by 8X over the previous architecture while maintaining accuracy. Intel also tripled the request bandwidth from the instruction cache to the L2 and doubled the instruction fetch bandwidth from 64 to 128 bytes per second. Additionally, decode bandwidth was bumped up from 6 to 8 instructions per cycle while the micro-op cache was increased along with its read bandwidth. The micro-op queue was also increased from 144 entries to 192 entries."

EDIT:
It appears that LionCove is a completely redesigned core and designed from the ground up with a new approach.

This is Intel basically catching up to ARM and co. Nice to see it.

lightisgood · Jun 13, 2024

Rumor has it that ARL-H is fabbed by Intel 20A.
However, that seems to be wrong.

Intel Arrow Lake "Core Ultra 200" & Panther Lake "Core Ultra 300" Laptop CPUs Leak Out

Intel's next-generation Arrow Lake "Core Ultra 200" & Panther Lake "Core Ultra 300" CPUs for laptops have been spotted in multiple leaks.

wccftech.com

FlameTail · Jun 13, 2024

Intel Arrow Lake "Core Ultra 200" & Panther Lake "Core Ultra 300" Laptop CPUs Leak Out

Intel's next-generation Arrow Lake "Core Ultra 200" & Panther Lake "Core Ultra 300" CPUs for laptops have been spotted in multiple leaks.

wccftech.com

Furthermore, it looks like Intel is going to incorporate LPDDR5X on-package memory again with Panther Lake-U low-power SKUs for thin and light platforms. Intel has already confirmed that its Panther Lake lineup will scale up what Lunar Lake had to offer and will offer more flexible DRAM configurations so you won't be limited to just 16 GB or 32 GB LPDDR5X capacities. A more recent Panther Lake leak also confirmed the "H" SKUs with 12 Xe cores based on the Celestial graphics IP.

Panther Lake will continue to use on-package memory.

poke01 · Jun 13, 2024

FlameTail said:
Intel Arrow Lake "Core Ultra 200" & Panther Lake "Core Ultra 300" Laptop CPUs Leak Out

Intel's next-generation Arrow Lake "Core Ultra 200" & Panther Lake "Core Ultra 300" CPUs for laptops have been spotted in multiple leaks.

wccftech.com

Panther Lake will continue to use on-package memory.

great. I hope to see 64GB and 128GB.

FlameTail · Jun 13, 2024

The leak says LPDDR5X, not LPDDR6 for PTL.

poke01 · Jun 13, 2024

FlameTail said:
The leak says LPDDR5X, not LPDDR6 for PTL.

LPDDR6 won't arrive till Nova Lake

Henry swagger · Jun 13, 2024

lightisgood said:
Rumor has it that ARL-H is fabbed by Intel 20A.
However, that seems to be wrong.

Intel Arrow Lake "Core Ultra 200" & Panther Lake "Core Ultra 300" Laptop CPUs Leak Out

Intel's next-generation Arrow Lake "Core Ultra 200" & Panther Lake "Core Ultra 300" CPUs for laptops have been spotted in multiple leaks.

wccftech.com

Maybe 8+16 is 20A and mobile is tsmc N3B only ? 🤔

lightisgood · Jun 13, 2024

Henry swagger said:
Maybe 8+16 is 20A and mobile is tsmc N3B only ? 🤔

I guess it is 2+8.
If Intel achieves manufacturing ARL-S, large compute tile, in 20A, this is clearly superb.
Of course Intel's desktop line needs advanced process node for replacing RPL-R, however, N3B is enough (c.f. Zen5) and probably provide higher wafer capacity.

dttprofessor · Jun 13, 2024

lightisgood said:
我猜是2+8。
如果英特尔能够实现 20A 的大型计算单元 ARL-S 的制造，这显然是极好的。
当然，英特尔的台式机产品线需要先进的工艺节点来取代RPL-R，但是，N3B就足够了（c.f. Zen5），并且可能提供更高的晶圆容量。我

It's said 20A for K.

DavidC1 · Jun 13, 2024

lightisgood said:
I guess it is 2+8.
If Intel achieves manufacturing ARL-S, large compute tile, in 20A, this is clearly superb.

From Sierra Forest review, we can see the Intel 3 process is good.

While I believe lot of Meteorlake's problems are due to being delayed, it may also be due to Intel 4 being a pipecleaner version of Intel 3.

In this case, we should not expect anything fancy with 20A. The good one will be 18A, and may be the reason why parts using 20A is limited.

Wolverine2349 · Jun 13, 2024

DavidC1 said:
From Sierra Forest review, we can see the Intel 3 process is good.

While I believe lot of Meteorlake's problems are due to being delayed, it may also be due to Intel 4 being a pipecleaner version of Intel 3.

In this case, we should not expect anything fancy with 20A. The good one will be 18A, and may be the reason why parts using 20A is limited.

Why don't they just make everything using Intel 18A Process 3 for Arrow Lake if it is the real good one. Id they are doing SIera FOrrest with it why Not Arrow Lake?

Hulk · Jun 13, 2024

DavidC1 said:
From Sierra Forest review, we can see the Intel 3 process is good.

While I believe lot of Meteorlake's problems are due to being delayed, it may also be due to Intel 4 being a pipecleaner version of Intel 3.

In this case, we should not expect anything fancy with 20A. The good one will be 18A, and may be the reason why parts using 20A is limited.

What Intel node do you think will be comparable to N3B? I realize this is kind of a wild guess since actual data is virtually non existent.

DavidC1 · Jun 13, 2024

Wolverine2349 said:
Why don't they just make everything using Intel 18A Process 3 for Arrow Lake if it is the real good one. Id they are doing SIera FOrrest with it why Not Arrow Lake?

Because like I said, Intel 4 and 20A is a "pipecleaner" process. You need real world data and experience before you can move onto the next one, because undoubtedly the next process is more complex in every way, thus the experience from the previous generation is pretty much a requirement.

This is cutting edge work, where there is little to no data about what is needed. So the engineers themselves are learning as they go along. Obviously you can't skip this process(pun not intended).

DavidC1 · Jun 13, 2024

Hulk said:
What Intel node do you think will be comparable to N3B? I realize this is kind of a wild guess since actual data is virtually non existent.

Intel is typically known for making processes that are a generation or more ahead in transistor performance, but half a generation behind in density.

Intel 3 is likely to beat even N2 on performance but be N4/N5 level for density.

Wolverine2349 · Jun 13, 2024

DavidC1 said:
Because like I said, Intel 4 and 20A is a "pipecleaner" process. You need real world data and experience before you can move onto the next one, because undoubtedly the next process is more complex in every way, thus the experience from the previous generation is pretty much a requirement.

This is cutting edge work, where there is little to no data about what is needed. So the engineers themselves are learning as they go along. Obviously you can't skip this process(pun not intended).

How will Intel 20A compare to TSMC process they are going to put the Core Ultra 275 and 285 or a hypothetical rumored 12 Lion Cove P core Core Ultra 295 on?

Which is the better process node if yields were not a thing?

lightisgood · Jun 13, 2024

DavidC1 said:
Intel is typically known for making processes that are a generation or more ahead in transistor performance, but half a generation behind in density.

Intel 3 is likely to beat even N2 on performance but be N4/N5 level for density.

We should pay attention to the decline of N3E's density in comparison to original N3...

DavidC1 · Jun 13, 2024

Wolverine2349 said:
How will Intel 20A compare to TSMC process they are going to put the Core Ultra 275 and 285 or a hypothetical rumored 12 Lion Cove P core Core Ultra 295 on?

Which is the better process node if yields were not a thing?

20A might also have limited libraries just like Intel 4 is and thus not suited for all chips.

Look to Intel 10nm as an example. Cannonlake existed, but with horrible performance figures, and they couldn't activate the iGPU. They said the problem was not defect density as much as parametric yields. You could have all functionality working, but doesn't matter if it doesn't clock high enough for example. But they still needed to get it out. Once it did we got Icelake very quickly, a much improved chip, even against 14nm.

ShimmerBlade · Jun 13, 2024

Hulk said:
What Intel node do you think will be comparable to N3B? I realize this is kind of a wild guess since actual data is virtually non existent.

An Intel engineer weighed in on a similar question under High Yield's YouTube video - "Why Next Generation Chips Separate Data and Power". Quote:
20A is a 3nm competitor, and 18A is an improved version, so it can be called a 2nm or 3nm+ competitor. However, I don't control how they sell this stuff and that's all it really is.

Hulk · Jun 13, 2024

So I'm examining/transcribing the presentation about Skymont form the lead architecture and came across this interesting paragraph on L1 to L1 cache transfers within a Skymont cluster. I think it's very interesting and wanted to share it here for further discussion/clarification. I have reached a new level of geekdom in that recently I transcribe these architecture presentations in my down time and it's actually enjoyable. Somehow it clears my head of the day-to-day stuff that clutters it. Anyway...

"The other thing was, this is kind of fun, some of you out there, tech press, you benchmark our cores, you run micros, and you notice things, and some of you noticed something. When we have data where multiple cores in the same module want to use it at the same time, and specifically, when one core has modified data, and it's still in the first level cache, and another core wants to access it, we do a funny thing.

We don't say, here's the data, we can execute it, and this is Gracemont and Crestmont. What we do is we pretend like it misses the L2, we send it to the fabric, the fabric comes back and asks us for the data, we provide the data to the fabric, the fabric gives this back to us. So, suddenly people were surprised, hey, the data is near and the latency is high. In fact, the latency a little bit longer than a normal cache hit and that's because of this sort of roundtrip behavior.

So it was nice of you guys to notice that, but the good news here is we went ahead and fixed it. In Skymont we have what we call L1 to L1 transfers. What this means is that when one core asks the L2 for the data, we see that it is resident in another core, and we don't go to the fabric anymore, the L2 goes and says, hey please give me the data, it grabs the data, provides it to the core locally, the fabric isn't involved anymore. This is more reliable performance for the cases where people have really tight pipelines and they're sharing data within a module in local time and space. So that's a more reliable latency for cooperative workloads."

H433x0n · Jun 13, 2024

Hulk said:
What Intel node do you think will be comparable to N3B? I realize this is kind of a wild guess since actual data is virtually non existent.

Intel 3 should be pretty close in perf/watt based on the results from SRF. The amount of compute provided at <=250W would suggest it is a true “3nm” tier node. It’s not going to be anywhere as dense since it doesn’t provide a 2-1 or 2-2 fin configuration. If you compare 3-3 fin cells, it’s got competitive density but that’s only half of the portfolio a foundry should be offering.

All reports suggest that the die size for N3B & 20A 6+8 compute tiles are of similar dimensions. So I don’t think it’s a stretch to say 18A will be functionally as dense as N3P while offering better perf/watt for HPC applications.

Hulk · Jun 13, 2024

Here is another interesting tidbit from the E core presentation.

When comparing power and performance between Skymont on the ring and Raptor Cove he says "Raptor Cove on the Raptor process" and "Skymont on the Lunar Lake process." Now, all of the Skymont vs. Raptor comparisons are on the ring as indicated by the note on the slides.

"on the Raptor Cove" process obviously means Intel 7 and I would assume "on the Lunar Lake process" implies TMSC N3B. This seems to imply that they have been testing Skymont on the ring on the N3B process. How could they have such parts unless the ARL compute tile is being manufactured on the N3B process?

"Again this is the same workload that I've shown you for all of these this is GCC spec 17 estimated compiled O2 out of the box. 2% higher on INT, 2% higher on FP. Now, before I said when comparing against E core that it was all positive there were no negatives, here it's a trade-off. Here's an S curve, you can see there are a few below the line, you can see there are a few above the line, so I don't mean to tell you that all workloads are 2% faster in IPC on this IP. It's a little bit of trade, but fundamentally the geometric mean SPEC INT and SPEC FP is 2% so let's map that to power and performance. This is the full peak power and performance curve between the two. This is Skymont in the Lunar Lake process, that's Raptor Cove in Raptor Cove process you can see the peak performance is higher, again you can scale to 6+ GHz, so let's zoom in to power envelopes that are more likely for an E core in a low power island.

(Skymont consumes) 0.66 the power (Raptor Cove) at the same performance level in that middle of the curve, or 20% higher performance at the same power level. This is what you're getting out of an E core, this is what you're getting out of Skymont, this is what we think is key to driving hybrid PC efficiency, to providing long battery life, and providing a great user experience for Lunar Lake."

ondma · Jun 14, 2024

Hulk said:
Here is another interesting tidbit from the E core presentation.

When comparing power and performance between Skymont on the ring and Raptor Cove he says "Raptor Cove on the Raptor process" and "Skymont on the Lunar Lake process." Now, all of the Skymont vs. Raptor comparisons are on the ring as indicated by the note on the slides.

"on the Raptor Cove" process obviously means Intel 7 and I would assume "on the Lunar Lake process" implies TMSC N3B. This seems to imply that they have been testing Skymont on the ring on the N3B process. How could they have such parts unless the ARL compute tile is being manufactured on the N3B process?

"Again this is the same workload that I've shown you for all of these this is GCC spec 17 estimated compiled O2 out of the box. 2% higher on INT, 2% higher on FP. Now, before I said when comparing against E core that it was all positive there were no negatives, here it's a trade-off. Here's an S curve, you can see there are a few below the line, you can see there are a few above the line, so I don't mean to tell you that all workloads are 2% faster in IPC on this IP. It's a little bit of trade, but fundamentally the geometric mean SPEC INT and SPEC FP is 2% so let's map that to power and performance. This is the full peak power and performance curve between the two. This is Skymont in the Lunar Lake process, that's Raptor Cove in Raptor Cove process you can see the peak performance is higher, again you can scale to 6+ GHz, so let's zoom in to power envelopes that are more likely for an E core in a low power island.

(Skymont consumes) 0.66 the power (Raptor Cove) at the same performance level in that middle of the curve, or 20% higher performance at the same power level. This is what you're getting out of an E core, this is what you're getting out of Skymont, this is what we think is key to driving hybrid PC efficiency, to providing long battery life, and providing a great user experience for Lunar Lake."

That is great and all, but just give me more/better P cores and a great desktop gaming chip. Please Intel???? All the hype about Lunar Lake makes me very apprehensive that ARL is going to be mediocre.

TESKATLIPOKA · Jun 14, 2024

ondma said:
That is great and all, but just give me more/better P cores and a great desktop gaming chip. Please Intel???? All the hype about Lunar Lake makes me very apprehensive that ARL is going to be mediocre.

Mediocre in what?
MT looks like It will be better in case you can use every core, single looks similar. For gaming Zen5 with 3D cache.

BTW, which game can use more than 8 cores?

TwistedAndy · Jun 14, 2024

ondma said:
That is great and all, but just give me more/better P cores and a great desktop gaming chip. Please Intel???? All the hype about Lunar Lake makes me very apprehensive that ARL is going to be mediocre.

It looks like Skymont's successor will be the next P-core.

yuri69 · Jun 14, 2024

TwistedAndy said:
It looks like Skymont's successor will be the next P-core.

The hype is cool and such, but we haven't seen reliable 3rd party benchmarks. E-cores cut corners by scaling down the FP side so replacing P-cores is not really likely.

Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads

Senior member

Attachments

Platinum Member

Senior member

Attachments

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Senior member

Senior member

Member

Golden Member

Senior member

Diamond Member

Golden Member

Golden Member

Senior member

Senior member

Golden Member

Junior Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Platinum Member

Member

Senior member