As Hot Chips 34 starting this week, Intel will unveil technical information of upcoming Meteor Lake (MTL) and Arrow Lake (ARL), new generation platform after Raptor Lake. Both MTL and ARL represent new direction which Intel will move to multiple chiplets and combine as one SoC platform.
MTL also represents new compute tile that based on Intel 4 process which is based on EUV lithography, a first from Intel. Intel expects to ship MTL mobile SoC in 2023.
ARL will come after MTL so Intel should be shipping it in 2024, that is what Intel roadmap is telling us. ARL compute tile will be manufactured by Intel 20A process, a first from Intel to use GAA transistors called RibbonFET.
Comparison of upcoming Intel's U-series CPU: Core Ultra 100U, Lunar Lake and Panther Lake
Model
Code-Name
Date
TDP
Node
Tiles
Main Tile
CPU
LP E-Core
LLC
GPU
Xe-cores
Core Ultra 100U
Meteor Lake
Q4 2023
15 - 57 W
Intel 4 + N5 + N6
4
tCPU
2P + 8E
2
12 MB
Intel Graphics
4
?
Lunar Lake
Q4 2024
17 - 30 W
N3B + N6
2
CPU + GPU & IMC
4P + 4E
0
12 MB
Arc
8
?
Panther Lake
Q1 2026 ?
?
Intel 18A + N3E
3
CPU + MC
4P + 8E
4
?
Arc
12
Comparison of die size of Each Tile of Meteor Lake, Arrow Lake, Lunar Lake and Panther Lake
Meteor Lake
Arrow Lake (N3B)
Lunar Lake
Panther Lake
Platform
Mobile H/U Only
Desktop & Mobile H&HX
Mobile U Only
Mobile H
Process Node
Intel 4
TSMC N3B
TSMC N3B
Intel 18A
Date
Q4 2023
Desktop-Q4-2024
H&HX-Q1-2025
Q4 2024
Q1 2026 ?
Full Die
6P + 8P
8P + 16E
4P + 4E
4P + 8E
LLC
24 MB
36 MB ?
12 MB
?
tCPU
66.48
tGPU
44.45
SoC
96.77
IOE
44.45
Total
252.15
Intel Core Ultra 100 - Meteor Lake
As mentioned by Tomshardware, TSMC will manufacture the I/O, SoC, and GPU tiles. That means Intel will manufacture only the CPU and Foveros tiles. (Notably, Intel calls the I/O tile an 'I/O Expander,' hence the IOE moniker.)
Arrow Lake is kinda like the bulldozer of memory performance 🤷♀️
High clockspeeds/numbers in AIda while low results in apps that actually are latency/bandwitdth bound? 🤣 View attachment 110267
I guess a fair comparison point for Arrowlake would be Meteorlake then, not Raptorlake. But I can't imagine that Arrowlake doesn't have any improvements in its tile based design. What made Intel call off Meteorlake's launch on desktop, but proceed to not sufficiently improve the successor for a desktop launch either?
I guess a fair comparison point for Arrowlake would be Meteorlake then, not Raptorlake. But I can't imagine that Arrowlake doesn't have any improvements in its tile based design. What made Intel call off Meteorlake's launch on desktop, but proceed to not sufficiently improve the successor for a desktop launch either?
Skymont is good enough that they wanted to launch it and at least outside of gaming Lion Cove is generally better than Golden Cove unlike Redwood Cove. Still Arrow Lake is poorly received due to poor gaming performance and a bunch of regressions here and there, some of which seem latency-related and others which might be scheduling problems. One can only imagine how poorly Meteor Lake S would have been received as a desktop processor with Redwood Cove and possibly even worse scheduling (presumably they fixed some bugs in the last year). If they did launch it would have to be OEM only to avoid that.
If Intel doesn't have an Arrow Lake Refresh LGA-1851 is probably a one-and-done platform because Meteor Lake was supposed to be the first but we never got it. Really, really turns me off from even considering the platform after what AMD said about AM5 through 2027.
I presume there's little to no scheduling issue, and more of it is memory/latency issue.
If you read Microsoft Office and Adobe ranking by TPU, you can find it could be easily diagnosed as MS/Adobe test suites are memory sensitive benchmarks, and it's memory/latency issue which leads to poor performance in these test suites with Ultra 200. What's coincident is 285k is roughly on par with 14600k in gaming which is memory sensitive as well.
Latency issue has been already there since Meteorlake and Tile based design is to blame, I guess there would be nothing Microsoft or Intel can do here, and these issues cannot solved by using better memory.
What memory/latency tests do you see ARL specifically not looking good that would cause the low scores we are seeing in these memory sensitive applications?
Actually blows my mind how poor these CPU's are. They must have done internal testing, and they somehow saw it fit to release them in this state? AMD managed to tread water this year with their cpu's and STILL managed to beat Intel. That's just sad.
When I first heard there would be no HT in ARL I was quite sure MT was going to be a problem. Also, based on 6GHz clocks I thought ST benchmarks, which actually isolate 1 core and are able to attain that 6GHz frequency would be a problem for ARL.
But, I was under the belief that improvements in Lion Cove would mitigate the latency/memory penalty of tiles and the minimal decrease in all-core frequency coupled with the fact that quite a lot of software still only uses ~8 threads meant that ARL would do very well in test suites like MS Office. What I'm trying to communicate here is the fact that 1-2 core Turbo clocks are irrelevent generally outside of benchmarks.
What we get is the exact opposite of what I was expecting. Intel hits it out of the park with Skymont, showing us IPC gains we haven't seen since Conroe. Any latency, tiles, or whatever are overcome and then a whole bunch more. ARL is competitive in MT and better than Raptor Lake.
ST benchmark scores are pretty good as well, which I didn't expect. But when the rubber hit's the road, on actual applications, there are weak spots and Lion Cove currently is underperforming and I think letting down the ARL design. I am hoping this is an fixable issue that brings Lion Cove up a bit for Intel's sake, but as it stands right I think it's hard to make a case for the 285K as top dog when it comes to overall performance. It's still only a day from NDA so things could change.
To summarize my take on ARL at this time: - Very good MT performance due primarily to Skymont
- Much improved efficiency compared to Raptor Lake but still a bit behind Zen 5
- Much improved temps, I don't see issues extracting full stock performance on air
- Confusing high/low on certain benchmarks (MS Office for example) that seem to indicate a bottleneck associated with Lion Cove that may be fixable with microcode or the Thread Director, but realistically I'm thinking will take an architectural change to address.
Intel will still get these in millions of systems and as Anand always said, "there are no bad products, only bad price points." Without Skymont this would have been a really bad showing for Intel.
Finally, I have to double check but I don't think Lunar Lake was doing bad in MS Office type stuff, in fact I think it excelled.
- Confusing high/low on certain benchmarks (MS Office for example) that seem to indicate a bottleneck associated with Lion Cove that may be fixable with microcode or the Thread Director, but realistically I'm thinking will take an architectural change to address.
It's (generally) Intel's director putting work on Skymont and not figuring out it should be moved to Lion Cove. Not Lion Cove's fault. Intel needs to fix this if they're keeping two different core types, regardless of what the cores are.
Intel will still get these in millions of systems and as Anand always said, "there are no bad products, only bad price points." Without Skymont this would have been a really bad showing for Intel.
It's (generally) Intel's director putting work on Skymont and not figuring out it should be moved to Lion Cove. Not Lion Cove's fault. Intel needs to fix this if they're keeping two different core types, regardless of what the cores are.
But did anyone verify that or this is still an assumption based on weird results? I am Thread Director was supposed to have easier job with only P and E cores without HT...
But did anyone verify that or this is still an assumption based on weird results? I am Thread Director was supposed to have easier job with only P and E cores without HT...
But did anyone verify that or this is still an assumption based on weird results? I am Thread Director was supposed to have easier job with only P and E cores without HT...
It is as unverified as blaming Lion Cove.
But there's a reason I avoided buying any heterogenous processors for machines that will run Windows. It takes them so long to fix things.
Skymont is good enough that they wanted to launch it and at least outside of gaming Lion Cove is generally better than Golden Cove unlike Redwood Cove. Still Arrow Lake is poorly received due to poor gaming performance and a bunch of regressions here and there, some of which seem latency-related and others which might be scheduling problems. One can only imagine how poorly Meteor Lake S would have been received as a desktop processor with Redwood Cove and possibly even worse scheduling (presumably they fixed some bugs in the last year). If they did launch it would have to be OEM only to avoid that.
If Intel doesn't have an Arrow Lake Refresh LGA-1851 is probably a one-and-done platform because Meteor Lake was supposed to be the first but we never got it. Really, really turns me off from even considering the platform after what AMD said about AM5 through 2027.
Also with all the layoffs and project cancellations at Intel, it makes me shudder to think the next socket after Arrow Lake might be even worse relative to next-gen AMD offerings. Scary to think Alder Lake might be the high point of Intel for a very long time.
I mean I don't expect a hypothetical 9950X 3nm edition to be much more efficient except in low TDPs. Check the chart above where 285K out performs the 9950X (under 150W). And there it is the IOd which is wasting power and here Intel has an advantage due to interconnect.
Seems like a questionable approach. In the desktop and desktop replacement laptop markets, the higher performance within the power envelope will be the most important to consumers. I am not buying into this concept that this market cares about how much electricity the computer is using. Thin and light? Sure.
More importantly, I see potential for Skymont and Lion Cove in the DC products of the future where the designs are likely to be power envelope bound ..... then your argument vector starts to make sense IMO.
TSMC themselves gave these approximate figures for their respective processes:
N3(B): +15% perf (aka clocks at iso power/transistor count) or -30% power vs. N5
N4P: +11% perf or -22% power vs. N5
So the only area where N3B is a notable uplift over N4P is transistor density.
N3E is slightly better than N3B in terms of perf/efficiency (+18% or -32% vs. N5), but loses some density.
Additionally, the technical differences between N5 class and N3 class processes could be significant enough that a quick-n-dirty shrink doesn't necessarily give you the maximum clockspeed benefit "just like that".
In any case, a 3nm Zen5 would've likely used N3E.
But 10-15% higher clocks would mean 6.3-6.6 GHz, and I highly doubt that Zen5 would've hit that so easily.
Consumer-Zen6 with the FPU scaled back to 256bit might hit that with N3P, if we're lucky.
By these TSMC figures, a die shrink alone should be 8% lower power, but you are correct, the higher clocks would be more on the order of 5%.
... and because you are also correct that the tools are different for the different processes (libraries are not compatible), it would not be an easy shrink. I was only posing a hypothetical.
I believe I have seen that Zen 6 is targeting N3P in mid 2025. That shows that AMD has no desire to do a "shrink only" design (which makes sense).
Extrapolating from earlier IPC comparison and comparing clock rates to the OC results posted above
A non-SMT Zen 5 would yield 1588 * (57/40) * 16 = 36,204
Arrow Lake would yield 1459 * (54/40) * 16 + 1702 * (58/40) * 8 = 51,257 very close to the observed
So apparently SMT has insane yield here. Over 40%...
It is my understanding that CB24 is much more bandwidth hungry than CB23 .... which is making me think that Arrow Lake's good showing in CB24 is more about its bandwidth than its processing prowess while CB23 seems to be more processor constrained.
Definitely. As I have said, I think the party is over for we enthusiasts. The days of Conroe level improvements is dead as the exponential cost of process equipment and physics has brought a halt to that performance progression.
I think the future big increases will be more targeted to specific tasks utilizing either specific new instruction sets, or specific cores designed for more specific workloads thus eliminating the need for new process equipment in order to make big gains.
E-cores L1 latency is 3 cycles. Except all future high-ipc designs to have that 3-cycle L1 load latency. For Lion Cove removal of HT probably saved that one cycle from 5 to 4 - but yet they don't have backend to fully take advantage from that lowered load latency. They need to redo whole backend to single-thread optimized to really gain advantage from removing HT.
If Lion Cove IPC is so high above ZEN5, why doesn't it show in application performance? I am a bit baffled. I do understand that there are WAY more operations in the processor than this little chart shows. Also FWIW, XOR is most frequently used for flipping specific bits in a register IME.
Actually blows my mind how poor these CPU's are. They must have done internal testing, and they somehow saw it fit to release them in this state? AMD managed to tread water this year with their cpu's and STILL managed to beat Intel. That's just sad.
True; however, what makes it worse is that Intel essentially had a double die shrink to mostly catch up with ZEN5 which remained on the same process node (mostly).
I kind of wonder if AMD gambled on this decision as the more certain path to performance superiority over an unreleased Arrow Lake would have been to produce Zen 5 on N3E and take the increased transistor budget to enhance performance in a few key areas.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.