Discussion Intel current and future Lakes & Rapids thread

Page 761 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Exist50

Platinum Member
Aug 18, 2016
2,452
3,101
136
Was that to be taken at face value or that some part of the processor would be fabbed @ TSMC? Based on what limited info he probably had at the time I'm willing to let it slide. IDK who he is or if he's been correct in the past, but I wouldn't expect someone to know intricate details about the part of a multi chip processor months in advance of others.
Raichu has been one of the better leakers. Not perfect, perhaps, but definitely a good enough track record to be worth listening to.

And he's referring to the CPU tile in particular. Certainly any 2024 products would be on Intel's roadmap as of October, so it's not implausible.
 
Last edited:

Exist50

Platinum Member
Aug 18, 2016
2,452
3,101
136
Consolidated? I don't remember Lion Cove taping but I also don't pay attention to online discussion forums much.
This slide leaked about a year ago, and as you can see, the dates indicate it was created around early 2021.


Current rumors indicate that plans have changed, but if nothing else, it indicates that Intel did seriously plan to make a compute die on N3.
 

A///

Diamond Member
Feb 24, 2017
4,352
3,155
136
Raichu has been one of the better leakers. Not perfect, perhaps, but definitely a good enough track record to definitely be worth listening to.

And he's referring to the CPU tile in particular. Certainly any 2024 products would be on Intel's roadmap as of October, so it's not implausible.

Very interesting then. With the drama at Intel we need a few disgruntled employees to anonymously leak roadmaps. C'mon pissed off employees!
 
Reactions: igor_kavinski

A///

Diamond Member
Feb 24, 2017
4,352
3,155
136
This slide leaked about a year ago, and as you can see, the dates indicate it was created around early 2021.
View attachment 77119

Current rumors indicate that plans have changed, but if nothing else, it indicates that Intel did seriously plan to make a compute die on N3.
I don't doubt this, but I also don't doubt Intel had redundency designed into their timeline schedule. I'm a little surprised Intel data sampling and release to preferred clients (prq) is so close together. Though they don't suffer the same issues as amd when it comes to certain logic, but more the drivers.

The RPLR should put Intel back into the lead now that we know the leaked scores for Zen 4 3d. The 7800X3d being severely delayed it either a marketing move or possibly performsn much better than the higher end 3d chips due to their only being 1 ccd. I'm reaching here because I'm spitting back out what I saw in irc this morning. In any case my point from october still stands. You can't go wrong with either intel or amd this generation. You potentially gain with am5 but that is to be seen.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
If you're talking about L1/L2/L3 then I think that would be a refresh. If you get into micro cache/reorder/branch order buffers, registers, loads/stores, and similar then I'd call it a "Tick."

L1 cache change points to a significant architectural change. There was a point where changing L1 size would have been a surprise. So changing L1 cache sizes come with other changes. Like for example Itanium 2's 0 cycle latency L1 is only possible with the architectural features it had. Nowadays with better methodologies and design tools it's happening more frequently.

Remember, L1 cache size only changed a handful of times.
-Pentium M: 2x16KB to 2x32KB
-Sunny Cove: 32KB to 48KB Data Cache

-Pentium 4 "Willamette: 8KB L1 Data Cache
-Pentium 4 "Prescott: 16KB L1 Data Cache

Zen 1: 64KB 4-way I-Cache, 32KB 8-way D-Cache
Zen 2: 2x32KB 8-way

Bonnell(first Atom): 32KB 8-way I-Cache, 24KB 6-way D-Cache
Tremont: 2x32KB 8-way
Gracemont: 64KB I-Cache, 32KB D-Cache

L2 is somewhat in the middle, ever since they moved to L3, you still need some layout changes, though nowhere near L1.
 
Last edited:

Hulk

Diamond Member
Oct 9, 1999
4,375
2,252
136
L1 cache change points to a significant architectural change. There was a point where changing L1 size would have been a surprise. So changing L1 cache sizes come with other changes. Like for example Itanium 2's 0 cycle latency L1 is only possible with the architectural features it had. Nowadays with better methodologies and design tools it's happening more frequently.

Remember, L1 cache size only changed a handful of times.
-Pentium M: 2x16KB to 2x32KB
-Sunny Cove: 32KB to 48KB Data Cache

L2 is somewhat in the middle, ever since they moved to L3, you still need some layout changes, though nowhere near L1.

You are 100% right on this. I didn't think on it long enough before I started typing. L3 is easier to modify right? If I remember correctly this type of cache was off chip in early designs? L1 is intrinsically integrated into the CPU to achieve such low latency, and good point about L2 being in the middle as far as difficulty in making modifications.

Your point is perfectly supported by Intel ONLY increase L2 and L3 in Raptor Cove from Golden Cove.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
You are 100% right on this. I didn't think on it long enough before I started typing. L3 is easier to modify right? If I remember correctly this type of cache was off chip in early designs? L1 is intrinsically integrated into the CPU to achieve such low latency, and good point about L2 being in the middle as far as difficulty in making modifications.

Your point is perfectly supported by Intel ONLY increase L2 and L3 in Raptor Cove from Golden Cove.

I added AMD Zen and Intel mont designs too. Zen 1 and 2 changed it, and stayed same since. Intel kept the same config since the first Atom in 2008, all the way until Tremont in 2020. Gracemont changed it again. With their main Core lineup, previous to Pentium M it did not change since Pentium MMX/Pentium II. Pentium and Pentium Pro both had 2x8KB. I forgot the Pentium 4. Willamette had 8KB L1 Data cache, Prescott had 16KB L1 Data Cache.

L1 cache is so critical that it's almost similar to uop cache and registers in the instruction stream. You can see when they call it a "core" it now includes L2 too, and the shapes are irregular, meaning they are doing a lot more work than they used to.

L2 was on-package with Pentium Pro, and part of the module in Pentium II and Athlon. The first on-die L3 part was second generation IA64, the Itanium 2 "Mckinley". Mckinley and the followup Madison had irregular L3 caches, but otherwise L3 caches are relatively very easy. That was one of the points of the Ring/Mesh design.
 
Last edited:
Reactions: lightmanek

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
Zen 1: 64KB 4-way I-Cache, 32KB 8-way D-Cache
Zen 2: 2x32KB 8-way
The interesting part about the change in Zen 2 is that it appears to be more of a reorganization of existing resources, not an actual cache size change. AMD opted to halve the L1 I-$ but double the L0 µOP-$. So while that is an effective change, it's not even in line of your other examples, increases of resources that need fundamental changes to the cores. I agree the latter happen very very rarely.
 

Hulk

Diamond Member
Oct 9, 1999
4,375
2,252
136
I added AMD Zen and Intel mont designs too. Zen 1 and 2 changed it, and stayed same since. Intel kept the same config since the first Atom in 2008, all the way until Tremont in 2020. Gracemont changed it again. With their main Core lineup, previous to Pentium M it did not change since Pentium MMX/Pentium II. Pentium and Pentium Pro both had 2x8KB. I forgot the Pentium 4. Willamette had 8KB L1 Data cache, Prescott had 16KB L1 Data Cache.

L1 cache is so critical that it's almost similar to uop cache and registers in the instruction stream. You can see when they call it a "core" it now includes L2 too, and the shapes are irregular, meaning they are doing a lot more work than they used to.

L2 was on-package with Pentium Pro, and part of the module in Pentium II and Athlon. The first on-die L3 part was second generation IA64, the Itanium 2 "Mckinley". Mckinley and the followup Madison had irregular L3 caches, but otherwise L3 caches are relatively very easy. That was one of the points of the Ring/Mesh design.

Speaking of micro-op cache. I remember reading somewhere back when ADL came out that Gracemont doesn't have a micro-op cache but instead has dual decoders that can each handle 3 simple instructions with some parallel limitations I don't remember.

How hampered is Gracemont compared to Golden/Raptor Cove when it comes to decode? Or more specifically what are the trade offs of the 1 complex+5 simple decoders with micro-op cache in the Coves vs the dual decode with each being 3 wide but simple only with no micro-op cache?
Seems like Gracemont should get more bogged down with complex instructions but do better with code that is heavier with simple instructions?

And now that I think about it, why does Gracemont have 17 execution ports vs 12 on the coves? Seems front end/back end unbalanced.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
How hampered is Gracemont compared to Golden/Raptor Cove when it comes to decode?

Very little if any. With Golden Cove they moved to all one type - 6 simple decoders. Gracemont's dual 3-wide issue does good enough that it consistently outputs 5, and that's limited by the back-end.

AFAIK Gracemont has kept the direct x86 instruction execution that was introduced with the original Bonnell Atom back in 2008. So not exactly comparable to Core.

Or more specifically what are the trade offs of the 1 complex+5 simple decoders with micro-op cache in the Coves vs the dual decode with each being 3 wide but simple only with no micro-op cache?

It is indeed a tradeoff. Gracemont's L1 cache has On-Demand Instruction Length Decoder which stores pre-decode information on the L1 I-Cache so that part can be skipped. Because it stores length of the instructions, it makes it easier for the decoders to handle it when it's fetched from the L1 cache.

The difference between Simple vs Complex decoders have been discussed ad-nauseum since the original Athlon days. Athlon according to AMD were said to have 3 Complex decoders(probably not as complex as what microcode can handle) while Pentium II/III could do 1 Complex + 2 Simple as in pointing out why it was faster.

Banias kept the same setup and was just as fast as without it. Decoders are only one part of the equation.

You are talking about 1+2 decoder setup scaling from Pentium Pro/Pentium II to Core Duo "Yonah".

"Simple" decoders are capable of handling all Vector instructions, so that includes SSE and AVX. That was back in the Core 2 days and they are probably more capable. Rest are done by microcode.

And now that I think about it, why does Gracemont have 17 execution ports vs 12 on the coves? Seems front end/back end unbalanced.

Gracemont has more of simple ports that are dedicated to the task. Golden Cove has less but each are capable of handling more.
 
Last edited:

Geddagod

Golden Member
Dec 28, 2021
1,205
1,172
106
Find it hard to believe MTL-S would be cancelled (at least due to clock speed reasons) if ES-2 samples are hitting 5 GHz. Raichu thinks it would end up hitting 93% of the ST clock speeds of the 13900k, and add some ipc tweaks, better ring clock speeds, and faster memory support, I could find it easily hitting 95% of RPL gaming performance with a massive gain in efficiency. And in MT, for a 8+16 model, since the cores would be able to boost higher using the same power draw, we might see a small bump there as well. Overall seems like a side-grade from the 13900k, but Intel launched worse...
At this point it just sounds like MTL-S, if the 8+16 version is cancelled or desktop all together, got cancelled because of execution problems or not wanting to waste stock on desktop parts.
I'm still leaning towards MTL-S not launching at all from rumors, but that's just my opinion. Rumors are heavily conflicted on this.
I just think it's impressive the clock speed is able to remain nearly the same from Intel 4 compared to the heavily improved Intel 7 ultra, which has been just a massively improved version of the Intel 10 ESF node.
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,101
136
I don't doubt this, but I also don't doubt Intel had redundency designed into their timeline schedule. I'm a little surprised Intel data sampling and release to preferred clients (prq) is so close together.
Which specific milestones are you referring to? Because they should be sampling to customers starting with ES1.
The RPLR should put Intel back into the lead now that we know the leaked scores for Zen 4 3d.
Eh, doubt it. Would need a couple hundred MHz more just to budge the needle, and I'm skeptical they can pull that off.
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,101
136
Find it hard to believe MTL-S would be cancelled (at least due to clock speed reasons) if ES-2 samples are hitting 5 GHz. Raichu thinks it would end up hitting 93% of the ST clock speeds of the 13900k, and add some ipc tweaks, better ring clock speeds, and faster memory support, I could find it easily hitting 95% of RPL gaming performance with a massive gain in efficiency. And in MT, for a 8+16 model, since the cores would be able to boost higher using the same power draw, we might see a small bump there as well. Overall seems like a side-grade from the 13900k, but Intel launched worse...
Would a presumably more expensive 13900k side-grade really make for a compelling product though? Sounds like Rocket Lake, in some ways. And I'd be skeptical of Meteor Lake's SoC changes from a gaming standpoint. Will be very curious to see what memory latency ends up being.
 

Glo.

Diamond Member
Apr 25, 2015
5,763
4,667
136
If this tweet is correct, MTL-S may have 6+8, 6+16, and 8 core variants:


Interesting, if true. We will see.

Also, depending on IPC increases, a 6+16 part could be really competitive.
Nothing new, apart from PCIe configs.

It was rumored that this year, all we will see is RPL-R, and MTL-S will land on desktop but at the same time as Arrow Lake-S. And its ARL-S which will have to fabled 8+16 config.
 

Geddagod

Golden Member
Dec 28, 2021
1,205
1,172
106
Would a presumably more expensive 13900k side-grade really make for a compelling product though? Sounds like Rocket Lake, in some ways. And I'd be skeptical of Meteor Lake's SoC changes from a gaming standpoint. Will be very curious to see what memory latency ends up being.
Was Rocket Lake more energy efficient than Comet Lake though? Maybe when equaling core count but I doubt the 11900k is more efficient than the 10900k. I think consumers are willing to accept a side grade if MTL ends up having a good gain in efficiency, which I think it will have. As long as the ST is within 95ish percent of the 14900k, I think Intel has a good argument for launching it. Consumers with a 12900k-14900k prob won't have a reason to upgrade if ST doesn't increase, but they prob won't upgrade anyway since MTL-S is on a new mobo regardless.
 

Hulk

Diamond Member
Oct 9, 1999
4,375
2,252
136
As noted previously this issue began with the transition from 22nm with Ivy Bridge to 14nm with Broadwell. Ivy was at 3.9GHz and when Broadwell was released in Q4 2014 in mobile form. It was strange when suddenly we had a "Tick" with no desktop release. Looking back it is obvious Intel was having problems ramping frequency with 14nm. In fact they didn't even bring the very limited desktop parts to market until Q2 2105 and those clocks were still below Ivy Bridge (3.9 for 3770K vs 3.7GHz for Broadwell 5775C)but they had brought them up from the initial mobile only release.

Amazingly the 14nm process that initially gave Intel so many problems ended up being their primary process for 7 years, through all of those Skylake revisions and to Rocket Lake, scaling from 3.7GHz with Broadwell to 5.3GHz with Rocket Lake.

The same thing happened with 10nm but probably to a greater extent because it took much longer to bring a viable mobile 10nm product to market with Ice Lake and even longer for desktop-level frequency with Alder Lake.

The point of all of this is that we can assume that Intel is learning from history. It takes them a while to ramp up frequency and yields as these process features reduce in size. They also need to keep producing processors so it becomes an economic problem of when to transition from one node to the next?

Based on this historical information and rumors I think it is quite reasonable to assume that while the Redwood Cove cores are showing IPC gains over Raptor Cove, whatever those gains may be they are not going to be able to offset the very mature Intel 7 max frequency.

This leads Intel to make a decision. Assuming it is performing better than Raptor in mobile applications that would be a go for launch. But with the high bar set with the 13900KS is it wise to release a next generation desktop part that is a regression in performance? Now I realize that many of you are thinking if this part produces nearly the same performance as Raptor but at much better frequencies just release it!

But Intel is thinking they can refine the current Intel 7 process using the same fabs and produce a part not only faster than Meteor but also faster than current Raptor with a Refresh part. And they can continue to "work" the Intel 7 fabs, which are assuredly big money makers as they have taken the place of the 14nm process.

So, from the economic Intel perspective they get a "better mobile" part with Meteor and use these parts as the initial transition to Intel 4 while keeping Intel 7 for a bit longer with the Raptor Refresh. What do they lose? People are going to scream and yell about RLR efficiency. In the end they are assuming that one drawback doesn't offset the positives I've described above.

So for Intel 14th generation continues "on schedule" on paper because Meteor on Intel 4 is released and they have a better desktop part to release along side it. This additional year of time *should* allow the Intel 4 process to mature to the point in terms of yields and frequency where Gen 15 Arrow Lake can also be released basically on schedule or close to it in both desktop and mobile forms with performance and efficiency beating generation 14.

Finally, we heard rumors quite far back that Meteor would be 6+16, which was basically informing us that this would be a mobile based release. Why push die sizes to 8+16 for a mobile release? Intel has already determined that even 8+16 with Meteor would have trouble competing with a Raptor Lake Refresh in terms of maximum performance.

Also if Intel did move desktop to Meteor besides having to deal with the fallout of having a new part that is less performant than the previous one they would have to make sure yields were good enough for demand and begin to transition from the highly lucrative Intel 7 fab to the less so Intel 4 one.

Economically this all makes perfect sense even if you might want a more efficient Meteor Lake desktop part on Intel 4 with nearly the same ST performance as Raptor but with better efficiency. While this might seem like a "dumb" business move (quoting Igor's recent post) from the perspective of your needs, in the overall plan it's the best option. This is why I stated only the "dumb" companies survive.

Anyway that's my theory and reasoning with the current Meteor Lake situation. Now I'll let you all take it apart!

Then the whole process starts again with Intel 20A but I am assuming Intel is getting ahead of this better each time. We shall see.
 

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
The point of all of this is that we can assume that Intel is learning from history.
Generally agree, but I'm not sure I'm seeing results of Intel learning from history so far. Rather it looks to me more like "fake it till you make it" with no real alternate plans beyond delays and refreshes. With 14nm and Skylake refreshes the delays actually turned out very well financially for Intel. With 10nm and Alder Lake refreshes Intel is not in such a good position currently though.
 
Reactions: Tlh97 and coercitiv

Hulk

Diamond Member
Oct 9, 1999
4,375
2,252
136
Generally agree, but I'm not sure I'm seeing results of Intel learning from history so far. Rather it looks to me more like "fake it till you make it" with no real alternate plans beyond delays and refreshes. With 14nm and Skylake refreshes the delays actually turned out very well financially for Intel. With 10nm and Alder Lake refreshes Intel is not in such a good position currently though.

Intel is currently competitive with AMD due to Alder Lake and Raptor Lake. It's pretty much an even ballgame. I'm not seeing them in a terrible position in the marketplace with their current product stack. Now back with Rocket Lake vs. Zen 3 they didn't have anything compelling on the desktop, which is why I waited for ADL to upgrade from Haswell.
 

moinmoin

Diamond Member
Jun 1, 2017
4,994
7,765
136
Intel is currently competitive with AMD due to Alder Lake and Raptor Lake. It's pretty much an even ballgame. I'm not seeing them in a terrible position in the marketplace with their current product stack. Now back with Rocket Lake vs. Zen 3 they didn't have anything compelling on the desktop, which is why I waited for ADL to upgrade from Haswell.
I'm not talking about Intel's competitiveness versus AMD in the consumer market, but its financial state. Skylake refreshes allowed for some of Intel's most profitable quarters. The recent quarters were close to the opposite end, and it's not really clear that Intel knows how to get through these ebbs, already shelving projects just started a year ago.
 
Reactions: Tlh97 and Hulk

Geddagod

Golden Member
Dec 28, 2021
1,205
1,172
106
I'm not talking about Intel's competitiveness versus AMD in the consumer market, but its financial state. Skylake refreshes allowed for some of Intel's most profitable quarters. The recent quarters were close to the opposite end, and it's not really clear that Intel knows how to get through these ebbs, already shelving projects just started a year ago.
I think that's the exception to the rule rather than the norm. Skylake refreshes were only truly super profitable because AMD couldn't compete regardless, right? Now that competition from AMD has returned, moving to new nodes and new architectures is a necessity to compete.
 

Geddagod

Golden Member
Dec 28, 2021
1,205
1,172
106
Generally agree, but I'm not sure I'm seeing results of Intel learning from history so far. Rather it looks to me more like "fake it till you make it" with no real alternate plans beyond delays and refreshes. With 14nm and Skylake refreshes the delays actually turned out very well financially for Intel. With 10nm and Alder Lake refreshes Intel is not in such a good position currently though.
I mean I get RPL is technically a refreshed ADL, but I feel like it is leaving out some of the larger picture. I would be willing to bet the larger L2 cache and extra 8 E-cores were features that were pulled in from MTL-S original design goals...
Now that I think about it, this is my guess about how RPL development went:
MTL-S originally brought:
8 extra E-cores
Larger L2
RWC arch (with stuff like increased L1$)
Crestmont arch
Intel 4
Tiled design
Intel then realized MTL-S was not coming out 2022. They needed a stop gap CPU, so they decided to just continue with GLC, add extra L2 to client which was a benefit of MTL, and also add 8 extra E-cores to remain competitive. Not changing the architecture to RWC and Crestmont ensures no need for a messy backport, and the Tiled design and Intel 4 were probably also not ready at that point.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |