Discussion Intel current and future Lakes & Rapids thread

DrMrLordX · Jun 1, 2021

In light of "that other company" messing around with cache stacking: why isn't Intel utilizing Foveros to do the same thing (more or less)? They have the packaging tech to do it.

jpiniero · Jun 1, 2021

DrMrLordX said:
In light of "that other company" messing around with cache stacking: why isn't Intel utilizing Foveros to do the same thing (more or less)? They have the packaging tech to do it.

Meteor Lake uses Foveros but unlikely that you will see anything before then. The original Foveros patent had the mesh+L3 cache on the bottom die.

Ajay · Jun 1, 2021

Geez, AMD releases something new and it gets announced in an Intel thread, come on guys!
This is ridiculous

RTX · Jun 1, 2021

jpiniero said:
Meteor Lake uses Foveros but unlikely that you will see anything before then. The original Foveros patent had the mesh+L3 cache on the bottom die.

Sapphire Rapids is using Foveros but is Meteor Lake using Hybrid Bonding?

jpiniero · Jun 1, 2021

Sapphire uses EMIB I think. Guess I was talking more about client products.

IntelUser2000 · Jun 1, 2021

DrMrLordX said:
In light of "that other company" messing around with cache stacking: why isn't Intel utilizing Foveros to do the same thing (more or less)? They have the packaging tech to do it.

Intel had a packaging leadership back with Haswell and eDRAM.

All these mega-corps fighting over patents? Silly, because it's the brilliant engineers and their minds that created it. People lose sight of what's important. The big corporations have their divisions basically taken over by the finance and marketing guys. Ironically they will eventually lose money because it's good products that create money.

Did they give up on eDRAM/HBM packaging to save money on Icelake/Tigerlake?

You can see from the lack of significant development that Intel suffered a lot of brain drain over the last 5 years. Few extra patents or tools aren't going to solve that!

I have skepticism over whether a single engineer CEO at the helm is enough. Granted he is a brilliant engineer. Is he merely going to stop the inevitable decline or reverse course altogether? Many people refer to Intel as a huge ship - slow to turn. But in addition to the lack of maneuverability it has a lot of structural faults. All need to be addressed.

eek2121 · Jun 1, 2021

coercitiv said:
Hehe, it doesn't work quite like that. Here's the big catch about heatsinks and coolers in general: the amount of heat they can displace increases linearly with the temperature delta between the heatsink and the air passing through the fins. The hotter the heasink is allowed to get, the more heat it able to dissipate.

Let's start with a familiar example: moving from Intel 8th gen CPU with paste thermal interface material, to 9th gen with solder, and last to 10th gen with lowered thickness interface. While using the same cooler with the same fan & speed, simply improving thermal transfer between the CPU die and the heatsink will result in a better performing cooler (heatsink gets hotter, more heat gets dissipated). The cooler is the same, yet heat dissipation has improved due to lower thermal resistance between CPU die and heatspreader.

Back to the mobile scenario, when attempting to prevent the CPU from throttling, the laptop cooler will use only part of it's potential. For example some heatpipes may not make (direct) contact with the CPU, but with the GPU. The same may apply to fan assemblies. On top of that, when using both GPU and CPU, the heatsink will capture more heat and potentially rise higher in temperature (in all areas). This can easily result in more heat being dissipated during combined workloads, as long as we're not talking about 65W package power for CPU + xy W package power for the GPU, since obviously adding GPU power will introduce some additional limitation to max CPU power. You may find that 45W + 45W is possible, or maybe something like 35W + 60W.

It all depends on heatpipe configuration and where the overall bottlenecks of the cooler lie: thermal resistance close to die (meaning bad paste), heatpipes, radiator, airflow, noise profile.

PS: one of my laptops had a 45W CPU, a 50W dGPU and a dual fan config with 2 shared heatpipes. The laptop was able to keep the CPU from throttling while runing Prime95 in long sessions, but temps were close to the limit (PL2 was ~56W, boost window was 1 minute or less). The same system was able to run combined gaming workloads which could easily have reached 70W+ of power use.

As someone who has owned Zen 1 through Zen 3, I have some rather interesting observations to share, among the most important: Zen 2 is the best of the bunch when it comes to perf/watt.

DrMrLordX · Jun 2, 2021

IntelUser2000 said:
Intel had a packaging leadership back with Haswell and eDRAM.

It isn't just that: Intel released Lakefield already. Sure it's not an amazing product due to . . .reasons, but it's out there, and it uses Foveros. Intel's demonstrated that they can stack dice. There's really nothing stopping them from doing it now. Makes you wonder (as you did) why they didn't roll out die stacking for products like Tiger Lake or even Rocket Lake. Someone else beat them to the punch.

Even if Meteor Lake does make use of Foveros as mentioned by @jpiniero , that's at least two years down the line.

IntelUser2000 · Jun 2, 2021

DrMrLordX said:
It isn't just that: Intel released Lakefield already. Sure it's not an amazing product due to . . .reasons, but it's out there, and it uses Foveros. Intel's demonstrated that they can stack dice. There's really nothing stopping them from doing it now. Makes you wonder (as you did) why they didn't roll out die stacking for products like Tiger Lake or even Rocket Lake. Someone else beat them to the punch.

We had a very accurate Intel leaker until Intel nabbed him away - possibly to stop having infos leaked. That was Ashraf.

So when he was talking about Tigerlake having 1/3rd the idle power or something, I believed there was a genuine reason that would be the case. Like Foveros with PCH as the active interposer just like Lakefield.

Because Intel is behind AMD in integrating the PCH. I mean come on, the PCH in AMD chips aren't fully featured as the PCH in Intel chips but it works because APUs go in integrated systems that require little expansion and PCH-lite worked perfectly. The CPUs that needed the extra functionality and bandwidth got the seperate PCH.

I'm disappointed that Jasper Lake regresses in that aspect. AMD goes monolithic in mobile because the on-die PCH functionality allows not only cutting peak power by few W but power management can work much faster and better. AMD actually beats Intel in battery life now.

Of course plans can always change. Just because the original Nehalem didn't release it didn't mean Intel never planned it.

What seems like an advantage on paper doesn't always materialize, I get that. Lakefield using expensive Foveros technology should have resulted in low idle power. Intel failed to do so. But in paper with good execution and design(low level) it would have resulted in a very efficient CPU that bridged the battery life gap against ARM parts. Based on that Adored leak they are trying to get that idle power down with Alderlake successor to Lakefield. But without getting the PCH on-die or stacking it somehow it'll always be behind.

I believe the failures boil down to few key reasons:
-Brain-drain over the years due to massive mismanagement, followed by further losses of people who should have kept the train going as Intel loses leadership. Why stay at a company with a dark future?
-Lack of PCH on-die is because Intel wants to keep older fabs operational for maximum margins. If you integrate that, you no longer have massive amounts of dies keeping those fabs open. On-die PCH = 350 million 60mm2 dies being produced annually is no longer needed.
-In case of Lakefield, it was basically an experimental project so they probably put a small, and/or inexperienced team on it. Tigerlake would be selling literally 100x the volume so any problems they would put any resource and the best men to work at it.

Hulk · Jun 2, 2021

IntelUser2000 said:
We had a very accurate Intel leaker until Intel nabbed him away - possibly to stop having infos leaked. That was Ashraf.

So when he was talking about Tigerlake having 1/3rd the idle power or something, I believed there was a genuine reason that would be the case. Like Foveros with PCH as the active interposer just like Lakefield.

Because Intel is behind AMD in integrating the PCH. I mean come on, the PCH in AMD chips aren't fully featured as the PCH in Intel chips but it works because APUs go in integrated systems that require little expansion and PCH-lite worked perfectly. The CPUs that needed the extra functionality and bandwidth got the seperate PCH.

I'm disappointed that Jasper Lake regresses in that aspect. AMD goes monolithic in mobile because the on-die PCH functionality allows not only cutting peak power by few W but power management can work much faster and better. AMD actually beats Intel in battery life now.

Of course plans can always change. Just because the original Nehalem didn't release it didn't mean Intel never planned it.

What seems like an advantage on paper doesn't always materialize, I get that. Lakefield using expensive Foveros technology should have resulted in low idle power. Intel failed to do so. But in paper with good execution and design(low level) it would have resulted in a very efficient CPU that bridged the battery life gap against ARM parts. Based on that Adored leak they are trying to get that idle power down with Alderlake successor to Lakefield. But without getting the PCH on-die or stacking it somehow it'll always be behind.

I believe the failures boil down to few key reasons:
-Brain-drain over the years due to massive mismanagement, followed by further losses of people who should have kept the train going as Intel loses leadership. Why stay at a company with a dark future?
-Lack of PCH on-die is because Intel wants to keep older fabs operational for maximum margins. If you integrate that, you no longer have massive amounts of dies keeping those fabs open. On-die PCH = 350 million 60mm2 dies being produced annually is no longer needed.
-In case of Lakefield, it was basically an experimental project so they probably put a small, and/or inexperienced team on it. Tigerlake would be selling literally 100x the volume so any problems they would put any resource and the best men to work at it.

Great analysis as always. Hopefully Intel learned a lot from Lakefield, and what was learned will make Alder Lake what we hope it should be.

On another note, Intel kind of reminds me of GM in the '70's. GM was doing so well through the '60's that they began to simply change bumper designs, or make other minor modifications from one model year to the next. But in the '70's and '80's it caught up with them, or more specifically the Japanese caught up with them. They went bankrupt, and luckily with the help of the US government were able to reorganize into the leaner, meaner GM we see today. But competition is still fierce and if the current SUV/pick-up truck mania goes away they could still be in big trouble.

Intel needs to "imagine" they are going bankrupt now and reorganize into a leaner, meaner version of itself. Get rid of all of the BS marketing people and get down to the actual science of designing and building great silicon.

As AMD has demonstrated IF YOU BUILD IT THEY WILL COME.

IntelUser2000 · Jun 2, 2021

Hulk said:
On another note, Intel kind of reminds me of GM in the '70's.

This seems to be the natural order of companies. It starts with innovation, and it grows. Then it becomes successful. Then it wants to optimize it's success with marketing and finance. Then they take over since they "keep the company alive". Then the decline starts.

The lack of integration with the PCH reminds me of them sticking with the FSB until they couldn't anymore. It's entirely due to economics. This is why they are so slow. They play politics in the company.

This great article by David Kanter explains CSI, which eventually came to be known as QuickPath Interconnect: https://www.realworldtech.com/common-system-interface/11/

Intel’s front-side bus has a long history that dates back to 1995 with the release of the Pentium Pro (P6). The P6 was the first processor to offer cheap and effective multiprocessing support; up to four CPUs could be connected to a single shared bus with very little additional effort for an OEM. The importance of cheap and effective cannot be underestimated.

Intel has a brisk chipset business on both the desktop and notebook side that keeps older fabs effectively utilized – an essential element of Intel’s capital strategy. If Intel were to integrate the northbridge in all MPUs, it would force the company to find other products which can use older fabs, or shutter some of the facilities.

The success of the Pentium Pro and its lineage captured the multi-billion dollar RISC workstation and low-end server market, but that success also created inertia around the bus interface. Politics within the company and with existing partners, OEMs and customers conspired to keep Intel content with the status quo.

Damn politics and finance.

So they are fully aware of their problems, but it doesn't get solved since money is #1 priority to the detriment of everything else.

This is why they are so slow to everything.

They keep talking about Omni-directional Interconnect and Foveros, but where are the products?

DrMrLordX · Jun 2, 2021

IntelUser2000 said:
-Lack of PCH on-die is because Intel wants to keep older fabs operational for maximum margins. If you integrate that, you no longer have massive amounts of dies keeping those fabs open. On-die PCH = 350 million 60mm2 dies being produced annually is no longer needed.

Intel's latest foundry initiative could allow them to sell that fab capacity.

blckgrffn · Jun 2, 2021

DrMrLordX said:
Intel's latest foundry initiative could allow them to sell that fab capacity.

This is super critical, IMO. Not only would Intel benefit, but opening up their very solid (so it seems) fabs to other industries (looking at you, automotive) could help alleviate other huge economic hurdles in those industries. I would imagine Intel's 22nm and ultra refined 14nm processes could be put to great use in other applications.

I mean, other than making B365 chipsets and the like

IntelUser2000 · Jun 2, 2021

DrMrLordX said:
Intel's latest foundry initiative could allow them to sell that fab capacity.

It'll take huge effort and lots of time to get to the level where it can effectively replace what used to be PCH production.

They must have been waiting for Foveros to get PCH effectively on-die. Of course getting the full benefits are easier said than done. As evidenced by Lakefield.

lobz · Jun 2, 2021

eek2121 said:
As someone who has owned Zen 1 through Zen 3, I have some rather interesting observations to share, among the most important: Zen 2 is the best of the bunch when it comes to perf/watt.

Now that is just objectively false.

uzzi38 · Jun 2, 2021

eek2121 said:
As someone who has owned Zen 1 through Zen 3, I have some rather interesting observations to share, among the most important: Zen 2 is the best of the bunch when it comes to perf/watt.

Not quite.

Where Zen 2 has an advantage is in two areas:

1. Clocks at a given power. Usually only an extra ~200 MHz or so, but its there. Less pronounced the higher up in power you go, but on the flip side, it can be larger than that in extremely low power situations (8 cores in
2. SMT yield. Zen 2 sees a larger gain in performance on average via SMT than Zen 3

Ultimately the IPC boost brought by Zen 3 cores makes up for the both of those in almost every scenario.

LightningZ71 · Jun 2, 2021

In fact, when you examine the actual benchmarks for the 5700u (Lucienne/Renoir, 8c/16t) vs. the 5800u (Cezanne 8c/16t, TWICE the L3 cache in a single CCX), you find that, while the 5800u can get some significant improvements in single thread tasks, it often barely manages to tie the 5700u in multi-thread scenarios and even looses a couple in roughly equivalent systems. In fact, unless you play 3D mark's Ice Storm benchmark for a living, most of the single thread improvements for the 5800u over the 5700u are 10% or lower.

Staying on topic, Intel at least saw significant thermal improvements moving from Ice Lake to Tiger Lake, though, its debatable what did more, the jump from 10nm(v2) to 10sf, or the tweaks that were made to the cpu cores themselves.

gdansk · Jun 2, 2021

I've seen it both ways. With all their power options/configurations and different cooling arrangements I don't think laptops are consistent indicators of MT performance per watt. Maybe that chip is more efficient or maybe the laptop A's cooling is able to maintain a boost above 15W for longer than laptop B. Testing should be conducted carefully for comparisons of performance per watt.

JoeRambo · Jun 2, 2021

Once you get rid of boost algos and run ZEN2/ZEN3 on static voltages in voltage regime where they are actually efficient, ZEN3 will beat ZEN2 in efficiency with ease. Zen3 Chips from AMD are basically running this algorithm: feed 1.5V into cores as long as PPT below ~105-115W and then drop clocks and voltages.

You can see results of AMD marketing morons influence on chip power algorithm tuning even on Anandtech:

https://images.anandtech.com/doci/15043/3950X%20PowerLoading.png

https://images.anandtech.com/doci/16214/PerCore-1-5950X-Total.png

No chance for Zen3 to somehow make up 20-50% more power usage when under low utilization, IPC and clocks can't close a gap as wide.

If one needs efficiency, ZEN3 has a lot to offer with manual tuning.

IntelUser2000 · Jun 2, 2021

uzzi38 said:
2. SMT yield. Zen 2 sees a larger gain in performance on average via SMT than Zen 3.

Well in CPUs with higher performance per clock, there's less chance for SMT to be better. So yea it makes sense.

How big is the difference in SMT gains between the two? That's the big question.

Hulk · Jun 2, 2021

IntelUser2000 said:
Then it wants to optimize it's success with marketing and finance.

Your quote regarding the rise and fall of big, successful companies is brilliant. Seriously. That is such great use of language. There is so much meaning in that sentence.. It's all there in 11 words. The story of Intel's technological decline over the last 10 or so years.

gdansk · Jun 2, 2021

Lower gains from SMT can also be a result of exploiting more parallelism in 1 thread of execution.

insertcarehere · Jun 2, 2021

uzzi38 said:
Not quite.

Where Zen 2 has an advantage is in two areas:

1. Clocks at a given power. Usually only an extra ~200 MHz or so, but its there. Less pronounced the higher up in power you go, but on the flip side, it can be larger than that in extremely low power situations (8 cores in
2. SMT yield. Zen 2 sees a larger gain in performance on average via SMT than Zen 3

Ultimately the IPC boost brought by Zen 3 cores makes up for the both of those in almost every scenario.

Add another:

3. HPC/Servers where Milan's increased IOD power consumption more than covers for any power efficiency improvements brought about in the individual cores.

Hold Zen 3 EPYC at the 225w TDP that most Zen 2 EPYCs top out at and they will likely be near even in performance.

blckgrffn · Jun 2, 2021

*ahem*

Intel thread.

That is all

DrMrLordX · Jun 3, 2021

blckgrffn said:
I would imagine Intel's 22nm and ultra refined 14nm processes could be put to great use in other applications.

Which node does Intel currently use for their off-die PCH? 22nm?

Discussion Intel current and future Lakes & Rapids thread

Lifer

Lifer

Lifer

Member

Lifer

Elite Member

Diamond Member

Lifer

Elite Member

Diamond Member

Elite Member

Lifer

Diamond Member

Elite Member

Platinum Member

Platinum Member

Golden Member

Platinum Member

Golden Member

Elite Member

Diamond Member

Platinum Member

Senior member

Diamond Member

Lifer