In light of "that other company" messing around with cache stacking: why isn't Intel utilizing Foveros to do the same thing (more or less)? They have the packaging tech to do it.
In light of "that other company" messing around with cache stacking: why isn't Intel utilizing Foveros to do the same thing (more or less)? They have the packaging tech to do it.
Hehe, it doesn't work quite like that. Here's the big catch about heatsinks and coolers in general: the amount of heat they can displace increases linearly with the temperature delta between the heatsink and the air passing through the fins. The hotter the heasink is allowed to get, the more heat it able to dissipate.
Let's start with a familiar example: moving from Intel 8th gen CPU with paste thermal interface material, to 9th gen with solder, and last to 10th gen with lowered thickness interface. While using the same cooler with the same fan & speed, simply improving thermal transfer between the CPU die and the heatsink will result in a better performing cooler (heatsink gets hotter, more heat gets dissipated). The cooler is the same, yet heat dissipation has improved due to lower thermal resistance between CPU die and heatspreader.
Back to the mobile scenario, when attempting to prevent the CPU from throttling, the laptop cooler will use only part of it's potential. For example some heatpipes may not make (direct) contact with the CPU, but with the GPU. The same may apply to fan assemblies. On top of that, when using both GPU and CPU, the heatsink will capture more heat and potentially rise higher in temperature (in all areas). This can easily result in more heat being dissipated during combined workloads, as long as we're not talking about 65W package power for CPU + xy W package power for the GPU, since obviously adding GPU power will introduce some additional limitation to max CPU power. You may find that 45W + 45W is possible, or maybe something like 35W + 60W.
It all depends on heatpipe configuration and where the overall bottlenecks of the cooler lie: thermal resistance close to die (meaning bad paste), heatpipes, radiator, airflow, noise profile.
PS: one of my laptops had a 45W CPU, a 50W dGPU and a dual fan config with 2 shared heatpipes. The laptop was able to keep the CPU from throttling while runing Prime95 in long sessions, but temps were close to the limit (PL2 was ~56W, boost window was 1 minute or less). The same system was able to run combined gaming workloads which could easily have reached 70W+ of power use.
Intel had a packaging leadership back with Haswell and eDRAM.
It isn't just that: Intel released Lakefield already. Sure it's not an amazing product due to . . .reasons, but it's out there, and it uses Foveros. Intel's demonstrated that they can stack dice. There's really nothing stopping them from doing it now. Makes you wonder (as you did) why they didn't roll out die stacking for products like Tiger Lake or even Rocket Lake. Someone else beat them to the punch.
We had a very accurate Intel leaker until Intel nabbed him away - possibly to stop having infos leaked. That was Ashraf.
So when he was talking about Tigerlake having 1/3rd the idle power or something, I believed there was a genuine reason that would be the case. Like Foveros with PCH as the active interposer just like Lakefield.
Because Intel is behind AMD in integrating the PCH. I mean come on, the PCH in AMD chips aren't fully featured as the PCH in Intel chips but it works because APUs go in integrated systems that require little expansion and PCH-lite worked perfectly. The CPUs that needed the extra functionality and bandwidth got the seperate PCH.
I'm disappointed that Jasper Lake regresses in that aspect. AMD goes monolithic in mobile because the on-die PCH functionality allows not only cutting peak power by few W but power management can work much faster and better. AMD actually beats Intel in battery life now.
Of course plans can always change. Just because the original Nehalem didn't release it didn't mean Intel never planned it.
What seems like an advantage on paper doesn't always materialize, I get that. Lakefield using expensive Foveros technology should have resulted in low idle power. Intel failed to do so. But in paper with good execution and design(low level) it would have resulted in a very efficient CPU that bridged the battery life gap against ARM parts. Based on that Adored leak they are trying to get that idle power down with Alderlake successor to Lakefield. But without getting the PCH on-die or stacking it somehow it'll always be behind.
I believe the failures boil down to few key reasons:
-Brain-drain over the years due to massive mismanagement, followed by further losses of people who should have kept the train going as Intel loses leadership. Why stay at a company with a dark future?
-Lack of PCH on-die is because Intel wants to keep older fabs operational for maximum margins. If you integrate that, you no longer have massive amounts of dies keeping those fabs open. On-die PCH = 350 million 60mm2 dies being produced annually is no longer needed.
-In case of Lakefield, it was basically an experimental project so they probably put a small, and/or inexperienced team on it. Tigerlake would be selling literally 100x the volume so any problems they would put any resource and the best men to work at it.
On another note, Intel kind of reminds me of GM in the '70's.
Intel’s front-side bus has a long history that dates back to 1995 with the release of the Pentium Pro (P6). The P6 was the first processor to offer cheap and effective multiprocessing support; up to four CPUs could be connected to a single shared bus with very little additional effort for an OEM. The importance of cheap and effective cannot be underestimated.
Intel has a brisk chipset business on both the desktop and notebook side that keeps older fabs effectively utilized – an essential element of Intel’s capital strategy. If Intel were to integrate the northbridge in all MPUs, it would force the company to find other products which can use older fabs, or shutter some of the facilities.
The success of the Pentium Pro and its lineage captured the multi-billion dollar RISC workstation and low-end server market, but that success also created inertia around the bus interface. Politics within the company and with existing partners, OEMs and customers conspired to keep Intel content with the status quo.
-Lack of PCH on-die is because Intel wants to keep older fabs operational for maximum margins. If you integrate that, you no longer have massive amounts of dies keeping those fabs open. On-die PCH = 350 million 60mm2 dies being produced annually is no longer needed.
Intel's latest foundry initiative could allow them to sell that fab capacity.
Intel's latest foundry initiative could allow them to sell that fab capacity.
Now that is just objectively false.As someone who has owned Zen 1 through Zen 3, I have some rather interesting observations to share, among the most important: Zen 2 is the best of the bunch when it comes to perf/watt.
Not quite.As someone who has owned Zen 1 through Zen 3, I have some rather interesting observations to share, among the most important: Zen 2 is the best of the bunch when it comes to perf/watt.
2. SMT yield. Zen 2 sees a larger gain in performance on average via SMT than Zen 3.
Then it wants to optimize it's success with marketing and finance.
Not quite.
Where Zen 2 has an advantage is in two areas:
1. Clocks at a given power. Usually only an extra ~200 MHz or so, but its there. Less pronounced the higher up in power you go, but on the flip side, it can be larger than that in extremely low power situations (8 cores in
2. SMT yield. Zen 2 sees a larger gain in performance on average via SMT than Zen 3
Ultimately the IPC boost brought by Zen 3 cores makes up for the both of those in almost every scenario.
I would imagine Intel's 22nm and ultra refined 14nm processes could be put to great use in other applications.