The idea of putting the cpu on top of the memory for cooling sounds great, but it likely doesn’t work. I think getting ~100 W through TSVs to the cpu is the problem. I think that is difficult to do, but I don’t know what the limit is. CPUs usually have a huge number of power and ground pins to handle the current draw, which is massive at such low voltages.
Edit: just watched a video on Moore’s law is dead YouTube channel about stacked GPUs. They seem to think that the compute die is stacked completely on top of the cache / IO die. They made a bunch of renderings with that, but that seems unlikely, although the power consumption of the compute die may not be that high if the whole thing is 150 W (2 gpu chiplets, base die or bridge die, and HBM). I would expect it to use EFB with an embedded cache / bridge die under the two gpu chiplets rather than a base die. Then they would also have bridge die (more likely to be just passive bridges) to connect the HBM. I have thought that they may use the same infinity cache die with Bergamo. It would likely be L4 since it is lower connectivity than SoIC used for v-cache die.