That is a lot less than you think it is.
No, it isn’t. They need more capacity. Yields aren’t the issue.
That is a lot less than you think it is.
Based off? You have to remember, ICL-SP isn't quite using the same process as TGL.No, it isn’t. They need more capacity. Yields aren’t the issue.
Update: Just some back of the napkin math here. Intel's middle die-size HCC on Skylake Xeon Scalable was ~480 mm2, with a production rate of 108 per wafer, assuming perfect yield. Let's work on the assumption that this would be a good die size for Ice Lake Xeon (more cores, denser 10nm process). If Intel's defect rate for 10nm was as good as TSMC's N7, for which we know the latter to be a rate of 0.09 defects per cm2, the actual yield would be 71 dies per wafer, which equates to 66%. If Intel was extracting 71 dies per wafer, then 115,000 dies would be around 1620 wafers. We don't know at this point if Intel is absorbing some of those defects by having more physical cores on the die than will be offered, and this doesn't take into account reduced die count configurations (e.g. a 20 core part from a 28-core die). But it's an interesting number.
ICL - SP is upto 40c nowWhat is not serious about it? Ice Lake SP contains up to 32 cores. They’ve shipped more than 100,000 units according to AT. If they were having yield issues they wouldn’t be shipping large dies at all.
ICL - SP is upto 40c now
What is not serious about it? Ice Lake SP contains up to 32 cores. They’ve shipped more than 100,000 units according to AT. If they were having yield issues they wouldn’t be shipping large dies at all.
How many of the sold units were based on Intels popular mainstream platform LGA1150 or LGA1151
and how many were based on Haswell-EP or Skylake-EP?
How many of the sold units were based on Intels popular mainstream platform LGA1150 or LGA1151 and how many were based on Haswell-EP or Skylake-EP?
That might not even be able to challenge Renoir-U in MT workloads, if I'm not missing anything, certainly not Cezanne-U. Those guys are absolute beasts for 15W laptop chips, when it comes to performance in a lot of actual work related software.Looks legit. Kinda crappy that the 15 W i7 only gets 2 Big even if it gets 8 small. Guess they are pushing the 28 W.
In ideal conditions they will likely surpass Cezanne, but in a strange way - either workloads with strong emphasis on ST perf, or workloads with strong emphasis on MT perf ( 10+ threads, great MT scaling). Anything in between will likely run better or more consistent on a 8+0 chip.That might not even be able to challenge Renoir-U in MT workloads, if I'm not missing anything, certainly not Cezanne-U. Those guys are absolute beasts for 15W laptop chips, when it comes to performance in a lot of actual work related software.
Umm... you know that the little cores do *not* have HT, right? What I mean with that question: I just can't see 2 really strong cores together with 8 'maybe Skylake' cores at God knows what freq, even with perfect scaling and scheduling (don't forget, we're still talking about Microsoft here), touching 8 full fledged Zen 3 cores in an MT heavy workload.In ideal conditions they will likely surpass Cezanne, but in a strange way - either workloads with strong emphasis on ST perf, or workloads with strong emphasis on MT perf ( 10+ threads, great MT scaling). Anything in between will likely run better or more consistent on a 8+0 chip.
That's the problem with the hybrid setup: you only need 4 small cores to get the power saving benefits. Once you aim for the MT performance benefit you need lots of these little buggers, and they eat into the big core budget. To make things even harder: area limitation gets compounded with the iGPU focus, since Intel wants to keep graphics emphasis and allocate same relative real estate for the CPU as they did on TGL 4+0.
Gracemont needs to be fast and devilishly efficient to get such premium real-estate on the die.
Umm... you know that the little cores do *not* have HT, right? What I mean with that question: I just can't see 2 really strong cores together with 8 'maybe Skylake' cores at God knows what freq, even with perfect scaling and scheduling (don't forget, we're still talking about Microsoft here), touching 8 full fledged Zen 3 cores in an MT heavy workload.
Don't ever let the fact that I was specifically talking about ADL vs RNR or CZN in heavy productivity apps bother you even for a second.The question is: is that even necesary? Because for games and most stuff dont need that, you are only going to see diference in benchmarks and heavy productivity apps. At the end of the day, what matter for mobile is heat and power vs overall perf. So we really need to see that first.
I can compare this to Bay Trail vs Kabini in mobile, Kabini had the CPU+IGP performance, but IGP perf was cut short due to bad decisions like ST, and at the end Bay Trail low power and TDP was great and allowed x86 into tablets for the first time with an actual good product. AMD to get rid of Kabini had to re-purpuse it as cheap desktop cpus.
So AMD will likely keep the CPU lead in mobile vs hybrid cpus, power usage and tdp is the question, so is iGPU because Vega overstayed its welcome.
I know they don't have HT. As I said, you need the opposites end of the spectrum for the hybrid to win. It's weaker between 4 and 8 threads, but picks up steam between 8 and 12 threads.Umm... you know that the little cores do *not* have HT, right? What I mean with that question: I just can't see 2 really strong cores together with 8 'maybe Skylake' cores at God knows what freq, even with perfect scaling and scheduling (don't forget, we're still talking about Microsoft here), touching 8 full fledged Zen 3 cores in an MT heavy workload.
This napkin math requires not just actual clock parity as you mentioned, but also 100% efficient and perfectly managed windows scheduling between cores and threads and in-process tasks and such. Good lock with that! To Intel, I meanI know they don't have HT. As I said, you need the opposites end of the spectrum for the hybrid to win. It's weaker between 4 and 8 threads, but picks up steam between 8 and 12 threads.
Time for "napkin graph". For the sake of convention consider ADL = 1.4x SKL, Zen3 = 1.25x SKL, Gracemont = 1x SKL, HT = +20% no matter the architecure. AFAIK ADL will have HT enabled on the big cores. Here's how performance would look like assuming clocks are the same on all cores.
View attachment 41101
After this you need to consider:
- max clocks on Gracemont, lower max clocks may further exacerbate the loss between 4-8 threads
- likely efficiency advantage from the small cores that may allow higher clocks on the hybrid ADL, pushing for a win in the entire 10-16 thread spectrum
I know they don't have HT. As I said, you need the opposites end of the spectrum for the hybrid to win. It's weaker between 4 and 8 threads, but picks up steam between 8 and 12 threads.
Time for "napkin graph". For the sake of convention consider ADL = 1.4x SKL, Zen3 = 1.25x SKL, Gracemont = 1x SKL, HT = +20% no matter the architecure. AFAIK ADL will have HT enabled on the big cores. Here's how performance would look like assuming clocks are the same on all cores.
View attachment 41101
After this you need to consider:
- max clocks on Gracemont, lower max clocks may further exacerbate the loss between 4-8 threads
- likely efficiency advantage from the small cores that may allow higher clocks on the hybrid ADL, pushing for a win in the entire 10-16 thread spectrum
Would not the graph be different for each AL configuration? If AL in fact has higher IPC than Zen, and similar clocks, then 8+0 (or 8+X) should be faster up to a certain number of threads (8+x, depending on hyperthreading efficiency), then dropping off rapidly as Zen increases in core count while AL adds only small cores.I know they don't have HT. As I said, you need the opposites end of the spectrum for the hybrid to win. It's weaker between 4 and 8 threads, but picks up steam between 8 and 12 threads.
Time for "napkin graph". For the sake of convention consider ADL = 1.4x SKL, Zen3 = 1.25x SKL, Gracemont = 1x SKL, HT = +20% no matter the architecure. AFAIK ADL will have HT enabled on the big cores. Here's how performance would look like assuming clocks are the same on all cores.
View attachment 41101
After this you need to consider:
- max clocks on Gracemont, lower max clocks may further exacerbate the loss between 4-8 threads
- likely efficiency advantage from the small cores that may allow higher clocks on the hybrid ADL, pushing for a win in the entire 10-16 thread spectrum
This again brings up the problem of scheduling. Apple has an easy time getting away with home runs despite the extreme complexity of scheduling difficulties, as they control their OS and their whole ecosystem. Ironically, this fact also excludes me as a potential customer, because in the past 2 decades, every single attempt I've made to 'like' or even 'get accustomed to' using Apple products have made me hate myself very quickly.What if any is the effect on overall performance for an application should some of the threads require a lesser amount of compute than other threads? Let me try to communicate my question clearly in a hypothetical simplified case.
Imagine an application that spawns three threads. Two are compute "light" and one is compute "heavy." They are dependent so they must be run more or less simultaneously.
Now imagine a CPU with two Big cores running these threads. One Big core executes the compute heavy core while the other Big core executes the two compute light cores. Of course this Big core would be constantly switching context
Now imagine a CPU with one Big core and two Little. You see where I'm going. Big core is assigned to heavy compute thread while two Little cores are each assigned one light compute thread and they have a happily running Big/Little family running this application.
So now for my question, worded as precisely as I can manage:
I realize that my hypothetical example may be quite far-fetched when it comes to reality, but could a Big/Little strategy be a better "fit" from some applications? Meaning if an application has varied compute loads across threads can the Big/Little cores could be assigned optimally to reduce context switching and equal or beat the performance of a number of Big cores with great theoretical total compute?
Yes, and as 32C ICL seems to be actually competitive with 32C Rome(finally), 40C ICL should have no problem competing with 64C Rome. Or Milan. Well, I mean...ICL - SP is upto 40c now
Hah, not bad, not bad at all! I wonder if MS would be one of the only BIG-BIG companies where brilliantly simple ideas coming from way under (sorry for describing you as way under, but you don't strike me as a prime executive preoccupied with proving his worth at ANY cost) can convince decision makers to act on them! OK I admit, I don't actually wonder! 🤣Maybe one thing that Windows 10 could do, is limit scheduling of "apps" (the new "Store" apps), to the "little" cores, and leave the "big" cores completely free, for system interrupts, or Win32/64 native x64 applications that might be compute-heavy (games).
Or maybe they can do things like the way "Optimus" does on laptops, with the GPU allocation, except this time, they would allocate either "big" cores or "little" cores, according to application listings / profiles.
We were talking specifically ADL 2+8 vs. Cezanne 8+0 in the 15W TDP range. (power limited, clocks going down fast in MT loads)Would not the graph be different for each AL configuration? If AL in fact has higher IPC than Zen, and similar clocks, then 8+0 (or 8+X) should be faster up to a certain number of threads (8+x, depending on hyperthreading efficiency), then dropping off rapidly as Zen increases in core count while AL adds only small cores.
I did mention earlier that it would surpass Cezanne's throughput in ideal conditions only. Running 10+ concurrent threads with high MT scaling is rather hard to do in typical consumer loads, and that before we get to max clocks and Win scheduler.This napkin math requires not just actual clock parity as you mentioned, but also 100% efficient and perfectly managed windows scheduling between cores and threads and in-process tasks and such. Good lock with that! To Intel, I mean