Discussion Intel current and future Lakes & Rapids thread

Page 395 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
I think it's obvious at least in desktops not all will be receptive to it. The overwhelmingly most important factor in desktops is performance.

They are feeling intense competitive pressure and Alderlake happens to be the way to get their desktop chips on 10nm and anything will be better than Rocketlake/Cometlake anyway. It feels like a rush to get it on desktops.

In laptops where thermals play a big role and is an incredible balance between responsiveness, battery life, and sustained performance, Alderlake's configuration sounds pretty good.

Maybe with Meteor Lake the hybrid configuration will make sense even with desktops.

Humans can sense a 10-30ms delay, especially as it adds up on top of the actual computation time to get the expected result on the screen, and especially if this is linked to motion in the UI.

We can be supremely sensitive.

Also our senses happen to be relative which makes it more sensitive. We all used to play with 1024x768 resolutions and 60Hz. But few years of that, you'll notice 75Hz is even smoother, and so on. We're at 240Hz for some monitors now.

Unconsciously, people with extreme sensitivity to PWM induced flickering on computer screens will have adverse effects easily in the high hundreds to even few thousands of Hz.
 
Last edited:
Reactions: Tlh97

Hulk

Diamond Member
Oct 9, 1999
4,367
2,234
136
It's quite possible the small cores will be a performance detriment for some workloads and a performance enhancement for others:
  • detriment for latency sensitive workloads and/or workloads that do not scale properly past 8 threads (the latency sensitive part is speculation at this point)
  • enhancement for throughput oriented workloads that scale well with more than 8 threads
Of course we need to see actual implementation to see which way the ratio goes, but you can rest assured hybrid will be a trade-off.


They are felt by the user, that's why both Intel and AMD invested R&D into making frequency and sleep state transition faster by migrating the decision process from software to hardware. Humans can sense a 10-30ms delay, especially as it adds up on top of the actual computation time to get the expected result on the screen, and especially if this is linked to motion in the UI.

Once the computer is "awake" from the sleep state what is the time frame for the processors to be ramping up and down? Is it still on the order of milliseconds? For example, when I open an application what is the time required to ramp from 800MHz or whatever to full speed?
 

coercitiv

Diamond Member
Jan 24, 2014
6,384
12,803
136
Once the computer is "awake" from the sleep state what is the time frame for the processors to be ramping up and down? Is it still on the order of milliseconds? For example, when I open an application what is the time required to ramp from 800MHz or whatever to full speed?
Read this.
 

Hulk

Diamond Member
Oct 9, 1999
4,367
2,234
136

Thank you for the link. Seems as though the ramp up time is pretty darn fast at about 35ms to max frequency. Everybody of course has their own subjective opinion on what makes a snappy system but I'm more concerned about the 3-4 second delay it takes for Thunderbird to open!
 

repoman27

Senior member
Dec 17, 2018
378
535
136
Looks like the wafer images Intel used to accompany their "Engineering the Future" presentation on Tuesday were 40-core Ice Lake-SP XCC. I did some quick calculations based on a count of the visible dies, and they're probably around 625 mm², which is fairly ambitious for 10+.



There's also a less cropped version available in the press kit.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Yea and 40 core Icelake-SP is a SKU, so no cores disabled.

So really the only reason we have Rocketlake-S rather than Tigerlake-S is because 2021 is year of 10nm and it just started ramping and they had to prioritize on mobile and server?

Tigerlake-S might have been only few % faster per clock but would have ended up being far lower power. So Rocketlake offers at most 10% more performance at much higher power.
 

eek2121

Diamond Member
Aug 2, 2005
3,043
4,265
136
Looks like the wafer images Intel used to accompany their "Engineering the Future" presentation on Tuesday were 40-core Ice Lake-SP XCC. I did some quick calculations based on a count of the visible dies, and they're probably around 625 mm², which is fairly ambitious for 10+.



There's also a less cropped version available in the press kit.

If you believe certain people here, Intel apparently only gets 1-2 chips per wafer, because they can’t possibly have good yields. 🤣
 

Hulk

Diamond Member
Oct 9, 1999
4,367
2,234
136
So really the only reason we have Rocketlake-S rather than Tigerlake-S is because 2021 is year of 10nm and it just started ramping and they had to prioritize on mobile and server?

Exactly. Rocketlake-S on 14++++++++ isn't going to bother Dell, HP, and the others one bit. The main thing is they can supply processors and they don't need discrete GPU's. Intel has made a few questionable decisions over the past few years, but considering their choices I think this one was the "least bad."

It is interesting how just before the Rocket Lake official launch the 5800X achieved widespread availability. Smart marketing by AMD, prioritize the part that is direct competition with Rocket Lake. Seeing how the 5800X is seemingly well stocked I'm thinking the 5900X and 5950X can't be far behind?[/QUOTE]
 
Last edited:

Saylick

Diamond Member
Sep 10, 2012
3,385
7,146
136
Looks like the wafer images Intel used to accompany their "Engineering the Future" presentation on Tuesday were 40-core Ice Lake-SP XCC. I did some quick calculations based on a count of the visible dies, and they're probably around 625 mm², which is fairly ambitious for 10+.



There's also a less cropped version available in the press kit.
According to this website, the largest die variant (XCC) is 640 mm2. It's got 42 cores in total, but with 2 disabled for now at the highest SKU.
Regardless, I agree that it is incredibly ambitious for 10nm+.

 

Mopetar

Diamond Member
Jan 31, 2011
8,004
6,446
136
I don't know if it's necessarily ambitious because we don't really know much about the process and I don't know if it's reasonable to think it has yield issues now because of some problems from years prior when Intel first started with the new node. Yields may be good, but it could just be that it doesn't produce chips that can clock very high, which there is some evidence to support.

In that case, server and mobile make sense because they're the chips that you don't want to push beyond the point of efficiency because power and heat constraints are more important than squeezing out every last MHz like we see in their desktop line. Server chips also sell at a much higher price and even with monolithic dies that are quite large, you can still find plenty of bins that will generate more revenue than you could expect from consumer parts.

Also, we see companies like Nvidia fab similarly sized chips aimed at the corporate or professional market. GP100 was similarly sized and on the (at the time) new TSMC 16 nm process that had only been used for mobile SoCs prior to that. When you can sell each of those chips for several thousand or even upwards of ten thousand dollars, even if the yields aren't great on a new node, it can still make financial sense to go with a big chip like that.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
According to this website, the largest die variant (XCC) is 640 mm2. It's got 42 cores in total, but with 2 disabled for now at the highest SKU.
Regardless, I agree that it is incredibly ambitious for 10nm+.


I disagree the 600mm2 die size necessiates the move to chiplets. Their problem was that they had issues with 14nm and 10nm, and chiplet wouldn't have done much because it doesn't solve power related benefits that new processes bring. In fact Nvidia pushes it even further at 800mm2!

Chiplets aren't a panacea, but merely a recipe that could result in success.

Another issue is that their "Sunny Cove" cores are way too large. AMD achieves better result at 2/3rd the size. If they used Zen 3 cores, they could have fit 55-60 cores in the same area those 40 Sunny Cove cores take up.

Also, the Cypress Cove cores in Rocketlake are at 14mm2 with the L2 cache. This is 60% larger than their Skylake core. It also means the 10nm only offers ~50% reduction(in line with historical trends for Intel large cores) because Sunny Cove is at 6.9mm2.

This is what I mean optimization will afford much much better result, and Tremont/Gracemont will be a fantastic example of that.

22nm Silvermont --> 14nm Airmont = 64% reduction(36% size)
Xe GPU = 2x the performance in the same area and power envelope
 
Last edited:

Saylick

Diamond Member
Sep 10, 2012
3,385
7,146
136
I disagree the 600mm2 die size necessiates the move to chiplets. Their problem was that they had issues with 14nm and 10nm, and chiplet wouldn't have done much because it doesn't solve power related benefits that new processes bring. In fact Nvidia pushes it even further at 800mm2!

Chiplets aren't a panacea, but merely a recipe that could result in success.

Another issue is that their "Sunny Cove" cores are way too large. AMD achieves better result at 2/3rd the size. If they used Zen 3 cores, they could have fit 55-60 cores in the same area those 40 Sunny Cove cores take up.

Also, the Cypress Cove cores in Rocketlake are at 14mm2 with the L2 cache. This is 60% larger than their Skylake core. It also means the 10nm only offers ~50% reduction(in line with historical trends for Intel large cores) because Sunny Cove is at 6.9mm2.

This is what I mean optimization will afford much much better result, and Tremont/Gracemont will be a fantastic example of that.

22nm Silvermont --> 14nm Airmont = 64% reduction(36% size)
Xe GPU = 2x the performance in the same area and power envelope
I don't know if it's fair to compare Sunny Cove cores directly with Zen 3 cores, at least on an area efficiency basis, because there's so much L3 cache that goes with the Zen core which fudges the comparison a bit. On Intel die shots, the cores take up significantly more real estate than the LLC, while on AMD die shots the core area is pretty much equal to the LLC area. With that said, I do believe that Intel's cores are larger than AMD's cores if you only include L1 and L2 cache, but not by much. Perhaps a better comparison would be transistors instead of area, so that at least the comparison can be node agnostic.




Anyways, enough AMD talk in an Intel thread...
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
With that said, I do believe that Intel's cores are larger than AMD's cores if you only include L1 and L2 cache, but not by much. Perhaps a better comparison would be transistors instead of area, so that at least the comparison can be node agnostic.

No it's actually larger by quite a bit.

I did a simple analysis back in post #9819. It's 3.2mm2 vs 4.4mm2. It's 37.5% larger and arguably Zen 3 is better than Sunny/Willow Cove. Zen 2 is 2.87mm2 so for the 15-20% perf/clock gain Zen 3 is only 11.5% larger.


Tremont cores are 0.85mm2 while only the AVX2/FP units are 0.62mm2 in Sunny Cove.

The AVX-512/FP block on the 14nm Knights Landing chip is 1.2mm. Meaning if it only shrinks by half on 10nm, we get to 0.6mm2, which is the same size as the one in Sunny Cove, but with twice the width.

14nm Goldmont - 1.1mm2
14nm Goldmont Plus - ?? - 30% per clock gain over Goldmont
10nm Tremont - 0.85mm2 - 30% per clock gain over Goldmont Plus

So Tremont is 70% faster per clock while being 20% smaller than Goldmont. I bet the core size increase is >70% meaning 10nm worked to achieve greater than 2x density, just not on Core chips. Since Tremont already has full SSE4 support with 128-bit width, you could get AVX2 version of Tremont at just 1mm2. If we add architectural additions to get the rumored increase in performance we might get that Skylake-level performance but at <1.5mm2 with Gracemont.
 
Last edited:

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
Alder Lake will play AMD's "large caches" card as well. If leaks are correct it will have 30MB of L3 ( 10 clusters of LLC 3MB ) and 12.5MB of L2 caches ( 10x 1.25MB MLC, tho this leaves 4C Atom clusters with nasty performance deficit ). IF true i think this indicates that Atoms are for power saving and not actually meant to process anything heavier.
So more cache per core than AMD has, this alone should improve performance quite a bit, unless they regress latencies even more.
 

dullard

Elite Member
May 21, 2001
25,203
3,617
126

Gideon

Golden Member
Nov 27, 2007
1,709
3,927
136
I don't know if it's fair to compare Sunny Cove cores directly with Zen 3 cores, at least on an area efficiency basis, because there's so much L3 cache that goes with the Zen core which fudges the comparison a bit. On Intel die shots, the cores take up significantly more real estate than the LLC, while on AMD die shots the core area is pretty much equal to the LLC area. With that said, I do believe that Intel's cores are larger than AMD's cores if you only include L1 and L2 cache, but not by much. Perhaps a better comparison would be transistors instead of area, so that at least the comparison can be node agnostic.

I agree that the comparison is somewhat apples to oranges, but the nodes are still similar enough that it's far from meaningless.

From Dr. Ian Cutress himself:
So the difference in density is roughly 10% (to TSMC's 7nm advantage). In a reply he also mentiones that for critical paths the density goes down to 35-60% but relatively the difference remains the same.

And here is a rough comparison of a 4 cores "cluster" of each with L3 included:


Bottom line:
  1. Willow Cove is about 26% bigger even when you include all of the L3 from both
  2. Willow Cove is 55% bigger without L3
  3. Willow Cove is 32% bigger without L2 and L3
Overall there are no two ways about it, even when you subtract the TSMC's claimed density advantage, Intel's core is noticeably bigger. I'm sure AVX-512 is a considerable portion of it, but still.

And the L3 on Intel's chip is so small compared to cores, because the cores really are just that much larger.The L3 on Intel's part isn't that much smaller than Ryzen's L3 as well (considering it's only 3/4 the size) once you factor in the Ring bus logic (which is effectively included in Ryzen as well (the area between L3 slices serves the same functionality).

Oh, and If the large L3 is such a dealbreaker for Vermeer, let's not forget that AMD also has Cezanne, which is smaller still (has only half the L3) but still holds it's own against Tiger Lake
 
Last edited:

eek2121

Diamond Member
Aug 2, 2005
3,043
4,265
136
I agree that the comparison is somewhat apples to oranges, but the nodes are still similar enough that it's far from meaningless.

From Dr. Ian Cutress himself:
So the difference in density is roughly 10% (to TSMC's 7nm advantage). In a reply he also mentiones that for critical paths the density goes down to 35-60% but relatively the difference remains the same.

And here is a rough comparison of a 4 cores "cluster" of each with L3 included:
View attachment 41946

Bottom line:
  1. Willow Cove is about 26% bigger even when you include all of the L3 from both
  2. Willow Cove is 55% bigger without L3
  3. Willow Cove is 32% bigger without L2 and L3
Overall there are no two ways about it, even when you subtract the TSMC's claimed density advantage, Intel's core is noticeably bigger. I'm sure AVX-512 is a considerable portion of it, but still.

And the L3 on Intel's chip is so small compared to cores, because the cores really are just that much larger.The L3 on Intel's part isn't that much smaller than Ryzen's L3 as well (considering it's only 3/4 the size) once you factor in the Ring bus logic (which is effectively included in Ryzen as well (the area between L3 slices serves the same functionality).

Oh, and If the large L3 is such a dealbreaker for Vermeer, let's not forget that AMD also has Cezanne, which is smaller still (has only half the L3) but still holds it's own against Tiger Lake

8 core Tiger Lake is a pretty solid chip. I think it is great try for Intel’s first real 10nm performance oriented chip, and I will be comparing it and Cezanne very closely when I buy a new laptop later this year.
 
Reactions: lightmanek

gdansk

Platinum Member
Feb 8, 2011
2,488
3,377
136
8 core Tiger Lake is a pretty solid chip. I think it is great try for Intel’s first real 10nm performance oriented chip, and I will be comparing it and Cezanne very closely when I buy a new laptop later this year.
It does look good (compared to the 5900H, for example). But I wonder about that base clock rate. Geekbench doesn't take that long to run. When both CPUs are thermally limited it seems (on paper, anyway) the 5900H would have a large lead. I look forward to someone investigating this.
 
Reactions: lightmanek

uzzi38

Platinum Member
Oct 16, 2019
2,702
6,404
146


Meteor Lake M/P platform.....no Meteor Lake-S listed there. MTL could be mobile exclusive and even if there is a deskop version it will come later unlike with ADL where we get desktop first.
Yeah, I think MTL-S coming later than -P/-M was expected for the most part. More importantly though, is the Intel Evo specification for Meteor Lake being listed for 1H 23. IMO that's probably when we can expect them (-P/-M) to launch.
 

JoeRambo

Golden Member
Jun 13, 2013
1,814
2,105
136
That's the same config as Tiger Lake so the latency shouldn't need to be changed at all.

L2 is the same, but L3 will have more stops on the ring, 10 instead of 4 cache slices so latency will be somewhat worse. And there is obviuosly an unknown involved - hopefully Atom L3 slices are not "special" and don't introduce extra latency penalties.
But 30MB of L3 is gonna go a long way increasing performance.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |