Discussion Intel current and future Lakes & Rapids thread

Page 714 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Raichu says Granite Rapids is a 128 core with 3 tiles having 48 cores each. If the need arises I can see them coming out with a halo chip having close to 140 cores, just like how Sapphire Rapids is now a 60 core product.
 

Saylick

Diamond Member
Sep 10, 2012
3,417
7,212
136
Raichu says Granite Rapids is a 128 core with 3 tiles having 48 cores each. If the need arises I can see them coming out with a halo chip having close to 140 cores, just like how Sapphire Rapids is now a 60 core product.
Jeez, 48 cores in likely a 6x8 grid is still going to be a rather sizeable tile, like >300mm2 I reckon.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Jeez, 48 cores in likely a 6x8 grid is still going to be a rather sizeable tile, like >300mm2 I reckon.

If they disaggregate the I/O functions out to separate tiles like Meteorlake then it should be manageable.

Sapphire Rapids having all I/O and memory controller in each core mirrored doesn't sound like the hyped future of chiplets and just an intermediate step.
 

BorisTheBlade82

Senior member
May 1, 2020
669
1,022
136
@IntelUser2000
Exactly - and this is what has been reported for Granite Rapids. Although 3 tiles seems strange - I wonder what kind of arrangement that would be. Maybe like an "E" were the vertical line is the IOD and the horizontal lines are the tiles.
 

A///

Diamond Member
Feb 24, 2017
4,352
3,155
136
@IntelUser2000
Exactly - and this is what has been reported for Granite Rapids. Although 3 tiles seems strange - I wonder what kind of arrangement that would be. Maybe like an "E" were the vertical line is the IOD and the horizontal lines are the tiles.
Offer intel to shoot it with your deagle and it'll smash into a dozen pieces. you'll have done their work for them.
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,102
136
@IntelUser2000
Exactly - and this is what has been reported for Granite Rapids. Although 3 tiles seems strange - I wonder what kind of arrangement that would be. Maybe like an "E" were the vertical line is the IOD and the horizontal lines are the tiles.
I think Intel's basically shown us. If you combine this older image:

With this:


Seems to be a pretty straightforward arrangement. They have two IO dies as end caps, and a variable number of compute dies in the middle. Though assuming the grey tiles represent memory the same way Intel's shown for SPR:

...then that means they're putting the memory controllers on the compute tiles. Would certainly be an interesting choice.
 

ashFTW

Senior member
Sep 21, 2020
312
235
96
It could also be 4 tiles in a square (ala Falcon Shores type design), with 3 CPU tiles and 1 accelerators tile, surrounded by I/O and memory tiles. From the slide below it seems Falcon Shores is coming next after Emerald Rapids.


What would you call a X86 only Falcon Shores? I have called it both Granite Rapids and Sierra Forest in the past, because it would make sense to have a CPU/GPU/XPU common platform. It seems obvious to me that that’s where Intel is headed.


I believe: Falcon shores common platform will be around a while, maybe even a decade. It will be where optical integration will happen over time as well. Granite Rapids and Sierra Forest will be the start of this platform. Future CPU/GPU/XPU will be built on revisions of the platform. Challenge for Intel is to isolate the XPU chiplets from the memory and I/O in such a way that these chiplets can be mixed and matched with evolving standards. The chiplet interfaces have to standardized and use scalable high performance and energy efficient interconnects.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
I believe: Falcon shores common platform will be around a while, maybe even a decade. It will be where optical integration will happen over time as well. Granite Rapids and Sierra Forest will be the start of this platform.

Okay, but in the real world if you make things common, you either have to compromise on both platforms, or one more than the other.

Falcon Shores would be for maximizing compute performance. The tradeoff is memory capacity and I/O connectivity. Granite Rapids is a general purpose server platform. The Enterprise and Cloud for example needs expansive I/O and memory capabilities, and especially the former can sacrifice bit of memory performance for extra RAS and capacity features. That's a waste on Falcon Shores. You don't care about having 6TB of memory capacity nor ability to connect dozen different drives and expansion slots.

I am even saying them unifying E5 and E7 chips with Skylake-SP onwards was a mistake, because due to the requirements of the E7 being aimed primarily at Enterprise with extra long validation times and reliability requirements, you sacrifice what was previously E5 as well.

The E5 was very efficient at serving the rest of the market while the Enterprise customers were perfectly served by E7, which the spirit of the platform existed since the Xeon MP days!

HPC(which is an ideal fit for Falcon Shores) also goes for low socket count(even single sockets sometimes), contrary to other datacenter that can use 2, 4, 8 or more sockets.

It would be even less suited to markets Sierra Forest will serve. Cloud don't care about memory bandwidth, and it would likely be made to be cheaper and lower power.

For all the hoopla about HBM Sapphire Rapids and even -X versions of EPYC, they are relative niche markets. Something I doubt Falcon Shores will change drastically. There's no advantage of having a common socket between them(all 3 actually!), beyond the "oh that's cool".

You are just wasting the extra pins required and the features that'll be useless in certain markets. It's just common sense. It'll increase complexity and cost more just for novelty, when instead you can tailor optimize for the target segment. This is essentially Xeon Phi revived. It'll probably do well in capturing Top500 markets and some future exaflop scale systems.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
...then that means they're putting the memory controllers on the compute tiles. Would certainly be an interesting choice.

Essentially they are jumping from Ivy Bridge-EP straight to the Tick Broadwell-EP. That's why the core is relatively a small gain and separate from Lion Cove in client. The server and client cores do seem to be divulging more and more. In Skylake-SP it was just extra L2 and AVX-512 unit tacked on. With Sapphire Rapids it adds AMX and accelerators.

After looking at how many accelerators Sapphire Rapids has added, I am starting to feel it wasn't just the firing of the validation team that was the problem. Maybe the issue was they tried to be too radical, too ambitious. Even after the many year delay it still is a very impressive achievement.

@Saylick After thinking about it a bit, >300mm2 is not that big. Sapphire Rapids has FOUR tiles and each of them are little over 400mm2.

Chiplets/Tiles/Process, all they do is increase the amount of performance and features added. They rarely ever use it to save on costs. Rather than thinking "We'll split 700mm2 dies into 10x80mm(assuming overhead) dies", they use it to get 4x400mm2 for example.

I am pretty sure the dream of some of the design teams are to get multiple reticle-limited dies in one package.
 
Last edited:

ashFTW

Senior member
Sep 21, 2020
312
235
96
Okay, but in the real world if you make things common, you either have to compromise on both platforms, or one more than the other.

Falcon Shores would be for maximizing compute performance. The tradeoff is memory capacity and I/O connectivity. Granite Rapids is a general purpose server platform. The Enterprise and Cloud for example needs expansive I/O and memory capabilities, and especially the former can sacrifice bit of memory performance for extra RAS and capacity features. That's a waste on Falcon Shores. You don't care about having 6TB of memory capacity nor ability to connect dozen different drives and expansion slots.
The Aurora blades with 2CPUs and 6GPUs currently support 6TB RAM. Why would that not be needed on a more unified platform that aims to increase performance per watt by 1000x in the next 5 years or so? Such blades in the future might be composed of 4-6 XPUs, or 2XPUs and 4GPUs, etc

The chiplets need to be reusable across designs. Do you think it makes sense to have a separate x86 chiplet on a general purpose CPU only compute chip, and another one on the XPU? They need to be composed out of same chiplets. The I/O (PCIe5/6, UPI/Xe-Link etc) and mem (HBM or not, DDR5/DDR6, number of memory channels etc) chiplets can be different, and that could result in different sockets. But there needs to be a fundamental architecture in place that promotes reuse and composability. They have way too many delays as is. The products have to be built with minimum number of chiplets with minimum validation, and yet meet the requirements of the various product segments. Basic engineering!

And I ask again, what do you think are the x86 only chips shown by Intel in the Falcon Shores slide? The slide even says “Next Gen Flexible Architecture”, “Compute density in ×86 socket”, “Memory Capacity & Bandwidth”. How much more explicit does it need to be to sink in??
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
The Aurora blades with 2CPUs and 6GPUs currently support 6TB RAM. Why would that not be needed on a more unified platform that aims to increase performance per watt by 1000x in the next 5 years or so? Such blades in the future might be composed of 4-6 XPUs, or 2XPUs and 4GPUs, etc

Again, Aurora is a blip in the sky in volumes. Same deal with Xeon Phi. You don't sacrifice bread and butter product to serve niche markets. You should see Xeon Phi and Cooper Lake-AP systems. Low sockets, I/O, and memory count. Some even skip DIMM slots altogether! Yes certain areas can use the extra memory but you can use regular server paired with conventional GPUs. You get the flexibility of pairing with almost any ratio of CPU:GPU, which is impossible with Falcon Shores.

-Identical Sockets: Unnecessary cost due to large packaging and pin count
-Extra I/O: Unnecessary for most products Falcon Shores will serve in
-Sierra Forest: Many features in both aren't needed at all for high core count, efficient, cheap cloud. Motherboards will be far simpler.

Basic engineering!

I don't get what you mean by that? Engineering needs to take into account the market it serves, not the other way around. Making one platform socket compatible with 3 vastly different market segments will unnecessarily complicate things and add cost, which isn't basic engineering.

How much more explicit does it need to be to sink in??

I think you are putting too much faith on what's basically technical marketing. Plus right now they are niche, niche markets. Actually for Falcon Shores the market right now is zero.
 
Last edited:

ashFTW

Senior member
Sep 21, 2020
312
235
96
You get the flexibility of pairing with almost any ratio of CPU:GPU, which is impossible with Falcon Shores.
That is a ridiculous statement. You need to look at the Falcon Shores slide and understand the various flexible configurations. All CPU, all GPU, 1-3 CPU rest GPU. They could also be packaged with additional chiplets instead like the Extreme Bandwidth Memory that Intel has briefly talked about. Today’s flexibility of all CPU and all GPU is a proper subset of that of Falcon Shores.

-Identical Sockets: Unnecessary cost due to large packaging and pin count
-Extra I/O: Unnecessary for most products Falcon Shores will serve in
-Sierra Forest: Many features in both aren't needed at all for high core count, efficient, cheap cloud. Motherboards will be far simpler.
Intel has already stated that Sierra Forest and Granite Rapids will share the same (I believe Birch Stream) platform. There may be standard and AP packages. Intel has also stated that Falcon Shores will share some x86 socket.

I don't get what you mean by that? Engineering needs to take into account the market it serves, not the other way around. Making one platform socket compatible with 3 vastly different market segments will unnecessarily complicate things and add cost, which isn't basic engineering.
Please reread what I said above. I’ll paraphrase: “The products have to be built with minimum number of chiplets with minimum engineering effort, and yet meet the requirements of the various product segments.“ This requires smart designs that absolutely decrease cost and shorten time to market, by not only building and validating fewer parts, but also decoupling development of the different parts of the system. This is what I have called parsimonious designs in the past; AMD did (and is still continuing to do) a great job of this because they were ultra financially constrained and had no other choice. Intel on the other hand, was just throwing money around by working in product segment silos. And even within these silos they build way too many non-reusable chips to address the market. Just look at Sapphire Rapid …

I think you are putting too much faith on what's basically technical marketing. Plus right now they are niche, niche markets. Actually for Falcon Shores the market right now is zero.
Another ridiculous statement. It’s the same GPU/AI/ML compute market that everyone is fighting for with NVIDIA currently being the dominant player with well over $11B revenue last year, and growing at a rapid pace. The total TAM in the next few years is conservatively upwards of 100B. NVIDIA (with Grace), AMD (with MI300) and Intel (with Falcon Shores) are all moving towards bringing CPU and GPU closer and smashing reticle limits with chiplets to make more powerful and efficient designs. I assume you know that lots of power is lost just moving the data around; that power can be better used to do the actual work.
 
Last edited:

Timmah!

Golden Member
Jul 24, 2010
1,468
743
136
So all of the researchers, scholars and all of the documentation(even from Intel) that say Knights Landing HBM/MCDRAM are wrong? Doing the simplest google search will attest to this.

mcdram? made by mcdonalds? :-D

MLID already ”leaking“ 44C granite rapids HEDT/WS, when that 34C is yet to be officially revealed. Its tiring.
 

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
mcdram? made by mcdonalds? :-D




MLID already ”leaking“ 44C granite rapids HEDT/WS, when that 34C is yet to be officially revealed. Its tiring.

That Huge 34C monolithic Die might not see the day of light. For starters the smaller tiles in SPR-SP are reporting 50% yields(so one can expect even lower on a bigger die), due to market segmentation the monolithic dies meant for Work Stations may not have full set of Accelerators, so in general tasks they will be at a disadvantage vs Genoa based TR PRO
 
Last edited:
Reactions: Tlh97 and Timmah!

Exist50

Platinum Member
Aug 18, 2016
2,452
3,102
136
I hope these "researchers" are no longer working there. What's with the really bad use of rainbow colors in that document? Hurts the eyes!
It looks like the document might have been converted from another format and something broke along the way. Some other artifacts throughout.
 

nicalandia

Diamond Member
Jan 10, 2019
3,331
5,282
136
What's with people doubting past researchers when they mention High Bandwidth Memory... Xeon Max is not the first and only X86 CPU with High Bandwidth Memory.


Sure It's the first x86 CPU to use on package HBMe, but Xeon Phi it's the very first of it's kind to use no-trademark High Bandwidth Memory.
 
Last edited:
Jul 27, 2020
18,247
11,972
116
Sure It's the first x86 CPU to use on package HBM2e, but Xeon Phi it's the very first of it's kind to use no-trademark High Bandwidth Memory.
You know Intel. When a product isn't hugely successful, they pretend it never existed.

Pretty soon they will be doing that with Optane.

Pat: Optane? What's that? Never heard of it.

I remember my company's managing director doing that with a past favored employee who left the company for greener pastures. In a discussion, I told him that the ex-employee used to do something. He was like, 'employee's name'? Who's that???
 

Exist50

Platinum Member
Aug 18, 2016
2,452
3,102
136
What's with people doubting past researchers when they mention High Bandwidth Memory...
Because of the distinction I mentioned above, about "high bandwidth memory" both naming a category and a specific implementation. And, beyond that, I really don't think anyone cares about that technicality either way...
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Another ridiculous statement. It’s the same GPU/AI/ML compute market that everyone is fighting for with NVIDIA currently being the dominant player with well over $11B revenue last year, and growing at a rapid pace.

GPUs address this. Right now it's zero.
 

ashFTW

Senior member
Sep 21, 2020
312
235
96
GPUs address this. Right now it's zero.
With that thinking you wouldn’t do anything new …. Because because … wait for it … “right now it‘s zero” 🤣 I’m done discussing this because it’s no use talking to someone who has no vision nor understanding of the competitive landscape.

I guess you are the where the “puck has been” kinda guy not the “where the puck is going“, especially when it takes 3+ years to build a complex chip. Good luck with that!! 😝
 
Last edited:

DrMrLordX

Lifer
Apr 27, 2000
21,842
11,199
136
GPUs address this. Right now it's zero.

To be fair, if you look at NV's future ambitions, they want to sell you enterprise dGPUs in a package with their own custom ARM server CPUs so that they will no longer be platform-dependent on Intel and AMD.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,787
14,821
136
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |