Discussion Qualcomm Snapdragon Thread

Page 146 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

FlameTail

Diamond Member
Dec 15, 2021
4,238
2,593
106
But even the bottom graph the results of each aren’t anywhere near max clocks. The X Elite can hit 4.3GHz and yet it’s only being measured at 3.8GHz for both int and fp.

Only reason why I can think of is that it’s showing the peak point of IPC. For example for FP the M4 above scores 15.58 at 3.87GHz for 4.03pts/GHz.. but its max of 16.11 at 4.04GHz is 3.98pts/GHz.

But nonetheless it’s probably best to show the results at max clocks such that overall performance is shown. Or like I said before delineate between different datasets to show max IPC vs performance at max clocks.
The 3rd slide that mvprod posted is specifically for comparing IPC, so Geekerwan limited the clocks to about 4 GHz. The other slides are measuring at max frequency, I believe.
 

jdubs03

Golden Member
Oct 1, 2013
1,079
746
136
The 3rd slide that mvprod posted is specifically for comparing IPC, so Geekerwan limited the clocks to about 4 GHz. The other slides are measuring at max frequency, I believe.
Yeah I guess that’s true (about the 4GHz comparison).
But the two charts above don’t really show that because like I noted in my first post about this, the X Elite only goes up to around 9 points in fp in the second slide whereas we know it can go higher than 13.58 in the bottom slide.

What would explain the disconnect there? Maybe he’s running a different compiler?
 
Last edited:

FlameTail

Diamond Member
Dec 15, 2021
4,238
2,593
106
Yeah I guess that’s true (about the 4GHz comparison).
But the two charts above don’t really show that because like I noted in my first post about this, the X Elite only goes up to around 9 points in fp in the second slide whereas we know it can go higher than 13.58 in the bottom slide.

What would explain the disconnect there? Maybe he’s running a different compiler?
Well spotted. There seems to be some issues with the data/curves.
 
Reactions: jdubs03

FlameTail

Diamond Member
Dec 15, 2021
4,238
2,593
106
Oryon V2 is rumoured to hit 5 GHz. Do you think it can be dine without adding more pipeline stages?

Also Apple M5 will be around 5 GHz too, I think (M1 = 3.2, M2 = 3.5, M3 = 4.05, M4 = 4.5).
and then Qualcomm needs to make a low power design which next gen X Elite isn’t going be.
Mahua (X2 Elite?) is 6L + 6M CPU. Is that not low power?


I don’t it will be Qualcomm who Intel should be worried about although that 5GHz core if it that happens and its efficient then it’s different.
I do think 5 GHz is probable for Snapdragon X2, though I find it unbelievable that 8 Gen 5 will also hit 5 GHz.

The question is can they do it without blowing up the power consumption to the moon, regressing cache latencies, etc...

PS:

NodePeak
Clock
ST power
X EliteN4P4.32 GHz18 W
8 Gen 4N3E4.3 GHz???

It will be very interesting to see how power 8G4 consumes for ST.

It will show what degree of improvements Qualcomm has made 1 year later to Oryon, and confirm/deny theories that there is some kind of defect/bug in X Elite silicon that's causing the high power consumption.
 
Last edited:

MarkizSchnitzel

Senior member
Nov 10, 2013
452
74
91
That's just a guy selling flagships.
Realistically, it's getting beyond ridiculous because of how little all that power is utilized by the average consumer.

SOCs get much more powerful, but also more power hungry, but batteries are more dense, so we get heavier devices which still have 1 day battery life.
All for the sake of bragging rights about a metric that nobody REALLY benefits from substantially in comparison to better battery life.
It's the same with using glass only and being obsessed with thinness and design, only to put devices in thick rubber condoms.
 

Raqia

Member
Nov 19, 2008
63
32
91
Mahua (X2 Elite?) is 6L + 6M CPU. Is that not low power?
View attachment 108613


I do think 5 GHz is probable for Snapdragon X2, though I find it unbelievable that 8 Gen 5 will also hit 5 GHz.

The question is can they do it without blowing up the power consumption to the moon, regressing cache latencies, etc...

PS:

NodePeak
Clock
ST power
X EliteN4P4.32 GHz18 W
8 Gen 4N3E4.3 GHz???

It will be very interesting to see how power 8G4 consumes for ST.

It will show what degree of improvements Qualcomm has made 1 year later to Oryon, and confirm/deny theories that there is some kind of defect/bug in X Elite silicon that's causing the high power consumption.
There is a rumored automotive SoC (SA8797) with 18 CPU cores that shares the rumored core count of the Glymur SoC with up to 800 GB/s memory bandwidth (16x 16-bit LPDDR controllers), 320 TOP/s of inference performance and 8.1 TFLOP/s GPU performance (even 1.4 TOP/s of INT8 on audio), it's possible that the PC oriented Glymur is based on this SoC with a different memory config. to spread NRE costs. (Not sure they can stretch their legs budget-wise at Q as much Apple did with separate designs for A18/A18 Pro rather than die harvesting...)


There is also a rumored BIG automotive SoC (SA8799) with 32 CPU cores alluded to in the links above, which would be very interesting if repurposed as a desktop PC SoC. (Presumably auto SoCs for both infotainment, gauge clusters, and ADAS will require a GPU and the specs of other blocks will scale as well. Adreno 8 is rumored to have a slice architecture after all...)
 
Last edited:
Reactions: FlameTail

The Hardcard

Senior member
Oct 19, 2021
271
353
106
TSMC really is expensive. But also speaks about how QCOM has been increasing die size gen over gen.
It’s not just TSMC. No one can make cheap chips in $100B + foundries using EUV machines. Cost reduction as a part of node advances ended at about the 22 nm - 14 nm timeframe. Since the (virtual) 7 nm node, the costs are astronomical.

How people think budget laptops will stay below $700 and budget phones will stay below $400 is beyond me. A 2 nm laptop launching below $700 will require the involvement of nonprofits.
 

FlameTail

Diamond Member
Dec 15, 2021
4,238
2,593
106
There a rumored automotive SoC (SA8797) with 18 CPU cores that shares the rumored core count of the Glymur SoC with up to 800 GB/s memory bandwidth (16x 16-bit LPDDR controllers), 320 TOP/s of inference performance and 8.1 TFLOP/s GPU performance (even 1.4 TOP/s of INT8 on audio), it's possible that the PC oriented Glymur is based on this SoC with a different memory config.
That sounds beastly. I didn't know that automotive SoCs had to be so powerful. One question though; 16 × 16 is a 256 bits... How do you get 800 GB/s with a 256 bit LPDDR controller? It's impossible even with the latest LPDDR5X-10667 standard.
 

Raqia

Member
Nov 19, 2008
63
32
91
That sounds beastly. I didn't know that automotive SoCs had to be so powerful. One question though; 16 × 16 is a 256 bits... How do you get 800 GB/s with a 256 bit LPDDR controller? It's impossible even with the latest LPDDR5X-10667 standard.
You're right, that detail sounds incorrect. It's more likely a 16 x (4 x 16) bit interface. Q already has products that likely have this memory controller config. in their latest AI 100 Ultra:




Details are lacking on the specific memory channel config. for that SKU but given the 4x memory capacity and 4x bandwidth vs. the AI 100 Pro:


It's likely they are using a 16 x 64 bit (each a stack of 4 x 16 bit or 2 x 32 bit channels per package?) config. for the AI 100 Ultra. I think it's not out of the question that the automotive part will be able to hit 800 GB/s on LPDDR5X.
 
Last edited:
Reactions: FlameTail

FlameTail

Diamond Member
Dec 15, 2021
4,238
2,593
106
@naukkis

Chips and Cheese suggests that the L2 cache in X Elite may not be a unified one, but a sliced one;
Qualcomm considers Oryon’s 12 MB L2 a very large capacity integrated cache. Latency is 15-20 cycles depending on which part of the cache data is in. That suggests the L2 is built from multiple slices, much like the L3 cache in Intel and AMD’s architectures.
 
Reactions: Gideon

Gideon

Golden Member
Nov 27, 2007
1,842
4,379
136
@naukkis

Chips and Cheese suggests that the L2 cache in X Elite may not be a unified one, but a sliced one;

I'd really like to see such sliced L2 on consumer (especially laptop) chips on the x86 side as well.

IMO such a "fast and large" L2 would play particularly well with large and high-bandwidth (but more relaxed latency latency) 3D-cache for L3 and/or SLC
 

naukkis

Senior member
Jun 5, 2002
962
829
136
@naukkis

Chips and Cheese suggests that the L2 cache in X Elite may not be a unified one, but a sliced one;


Sliced L3 like in x86 cpus meas that every core have a slice of cache and interconnect to other slices. X elite like Intel E-cores and AMD Dozers L2 aren't sliced like that, unified L2 serves multiple cores. That unified L2 though can have slices that has different access times to different cores - likely preferring closest to cpu part of cache prioritized. That kind of unified cache can power gate it's not needed parts when executing lighter jobs - unlike x86 like sliced cache where working cpu needs to power up all cache slices(because memory addresses are interleaved between slices) and interconnect network between them.
 
Reactions: Hitman928

Raqia

Member
Nov 19, 2008
63
32
91
Sliced L3 like in x86 cpus meas that every core have a slice of cache and interconnect to other slices. X elite like Intel E-cores and AMD Dozers L2 aren't sliced like that, unified L2 serves multiple cores. That unified L2 though can have slices that has different access times to different cores - likely preferring closest to cpu part of cache prioritized. That kind of unified cache can power gate it's not needed parts when executing lighter jobs - unlike x86 like sliced cache where working cpu needs to power up all cache slices(because memory addresses are interleaved between slices) and interconnect network between them.
Probably something in between is true here: the X-Elite SoC has 3x 4-CPU clusters which are each powered on in an all or nothing fashion. Each CPU can use the entire cluster's L2, but is best matched to a particular subset of that cluster's rather large L2.

Perhaps the L2's per cluster are more like IBM Telum's with preferred physical slices for each CPU's L2, but also uses a heuristic for each CPU core to acquire other CPUs L2 slices to act as a virtual "L3". Latency for the X-Elite L2 is excellent and in the same league as the the Telum II, though likely lower given the more intense power gating requirements for consumer SoCs that are not generally always run at full load 24/7. Given the puny 6MB higher level cache in the X-Elite which also addresses the needs of the entire rest of the SoC, this would make some sense.

The 1st gen X-Elite design allegedly used a rushed redesign of a server targeted CPU (also plagued by IP protection related redesigns related to ARM's lawsuit against Nuvia per Qualcomm's court filings). As such, it does not sport more granular power gating features like DVFS, E-cores or per core power gating which we may see starting with the 8G4.
 
Last edited:

gdansk

Diamond Member
Feb 8, 2011
3,276
5,186
136
I wonder how much actual profit that is? They were never going to justify their lofty valuation just by continuing to dominate smartphones, or even if Qualcomm could carve off a small slice of PCs. Maybe they think the auto market can.
Qualcomm has one of the lower PE ratios in the industry. 22 vs AMD's 205, for example.

But yeah I'm not sure how much growth that market will really have. The expectations seem overstated.
 

FlameTail

Diamond Member
Dec 15, 2021
4,238
2,593
106
Linus is still using the Snapdragon laptop he reviewed.


Timestamp = 1:45

He says he loves the battery life.
 

FlameTail

Diamond Member
Dec 15, 2021
4,238
2,593
106
The 1st gen X-Elite design allegedly used a rushed redesign of a server targeted CPU (also plagued by IP protection related redesigns related to ARM's lawsuit against Nuvia per Qualcomm's court filings). As such, it does not sport more granular power gating features like DVFS, E-cores or per core power gating which we may see starting with the 8G4.
I wonder if it would be prudent for Qualcomm to do some kind of refresh of the X Elite, fixing those deficiencies porting it to 3nm, for a release in 2025H1. X Elite 2nd gen is tipped for 2026H1, which is a long time away. Ming Chi Kuo suggested something like that;
The X Elite and X Plus chips, used for Windows on ARM (WOA), will reach about 2 million unit shipments in 2024, with expected year-on-year growth of at least 100–200% in 2025. The X Elite and X Plus will have modified versions in 2025, with a reduction in end product prices.
 
Reactions: ikjadoon

jdubs03

Golden Member
Oct 1, 2013
1,079
746
136
I wonder if it would be prudent for Qualcomm to do some kind of refresh of the X Elite, fixing those deficiencies porting it to 3nm, for a release in 2025H1. X Elite 2nd gen is tipped for 2026H1, which is a long time away. Ming Chi Kuo suggested something like that;

That’s kind of what I was alluding to in the Mediatek thread. We can already tell that S8G4 looks better than the X Elite. A quick refresh to align the core characteristics at a minimum doesn’t seem out of the question. And Oryon v2 perhaps could be released earlier as it was less prone to delay as v1.
 
Last edited:

Doug S

Platinum Member
Feb 8, 2020
2,888
4,912
136
Qualcomm has one of the lower PE ratios in the industry. 22 vs AMD's 205, for example.

But yeah I'm not sure how much growth that market will really have. The expectations seem overstated.

Sorry must have had too much IPA earlier I totally misread that tweet, for some reason I read it as ARM talking about automotive as a source of growth lol!

Yeah it won't make much of a difference in Qualcomm's overall financial picture, it would have a bigger potential impact on ARM if a lot more ARM designed cores (especially higher end ones with bigger licensing $$) end up in future autos.
 

Raqia

Member
Nov 19, 2008
63
32
91
I wonder if it would be prudent for Qualcomm to do some kind of refresh of the X Elite, fixing those deficiencies porting it to 3nm, for a release in 2025H1. X Elite 2nd gen is tipped for 2026H1, which is a long time away. Ming Chi Kuo suggested something like that;

Die harvested 8G4's (e.g. ones w/ a busted modem block) could also be very potent laptop chips with better efficiency characteristics than the X-Elite/Plus at 3nm with a major design refresh, but it's difficult to see where they would slot them in with the released X-Elite and X-Plus laptop parts. An 8G4 is likely to have a stronger GPU, but a weaker CPU in multi-threaded situations than the laptop parts. Maybe budget gaming laptops and Steam deck competitors...
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |