Discussion Qualcomm Snapdragon Thread

Page 174 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

soresu

Diamond Member
Dec 19, 2014
3,489
2,782
136
Was reading this blogpost:


Qualcomm plans to support Mesh Nodes and Work Graphs in a future Adreno architecture.
TLDR Mesh nodes are an extension of both Work Graph and Mesh shader functionality.

It's brand new as of 5 and a half months ago, so don't expect anything from it for quite some time in any big game engine.

Even work graphs are still very recent.
 

soresu

Diamond Member
Dec 19, 2014
3,489
2,782
136
How long will this trial take? Will it be done in a few days or drag out over years?
QC will settle it sooner than later - they can't adopt newer v9-A ISA/features until they have put this to bed, so if it's already in their roadmap they have no choice in the matter.

Now it's only a point of negotiating a deal.
 

soresu

Diamond Member
Dec 19, 2014
3,489
2,782
136
Over the past 4 years, flagship smartphone GPU performance has increased by 500%
View attachment 113279
Completely nuts!

Image from this r/hardware post
Process tech advancement has a lot to do with that.

Finally cracking EUV lithography was a huge win for semiconductor manufacturing, much like the long delayed PCIe v4 was quickly followed by v5, 6 and now 7 well into draft development.

Notice how much more modest the increase in memory bandwidth has been in that same timeframe though, and memory density is no better.

I feel like once they finally get 3D DRAM into production under 10nm we will get many years of density increases as they leverage the last decade of experience working with 3D NAND flash memories.
 
Reactions: FlameTail

FlameTail

Diamond Member
Dec 15, 2021
4,384
2,754
106
Can the NPU be used for DLSS-like upscaling?

Or should tensor cores necessarily be inside the GPU itself to do it?
Well, now we know it can.


But it's not the best solution.

Qualcomm will eventually have to add Tensor cores to their Adreno GPUs to keep up with the competition. Neural graphics are the future.

A thread by Sebastian Aaltonen:

"This is the reason why you want tensor units inside GPU compute units instead of NPU for graphics processing.

Upscaling runs in middle of the frame, so you'd have GPU->NPU->GPU dependency, which would create a bubble. Also NPU would be unused for most of the frame duration.
NPU would need to:
1. Support low latency fence sync with GPU
2. Have shared memory and shared LLC with GPU
3. Be able to read+write GPU swizzled texture layouts
4. Be able to read+write DCC compressed GPU textures

And you still have the above mentioned problem.
If your GPU is waiting for the NPU in middle of the frame. You might as well put the tensor units in the GPU to get 1, 2, 3 and 4 all for free. That's why you put tensor units inside GPU compute units. It's simpler and it reuses all the GPU units and register files.
Sony or Nvidia (GPU tensor) vs Microsoft/Qualcomm AutoSR (NPU) comparison:

1. NPU runs after GPU has finished the frame. Not in the middle of the frame. Upscales UI too. Which is not nice
2. Adds one frame of latency. NPU processes frame in parallel while GPU runs frame+1
TV sets also use NPUs for upscaling, as added latency is not an issue. GPU tensor cores are better when latency matters (gaming). Also, games want to compose low-res 3d content + native UI. NPU is not good for this kind of workload."
 

hemedans

Senior member
Jan 31, 2015
254
143
116
Process tech advancement has a lot to do with that.

Finally cracking EUV lithography was a huge win for semiconductor manufacturing, much like the long delayed PCIe v4 was quickly followed by v5, 6 and now 7 well into draft development.

Notice how much more modest the increase in memory bandwidth has been in that same timeframe though, and memory density is no better.

I feel like once they finally get 3D DRAM into production under 10nm we will get many years of density increases as they leverage the last decade of experience working with 3D NAND flash memories.
Power consumption too, 2020 sd 865 gpu use less than 5W, but now power consumption triple.
 

eek2121

Diamond Member
Aug 2, 2005
3,202
4,635
136
@FlameTail I saw you called out my posts earlier. Between life with kids, a wife, disabilities, holidays, and more, I wanted to set the record straight. I don't want you to think I'm calling you out or insulting you. I wanted to message you privately, but I think my post may help others. Mods, if this violates the rules, please remove. I tried to avoid callouts, but I wanted to try to reach out to @FlameTail while helping others and this isn't an insulting post.

From what I can tell, you are much younger than I or others. I'm in my 40s, and many others are the same/older. I grew up with computers. I've been programming since I was old enough to read. I learned BASIC and ASM (the only two languages available early on in my life, ask my elders what they dealt with 🤣and I've worked extensively with both hardware and software, I ever designed silicon, just boards and systems, but I've written emulators) and I've watched a gazillion different CPU architectures fight to the death. ARM included. ARM has made several attempts to get into the PC market, and they've failed. Qualcomm is only the latest attempt. When we older folks say stuff you view as negative, it isn't actually negative, but rather objective. I'm not against ARM overtaking Intel. That is fine. I'm against improper hype of something that has been tried dozens of times before, especially by a company like QUALCOMM. I actually own more ARM devices than x86 devices, and that EXCLUDES mobile unlike a lot of people, but when you watch the same scenario play out over and over, yeah it is going to take a bit more than words and hype to impress.

Respectfully, my advice is to stop downplaying Intel and AMD. Doing so isn't objective. Compare the merits, including compatibility, and acknowledge the strength/weaknesses of both platforms. Also understand that some of us have some very real industry experience.

Oh, an apt comparison would be someone who was born from say, 97-98 until now to challenge someone that saw 9/11 to the accuracy of events they personally saw, and we old farts aren't going to go out prancing around for proof to prove to a person much younger than us. We just don't have the patience.

I've never stated ARM is bad, just that it is being over-hyped and over-compared to x86. All platforms have their strengths/weaknesses. Raw power is only one of several. So is efficiency.

...oh and yes, we all can be dismissive. We've seen the claims a gazillion times and have also seen Microsoft fumble everything, including Xbox, in the past few years. I will apologize for all of us. It is age and impatience the 100th time someone comes along hyping. Oh and Apple has always been there, even in the 90s on Motorola 68k or powerpc, just threatening to dominate. Don't get me started on IBM. Yet, here we are today.

That is all, sorry, I wasn't trying to rant or be mean. When you become an old fart, being mean usually takes effort unless you are born that way, and I wasn't.

Cheers.

P.S. I do have a favorite: It is RISC-V. 😎
 

Magio

Member
May 13, 2024
108
113
76
So, with the court case (for now) falling Qualcomm's way, I expect they'll have the confidence to expand their use of Oryon even more starting now.

But one thing I'm curious about, the court case seems to affirm that not only is Qualcomm's ARM v8 license absolutely still valid, their ARM v9 license is equally safe for the foreseeable future so does anyone know if Oryon v3 (so 8 Elite gen 2 and X Elite gen 2) is expected to leverage ARM v9 or will they stick with v8?
 

FlameTail

Diamond Member
Dec 15, 2021
4,384
2,754
106
So, with the court case (for now) falling Qualcomm's way, I expect they'll have the confidence to expand their use of Oryon even more starting now.
For this generation, it seems Oryon is limited to only flagship Snapdragon 8 mobile SoC. All mobile SoCs below that tier are rumorued to still stick with ARM Cortex cores.
But one thing I'm curious about, the court case seems to affirm that not only is Qualcomm's ARM v8 license absolutely still valid, their ARM v9 license is equally safe for the foreseeable future
I don't think it's two ALAs for ARMv8 and ARMv9 each.

It's one ALA that has provisions for both ARMv8 and ARMv9.
so does anyone know if Oryon v3 (so 8 Elite gen 2 and X Elite gen 2) is expected to leverage ARM v9 or will they stick with v8?
Rumours said Oryon Gen 3 will be ARMv9 and have SME.

Court documents confirm that Qualcomm is working on an ARMv9 core.
 

Magio

Member
May 13, 2024
108
113
76
For this generation, it seems Oryon is limited to only flagship Snapdragon 8 mobile SoC. All mobile SoCs below that tier are rumorued to still stick with ARM Cortex cores.

Yeah for this generation this is expected but if no further potential legal issues arise, I could see them expand the use of Oryon to more tiers next year (if they had such designs being prepped just in case, at least).

Rumours said Oryon Gen 3 will be ARMv9 and have SME.
View attachment 113580

N3P + latest ARM ISA + GPU improvements + another set of Oryon improvements (I assume), if everything pans out X Elite/8 Elite Gen 2 could stun the industry.
 

jdubs03

Golden Member
Oct 1, 2013
1,155
799
136
A clock frequency at 5GHz alone would bump up the GB6 score to 3700. SME will help push it towards 4000. Multicore will be impressive for sure. For single core, my guess is it’ll be a little less to on par with the A19 in overall performance, but still considerably behind in IPC. This is assuming Apple goes for its’ normal 15% bump. I suspect the E cores will be improved upon, not sure about the P cores other than clocks.
 

Raqia

Member
Nov 19, 2008
78
46
91
A clock frequency at 5GHz alone would bump up the GB6 score to 3700. SME will help push it towards 4000. Multicore will be impressive for sure. For single core, my guess is it’ll be a little less to on par with the A19 in overall performance, but still considerably behind in IPC. This is assuming Apple goes for its’ normal 15% bump. I suspect the E cores will be improved upon, not sure about the P cores other than clocks.
>Behind on IPC

You might equally well say the Apple A19 will be behind on clock... The overall PPA is excellent and the high clocks are a part of the design of this line of CPUs.
 

jdubs03

Golden Member
Oct 1, 2013
1,155
799
136
>Behind on IPC

You might equally well say the Apple A19 will be behind on clock... The overall PPA is excellent and the high clocks are a part of the design of this line of CPUs.
Definitely room for improvement. It’ll be closer for sure sans a big upgrade on the Apple side.
 

Meteor Late

Senior member
Dec 15, 2023
266
291
96
A clock frequency at 5GHz alone would bump up the GB6 score to 3700. SME will help push it towards 4000. Multicore will be impressive for sure. For single core, my guess is it’ll be a little less to on par with the A19 in overall performance, but still considerably behind in IPC. This is assuming Apple goes for its’ normal 15% bump. I suspect the E cores will be improved upon, not sure about the P cores other than clocks.

It doesn't say it will happen on mobile. If the core used is almost the exact same core used in X Elite v2, then it's most likely the 5GHz target is on Laptop.
 

jdubs03

Golden Member
Oct 1, 2013
1,155
799
136
It doesn't say it will happen on mobile. If the core used is almost the exact same core used in X Elite v2, then it's most likely the 5GHz target is on Laptop.
Well I wasn’t the one that brought up 5GHz, I just took that and applied the calculation.
But by applying the N3 -> N3P increase in clock/performance, it’s an additional 5%. Which would take the next 8 Elite to 4.5-4.7GHz. Seems more reasonable I’d agree. 5GHz does seem more appropriate for laptop.
 
Last edited:

Magio

Member
May 13, 2024
108
113
76
>Behind on IPC

You might equally well say the Apple A19 will be behind on clock... The overall PPA is excellent and the high clocks are a part of the design of this line of CPUs.

To be fair it makes sense to pick out the "flaws" of the design that is currently trailing. Oryon, especially in 8 Elite, is impressive but in single core performance and efficiency it's still trailing Apple's P-core.

There's also always the question of how high you can push clocks before it sends your power consumption skyrocketing and it's unclear if they can go past 5GHz without losing the efficiency.

Oryon's exceptional perf per area is its greatest strength IMO as it gives the Nuvia team room to improve its IPC by making the L core a bit bigger while the M core stays in its sweet spot, meaning they don't *have* to become entirely reliant on clock speed to get generational improvements. But as always with chip design, theory is nice but it's the ability to execute which makes the difference.
 
Reactions: jdubs03

Doug S

Diamond Member
Feb 8, 2020
3,005
5,167
136
To be fair it makes sense to pick out the "flaws" of the design that is currently trailing. Oryon, especially in 8 Elite, is impressive but in single core performance and efficiency it's still trailing Apple's P-core.

There's also always the question of how high you can push clocks before it sends your power consumption skyrocketing and it's unclear if they can go past 5GHz without losing the efficiency.

Oryon's exceptional perf per area is its greatest strength IMO as it gives the Nuvia team room to improve its IPC by making the L core a bit bigger while the M core stays in its sweet spot, meaning they don't *have* to become entirely reliant on clock speed to get generational improvements. But as always with chip design, theory is nice but it's the ability to execute which makes the difference.


I think Qualcomm's bigger problem WRT mobile is that they keep ramping up the price they charge OEMs. If they take too much of the profit from the high end of the Android market it looks more and more attractive for OEMs to look elsewhere - either designing their own SoC using ARM cores (which are in the same performance ballpark) or buying from someone else who has.

The only real missing piece is a modem, but if Google does OK going with Mediatek's modem I imagine they will start getting a lot of calls and Qualcomm will be forced to either rein in their price increases or accept a smaller market share.
 

FlameTail

Diamond Member
Dec 15, 2021
4,384
2,754
106
An interesting talk about Ray Tracing on Adreno GPUs:


PDF of presentation slides:


8 Gen 2 (Adreno 740) and X Elite (Adreno 741) have 1st generation Adreno RT.

8 Gen 3 (Adreno 750) has 2nd generation Adreno RT. It has a hardware block called AQE that does invocation repacking, which sounds similar to Nvidia's Shader Execution Re-ordering (SER) or Intel's Thread Sorting Unit (TSU).

8 Elite has 3rd generation Adreno RT I guess, but the details are unknown.

8 Elite G2 (Adreno 840) will have 4th generation Adreno RT, and it might implement BVH traversal acceleration in hardware. Apple already introduced this in A17 Pro last year, so it's long overdue for Qualcomm to do the same.

Snapdragon X2 might have a decent GPU uarch.
Qualcomm has added a heavy hitter to its GPU ranks. AMD’s former ray tracing expert, Paritosh Kulkarni, announced he’s joined Qualcomm’s team to work on DirectX 12.2 support for its Adreno GPUs. He’ll leverage his expertise in ray tracing to help develop DX12.2 features like DXR, mesh shaders, work graphs, and driver optimizations.
 
Last edited:

soresu

Diamond Member
Dec 19, 2014
3,489
2,782
136
8 Elite G2 (Adreno 840) will have 4th generation Adreno RT, and it might implement BVH traversal acceleration in hardware
I was under the impression that this feature was pretty fundamental to any HW RT implementation?
 

soresu

Diamond Member
Dec 19, 2014
3,489
2,782
136
AdrenoCUsNodeSize
8+G17304N418 mm²
8G27406N426 mm²
8G37506N4P28 mm²
8 Elite83012N3E22 mm²
8 Elite G284018N3P

Adreno 840 might be the biggest GPU ever in Snapdragon 8 history. 30+ mm².
The power draw increase will be pretty knarly too unless they squeezed a lot more efficiency out of it, because N3E -> N3P is not worth -50% power.
 
Reactions: Tlh97

FlameTail

Diamond Member
Dec 15, 2021
4,384
2,754
106
I was under the impression that this feature was pretty fundamental to any HW RT implementation?
No, even AMD RDNA2/RDNA3 doesn't have hardware acceleration for BVH, but they are adding it for RDNA4, which is the reason for the hype.

Refer this:
The power draw increase will be pretty knarly too unless they squeezed a lot more efficiency out of it, because N3E -> N3P is not worth -50% power.
Well the last time Qualcomm did a 50% CU increase was 8G1 (730) -> 8G2 (740)

8+ G18G2
Adreno 730Adreno 740
4 CUs6 CUs
900 MHz680 MHz

3DMark Steel Nomad Light (Socpk.com - Geekerwan)

Both on same node (N4), 740 has 25% lower clock speed, yet it's 35% faster and at a similar power level.

830 -> 840 there is going to be a node improvement (N3E -> N3P), so I expect the clock regression will only be about 10-15%. So I estimate 840 will bring atleast 30% higher performance iso-power, compared to 830.
 

Meteor Late

Senior member
Dec 15, 2023
266
291
96
The power draw increase will be pretty knarly too unless they squeezed a lot more efficiency out of it, because N3E -> N3P is not worth -50% power.

No because you don't necessarily increase or even maintain clocks, with N3P a 10% frequency reduction from what was in N3E before could get you close to equal power consumption, maybe, or a bit more.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |