Discussion NV Re-Enter ARM PC market in 2025!

Page 9 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

LightningZ71

Platinum Member
Mar 10, 2017
2,077
2,525
136
There isn't a lot left beyond maturing the tech that's out there. There are hundreds of tiny compute platforms out there for all of the various items that could use computer control systems. AI is progressing towards a point where it either hits a solid brick wall, or it makes a huge leap by actually achieving true reasoning. Big data is still big data, but it's more constrained by locating and moving the data than by actual compute. The auto industry has many players now and all are converging on being both sensor cost and AI capability limited, and given where silicon scaling is seeming to stall, it seems to have hard limits too. GPUs are nearing the limits of raster rates and ray tracing is so resource intensive that it may not ever hit full, real time image throughout natively and will continue to rely on AI to clean up low res approximations, at least, not in something that is affordable and able to be operated by mere mortals (power, thermal management, and even size of the whole system will be constraints)

All that seems to be left is the nano world of micro robots for health care and equipment repair and inspection. We really aren't making circuits drastically smaller anymore, and trying to get such tiny and densely packed circuits to switch more rapidly is hitting thermal limits. Where else do you go?
 
Jul 27, 2020
23,462
16,510
146
We really aren't making circuits drastically smaller anymore, and trying to get such tiny and densely packed circuits to switch more rapidly is hitting thermal limits. Where else do you go?
Bigger caches. More cache levels. Heftier AI involvement in caching strategies. Reverse HT or rentable units type of innovations to increase IPC.
 

LightningZ71

Platinum Member
Mar 10, 2017
2,077
2,525
136
Those aren't much more than a few percent more performance each. Caches can only get so big or deep before access latencies kill all the benefits. Hit rates are already typically averaging well north of 90%. Cores are already quite wide, so adding more execution units only helps in edge cases most of the time.

There are still a few things that can be innovated on, but, they're really just working to address stalls, predict branches better and make mispredicts less costly, and optimizing certain instruction latencies around predicted workloads.
 

LightningZ71

Platinum Member
Mar 10, 2017
2,077
2,525
136
This is the least surprising thing in the world. With AMD and Intel producing SOCs that have iGPUs good enough to have gutted the bottom end of Nvidia's laptop dGPU stack, they really had no choice if they didn't want to give up a chunk of their sales. I'm interested in seeing how much Nvidia contributes towards making WoArm a better product as NVidia is known for their software work and they can't get an x86 license so it's their only real option.
 

SpudLobby

Golden Member
May 18, 2022
1,039
700
106
This is the least surprising thing in the world. With AMD and Intel producing SOCs that have iGPUs good enough to have gutted the bottom end of Nvidia's laptop dGPU stack, they really had no choice if they didn't want to give up a chunk of their sales. I'm interested in seeing how much Nvidia contributes towards making WoArm a better product as NVidia is known for their software work and they can't get an x86 license so it's their only real option.
FWIW Nvidia today wouldn’t care about an X86 license. Originally what they wanted was to get around that anyway. Arm isn’t perfect but Nvidia increasing the software moat at a time they could invest to do the opposite toward a much more open playing field (even if very much imperfect with Arm’s antics) makes no sense, especially when AMD and Intel would retain the rights to that on a probably unfavorable term. As a bonus if their core development goes nowhere, and it might, they have generic Arm cores they can use since their real advantage is other IP anyway.
 

Tigerick

Senior member
Apr 1, 2022
718
670
106
170W vs 155W (mostly for Mac Mini M4 Pro with 256-bit memory interface)

I suspect GB10 would be used for upcoming high end ARM SoC with similar power consumption; also competing with Strix/Medusa Halo.

There was rumor that NV going to launch 3 ARM SoC; If N1x is low end SoC and GB10 is high end one, the only one missing is mainstream SoC with 128-bit memory bus. I have created a table in the front page comparing different SoCs. It was fun, hope u guys have a good read...
 

DZero

Senior member
Jun 20, 2024
754
284
96
170W vs 155W (mostly for Mac Mini M4 Pro with 256-bit memory interface)

I suspect GB10 would be used for upcoming high end ARM SoC with similar power consumption; also competing with Strix/Medusa Halo.

There was rumor that NV going to launch 3 ARM SoC; If N1x is low end SoC and GB10 is high end one, the only one missing is mainstream SoC with 128-bit memory bus. I have created a table in the front page comparing different SoCs. It was fun, hope u guys have a good read...
Seems that the alliance with MTK is working properly.
Now time to see the reaction of x86
 

Jan Olšan

Senior member
Jan 12, 2017
514
1,011
136
Apparently the GB10 SoC is actually two pieces of silicon (see photos on ServeTheHome) and based on Nvidia's press release, the CPU and GPU communicates over "NVLink-C2C" with 5x the bandwidth of PCIe 5.0.

I wonder how the RAM sharing is handled. I guess it is similar as Strix Halo, but it's going to be interesting to see how well did they manage to handle CPU's RAM access.

Perhaps this is because Nvidia didn't want to give MediaTek any of their IP to integrate onto their design? Besides the NVLink IP.
 

soresu

Diamond Member
Dec 19, 2014
3,688
3,025
136
Seeing as my general ARM thread has been locked I might as well add the posts from there to this thread as they are all nVidia related.
 

soresu

Diamond Member
Dec 19, 2014
3,688
3,025
136
On the subject of nVidia's custom 'Olympus' CPU core here is all the currently released info on its instruction support:

+// REQUIRES: aarch64-registered-target
+// RUN: %clang --target=aarch64 --print-enabled-extensions -mcpu=olympus | FileCheck --strict-whitespace --implicit-check-not=FEAT_ %s
+
+// CHECK: Extensions enabled for the given AArch64 target
+// CHECK-EMPTY:
+// CHECK-NEXT: Architecture Feature(s) Description
+// CHECK-NEXT: FEAT_AES, FEAT_PMULL Enable AES support
+// CHECK-NEXT: FEAT_AMUv1 Enable Armv8.4-A Activity Monitors extension
+// CHECK-NEXT: FEAT_AMUv1p1 Enable Armv8.6-A Activity Monitors Virtualization support
+// CHECK-NEXT: FEAT_AdvSIMD Enable Advanced SIMD instructions
+// CHECK-NEXT: FEAT_BF16 Enable BFloat16 Extension
+// CHECK-NEXT: FEAT_BRBE Enable Branch Record Buffer Extension
+// CHECK-NEXT: FEAT_BTI Enable Branch Target Identification
+// CHECK-NEXT: FEAT_CCIDX Enable Armv8.3-A Extend of the CCSIDR number of sets
+// CHECK-NEXT: FEAT_CHK Enable Armv8.0-A Check Feature Status Extension
+// CHECK-NEXT: FEAT_CRC32 Enable Armv8.0-A CRC-32 checksum instructions
+// CHECK-NEXT: FEAT_CSV2_2 Enable architectural speculation restriction
+// CHECK-NEXT: FEAT_Crypto Enable cryptographic instructions
+// CHECK-NEXT: FEAT_DIT Enable Armv8.4-A Data Independent Timing instructions
+// CHECK-NEXT: FEAT_DPB Enable Armv8.2-A data Cache Clean to Point of Persistence
+// CHECK-NEXT: FEAT_DPB2 Enable Armv8.5-A Cache Clean to Point of Deep Persistence
+// CHECK-NEXT: FEAT_DotProd Enable dot product support
+// CHECK-NEXT: FEAT_ECV Enable enhanced counter virtualization extension
+// CHECK-NEXT: FEAT_ETE Enable Embedded Trace Extension
+// CHECK-NEXT: FEAT_FAMINMAX Enable FAMIN and FAMAX instructions
+// CHECK-NEXT: FEAT_FCMA Enable Armv8.3-A Floating-point complex number support
+// CHECK-NEXT: FEAT_FGT Enable fine grained virtualization traps extension
+// CHECK-NEXT: FEAT_FHM Enable FP16 FML instructions
+// CHECK-NEXT: FEAT_FP Enable Armv8.0-A Floating Point Extensions
+// CHECK-NEXT: FEAT_FP16 Enable half-precision floating-point data processing
+// CHECK-NEXT: FEAT_FP8 Enable FP8 instructions
+// CHECK-NEXT: FEAT_FP8DOT2 Enable FP8 2-way dot instructions
+// CHECK-NEXT: FEAT_FP8DOT4 Enable FP8 4-way dot instructions
+// CHECK-NEXT: FEAT_FP8FMA Enable Armv9.5-A FP8 multiply-add instructions
+// CHECK-NEXT: FEAT_FPAC Enable Armv8.3-A Pointer Authentication Faulting enhancement
+// CHECK-NEXT: FEAT_FRINTTS Enable FRInt[32|64][Z|X] instructions that round a floating-point number to an integer (in FP format) forcing it to fit into a 32- or 64-bit int
+// CHECK-NEXT: FEAT_FlagM Enable Armv8.4-A Flag Manipulation instructions
+// CHECK-NEXT: FEAT_FlagM2 Enable alternative NZCV format for floating point comparisons
+// CHECK-NEXT: FEAT_HCX Enable Armv8.7-A HCRX_EL2 system register
+// CHECK-NEXT: FEAT_I8MM Enable Matrix Multiply Int8 Extension
+// CHECK-NEXT: FEAT_JSCVT Enable Armv8.3-A JavaScript FP conversion instructions
+// CHECK-NEXT: FEAT_LOR Enable Armv8.1-A Limited Ordering Regions extension
+// CHECK-NEXT: FEAT_LRCPC Enable support for RCPC extension
+// CHECK-NEXT: FEAT_LRCPC2 Enable Armv8.4-A RCPC instructions with Immediate Offsets
+// CHECK-NEXT: FEAT_LS64, FEAT_LS64_V, FEAT_LS64_ACCDATA Enable Armv8.7-A LD64B/ST64B Accelerator Extension
+// CHECK-NEXT: FEAT_LSE Enable Armv8.1-A Large System Extension (LSE) atomic instructions
+// CHECK-NEXT: FEAT_LSE2 Enable Armv8.4-A Large System Extension 2 (LSE2) atomicity rules
+// CHECK-NEXT: FEAT_LUT Enable Lookup Table instructions
+// CHECK-NEXT: FEAT_MEC Enable Memory Encryption Contexts Extension
+// CHECK-NEXT: FEAT_MPAM Enable Armv8.4-A Memory system Partitioning and Monitoring extension
+// CHECK-NEXT: FEAT_MTE, FEAT_MTE2 Enable Memory Tagging Extension
+// CHECK-NEXT: FEAT_NV, FEAT_NV2 Enable Armv8.4-A Nested Virtualization Enchancement
+// CHECK-NEXT: FEAT_PAN Enable Armv8.1-A Privileged Access-Never extension
+// CHECK-NEXT: FEAT_PAN2 Enable Armv8.2-A PAN s1e1R and s1e1W Variants
+// CHECK-NEXT: FEAT_PAuth Enable Armv8.3-A Pointer Authentication extension
+// CHECK-NEXT: FEAT_PMUv3 Enable Armv8.0-A PMUv3 Performance Monitors extension
+// CHECK-NEXT: FEAT_RAS, FEAT_RASv1p1 Enable Armv8.0-A Reliability, Availability and Serviceability Extensions
+// CHECK-NEXT: FEAT_RDM Enable Armv8.1-A Rounding Double Multiply Add/Subtract instructions
+// CHECK-NEXT: FEAT_RME Enable Realm Management Extension
+// CHECK-NEXT: FEAT_RNG Enable Random Number generation instructions
+// CHECK-NEXT: FEAT_SB Enable Armv8.5-A Speculation Barrier
+// CHECK-NEXT: FEAT_SEL2 Enable Armv8.4-A Secure Exception Level 2 extension
+// CHECK-NEXT: FEAT_SHA1, FEAT_SHA256 Enable SHA1 and SHA256 support
+// CHECK-NEXT: FEAT_SHA3, FEAT_SHA512 Enable SHA512 and SHA3 support
+// CHECK-NEXT: FEAT_SM4, FEAT_SM3 Enable SM3 and SM4 support
+// CHECK-NEXT: FEAT_SPE Enable Statistical Profiling extension
+// CHECK-NEXT: FEAT_SPECRES Enable Armv8.5-A execution and data prediction invalidation instructions
+// CHECK-NEXT: FEAT_SPEv1p2 Enable extra register in the Statistical Profiling Extension
+// CHECK-NEXT: FEAT_SSBS, FEAT_SSBS2 Enable Speculative Store Bypass Safe bit
+// CHECK-NEXT: FEAT_SVE Enable Scalable Vector Extension (SVE) instructions
+// CHECK-NEXT: FEAT_SVE2 Enable Scalable Vector Extension 2 (SVE2) instructions
+// CHECK-NEXT: FEAT_SVE_AES, FEAT_SVE_PMULL128 Enable SVE AES and quadword SVE polynomial multiply instructions
+// CHECK-NEXT: FEAT_SVE_BitPerm Enable bit permutation SVE2 instructions
+// CHECK-NEXT: FEAT_SVE_SHA3 Enable SHA3 SVE2 instructions
+// CHECK-NEXT: FEAT_SVE_SM4 Enable SM4 SVE2 instructions
+// CHECK-NEXT: FEAT_TLBIOS, FEAT_TLBIRANGE Enable Armv8.4-A TLB Range and Maintenance instructions
+// CHECK-NEXT: FEAT_TRBE Enable Trace Buffer Extension
+// CHECK-NEXT: FEAT_TRF Enable Armv8.4-A Trace extension
+// CHECK-NEXT: FEAT_UAO Enable Armv8.2-A UAO PState
+// CHECK-NEXT: FEAT_VHE Enable Armv8.1-A Virtual Host extension
+// CHECK-NEXT: FEAT_WFxT Enable Armv8.7-A WFET and WFIT instruction
+// CHECK-NEXT: FEAT_XS Enable Armv8.7-A limited-TLB-maintenance instruction
 

soresu

Diamond Member
Dec 19, 2014
3,688
3,025
136
Olympus definitely has SVE2, but not the SVE2p1 or SVEp2 extensions from later v9-A ISA releases - no info as yet on ALU bit length.
 

soresu

Diamond Member
Dec 19, 2014
3,688
3,025
136
Further investigation reveals the "HasV9_2aOps" flag which means v9.2-A architecture profile as targeted by LLVM.

(though many µArch features exceed the v9.2-A profile)
 
Last edited:

soresu

Diamond Member
Dec 19, 2014
3,688
3,025
136
"Enable Armv8.4-A Large System Extension 2 (LSE2) atomicity rules" should make it better at binary translation for x86 -> ARM64 emulation a la Prism and FEXemu.

It also has instructions called Large System Float Extension (LSFE), not sure if that has any particular positive impact on emulation.

Edit: nope, LSFE doesn't have any effect on x86 emulation as x86 lacks atomic float ops, but it might have a positive effect on emulating other CPU (or GPU) ISA's with ARM.
 

poke01

Diamond Member
Mar 8, 2022
3,381
4,625
106
Apparently the GB10 SoC is actually two pieces of silicon (see photos on ServeTheHome) and based on Nvidia's press release, the CPU and GPU communicates over "NVLink-C2C" with 5x the bandwidth of PCIe 5.0.

I wonder how the RAM sharing is handled. I guess it is similar as Strix Halo, but it's going to be interesting to see how well did they manage to handle CPU's RAM access.

Perhaps this is because Nvidia didn't want to give MediaTek any of their IP to integrate onto their design? Besides the NVLink IP.
I wonder if NV upcoming SoCs use SoIC packaging?
 

soresu

Diamond Member
Dec 19, 2014
3,688
3,025
136
Got to say I don't appreciate the locking of this thread without even leaving a post to explain, especially when I only created it with the intention of keeping this thread subject clean as the mods weren't making any effort to that effect as they do on the Intel/AMD/nV threads.

I don't know what the reason is, but whatever it was it should still have warranted a 30 second effort to type an explanation before locking it.

Do I have to use profanity for attention just to get a response from the mods on this?
 
Last edited:

Tup3x

Golden Member
Dec 31, 2016
1,210
1,290
136
Apparently the GB10 SoC is actually two pieces of silicon (see photos on ServeTheHome) and based on Nvidia's press release, the CPU and GPU communicates over "NVLink-C2C" with 5x the bandwidth of PCIe 5.0.

I wonder how the RAM sharing is handled. I guess it is similar as Strix Halo, but it's going to be interesting to see how well did they manage to handle CPU's RAM access.

Perhaps this is because Nvidia didn't want to give MediaTek any of their IP to integrate onto their design? Besides the NVLink IP.
If they'd make a version without ConnectX-7 NIC the price should go down a lot.
 
Jul 27, 2020
23,462
16,510
146
Do I have to use profanity for attention just to get a response from the mods on this?
I can understand your frustration but publicly calling out moderators isn't allowed. Better if you can ask some mod directly via PM. They are very helpful and aren't rude unless you force them to be.

I would also advise you to edit your "venting" post to avoid getting points.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |