Question Intel and AMD form x86 Ecosystem Advisory Group

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Raqia

Member
Nov 19, 2008
61
30
91
Today, x86 is back-pedaling from ARM and RISC-V encroachment from below into laptops, desktops, and servers: their traditional stronghold. More clarity on future investments for the 2 biggest x86 implementers is absolutely needed to better concentrate capital deployments for implementations at expensive new nodes.

Tepid and slow adoption of new extensions had been the norm for x86 in the past as even within x86, AMD and Intel went their separate ways (e.g. 3DNOW! vs. SSE, AVX2 vs. SSE5, AVX-512 support for consumer parts vs none...), never mind x86 vs Itanium... Chrome wasn't even ported to x86-64 for some 6 years after release! By comparison, phones have had much tighter software ecosystems thanks to the app-store model and browsers; CPU ISA is pretty uniform too.

Because of how expensive it is to design and implement new CPUs (nevermind entire SoCs) at leading nodes, fragmentation would leave the x86 manufacturer with the un-adopted extension utterly doomed, and even the winning manufacturer would be hated by software developers for lost effort due to fragmentation. Intel having missed the boat on phones cost them dearly not only in revenues but in terms of computing ecosystem clout.
 

Nothingness

Diamond Member
Jul 3, 2013
3,134
2,145
136
I wonder does this mean that consumer hardware will converge on AVX10/256 only or will AMD keep AVX10/512 across the stack going forward (double pumped or native).
It's not the first time I see people using the term "double pumped". In CPU design it means you do twice the work per clock (for instance P4 ALU). I don't think Intel or AMD are doing that. You probably mean the reverse and that AMD splits its 512-bit operations in two halves.
 

Nothingness

Diamond Member
Jul 3, 2013
3,134
2,145
136
Tepid and slow adoption of new extensions had been the norm for x86 in the past as even within x86, AMD and Intel went their separate ways (e.g. 3DNOW! vs. SSE, AVX2 vs. SSE5, AVX-512 support for consumer parts vs none...), never mind x86 vs Itanium... Chrome wasn't even ported to x86-64 for some 6 years after release! By comparison, phones have had much tighter software ecosystems thanks to the app-store model and browsers; CPU ISA is pretty uniform too.
IMHO SVE2 adoption is no better on Arm. Thanks to Qualcomm. And to Arm CPUs for not bringing enough benefits to switch from NEON 128-bit to SVE2 128-bit.

At least it can be argued that all Arm customers of recent CPUs have SVE2, but still.

OTOH I wonder if Arm isn't more advanced with security extensions than Intel/AMD. I'm not familiar enough with x86 extensions to know for sure, but Arm has many ISA features for security (BTI, MTE, etc.). This might be the main driver of the x86 alliance.
 

gdansk

Diamond Member
Feb 8, 2011
3,110
4,840
136
OTOH I wonder if Arm isn't more advanced with security extensions than Intel/AMD. I'm not familiar enough with x86 extensions to know for sure, but Arm has many ISA features for security (BTI, MTE, etc.). This might be the main driver of the x86 alliance.
Zen 4 has equivalents for both indirect branch tracking and upper address ignore for pointer tagging (albeit only 7 bits).

But it's an area to standardize because AMD notes some features are enumerated in a different way than "other vendors" (i.e. Intel).
 

Doug S

Platinum Member
Feb 8, 2020
2,836
4,823
136
IMHO SVE2 adoption is no better on Arm. Thanks to Qualcomm. And to Arm CPUs for not bringing enough benefits to switch from NEON 128-bit to SVE2 128-bit.

At least it can be argued that all Arm customers of recent CPUs have SVE2, but still.

I think a further problem is that they introduced streaming SVE which is yet another way to accomplish the same 128 bit math. That was introduced as part of SME, so I'd argue that maybe Apple thought SVE2 sucked and gave them "SVE done right" along with their AMX that became SME except that Apple's SSVE implementation on M4 doesn't perform all that well. It isn't clear whether that's an implementation bug, something that needs to be fixed in the spec (I don't recall the specifics, but @name99 can speak to this, something about the way the various registers can be used) or Apple just doesn't care about SSVE either and its poor performance is because they didn't care/try to make it perform better.

So anyone complaining about the state of SIMD on x86 shouldn't be looking over at ARM and thinking the grass is greener, because NEON is the only thing you can depend on having if you care about Android/Windows ARM and iPhone/Mac ARM compatibility. If Apple makes SSVE perform better, and ARM & Qualcomm implementations include SME and thus SSVE, maybe SVE2 will end up depecated and removed by the time ARMv10 comes along, who knows...
 

MS_AT

Senior member
Jul 15, 2024
311
691
96
It's not the first time I see people using the term "double pumped". In CPU design it means you do twice the work per clock (for instance P4 ALU). I don't think Intel or AMD are doing that. You probably mean the reverse and that AMD splits its 512-bit operations in two halves.
Probably because it's the term AMD used during one of their PRs presentations and since then it was repeated by all the media that probably did not look into technical accuracy of the term. But anyway thanks for pointing this out.

And yes they split 512b operations on Zen4 and Zen5 mobile. According to Mystical (Y-cruncher author):

When a 512-bit instruction is split, it is issued on two consecutive cycles likely to the same unit. (hence the "double-pumping") So one half will always be a cycle behind the other half. I have not tested which half has the 1 cycle delay, but I assume it's the upper half. Thus one can assume that all the datapaths on Zen4 remain 256-bit as 512-bit operands would take two consecutive cycles on each port.

As long as there are no dependencies which cross the halves, this 1 cycle delay is not observable. Therefore, the latencies of most 512-bit instructions remain the same as the 256-bit versions. 512-bit instructions which do have dependencies that cross between lower and upper 256-bit halves have a 1 cycle additional latency - presumably to wait for the slower half to be ready.
from https://www.mersenneforum.org/node/21615#post614191 this is from his Zen4 teardown but I would expect Zen 5 mobile to use the same behaviour. So if this description is accurate this is might be how they arrived at this "double-pumping".
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,494
2,059
136
I wonder does this mean that consumer hardware will converge on AVX10/256 only or will AMD keep AVX10/512 across the stack going forward (double pumped or native).

If Intel puts AVX10/256 on everything, I can see AMD moving back to the split implementation across their consumer lineup, to optimize more for the 256-bit case. But AMD's half-width implementation is really good, I can't see them doing away with it for really any reason.
 
Reactions: DarthKyrie

Nothingness

Diamond Member
Jul 3, 2013
3,134
2,145
136
I think a further problem is that they introduced streaming SVE which is yet another way to accomplish the same 128 bit math.
Streaming SVE operates at vector length of SME, so in the case of Apple that's 512-bit. But it's so slow it's useless (except for tasks between two matrix operations).

If Apple makes SSVE perform better, and ARM & Qualcomm implementations include SME and thus SSVE, maybe SVE2 will end up depecated and removed by the time ARMv10 comes along, who knows...
The thing is that you have a single SME block per cluster (these matrix engines are large), so 4 CPU with SVE might be faster than even a good implementation of SSVE. Of course that would depend on the number of SVE datapaths and vector width in CPUs.
 
Reactions: soresu

branch_suggestion

Senior member
Aug 4, 2023
401
892
96
If nothing is beyond Jensen then he would have owned by ARM by now
NV's flaw is their lack of goodwill building, they don't have the patience.
Qualcomm et al. had all the necessary connections in place to shut down the deal before it ever had a chance.
NV is an outsider with the US government, that hasn't changed, most top liaison roles between the semicaps and Congress are held by more traditional players.
 
Reactions: DarthKyrie

poke01

Platinum Member
Mar 8, 2022
2,352
3,081
106
NV's flaw is their lack of goodwill building, they don't have the patience.
Qualcomm et al. had all the necessary connections in place to shut down the deal before it ever had a chance.
NV is an outsider with the US government, that hasn't changed, most top liaison roles between the semicaps and Congress are held by more traditional players.
fair points.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,712
1,241
136
RISC-V is not a threat yet to x86, it’s more fragmented and that will take another decade or two to be feasible in HPC.
RISC-V isn't fragmented and it isn't going to be another decade. At best we are like only 12 months away from HPC~Cloud appearing.
 

soresu

Diamond Member
Dec 19, 2014
3,274
2,551
136
AVX2 vs. SSE5
SSE5 was announced but never implemented as detailed.

Most of it was covered by the AVX1 spec that Intel announced shortly after, and the rest was implemented as FMA4 and XOP.
 

Nothingness

Diamond Member
Jul 3, 2013
3,134
2,145
136
NV was always larger than ARM, nothing is beyond JHH as everyone should know by now.
It's quite likely JHH and Haas still have discussions and some common interests, just like Intel and AMD have, but thinking JHH has some form of control from behind the curtain over Arm through Haas (don't read this out loud at work ) is certainly exaggerated.

Anyway enough of that, let's get back to Intel/AMD/x86
 
Reactions: FlameTail

NostaSeronx

Diamond Member
Sep 18, 2011
3,712
1,241
136
CPUs currently out there lack vector processing, that's not a trivial thing.
Are those CPUs actually release-safe mass produced units? Can I buy them in a large volume name-brand desktop, laptop, etc? If not that is a prototype/limited market product and are trivial toys at best.

What is the defined RISC-V profile/spec for Android and Linux? Then, you have your answer there is no fragmentation for specific general purpose operating systems.
 
Last edited:

soresu

Diamond Member
Dec 19, 2014
3,274
2,551
136
If Apple makes SSVE perform better, and ARM & Qualcomm implementations include SME and thus SSVE, maybe SVE2 will end up depecated and removed by the time ARMv10 comes along, who knows...
SME is an optional feature of the ISA, as is SVE.

They will not be deprecating it.

Certainly not as early as the first major ISA revision after it was introduced.

Also read this from the top of ARM's own developer documentation of SME/SME2:

It introduces a new execution mode: Streaming SVE mode in which the new SME instructions and a subset of SVE2 instructions can be executed

So SSVE doesn't even implement the full spectrum of SVE2 instructions, thus it cannot completely replace it no matter how much it is sped up.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,712
1,241
136
I stopped reading here.
Akeana 5000 = RVA23
SiFive P870/Napa = RVA23
TT-Ascalon = RVA23
Veyron V2 = RVA23
Kunminghu = RVA23
C930 = RVA23
A66/A67/Cuzco = RVA23
RXU 3.2/3.5 = RVA23
etc.

It isn't fragmented for HPC/Cloud/Datacenter or whatever. There is no more RVA24/RVA25 going forward. It will only be RVA23 as the hardpoint, the freeze for it is pretty cool.

ARM and x86-64 operate as deeply medicore ISAs with heavy fragmentation in active consumer products. None of this exists in RISC-V. Every single HPC-target core shares the same exact instruction set profile, RVA23.
 
Last edited:

branch_suggestion

Senior member
Aug 4, 2023
401
892
96
JHH: "We support the x86. The x86 is very important to us. We support it for PCs, workstations, data centers. And so the fact that the architecture was fragmenting isn’t good for the industry, so I love what they’re doing. Pulling it together and making sure that x86 remains x86. Otherwise, it’s not x86 anymore so I think it’s really terrific what they’re doing."
 

gdansk

Diamond Member
Feb 8, 2011
3,110
4,840
136
JHH: "We support the x86. The x86 is very important to us. We support it for PCs, workstations, data centers. And so the fact that the architecture was fragmenting isn’t good for the industry, so I love what they’re doing. Pulling it together and making sure that x86 remains x86. Otherwise, it’s not x86 anymore so I think it’s really terrific what they’re doing."
Aiming for one target is better for everyone. Even people who have to maintain x86 emulators and add future instruction set extensions.
 

soresu

Diamond Member
Dec 19, 2014
3,274
2,551
136
JHH: "We support the x86. The x86 is very important to us. We support it for PCs, workstations, data centers. And so the fact that the architecture was fragmenting isn’t good for the industry, so I love what they’re doing. Pulling it together and making sure that x86 remains x86. Otherwise, it’s not x86 anymore so I think it’s really terrific what they’re doing."
Wait, is that a real quote?

The last part sounds like he had a stroke, it's almost child like.
 

soresu

Diamond Member
Dec 19, 2014
3,274
2,551
136
Aiming for one target is better for everyone. Even people who have to maintain x86 emulators and add future instruction set extensions.
True, even for nVidia where maintaining drivers is concerned, as they almost certainly have x86 specific optimisations.
 
Reactions: DarthKyrie
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |