- Jul 27, 2020
- 23,093
- 16,257
- 146
^^^ Dang! The odds of asking a fitness trainer chick out and becoming a dad one year later seem easier than what happened there!
That C&C article is very in-depth but... it repeatedly compares against Intel's first gen AVX-512 - Skylake-X. That means a 2024 core vs a 2017 core which is rather unfortunate given there are Cooper Lake, Ice Lake, Sapphire Rapids, Emerald Rapids, and Granite Rapids.
Get in touch with Chips and Cheese, I'm sure they'd have some use for them!I have Cascade Lake, Ice Lake and Sapphire Rapids Xeons if there are any benchmarks that would be helpful to run. Those machines are on Linux though so it would need to be a Linux compatible benchmark.
I mean, at the end of the day the saying "the customer is always right" is pretty valid in the processor market as long as there is enough of a possible user base to take advantage of it.^^^ Dang! The odds of asking a fitness trainer chick out and becoming a dad one year later seem easier than what happened there!
You can implement a LERP in two instructions with a FMA and an FNMSub (subtract instead of multiply on the third operand, and negate the multiplicand). And with AVX-512 you can do 16 of them simultaneously!Interesting though.
A quick search shows that lerp instructions still don't exist in x86 15-16 years later.
Yeah that fits with what I saw on Intel's dev forum.You can implement a LERP in two instructions with a FMA and an FNMSub (subtract instead of multiply on the third operand, and negate the multiplicand). And with AVX-512 you can do 16 of them simultaneously!
I hadn't looked around in Visual Studio Project properties. Apparently, you can force it to compile only for AVX-512 and even AVX 10.1 and Schmide also helped with code that doesn't let the compiler optimize away the AVX-512 instructions so it's now working in the AVX-512-only v0.02a executable of Rudi_Float_Bench.EDIT: Sadly, looks like the compiler cheated and didn't let the AVX-512 instructions go into the executable. Need to figure out how to stop the compiler from doing that...
Because every AVX512 enabled CPU has to support AVX512F. F stands for foundation.AVX-512F
I tested the code on Cascade Lake and Ice Lake Xeons. It only worked on the Ice Lake Xeon.
My complaint was regarding this: https://learn.microsoft.com/en-us/w...f-processthreadsapi-isprocessorfeaturepresentBecause every AVX512 enabled CPU has to support AVX512F. F stands for foundation.
That means msvc generated code that requires some subset that Icelake supports but Cascade Lake does not. But this is a problem with msvc.
'avx512bw',
'avx512cd',
'avx512dq',
'avx512f',
'avx512ifma',
'avx512vbmi',
'avx512vl',
Will Nova Lake bring it to par with AMD? Instruction wiseThis shows the fragmented mess that Intel created and why ultimately most developers decided not to mess with adding AVX-512 support to their applications:
I'm not sure. They won't be able to resist making it incompatible with AMD's AVX512F by introducing a few new instructions in AVX10.2 which if we are lucky, may appear in Zen 7. So maybe (and a big one) is that ISA extension parity is achieved between Intel and AMD CPUs in Zen 7 and Razer Lake.Will Nova Lake bring it to par with AMD? Instruction wise
It is right. As I said every CPU supports AVX512F including Cascade Lake, problem is msvc is using newer subsets but checks only for AVX512F and not the newer ones. In other words the check is insufficient for the codegen, hence my recommendation to use clang.Interestingly, maybe it's a typo on the wiki page but what it says about AVX512F being available on Cascade Lake is wrong. Or Microsoft's function call is implemented wrong because it wouldn't detect it on Cascade Lake when I tried it.
View attachment 119482
I hadn't looked around in Visual Studio Project properties. Apparently, you can force it to compile only for AVX-512 and even AVX 10.1 and Schmide also helped with code that doesn't let the compiler optimize away the AVX-512 instructions so it's now working in the AVX-512-only v0.02a executable of Rudi_Float_Bench.
Funny thing I found when searching for CPU feature detection in Microsoft docs. And I mean, I found it to be REALLY funny. Like LOL funny. Get this. Microsoft only provides AVX-512F feature detection support. That means, if you have anything less than an Ice Lake, the code will report that AVX-512 is not supported by your CPU.
How effing ironic that Microsoft thinks that all Intel Xeon CPUs with AVX-512 units before Ice Lake are worthless for AVX-512 execution so they didn't even provide API level support for detecting AVX-512 on them. I tested the code on Cascade Lake and Ice Lake Xeons. It only worked on the Ice Lake Xeon. And the worst thing was, the Cascade Lake Xeon is my company's $7000 server! Their fault for cheaping out and not going with Genoa like I asked them to.
It is reporting it, apparently but MS's detection routine is buggy I guess:One thing you're going to run into. If you enable AVX512f in your project, the compiler is going to use it. If you call a math library it's going to crash on lesser processors. You're probably going to have to subdivide your project into subprojects/libraries/dlls to maintain lesser compatibility.