Question Intel Mont thread

Page 8 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

poke01

Golden Member
Mar 8, 2022
1,336
1,505
106
The problem with IPC is that it heavily depends on the test suite. And Geekbench 6.3 is not the best one for comparing different platforms. Actually, in some cases, you can't compare results within one platform (SME).

The only viable metric is the actual performance of the apps you're using on the fixed power limit.
This is why SPEC exists and we have SPEC results for M4. this is the industry standard and why Intel and ARM use it.

Also why do you always harp on SME? AVX-512 is also bumps the score on every Geekbench past 5.1, so one can say you can’t compare any Zen 4/5 score using Geekbench to any Intel/Apple CPU.

Even AMD cherry picked the Geekbench sub tests. Oh and guess what Intel CPUs also support AVX-VNNI which Geekbench 6 supports and also bumps up the score.
With Lunar Lake supporting AVX-VNNI, its score will also be increased in Geekbench.

It looks like you just want to discredit the lead Apple has in IPC which is a big one.
 

TwistedAndy

Member
May 23, 2024
98
68
46
This is why SPEC exists and we have SPEC results for M4. this is the industry standard and why Intel and ARM use it.

Also why do you always harp on SME? AVX-512 is also bumps the score on every Geekbench past 5.1, so one can say you can’t compare any Zen 4/5 score using Geekbench to any Intel/Apple CPU.

Even AMD cherry picked the Geekbench sub tests. Oh and guess what Intel CPUs also support AVX-VNNI which Geekbench 6 supports and also bumps up the score.
With Lunar Lake supporting AVX-VNNI, its score will also be increased in Geekbench.

It looks like you just want to discredit the lead Apple has in IPC which is a big one.
When you compare two platforms and want to be objective, you need to ensure that both platforms are in the same conditions. SME gives one platform a benefit over the other, and it is used in three tests.

AVX512-VNNI, as well as AMX, are supported only on some server Intel CPUs and are used in two tests. A more generic AVX-512 set is available on some server Intel processors and recent AMD ones but is not available on consumer Intel products (except Tiger Lake). That also does not make the comparison more objective.

All those cases make the GB6 results not comparable not only across different platforms (ARM and x86) but within each platform as well. You simply can't compare them objectively. And this is what the Geekbench team wrote in the release notes for 6.3.

You may argue that some applications may benefit SMEs, AVX-512, etc. And you are right. But in this case, it makes sense to measure the performance of those apps directly. It will be way closer to real life than synthetic tests.

Another problem is how relevant those tests are to the real work. Let's take SPEC CPU 2017, as an example.



How often do you use Perl in 2024? How frequently do you compile C? Also, I don't think you model explosions, atmosphere, fluids, weather, etc.

Another group of tests is related to AI, media processing, rendering, etc.. Those tasks, in most cases, are handled by GPU, media accelerators, and hardware NPU units. There's no reason to use a CPU for that because a GPU is tens of times faster.

You may argue that those numbers represent real-life performance, but they don't. The only case, when it makes sense to check those benchmarks is to compare results within one platform (Intel, AMD, etc.). It may give you some kind of an estimate of a real-life performance boost, if you decide to upgrade.

Let's take PHP as an example. It is heavily optimized for Intel CPUs. As a result, it runs 2 (two) times faster on Intel 12600 than on Apple M2: PHPBench (an ST PHP test).

Is it a fair comparison? No. But if you need to run a lot of PHP code, the answer is obvious.

If we take Python, as another example, the situation is nearly the same: PyBench chart. Apple M2 is nearly two times slower than the Intel CPUs in the ST test.

Redis? Same story: Test Results

Obviously, there are many cases when you can't use an app or a game just because it's not available on the platform you use. In this case, the performance does not matter at all.

So, we end up with the conclusion that the only viable metric is the actual performance of the apps you're using on the fixed power limit.
 
Last edited:

Nothingness

Platinum Member
Jul 3, 2013
2,717
1,347
136
When you compare two platforms and want to be objective, you need to ensure that both platforms are in the same conditions. SME gives one platform a benefit over the other, and it is used in three tests.
And that test will show you how much SME can bring. You just have to be sure your application can benefit from it. And that helped show that SME seems to bring more performance than AMX (for people who care about matrix computations that matters).

AVX512-VNNI, as well as AMX, are supported only on some server Intel CPUs and are used in two tests. A more generic AVX-512 set is available on some server Intel processors and recent AMD ones but is not available on consumer Intel products (except Tiger Lake). That also does not make the comparison more objective.
Same here. I want a CPU with AVX-512 because I've seen what it brings on some benchmarks that I know have similarities that code I develop as a hobby. So my next CPU will be an AMD. I thank benchmarks for showing me this.

All those cases make the GB6 results not comparable not only across different platforms (ARM and x86) but within each platform as well. You simply can't compare them objectively. And this is what the Geekbench team wrote in the release notes for 6.3.
Some subtests of GB are definitely useful and can be compared across OS, ISA and platforms (clang for instance).

How often do you use Perl in 2024? How frequently do you compile C? Also, I don't think you model explosions, atmosphere, fluids, weather, etc.
SPEC PERL shows things about branch prediction that applies to most end-user code.
Quite likely the FP examples you list have some commonalities with some code in games.

Another group of tests is related to AI, media processing, rendering, etc.. Those tasks, in most cases, are handled by GPU, media accelerators, and hardware NPU units. There's no reason to use a CPU for that because a GPU is tens of times faster.
And when you end up with a HW media accelerator that doesn't support the shiny new compression, you'll end on the CPU, where the profile of SW encoders/decoders measured by benchmarks starts being relevant.

So, we end up with the conclusion that the only viable metric is the actual performance of the apps you're using on the fixed power limit.
Yes agreed. And that's the most important remark. That applies perfectly well to GPU reviews: pick the game you're interested in, the rest most likely doesn't show you much.

But I'll repeat again: benchmarks are very useful as proxies for many things when designing CPU or for educated end-users looking for something in particular.
 

TwistedAndy

Member
May 23, 2024
98
68
46
Ultimately, it all ends up with the purpose of those benchmarks.

Let's take SME as an example. If you want to measure its performance, it makes sense to use the specialized benchmark or read some research articles. It allows achieving 16 TOPS on INT8. It's a pretty good result, but does it make sense to use CPU for matrix operations?

This is a good question, considering that SME support is limited to Apple M4. It already has a 38 TOPS NPU, which can do all those operations a few times faster while consuming less power than a CPU with all cores fully loaded. Actually, that's why other companies like AMD, Qualcomm, ARM, etc. do not add matrix extension support.

As for Clang, it's a useful test if you're actually using Clang to compile your code. These results can't be applied to other languages like Java, PHP, Python, Ruby, JS, etc. From my perspective, the Node.js V8 Tooling test is much more useful here.

As for the media encoding and some unsupported media formats, it makes sense to test that "unsupported media formats."

So, in general, we are ending up with the same conclusion.
 

FlameTail

Diamond Member
Dec 15, 2021
3,122
1,786
106
If you listened to ARM Computex Keynote/Q&A, they explained SME/SVE/KlediAI. In short, the virtue of adding SME to the CPU is that the CPU is most general purpose processor, and unless an application is specifically programmed to be accelerated by the NPU, it will run on the CPU. That's where the benefit of adding SME to the CPU is.

Also their is benefit to doing matrix operations on the SME unit instead of the NPU. The SME unit has lower latency, than going out to the NPU. This is great for matrix operations that are low throughput but latency sensitive.
 

poke01

Golden Member
Mar 8, 2022
1,336
1,505
106
This is a good question, considering that SME support is limited to Apple M4. It already has a 38 TOPS NPU, which can do all those operations a few times faster while consuming less power than a CPU with all cores fully loaded. Actually, that's why other companies like AMD, Qualcomm, ARM, etc. do not add matrix extension support.
AMD added AVX-VINNI in Zen5 and Intel extended AVX-VINNI extensions to support INT8 in Lunar. These extensions ultimately help with AI workloads.
 

TwistedAndy

Member
May 23, 2024
98
68
46
If you listened to ARM Computex Keynote/Q&A, they explained SME/SVE/KlediAI. In short, the virtue of adding SME to the CPU is that the CPU is most general purpose processor, and unless an application is specifically programmed to be accelerated by the NPU, it will run on the CPU. That's where the benefit of adding SME to the CPU is.

Also their is benefit to doing matrix operations on the SME unit instead of the NPU. The SME unit has lower latency, than going out to the NPU. This is great for matrix operations that are low throughput but latency sensitive.

ARM X925 will not support SMEs, according to the technical specs.

As for Intel Skymont, it will be supporting AVX-VNNI. I don't think we will see extensive support for that in actual apps because most developers will be focused on NPUs and appropriate APIs (like DirectML).

Also, it looks strange to add a lot of silicon for a powerful NPU and then promote running matrix operations on the CPU. I'm sure there will be some kind of legacy mode when some simple models will be running on CPUs, but it will take place when there is no NPU.
 
Reactions: Nothingness

poke01

Golden Member
Mar 8, 2022
1,336
1,505
106
ARM X925 will not support SMEs, according to the technical specs.
They don’t but they should. They seem to love writing articles and talking about it. Anyway, the SME stuff should go to the ARM thread.
Also, it looks strange to add a lot of silicon for a powerful NPU and then promote running matrix operations on the CPU. I'm sure there will be some kind of legacy mode when some simple models will be running on CPUs, but it will take place when there is no NPU.
But Intel added these extensions to Lunar Lake as well where will always be an NPU present. There might be some benefits running AI workloads on a CPU.



 

TwistedAndy

Member
May 23, 2024
98
68
46
But Intel added these extensions to Lunar Lake as well where will always be an NPU present. There might be some benefits running AI workloads on a CPU.

The case with Intel is pretty funny. It has added those extensions mostly for the server market. Intel wants to compete with nVidia in the ML niche. Consumer SKUs got those extensions rather than as a bonus.
 

The Hardcard

Member
Oct 19, 2021
113
159
86
ARM X925 will not support SMEs, according to the technical specs.

As for Intel Skymont, it will be supporting AVX-VNNI. I don't think we will see extensive support for that in actual apps because most developers will be focused on NPUs and appropriate APIs (like DirectML).

Also, it looks strange to add a lot of silicon for a powerful NPU and then promote running matrix operations on the CPU. I'm sure there will be some kind of legacy mode when some simple models will be running on CPUs, but it will take place when there is no NPU.
CPU matrix operations will still be used even with an NPU present. There are latency and power penalties for going off CPU. simple small tasks will stay on the CPU. NPU opens up the possibility of doing somewhat larger tasks for a little bit more power.

GPUs will still handle the biggest tasks, though with a much bigger power penalty.
 

TwistedAndy

Member
May 23, 2024
98
68
46
CPU matrix operations will still be used even with an NPU present. There are latency and power penalties for going off CPU. simple small tasks will stay on the CPU. NPU opens up the possibility of doing somewhat larger tasks for a little bit more power.

GPUs will still handle the biggest tasks, though with a much bigger power penalty.

This situation with matrix extensions looks pretty funny.

When an app rarely uses the matrix extensions, there will be no noticeable difference between CPUs with and without them.

When an app uses them a lot, it makes sense to offload those tasks to the NPU or GPU and get significant improvement in terms of speed and power consumption. Usually, the actual models are pretty big, and it makes sense to use NPU or GPU anyway.

In the case of Intel, we got AVX-VNNI mainly because they were added at the beginning of the AI hype, and Intel tried to jump into this train.
 

FlameTail

Diamond Member
Dec 15, 2021
3,122
1,786
106
When an app uses them a lot, it makes sense to offload those tasks to the NPU or GPU and get significant improvement in terms of speed and power consumption. Usually, the actual models are pretty big, and it makes sense to use NPU or GPU anyway.

Not all matrix operations are LLMs.
 
Jul 27, 2020
17,712
11,499
106


As someone who has done database benchmarking, I don't know how Intel managed to pull a rabbit out of their dingy hat and deliver the lowest latency in a PGSQL workload AND beat out Genoa-X at the same time without busting their power budget. This CPU has NOT been designed by the usual suspects at Intel!

Yes, Turin will probably beat it but it's not available to defend AMD's honor right NOW.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |