Question Geekbench 6 released and calibrated against Core i7-12700

Page 19 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

TwistedAndy

Member
May 23, 2024
139
104
71
I've found interesting results illustrating the impact of the SME on Apple M4.

Here is the comparison of the same Apple M4 device made by the same user:



The only difference is the test version. Geekbench 6.3 support SME, Geekbench 6.2 does not.

Also, it's pretty obvious which tests are affected:



And there's no way to filter Geekbench results by the benchmark version.

Garbage benchmark.
 

Eug

Lifer
Mar 11, 2000
23,753
1,309
126
I've found interesting results illustrating the impact of the SME on Apple M4.

Here is the comparison of the same Apple M4 device made by the same user:

View attachment 101848

The only difference is the test version. Geekbench 6.3 support SME, Geekbench 6.2 does not.

Also, it's pretty obvious which tests are affected:

View attachment 101850

And there's no way to filter Geekbench results by the benchmark version.

Garbage benchmark.
We had this discussion weeks ago. We also calculated the impact of SME. In addition, Geekbench 5 results are available for M4.
 

TwistedAndy

Member
May 23, 2024
139
104
71
We had this discussion weeks ago. We also calculated the impact of SME.

Yep. Now, we have some actual results on the same device and the OS version.

There are some other results made on iOS 18 with GB 6.2 (3476), but they are not much different.

Unfortunately, Geekbench does not allow us to filter results by the benchmark version.

Luckily, the results for Geekbench 5 are still available. They are way more accurate and closer to SPEC.
 
Jul 27, 2020
18,002
11,728
116
Geekbench does use many libraries that are used in both commercial and open source applications.
For compression, LZ4/ZSTD: Mostly used in Linux?

Navigation: How many self learned developers who mostly learned by shooting their foot off and learning to the extent of what NOT to do to compile software successfully, know about Dijkstra's algorithm? I bet most non-CS background programmers think the problem is easy to solve and just use brute force to generate such routes.

HTML5 browser: Why doesn't GB just use a full open source browser for this workload instead of trying to mimic one?

PDF Render: Again, it's using only the PDF component in isolation and not simulating how a browser with other tabs open would generate a workload or how a browser would first launch, load its engine into RAM and then display the PDF.

Photo Library: Mimicry instead of using real downloadable software for such user tasks.

Clang: This is a more real world workload but is it testing compilation only or JIT compilation too? Not clear if it's doing both. Is it running some workload in Lua after compiling it?

Text processing: What kind of convoluted mess is this? Why is there an encrypted in-memory file system being used? Which real text processing software does this? And using Python that can never achieve true parallelism due to its GIL issue? How many normal non-developer folks use Python for multicore workloads in their daily lives?

Asset compression: This is a workload that is even more limited in scope as it targets a specific species of developer: the Game Developer, renowned for the ability to CRUNCH, CRUNCH, CRUNCH with minimum sleep and maximum caffeine. What is this workload going to tell the average user?

Object Detection: Why not use a real open source software for this purpose?

Background Blur: Not a bad workload but only 10 frames? How can those 10 frames capture the rapid head movement of a typical user on cam?

Object Remover: Again, why not use Gimp?

Horizon Detection: Why no Gimp?

Photo filter: Doesn't mention a library so maybe written from scratch?

HDR: This is also not using a library and written from scratch meaning it may not be reflective of HDR operations done in real applications.

Raytracer: No issues with this since it's using Embree.

Structure from Motion: No mention of library. Custom code.

And the remaining tests don't mention any libraries either.

This benchmark is mostly a mishmash of custom code with the occasional widely used library. How can it possibly reflect real world usage?

I pray that your much-awaited eekBench does not suffer from the above mentioned pitfalls.
 

Nothingness

Platinum Member
Jul 3, 2013
2,769
1,429
136
For compression, LZ4/ZSTD: Mostly used in Linux?

Navigation: How many self learned developers who mostly learned by shooting their foot off and learning to the extent of what NOT to do to compile software successfully, know about Dijkstra's algorithm? I bet most non-CS background programmers think the problem is easy to solve and just use brute force to generate such routes.

HTML5 browser: Why doesn't GB just use a full open source browser for this workload instead of trying to mimic one?

PDF Render: Again, it's using only the PDF component in isolation and not simulating how a browser with other tabs open would generate a workload or how a browser would first launch, load its engine into RAM and then display the PDF.

Photo Library: Mimicry instead of using real downloadable software for such user tasks.

Clang: This is a more real world workload but is it testing compilation only or JIT compilation too? Not clear if it's doing both. Is it running some workload in Lua after compiling it?

Text processing: What kind of convoluted mess is this? Why is there an encrypted in-memory file system being used? Which real text processing software does this? And using Python that can never achieve true parallelism due to its GIL issue? How many normal non-developer folks use Python for multicore workloads in their daily lives?

Asset compression: This is a workload that is even more limited in scope as it targets a specific species of developer: the Game Developer, renowned for the ability to CRUNCH, CRUNCH, CRUNCH with minimum sleep and maximum caffeine. What is this workload going to tell the average user?

Object Detection: Why not use a real open source software for this purpose?

Background Blur: Not a bad workload but only 10 frames? How can those 10 frames capture the rapid head movement of a typical user on cam?

Object Remover: Again, why not use Gimp?

Horizon Detection: Why no Gimp?

Photo filter: Doesn't mention a library so maybe written from scratch?

HDR: This is also not using a library and written from scratch meaning it may not be reflective of HDR operations done in real applications.

Raytracer: No issues with this since it's using Embree.

Structure from Motion: No mention of library. Custom code.

And the remaining tests don't mention any libraries either.

This benchmark is mostly a mishmash of custom code with the occasional widely used library. How can it possibly reflect real world usage?

I pray that your much-awaited eekBench does not suffer from the above mentioned pitfalls.
Can't wait to see Geekbench become a 1 GB code benchmark and people complain that it's not using the very latest version of their preferred app

Benchmarks such as Geekbench are proxies to existing application. If you want to measure the speed of GIMP then ask the GIMP community to design a specific benchmark for GIMP. It's the same thing as people saying "Oh look that SPEC x264 is stupid, it doesn't use AVX-4096!!!!!!!1111111111".
 
Reactions: Orfosaurio

Doug S

Platinum Member
Feb 8, 2020
2,508
4,113
136
For compression, LZ4/ZSTD: Mostly used in Linux?

Navigation: How many self learned developers who mostly learned by shooting their foot off and learning to the extent of what NOT to do to compile software successfully, know about Dijkstra's algorithm? I bet most non-CS background programmers think the problem is easy to solve and just use brute force to generate such routes.

HTML5 browser: Why doesn't GB just use a full open source browser for this workload instead of trying to mimic one?

PDF Render: Again, it's using only the PDF component in isolation and not simulating how a browser with other tabs open would generate a workload or how a browser would first launch, load its engine into RAM and then display the PDF.

Photo Library: Mimicry instead of using real downloadable software for such user tasks.

Clang: This is a more real world workload but is it testing compilation only or JIT compilation too? Not clear if it's doing both. Is it running some workload in Lua after compiling it?

Text processing: What kind of convoluted mess is this? Why is there an encrypted in-memory file system being used? Which real text processing software does this? And using Python that can never achieve true parallelism due to its GIL issue? How many normal non-developer folks use Python for multicore workloads in their daily lives?

Asset compression: This is a workload that is even more limited in scope as it targets a specific species of developer: the Game Developer, renowned for the ability to CRUNCH, CRUNCH, CRUNCH with minimum sleep and maximum caffeine. What is this workload going to tell the average user?

Object Detection: Why not use a real open source software for this purpose?

Background Blur: Not a bad workload but only 10 frames? How can those 10 frames capture the rapid head movement of a typical user on cam?

Object Remover: Again, why not use Gimp?

Horizon Detection: Why no Gimp?

Photo filter: Doesn't mention a library so maybe written from scratch?

HDR: This is also not using a library and written from scratch meaning it may not be reflective of HDR operations done in real applications.

Raytracer: No issues with this since it's using Embree.

Structure from Motion: No mention of library. Custom code.

And the remaining tests don't mention any libraries either.

This benchmark is mostly a mishmash of custom code with the occasional widely used library. How can it possibly reflect real world usage?

I pray that your much-awaited eekBench does not suffer from the above mentioned pitfalls.


Maybe you should read the rather detailed PDF John Poole provides about Geekbench before compiling this list, since a lot of it is irrelevant. i.e. he tells you exactly what the Clang benchmark does, and it does not include JIT compilation or running Lua, the object detection does use open source for that, etc.

Regarding HTML5, I don't think you've considered how useless a cross platform benchmark would be that actually RAN a browser. If it ran Firefox, for example, it is running a ton of OS specific code (and using the GPU) to display the GUI which would invalidate its numbers between Windows and Mac, between Android and iPhone, etc. It wouldn't be a "HTML5 test", it would be a test of how efficient the operating system's memory management was as the browser started up, how fast it could draw the UI, and any actual HTML5 interpretation would be completely drowned out.

The same applies for "why not use GIMP". GIMP probably supports an option to run without the UI, but that doesn't mean it isn't going to be doing a ton of various startup tasks just to get to the point where it can do "object remover" etc. All that startup is going to make the results much more dependent on the operating system and be a poor test of the CPU.

Hopefully eek understands the "pitfalls" of benchmarking far far better than you do! I imagine @SarahKerrigan could expand greatly on my list of objections to your list of "pitfalls".
 

whoshere

Junior Member
Feb 28, 2020
21
46
91
For compression, LZ4/ZSTD: Mostly used in Linux?

GB is closed source and proprietary for one most important reason: the company behind it doesn't want CPU OEMs to optimize specifically for its workloads and win unfairly and if GB starts using open source components that's exactly what's going to happen. That's why almost all popular benchmarks are closed source.
 
Jul 27, 2020
18,002
11,728
116
GB is closed source and proprietary for one most important reason: the company behind it doesn't want CPU OEMs to optimize specifically for its workloads and win unfairly and if GB starts using open source components that's exactly what's going to happen. That's why almost all popular benchmarks are closed source.
Then I guess every win on Phoronix Test Suite must be a cheat.
 

Nothingness

Platinum Member
Jul 3, 2013
2,769
1,429
136
GB is closed source and proprietary for one most important reason: the company behind it doesn't want CPU OEMs to optimize specifically for its workloads and win unfairly and if GB starts using open source components that's exactly what's going to happen. That's why almost all popular benchmarks are closed source.
You don't need sources to tune a CPU for a benchmark.
 

TwistedAndy

Member
May 23, 2024
139
104
71
GB is closed source and proprietary for one most important reason: the company behind it doesn't want CPU OEMs to optimize specifically for its workloads and win unfairly and if GB starts using open source components that's exactly what's going to happen. That's why almost all popular benchmarks are closed source.
We already have that with Apple SME support (+10% boost)
 
Jul 27, 2020
18,002
11,728
116
Regarding HTML5, I don't think you've considered how useless a cross platform benchmark would be that actually RAN a browser. If it ran Firefox, for example, it is running a ton of OS specific code (and using the GPU) to display the GUI which would invalidate its numbers between Windows and Mac, between Android and iPhone, etc. It wouldn't be a "HTML5 test", it would be a test of how efficient the operating system's memory management was as the browser started up, how fast it could draw the UI, and any actual HTML5 interpretation would be completely drowned out.
It's not like current GB scores are OS agnostic.
 
Jul 27, 2020
18,002
11,728
116
But hey only Apple has additional help right....
What we don't know is how easy it is to write good Apple SME code vs. good AVX-512 code. If AVX-512 is just harder to write for by its very nature, x86 would be at a disadvantage in this benchmark due to almost useless AVX-512 usage. I'm not even sure if the speed up offered by these enhanced image functions is worth engaging the AVX-512 units.

TRANSPARENCY is important and the GB developer doesn't reveal such important details.
 

Doug S

Platinum Member
Feb 8, 2020
2,508
4,113
136
What we don't know is how easy it is to write good Apple SME code vs. good AVX-512 code. If AVX-512 is just harder to write for by its very nature, x86 would be at a disadvantage in this benchmark due to almost useless AVX-512 usage. I'm not even sure if the speed up offered by these enhanced image functions is worth engaging the AVX-512 units.

TRANSPARENCY is important and the GB developer doesn't reveal such important details.

The idea that a benchmark is biased against x86 is just laughably absurd. When provided proof that there are plenty of optimizations added for x86, now you're trying to hang your hat on an even more absurd argument that it is "hard to write good AVX-512 code". Nevermind that AVX-512 has been around for years and there are plenty of code examples on the net so you don't even have to write it yourself for common functions.

If it actually was true that writing SME code was much easier than writing AVX-512 code, wouldn't that be an indication of strength for ARM - that real world code would be more likely to realize those benefits?
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |