I always felt like Geekbench was trash. Especially gb5, that was insanely, absurdly memory bound. I remember just tuning my memory increased my gb5 score by 40%, lol.
Geekbench 6 is not that different here as well. Let's take a closer look on the technical details of the
tests included in Geekbench 6:
1. File Compression
Almost useless. Instead of LZ4 and ZSTD, it makes sense to use deflate (gzip/zip), which is used everywhere on the web and system-wide.
2. Navigation
Useless. Both Google and Apple Maps usually do not perform those calculations on the device. Even in the case of offline navigation, that test is not representative because different apps are used.
3. HTML5 Browser
It's not representative. Instead of using the actual headless browser or Node.js, Geekbench decided to use some libraries to parse and render HTML/CSS. Surprisingly, the most compute-heavy part (JS) is not included.
4. PDF Render
The first useful test!
5. Photo Library
It's a very weird test. Instead of measuring some useful things like JPG/PNG/WebP scaling/compression/decompression, they added a lot of other useless steps like running an image classification model and storing tags in the SQL database.
6. Developer workloads
- Clang. It's not representative because Clang is used mostly on Apple OS and some Unix systems. Technically, it's possible to use Clang on Windows or Linux, but it's not a common scenario. Windows uses MSVC by default. Linux - GCC.
In general, this benchmark represents the compile performance in Xcode for Apple devices and is not relevant for other platforms.
- Text Processing. It looks to be a valid benchmark, but it makes sense to process all those files using Node.js, JVM, or PHP. It's a more common scenario.
- Asset Compression. I do not work with 3D assets, but it looks to be valid.
7. Machine Learning Workloads
Despite having ML in the name, it's a useless benchmark. The actual apps that detect objects blur backgrounds, remove objects, etc., usually use GPU or NPU for that. Also, these kinds of workloads are very sensitive to code optimization and used libraries and frameworks.
8. Horizon Detection, Photo Filter, and HDR
These tasks also heavily depend on code optimization and the libraries used. The performance in different apps may be very different from the results in this section.
9. Ray Tracer
Useless. In most cases, the GPU makes the ray tracing calculations and scene rendering. There are some cases when it makes to do that on CPU, but it's an exception.
10. Structure from Motion
It looks to be a valid benchmark, but I'm not sure how frequently it is used.
Summary
Geekbench 6 is not the best benchmark available. Many tests heavily depend on the actual implementation and platform optimization, not to mention SME, AVX-512, etc.
Instead of using open and widely used tools, libraries, and apps like Node.js, Electron, Blender, JVM, etc., it uses some custom implementations for very questionable tasks.
Cinebench R23 and 2024, on the other hand, measure one specific task, but it's based on real commercial software.