So this one is one of those challenges into which I invested a bunch of several (quick but quite definitive) optimization tests upfront. Of course it's the first time that I have Zen 4 in the game, so that one was worth testing a bit. I even tested Broadwell-EP once more, even though this one is comparably straightforward to configure both on the hardware platform level and the software level. Quite some testing was concerned with Zen 2, which is handicapped in llrESP by the small size of the CCXs.
After the fact, two items came to my mind, or were brought to my attention elsewhere, which I haven't explored yet:
- Especially on Zen 2, would it be worthwhile to configure logical CPU affinity not just on the level of a whole task, but for each single thread of a task?
- On Zen 4, what if AVX-512 is disabled in the BIOS? The way AMD implemented it, it theoretically comes with chances of good benefits over AVX2, but also with a certain potential of regressions. (link to a link)
The latter can be figured out almost as quickly as the computer can be rebooted (which isn't quite very quick), so I might check this out some time after the challenge. The former needs a bit more work.