emoga
Member
- May 13, 2018
- 196
- 308
- 136
With Zen 4 supporting avx-512, and a lot of our team has Zen 4, that alone dooms most competitors ! And 2 (soon to be 3) 9554's and a 9654 in the mix, just adds to their defeat !Over 18 days until the next competition and the team is already testing out SoB workunits to fine-tune their systems.
Times have sure changed...we used to struggle against the Noobs Of Kryta View attachment 84879
I just looked over there. I can't find it, can you link it for me please ?A while ago I posted a script for Linux in the private section of teamanandtech.org.
On single socket computers:
Edit the top of the script to define the desired config, as described in the inline comments of the script. Then keep the script running in the background (e.g. in an extra terminal window which you leave open or minimized).
On dual socket computers:
The same. Though for best results, one might combine the script with setting up two boinc client instances; one boinc bound to all logical CPUs of one socket, the other one bound to all logical CPUs of the other socket, each boinc starting only as many tasks at once as fit on a single socket of course.
The script can also be run as a system service, but I haven't written up a copy+paste recipe for this yet.
Over 18 days until the next competition and the team is already testing out SoB workunits to fine-tune their systems.
My tests show that SMT is no help, but it doesn't hurt a whole lot if you don't want to turn it off in the BIOS or play with process affinity. Best result I got was 2 tasks of 8 threads, pinned to physical cores. The 7950X is faster than my twin Xeon 2696v4's. 16 Zen4 cores beat 44 Broadwell cores...I ran a few tests already as well. 7950x with the smaller cache runs about 14.5 hours with all 32 threads. 4x8threads didn't do so well. I should try 2x16 threads, I suppose. Tasks score upwards of 107,000 points each.
Plug: My script for Linux is trivial to set up and run. (Just remember the basic things like setting the executable bit, which can be done in any graphical file manager.) You can safely ignore the size of the script which came out somewhat large; this is due to small features which I wanted to have in there but neither matter for, nor are interfering with, a more basic operation on Ryzens and the likes.I've been setting mine up manually using taskset, but there are better ways to be learned, especially for high core count CPUs.
Setting processor affinities not only helps with optimum use of SMT, it also importantly reduces data transfers across cache boundaries — on Zen CPUs which have more than one CCX. This traffic costs time and wastes energy which would be better spent in the FMA units.My tests show that SMT is no help, but it doesn't hurt a whole lot if you don't want to turn it off in the BIOS or play with process affinity.
login as: markI wonder what the output of lscpu -C looks like on a 7950X3D.
lscpu
assumes that all caches at a given level have the same size. Which is of course no longer true for a few CPUs, among them AMD's dual-CCD Ryzen X3D. I had a quick look at the mainline code repository, https://github.com/util-linux/util-linux/tree/master/sys-utils, and I haven't seen any respective code update at first glance.cat /proc/cpuinfo
show the two differently sized L3 caches perhaps? (Which would depend on respectively extended kernel code. I have no idea if anybody cared to implement this, and if yes, in which kernel versions.)looking at "cache size" they all say 1024Seems likelscpu
assumes that all caches at a given level have the same size. Which is of course no longer true for a few CPUs, among them AMD's dual-CCD Ryzen X3D. I had a quick look at the mainline code repository, https://github.com/util-linux/util-linux/tree/master/sys-utils, and I haven't seen any respective code update at first glance.
Doescat /proc/cpuinfo
show the two differently sized L3 caches perhaps? (Which would depend on respectively extended kernel code. I have no idea if anybody cared to implement this, and if yes, in which kernel versions.)
Hmm, on second thought, this is only true ifWhen you went as far as pinning tasks to sets of logical CPUs on both operating systems, remaining influences of the OS on the performance of LLR should be negligible,
PS, once you reported a result, you may have to wait some time until validation. But until then, you already can look up the pending credit of your completed tasks on your user account page at the PrimeGrid web site.You will need to compute credits/time for such a test with random workunits; just a comparison of times will be unreliable due to size variations between workunits.