Has anyone ran the CPU side of GFN21s?
mackerel said:
6700k, 24h, 68W
5930k, 25h, 117W
8086k, 26h, 76W
[Ryzen] 2600, 30h, 61W average, 75W peak
...
For comparison, my RTX 2070 does 3 units in just over 24h, and takes reported 150W constantly. That puts it a fair bit more power efficient than CPU.
Yves Gallot said:
Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz - 17 threads
27,416 sec = 7 h 36 mn 56 sec
Also,
message 123470:
Yves Gallot said:
There is a memory bottleneck with 9th generation i7s and i9s.
L3 cache size of i7-9700K is 12 MB and cpuGFN21 memory size is 16 MB R/W + 8 MB Ro.
8 threads run at about the same speed than 4.
In other words:
Desktop CPUs have less cache than needed to fit a single task, and should run at most one cpuGFN21 task at once.
Desktop CPUs with dual channel memory and Skylake based cores scale only up to 4 threads for a cpuGFN21 task.
The aforementioned i9-7980XE has 24.75 MB shared L3 cache (non-inclusive, together with 1 MB L2 cache for each of the 18 cores). Evidently this cache setup supports cpuGFN21's demands pretty well.
Some data reported in
SETI.Germany's PG challenge thread:
Coffeelake(?) 6-core i5-8600K @4.3 GHz running one cpuGFN21 task along with one GPU task:
6 cores: 32 h
5 cores: 30 h
4 cores: 29.5 h
3 cores: 32.5 h
2 cores: 42 h
If more than one CPU task is run at a time, overall throughput goes down.
Ryzen 2700X@4.1 GHz, 3466 RAM:
2 tasks at once, 4 threads per task: 70 h per task
1 task at once, 8 threads per task: 29 h per task
Ryzen 1300X and Core i5 4460 both take 38 h per task.
Based on Yves Gallot's post regarding the memory size of CPU tasks, I now started 2 tasks per processor on
Xeon E5-2696 v4 with 55 MB cache and 22C/44T @ 2.6 GHz in dual-processor machines running Linux.
2 tasks per processor, 11 threads per task:
15...19 h estimated run time per task after >1.5 % completion
14.5...15 h estimated run time per task after 10 % completion
2 tasks per processor, 20 threads per task:
run time estimation hasn't stabilized yet, but "top" shows ~2000 % processor utilization per task, which indicates good scaling.
16.1...18 h estimated run time per task after 10 % completion
TDP of these processors is 150 W. But since my particular mainboard & BIOS drives them at the all-core avx turbo clock instead of the base clock, it is possible that they persistently consume more than the TDP. Still, with an estimated ~7.5 h mean time between task completions per processor, these 14 nm CPUs are both faster and more efficient than my 16 nm GPUs (1080Ti with ~8.9 h per task at 250 W GPU power + a small amount of power for the supporting CPU).
Xeon E5-2690 v4 with 35 MB cache and 14C/28T @ 2.9 GHz, ditto dual-processor machines running Linux:
1 task per processor, 14 threads per task:
10.1 h estimated run time per task after 10 % completion
These processors have 135 TDP, but again, since the particular motherboard drives them at their all-core avx turbo clock instead of base clock, actual sustained power consumption possibly exceeds TDP.
Edit: run time estimation updated
Edit 2: E5-2690 v4 tested too
Edit 3: Background info and more CPU data from prerelease versions and lower "leading edge" can be found in the PG forum thread "genefer 3.3.4".