The latency of memory is probably the main problem, but the CCX designs add another layer to this problem.
How the process thread is transported through the CCX ? It jumps directly through the data fabric to the ccx or is a DMA operation ?
Look Ryzen is a beast of cpu, the ipc must be better than intel's ipc, but is dropping many cycles per second. That is the reason why in heavy multithreaded tasks performs likely i7 6900k. In heavy multitasking the threads don't cycle between core complexes or i must say "they don't cross the...
The interface is the Coherent Data Fabric. L3 bandwidth is several gbps above DDR4 bandwidth so, if i don't want a bottleneck between the cores, i must consider an Unified L3 cache across the 8 cores or an acelerated BUS between Core Complexes.
In this one i am speculating: it is possible for AMD adding a L4 cache shared across both CCX?? Or adding a link like Hypertransport independent of data fabric?
I think must of the problems are related the CCX design, Haswell E or Broadwell E have a large L3 cache shared for all the cores. This ccx configuration is a Double CPU , the fabric clocks at the ram speed (2400mt=1200mhz) so we have to go at the speed of ram in order to communicate between CCX...
Someone did the test? Groupsize wont work because Windows will use one CCX or another randomly, sometimes for a four thread load will assign two threads from one CCX and two threads from another CCX. So the best solution until Microsoft launches a official patch will be the using of core parking...
I think that the best way is 50% min cores unparked, and 50% parked (until workloads requires unpark more cores) and core overutilization threshold at 100% with high performance energy plan modified....
Group Size = 8 plus Core parking won't work. Just test it, if you make groupsize = 8 and parking min cores 50% then half of the threads of each CCX will be stopped until some workload will require additional threads from each CCX. 4 threads from the first CCX will be unparked , and 4 threads...
I forget to mention that i have set core overutilization at 100%
Edited "High Performance power profile" with:
943c8cb6-6f93-4227-ad87-e9a3feec08d1
Processor Performance Core Parking Overutilization Threshold
Set at 100%
ea062031-0e34-4ff1-9b6d-eb1059334028
Processor Performance Core Parking...
The minimum percentage of logical processors (in terms of all logical processors that are enabled on the system) that can be placed in the unparked state at any given time.
Did you read that?
Hi, Somebody tested this:
Processor Performance Core Parking Min Cores
The minimum percentage of logical processors (in terms of all logical processors that are enabled on the system) that can be placed in the unparked state at any given time. For example, on a system with 16 logical...
All seems in the direction that AMD wants to get the most optimized platform for Zen , that is the reason of the engineering samples sent to the manufacturers. Zen will launch at 3.3ghz 3.6 turbo. Remember me.
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.