AMD Ryzen (Summit Ridge) Benchmarks Thread (use new thread)

bjt2 · Sep 15, 2016

Let's do an example with reasonable numbers.

1 A9X core at 2.26GHz has a TDP of at max 2.5W.
Let's suppose that this TDP is 0.5W the ARM decoders and 2W the rest.
Let's suppose a Zen CPU at 2.26GHz.
let's suppose 2W for all that decoders. This seems reasonable.
let's suppose 4W for the 4 x86 decoders. 8 times the 6 ARM decoder.
Let's suppose 80% hit rate of the uop cache.
So the 4 decoders draw 0.8W.
Result? Zen@2.26GHz draw 2.8W, about +13%. THis is the kind of rough estimation I was aiming.
Drop the frequency by 10% and a Zen core at 2GHz draw 2.5W and 32 Zen core at 2GHz draw 80W.

cdimauro · Sep 15, 2016

Abwx said:
So much but then what is the point to discuss those numbers in a forum to eventually deny them, wait for
AT analysis in this case..

In fact I never played with my crystal ball and gave numbers.

cdimauro · Sep 15, 2016

bjt2 said:
Let's do an example with reasonable numbers.

1 A9X core at 2.26GHz has a TDP of at max 2.5W.
Let's suppose that this TDP is 0.5W the ARM decoders and 2W the rest.
Let's suppose a Zen CPU at 2.26GHz.
let's suppose 2W for all that decoders. This seems reasonable.
let's suppose 4W for the 4 x86 decoders. 8 times the 6 ARM decoder.
Let's suppose 80% hit rate of the uop cache.
So the 4 decoders draw 0.8W.
Result? Zen@2.26GHz draw 2.8W, about +13%. THis is the kind of rough estimation I was aiming.
Drop the frequency by 10% and a Zen core at 2GHz draw 2.5W and 32 Zen core at 2GHz draw 80W.

Good. Now tell me how many runtime checks are made for the execution of an x86/x64 instruction which are not needed by an ARM one.

bjt2 · Sep 15, 2016

cdimauro said:
It's not only a question of decoders. ISA matters ALSO in several other aspects, which nobody talk about unfortunately.

Take some INT or FPU instructions from the respective ISAs, and try to follow what happens in the pipeline AND in the respective unit during all execution cycle.

Just ONE hint: even on x64, the segmentation is still active. Do you know what does it mean? I think no, from what you've said 'til now.

The real world isn't made of just FO4, silicon, and libraries. ISA AND uarchitectures MATTER too.

Another useful exercise to understand it can be writing an x86 and/or an ARM emulator. Then you'll see how much work is needed by an "ALU/FPU" to achieve exactly the same operation...

The segmentation in x64 mode is disabled. Windows set CS, DS, ES, FS etc at the same selector, with zero offset. I am sure INTEL and AMD engineers are aware of that and gate down these circuits. I vaguely remember even I read it...
ARM AFAIK does have pagination, TLBs and similar stuff... And we can include all the fancy x86 stuff in the 4W i provided above...

bjt2 · Sep 15, 2016

cdimauro said:
Good. Now tell me how many runtime checks are made for the execution of an x86/x64 instruction which are not needed by an ARM one.

In x64 segmentation is disabled. AFAIK some legacy instruction are even disabled, e.g. BOUND if i remember well

cdimauro · Sep 15, 2016

bjt2 said:
The segmentation in x64 mode is disabled. Windows set CS, DS, ES, FS etc at the same selector, with zero offset. I am sure INTEL and AMD engineers are aware of that and gate down these circuits. I vaguely remember even I read it...
ARM AFAIK does have pagination, TLBs and similar stuff... And we can include all the fancy x86 stuff in the 4W i provided above...

You see? You are wrong. No, the segmentation isn't disabled even on x64.

How do you implement the Threads Local Storage? And how do you access to kernel memory?

cdimauro · Sep 15, 2016

bjt2 said:
In x64 segmentation is disabled. AFAIK some legacy instruction are even disabled, e.g. BOUND if i remember well

BOUND doesn't even exists on x64, and its opcode is (re)used to implement the completely new AVX-512 SIMD instructions.

Some legacy instructions aren't disabled: they are not present anymore, and sometimes their opcodes are reused.

And, again, the segmentation isn't disabled. Please, open an Intel or AMD architecture manual and verify yourself.

bjt2 · Sep 15, 2016

cdimauro said:
You see? You are wrong. No, the segmentation isn't disabled even on x64.

How do you implement the Threads Local Storage? And how do you access to kernel memory?

The segment offset is ignored and no bound checks are performed. And I am pretty sure that also ARM CPU should do similar operations to support the threads local storage and kernel memory... Android is linux based. And linux exists also on x86...

cdimauro · Sep 15, 2016

I already suggested you what do to.

BTW, check how FS and GS segments are used on x64 (Long Mode).

And check how virtual machine softwares continue to use the segment limits even in Long Mode...

BTW2: ARM use completely different methods.

bjt2 · Sep 15, 2016

cdimauro said:
BOUND doesn't even exists on x64, and its opcode is (re)used to implement the completely new AVX-512 SIMD instructions.

Some legacy instructions aren't disabled: they are not present anymore, and sometimes their opcodes are reused.

That is what i meant... Highlighting all my errors does not prove wrong my main point. The CPU executes kernel
mode instruction in less than 1% of the time. Even if all this stuff requires 10W, we are talking of 0.1W...
Ok Zen at 2.26GHz draw 2,9W ok?

bjt2 · Sep 15, 2016

cdimauro said:
I already suggested you what do to.

BTW, check how FS and GS segments are used on x64 (Long Mode).

And check how virtual machine softwares continue to use the segment limits even in Long Mode...

BTW2: ARM use completely different methods.

Ok you are right. But even if all this stuff draw 10W at 2GHz, it's used less than 1% of the time! I am doing a ROUGH estimation! ROUGH!

cdimauro · Sep 15, 2016

http://www.vmware.com/pdf/desktop/vmware_player50.pdf

"An AMD CPU that has segment-limit support in long mode"

And now I give-up, because doesn't make sense to continue to talk with a people which doesn't know how the things really work, and insist with his personal battle to defend his numbers...

Abwx · Sep 15, 2016

cdimauro said:
In fact I never played with my crystal ball and gave numbers.

Who need a cristal ball when we have some basis to estimate accurately power numbers within +-10%..?..

For instance AMD said that Zen consumed a little less than BDW in the blender demo, so take a i7 6900K clock it at 3GHz at its stock voltage and run Blender or Cinebench, substract 10% to the power consumed by the CPU and you ll have Zen power drain at 3GHz..

The other methodology is the one i used in comparison of XV, the results are the same within the margin of error i stated..

We could as well take the 95W stated TDP and divide it by 1.25x and we ll have the power in MT with regular softs, on the basis that in worst case scenario an AMD CPU reach its maximum TDP only with Prime 95 (check the FX8350...) and that power with this soft is 1.25x higher than with usual MT softs..

bjt2 · Sep 15, 2016

cdimauro said:
http://www.vmware.com/pdf/desktop/vmware_player50.pdf

"An AMD CPU that has segment-limit support in long mode"

And now I give-up, because doesn't make sense to continue to talk with a people which doesn't know how the things really work, and insist with his personal battle to defend his numbers...

I said i was wrong. I am not omniscient. No one is. Highlighting my ignorance without replying me is not costructive. You didn't reply to my last concepts. What is the percentage of time a CPU is in kernel mode AND use these instruction? 1%, 10%? How much power these instuctions can draw? I think the power is negligible. This is enough for my goal: DO A ROUGH CALCULATION OF ZEN POWER DRAW.

bjt2 · Sep 15, 2016

Abwx said:
Who need a cristal ball when we have some basis to estimate accurately power numbers within +-10%..?..

For instance AMD said that Zen consumed a little less than BDW in the blender demo, so take a i7 6900K clock it at 3GHz at its stock voltage and run Blender or Cinebench, substract 10% to the power consumed by the CPU and you ll have Zen power drain at 3GHz..

The other methodology is the one i used in comparison of XV, the results are the same within the margin of error i stated..

We could as well take the 95W stated TDP and divide it by 1.25x and we ll have the power in MT with regular softs, on the basis that in orst case scenario an AMD CPU reach its maximum TDP only with Prime 95 (check the FX8350...) and that power with this soft is 1.25x higher than with usual MT softs..

No. You must calculate Zen power consumption within 3 decimal digit and accurately state each step of your calculation.

cdimauro · Sep 15, 2016

Abwx said:
Who need a cristal ball when we have some basis to estimate accurately power numbers within +-10%..?..

For instance AMD said that Zen consumed a little less than BDW in the blender demo, so take a i7 6900K clock it at 3GHz at its stock voltage and run Blender or Cinebench, substract 10% to the power consumed by the CPU and you ll have Zen power drain at 3GHz..

The other methodology is the one i used in comparison of XV, the results are the same within the margin of error i stated..

We could as well take the 95W stated TDP and divide it by 1.25x and we ll have the power in MT with regular softs, on the basis that in worst case scenario an AMD CPU reach its maximum TDP only with Prime 95 (check the FX8350...) and that power with this soft is 1.25x higher than with usual MT softs..

Can I respectfully avoid to take as granted some declarations, and wait for some third-party tests?

@bjt2: trolling doesn't help. And nobody talked about kernel mode. Again, you talk of things which you have no idea of how they work...

bjt2 · Sep 15, 2016

cdimauro said:
Can I respectfully avoid to take as granted some declarations, and wait for some third-party tests?

@bjt2: trolling doesn't help. And nobody talked about kernel mode. Again, you talk of things which you have no idea of how they work...

I don't need to know too much. I admit my ignorance. I am not a kernel programmer. Sometimes I program in assembler, but only user mode. You stated that there are instruction on x86 that ARM doies not have. Kernel mode or not it's not important. I don't know why you highlight my errors, without answer, as if telling me I am stupid and ingnorant, makes my point completely wrong.
These instruction, kernel mode or user mode, how often occur? How much power they draw? Is this power negligible (namely under 5% of the TDP) for my goal, namely estimating Zen power draw with 10-20% margin?
If they draw 10% of the TDP, i don't know how INTEL didn't go bankrupt...

Abwx · Sep 15, 2016

cdimauro said:
Can I respectfully avoid to take as granted some declarations, and wait for some third-party tests?

You can but then you have no right to contradict anybody if your argument is that we should wait for third party
numbers, because you negate other people estimations on the basis of this fallacy that we should accept a condition that you dont apply to your own opinion.

In the waiting consider BJT2 and myself as third parties since we are not part of AMD AFAIK, so our estimations
are certainly more valid than your rebuttals, moreover because i stated three different methodologies and you didnt adress a single one other than with general statements that are blank statements in disguise.

cdimauro · Sep 15, 2016

@bjt2: please, don't put words on my mouth that I haven't pronounced. For me the discussion is over: I have nothing else to say (there's already plenty of stuff), except good night.

bjt2 · Sep 15, 2016

Ok good night. I know you have not said I am stupid. But highlighting other errors without answering, is not pleasant, you should admit.

bjt2 · Sep 15, 2016

@Abwx Anyway, XV power numbers are quite interesting. Were you found it? Are they reliable? Are there other data for other frequencies?

Abwx · Sep 15, 2016

They can be extracted thanks to reviews like this one where the idle and load power are measured :

http://www.portvapes.co.uk/?id=Latest-exam-1Z0-876-Dumps&exid=threads/amd-bristol-stoney-ridge-thread.2463487/page-16#post-38472229

If we take the PSU and VRM efficencies at 90% each this amount to 46W delta at the APU level assuming that the whole delta is due to the APU, there s possibly a watt or two that get elsewhere (in the RAM probably) but this is not significant and will eventually compensate for an eventual slight underestimation, indeed those numbers correlate with the numbers measured by TheStilt in his Carrizo plateform..

bjt2 · Sep 15, 2016

Abwx said:
They can be extracted thanks to reviews like this one where the idle and load power are measured :

http://www.portvapes.co.uk/?id=Latest-exam-1Z0-876-Dumps&exid=threads/amd-bristol-stoney-ridge-thread.2463487/page-16#post-38472229

If we take the PSU and VRM efficencies at 90% each this amount to 46W delta at the APU level assuming that the whole delta is due to the APU, there s possibly a watt or two that get elsewhere (in the RAM probably) but this is not significant and will eventually compensate for an eventual slight underestimation, indeed those numbers correlate with the numbers measured by TheStilt in his Carrizo plateform..

Well this is very intereting, taking into account that 845 are the waste APU that probabily are also bad on the CPU side, in terms of needed Vcore, leakage and power consumption and that Bristol Ridge should be even better...

Also impressive the OCed 845, I guess with same Vcore... There seems to be a clock wall and the problem is not the TDP... That confirms the poor silicon quality.
So an high quality bristol ridge should draw less or probabily the 3.8-4.2 top model should draw as this 3.5 model: 8W a core. And so Zen, according to AMD slides...

Abwx · Sep 15, 2016

bjt2 said:
Well this is very intereting, taking into account that 845 are the waste APU that probabily are also bad on the CPU side, in terms of needed Vcore, leakage and power consumption and that Bristol Ridge should be even better...

Yes, the fact that those Athlon are eventually second rate parts is a way to not underestimate anything, besides as you point it it s quite possible that the comparison in the slide is Bristol Ridge, wich would definitly make sure that estimations are a worst case figure.

bjt2 said:
Also impressive the OCed 845, I guess with same Vcore... There seems to be a clock wall and the problem is not the TDP... That confirms the poor silicon quality.
So an high quality bristol ridge should draw less or probabily the 3.8-4.2 top model should draw as this 3.5 model: 8W a core. And so Zen, according to AMD slides...

The Vcore is slightly increased for the overclocking and power did increase by only 25% wich is the number for a process that would scale perfectly, this is of course not the case and it s likely that they are not within regular voltage margin, neverless their results point to the limitation being due to the BCLOCK method of overclocking.

From the curves AMD published Bristol Ridge power dissipation is significantly lower than Carrizo, possibly 6.5W/core at the same 3.5GHz and with a more efficient uncore, but for the time i prefer to be cautious and stick with the Athlon 845 as basis for estimations.

cdimauro · Sep 16, 2016

Abwx said:
You can but then you have no right to contradict anybody if your argument is that we should wait for third party
numbers, because you negate other people estimations on the basis of this fallacy that we should accept a condition that you dont apply to your own opinion.

I only expressed my opinion regarding the fact why some comparisons don't make sense, providing also some facts.

Of course, you're free to continue reporting your estimations: nobody stops you.

In the waiting consider BJT2 and myself as third parties since we are not part of AMD AFAIK,

I clearly stated who I consider third party: professional sites like AnandTech, which makes the real world tests with ad hoc tools.

so our estimations
are certainly more valid than your rebuttals, moreover because i stated three different methodologies and you didnt adress a single one other than with general statements that are blank statements in disguise.

Your estimations are based on the declaration of the vendor, and that's the problem: it has to be verified.

AMD Ryzen (Summit Ridge) Benchmarks Thread (use new thread)

Senior member

Member

Member

Senior member

Senior member

Member

Member

Senior member

Member

Senior member

Senior member

Member

Lifer

Senior member

Senior member

Member

Senior member

Lifer

Member

Senior member

Senior member

Lifer

Senior member

Lifer

Member