ub4ty
Senior member
- Jun 21, 2017
- 749
- 898
- 96
IPC is a metric that refers to very specific micro-architectural details beyond the purview of the average person with a brain. It's why degrees are formulated around such talk... And I'm speaking of graduate degrees. Outside of the context of discussing micro-architecture and pipelines, it is a non-sensical point of discussion. This is non-obvious to a person without a degree in computer engineering which is why I tried to detail the complexity and nature of the measure. You can disagree. If you do, I ask you to enlighten me on what I'm missing....Look, if you want to be exact then yes everyone with a brain (on forums like these) understands IPC is dependant on the software and instructions used, different architecture has its strengths and weakness in different software, even the software has different optimisations for different manufacturers, dependant on favouritism and/or funding.
What does the mention of IPC mean in 100% of the mainstream reviews floating around the internet besides a high level measure of how long a benchmark took? As such, why mention IPC at all? Instead of just K.I.S.S and say : This processor took 10min. The other took 15min. This processor is 50% faster than the other.
You have no clue what the actual IPC is by such a measure. Zero. So, why mention it?
No, you actually don't have a general idea of what IPC is for a previous or different processor. All you have is a figure of how long it took on one processor vs another and that's all you need to make a decision. If you want to have an accurate discussion about IPC, you have to drill down into the micro-architecture and pipeline and get very specific about what block of instruction flows have what kind of flow rate and why? Is it due to memory stalls? Branch prediction algorithm? etc etc...But for getting a general idea of a processors IPC in desktop workloads...to compare against previous or different processor's..then you certainly can go read a decent review and get a solid idea.
You can read an anandtech review for instance and know skylake is between 6-12% faster per clock than Zen.
To say all of it is complete bogus and worthless is in itself quite bogus.
Faster per clock in executing a benchmark is not Instructions per clock. Please appreciate the difference. You have no clue what the actual varied flow per varied instruction is. This is exactly where the misinformation sets in. Quantum physics is Quantum physics. Trying to refer to very strict and detailed micro-architectural measures in such a glossed over populist manner is grandiose disinformation.You can read an anandtech review for instance and know skylake is between 6-12% faster per clock than Zen. To say all of it is complete bogus and worthless is in itself quite bogus
The number of instructions executed per clock is not a constant for a given processor; it depends on how the particular software being run interacts with the processor, and indeed the entire machine, particularly the memory hierarchy.
https://en.wikipedia.org/wiki/Instructions_per_cycle
https://www.expobrain.net/2013/06/19/disassembly-c-code-for-fun-part-2/
Code
#include <stdio.h>
int main()
{
printf("Hello world!");
return 0;
}
Compile and disassembly:
$ cc -g main.c
$ gdb a.out
Dump of assembler code for function main:
0x0000000100000f00 <main+0>: push %rbp
0x0000000100000f01 <main+1>: mov %rsp,%rbp
0x0000000100000f04 <main+4>: sub $0x10,%rsp
0x0000000100000f08 <main+8>: lea 0x51(%rip),%rdi # 0x100000f60
0x0000000100000f0f <main+15>: movl $0x0,-0x4(%rbp)
0x0000000100000f16 <main+22>: mov $0x0,%al
0x0000000100000f18 <main+24>: callq 0x100000f34 <dyld_stub_printf>
0x0000000100000f1d <main+29>: mov $0x0,%ecx
0x0000000100000f22 <main+34>: mov %eax,-0x8(%rbp)
0x0000000100000f25 <main+37>: mov %ecx,%eax
0x0000000100000f27 <main+39>: add $0x10,%rsp
0x0000000100000f2b <main+43>: pop %rbp
0x0000000100000f2c <main+44>: retq
End of assembler dump.
These are the instructions for a simple printf program. I'm sure you don't need this lesson as you seem informed. Now extrapolate this to a benchmark application.
Analysis of an execution flow :
https://en.wikipedia.org/wiki/Cycles_per_instruction
Instruction level/micro-architecture analysis is very complex and detailed and gets reflected in fixed and published official numbers. This is not something you bake into a review on a per benchmark basis to make your readers feel smarter than they are. This leads directly to a bunch of misinformed people talking far beyond their depth of understanding.
If you want an actual and proper discussion about instruction level performance, you do the following :
https://en.wikipedia.org/wiki/Cycles_per_instruction
For the multi-cycle MIPS, there are five types of instructions:
If a program has:
- 50% load instructions
- 25% store instructions
- 15% R-type instructions
- 8% branch instructions
- 2% jump instructions
This doesn't even get into advanced processor performance but is a basic high level of instruction performance.
This is what a series of ISA/software guide books look like for instruction level granularity :
So, I'll restate : IPC measures in populist reviews are bogus. K.I.S.S : Processor X completes benchmark Y this fast. It is K% faster than Processor A,B,C,1,2,3.That's all people care about and can interpret. If you want to start with derivation analysis.. specify this and specify how you arrived at your derivation.
If you're not going to measure something properly, don't refer to it at all. If it has no relevance, leave it out. If something can be summarized using something far simpler, go for simple. Why would you ever further complicate and be wrong about something when the answer was already simply given to you ?
So, that's IPC.... I hope this unofficial and likely wrong uncore/Interconnect power utilization analysis doesn't also become a meme.
Last edited: