Over the last few years, there has been an undercurrent of uncertainty in the world of game benchmarking about exactly what it is that they are supposed to be measuring. Gone are the days where all you needed was a single FPS number and everyone knew what it meant. I have heard talk of minimums, frame times, variance, 0.1%, medians, and many other things. Then today, on seeing the Level 1 Techs Ryzen vs. Intel test, I realized that that was the first double-blind experiment that I had ever seen applied to gaming.
I know the effects that double-blind experiments can have, turning conventional wisdom on its head by showing that the beliefs people had held as obvious were in fact entirely imaginary. So what are current game benchmarks really telling us? If one system has a better average FPS, but worse 1% frame times than another system, does that make it worse? Can anyone even tell the difference? Is the preception of performance constant, or is it affected by things like monitor size, or the player's age? I want to see some scientific method being applied to this (and to everything else in the world too, while we're at it).
I know the effects that double-blind experiments can have, turning conventional wisdom on its head by showing that the beliefs people had held as obvious were in fact entirely imaginary. So what are current game benchmarks really telling us? If one system has a better average FPS, but worse 1% frame times than another system, does that make it worse? Can anyone even tell the difference? Is the preception of performance constant, or is it affected by things like monitor size, or the player's age? I want to see some scientific method being applied to this (and to everything else in the world too, while we're at it).