antihelten
Golden Member
- Feb 2, 2012
- 1,764
- 274
- 126
Yes I am not suggesting that it is the holy grail of game testing, but that's as far as I can go as a hobbyist benchmarker. I just believe it is better to use it, than not to and just rely on just framerate averages/mins/maxes.
It does expose some shortcomings very nicely however. Like with what I was seeing in Shadow Warrior 2. I felt there was something wrong but I couldn't put my finger on it, until I saw the accumulated data. On the other hand in my ROTTR testing it also highlighted nicely the very good experience I was getting and so on.
Actually that's why I use gameplay video/msi graphs/fraps scores/flac analysis to show the performance. It's the combination of all those that gives an approximation of where things stand. It's not perfect, but it's ok from my point of view.
It's actually fraps that I am starting to fear. It's getting old. Development has stopped, the dev does not respond and game engines progress rapidly. It's a matter of time before it will not be able to collect usable data I'm afraid. I have sent Mirillis an email with some suggestions regarding their Action! program. I hope they will listen.
What kind of raw data you mean? The csvs?
I definitely agree with you that it's better than average, and vastly better than mins/maxes. Personally though I still think 90/95/99 percentile scores are preferable, since one they are easier to interpret and two they are arguably the most important metric for experienced smoothness (i.e. experienced smoothness is dependent upon the marginal frames, question is simply at what percentile we define the margin as significant).
Regarding exposing shortcomings I agree that it can do quite nicely, but looking at those graphs I would expect the 970 to feel roughly comparable to 60 FPS and the 7950 to feel roughly comparable to 30 FPS, however the derived scores are 90 FPS and 35 FPS. As such it only appears to accurate in one of the two cases. I don't think this is necessarily a problem with the scoring scheme used however, more likely it's a problem with the graph and the granularity of it, it can often be difficult to determine exactly how common spikes actually are when you clump some 60,000 data points together.
Since graphs can often be a bit misleading in this manner, I personally think it's useful to do a bit of sorting of the data to get a better feeling for it, and it was basically for this that I asked if you had the data in raw format, and yes csv would be fine.