- May 7, 2005
- 5,161
- 32
- 86
I find many adjust the benchmarks to favor the card that is the subject of the review. They tend to run benchmarks, games, at resolutions and filtering that show the subject card in it's best light. I can understand this because most likely the review sample was supplied with preferences from the supplier on running certain tests specific ways.
Example: The supplier could say, "Make sure you run the tests with high AA. We've really made some major performance gains in our AA." Or the supplier might have said, "The higher the resolution the stronger our cards performance is. Be sure to test it at 2560*1600." Or, "This card eats Crysis for breakfast. Runs 60fps @ 1900*1200. Be sure to show that in your tests." Or, whatever. I'm sure you get the idea.
In all fairness I can understand the card (or any other hardware) supplier wanting their piece shown like this. The last thing you would want is for a reviewer to test your product in a way that it doesn't perform at it's best. If you read more reviews you'll see the strengths and weaknesses of the products. There's so many possible setups, settings and scenarios that there's no way for just one review to tell the whole story, IMO.
Which means that you can put forward support for either side and play the value and performance game until you're blue in the face, and argue both sides equally, but at the end of the day, what matters more for an individual user is not that the GTX480 can be on average 25% faster if you use the right benchmark, or that the HD5850 can be faster than the GTX470 if you use the right benchmarks, but look at which card performs best in the specific selection of games you are interested in at your specific resolution using the settings you are most likely to use (e.g. AA/AF levels).
That, more than anything, is what this shows (although it makes it in no way truer than it was before, but it's worth repeating anyway).
It's also why most threads where people ask for a card recommendation break down, because everyone has their own personal preference and can show benchmarks putting forward a certain card to be better value.
So do in-game benchmarks, or demo versions, or third-party synthetic benchmarks. We're not really after making sure each card performs the exact same test. We can be pretty sure they do already.This puts both cards through the exact same paces and then we can compare.
I don't think people actually "trust one shot reviews". The real problem is that the testing methodologies are largely secret, with the info about them restricted to the title of the game and graphical settings, no real explanation or even revelation of the methodology.but I really hate how much trust people place in 'one shot' (aka N=1) reviews.
I know many reviewers probably don't the time or know how, but a lot of this stuff could be cleared up with multiple runs and some simple statistical analysis (finding correlation coefficients, running regression analysis, etc.) to help answer questions like, "Do synthetics correlate with real world performance?" or "Does GPU X have a higher impact on FPS than GPU Y?"
For example, with the last question, you could take data from multiple runs of benchmarks (at least 30 benchmark runs per game) and have a dummy variable that takes on the value of 1 if GPU X and 0 if GPU Y. Your dependent variable would be FPS in a a given game. A one parameter estimate model using Ordinary Least Squares regression would be able to determine which intercept was higher and therefore which card was faster in terms of FPS and exactly how much faster it was.
This, to me, is the only way to have any confidence in benchmarks and is also the only way to reasonably protect yourself from 'fluke' results. This is the end of a stats nerd rant, but I really hate how much trust people place in 'one shot' (aka N=1) reviews. Like Canucks said, there are a whole host of variables that can affect performance and that fact should be treated with more sophisticated analysis.