There are lots of silly benchmarks out there, and I don't think all of them are native x86. Some people dismiss Javascript benchmarks ("they cheated in the browser!!") at which point they look at other so-called benchmarks.
Indeed all benchmarks deserved to be looked at. But one has to dismiss outliers.
AnTuTu is the perfect example: it performed very well on Intel phones, while other benchmarks didn't show the same. It turned out it was a compiler trick.
Another example is Sunspider: it performs very well on IE, but most other Javascript bechmarks run poorly on IE.
IMHO you need a very wide panel of benchmarks:
- browser: it's the app the most used
- native: you need to know how fast the processor is, and anyway there are very important native apps, such as Angry Birds
- Java: after all, most Android apps are written in Java.
This last point raises a concern: why is Dalvik so poor on Intel despite Intel having worked on Android for several years? There are traces of Intel commits in Dalvik starting in 2009. And as Exophase wrote, Silvermont should be much less sensitive to JIT tuning than previous Atom cores.