Anyone saying ARM will take over is living in a fantasy land and probably doesn't understand it very well.
X86 kills arm in throughput, because arm is very stripped down low power design. ARM has a reduced instruction set, so it's probably an order of magnitude slower than an x86 processor of five years ago. x86 is faster because there are a lot of tricks to squeeze more performance, such as deeper pipes, path prediction, large caches, wider decoders, out of order execution, etc.....and that adds power consumption. To make arm as powerful as x86, would pretty much kill the whole low power ideology behind the design.
For many tasks though and a good gpgpu strategy I could see ARM making some inroads in very specifically tasked devices. I think x86 will fix power consumption and battery technology will improve long before arm gets powerful enough to compete in the x86 space.
In the meantime, ARM is going to stay on tablets.
I disagree with all the points you seem to base your argument on... can you restate your argument or correct me where I'm wrong?
RISC doesn't mean reduced capability - it means you can reduce the amount of hardware you use to get the same amount of performance out. Until Intel outspent all the RISC companies, the fastest processors in the world were e.g. Alpha, MIPS (SGI), Power/PowerPC, etc. Intel made its money by working from the bottom up, by being willing to accept lower margins than the RISC vendors. ARM is now in the same position, raking in cash from cell phones, tablets, etc to use to attack Intel's markets with lower cost. ARM is arguably in an even safer position, since due to its large ecosystem there's less risk of any one vendor's Netburst disaster sinking the whole ecosystem.
AMD's K8 had a 12 stage pipeline, and I believe that remained through at least Phenom II. ARM's A8 has a 13 stage pipeline; A9 has an 8 stage pipeline and A15 has a 15 stage pipeline. If there's opportunity in deeper pipes, it doesn't look like ARM is missing out. Given that they've both shortened and lengthened pipelines across their products, they probably know what they're doing here.
A9 and A15 both have 32KB L1 instruction and data caches, like all Intel processors I can think of from Pentium 2 through Sandy Bridge, excluding the Netburst designs. A9 and A15 support up to 4MB L2 cache per 4 cores (that's more than many Athlon II's, right? maybe also some modern "Pentium" Core i#-derivatives?)
A15 is 3-wide, which means its decoders are at least as wide as K8 (i.e. through at least Phenom II). Now, many x86 instructions perform two tasks (e.g. read memory
and operate on the result), but the decoders are equally wide. If ARM chooses to build an even-wider decoder, it's much easier to do for them since the instructions can't be any arbitrary length from 1 to 15 bytes... an x86 decoder becomes unwieldy above 4-wide, but for RISC instruction sets it's easy to go wider.
A9 is out of order, so ARM has been seeing any power penalties associated with that for a long time now.