Such wasn't clear from the review, but 12 fps matched up with where SGX 545 should be at Z2760's 533 MHz frequency in comparison to Z2460's SGX 540 at a 400 MHz frequency -
gfxbench for Atom with SGX 540 @ 400 MHz. But you're quite correct that the number I linked was incorrect.
I agree, the numbers for CT on Win 8 look lower than they should be and bad drivers could be to blame. But unless there's a universal penalty for glbench 2.5 running on Win 8 instead of Android then it shouldn't really matter why the CT score is as bad as it is, only that it's a baseline for estimating the 28-32 or so score that BT is going to get. This is simply not a competitive score.
Oh, and as for power... yes, tablets have higher tolerances, however I wouldn't be at all surprised if Adreno 330 was drawing over 4 watts to achieve that level of performance, more likely somewhere in the 6-8 watt range. (Based on the GPU power usage in Anandtech's x86 vs ARM power usage article - Mali T-604 uses around 3.5 watts for 43 fps in GLBenchmark 2.5 while Adreno 225 uses around 1 watt for 14.5 fps on the android side while Tegra 3 uses around 1.75 watts for 11.5 fps and SGX 545 uses 0.75 watts for 7 fps on the windows side which isn't as useful for this extrapolation.)
I don't know, if Adreno 225 used 1W and the performance is 4.7x better then it sounds like 6-8W will mean 28% to 70% worse perf/W at peak. Some of the peak performance improvement is due to a clock bump going from 300MHz to 450MHz which will take its toll in efficiency, but the rest of it is due to a wider uarch which for GPUs shouldn't hurt peak perf/W that much. Given the move to HKMG and uarch improvements 8W seems on the overly pessimistic side, 6W not nearly as much. I do agree well over 4W is pretty much a given.
But everyone else is willing to allow this kind of power drain with the GPU running heavily on tablets. I don't think anyone is going to prefer an inability to play more demanding content over playing it while sucking down the battery. That extra performance should mainly go unused when not needed. People just need an option to run games at artificially lower frame rates and/or with less features when they're comfortable doing so to prioritize battery life.
Intel should have better perf/W due to their process advantage, but as far as GPU goes they're really not capitalizing on it at all. Nothing new here though, I suppose. They're getting a lot more serious about offering fatter GPUs with Haswell, but that could be highly influenced by Apple as a customer in that space.