Geekbench 3 Sandy Bridge v.s. Apple Cyclone IPC comparison

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

FwFred

Member
Sep 8, 2011
149
7
81
Impressive IPC on Geekbench, but if we are going to start comparing vs. Core I think we need more than toy benchmarks. I'm not convinced even POWER8 would look that much better @ 1.3GHz on these.

Also, I think there is too much focus on IPC. We know much more about Core, Atom, Krait, and A15's overall design parameters. We know next to nothing about Cyclone. What frequency can it run in an iPad with how many cores? If cyclone is 1.4-1.5Ghz dual core in an iPad, I think we can stop the silly comparisons to Core and focus on Bay Trail, Snapdragon 800, Tegra 4.

Another interesting point, if Cyclone is locked at 1.3GHz the entire run, why is such a capable design team leaving so much performance on the table? If you don't have a good turbo-like capability, you are behind the curve.
 

seitur

Senior member
Jul 12, 2013
383
1
81
I wouldn't mind if Apple steps in to fill the competitive void that AMD has created. Anything to keep the MPU business' feet to the fire.

I wonder how much of the A7 was Jim Keller's handiwork (in terms of project management), and if so then I wonder how much of it might bleed over into future AMD chips?
+1

Preety much this.

PS. Couple this with those rumors about Apple sniffing around fab business and we can have interesting Intel vs Apple future battle.
 
Last edited:

mikk

Diamond Member
May 15, 2012
4,173
2,211
136
That or you don't get that Apple traded frequency for IPC. Is that so hard to understand and admit?


Considering that Geekbench v2 gave us flawed scores it is not hard to understand that Geekbench produces some nonsense.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
That or you don't get that Apple traded frequency for IPC. Is that so hard to understand and admit?

Go look at the execution resources purportedly inside of Cyclone vs those inside of Haswell. It's literally impossible for it to have higher IPC than Haswell absent a complete misstep on Intel's part or poor software compilation/utilization of those resources. Now which of those seems more likely?
 

epidemis

Senior member
Jun 6, 2007
796
0
0
I agree, but it's not about how you'll never run SB run at 1.6ghz, but a near clock-for-clock comparison of the two architectures.

Cyclone is one sick chip, but we [all] also have to remember that SB is two generations old too.
He's got a point.
'

Intel is going to lose the macbook air business soon.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
It would be good if people used the correct term here, perf/MHz, not IPC. The two processors aren't even running the same instructions.

Go look at the execution resources purportedly inside of Cyclone vs those inside of Haswell. It's literally impossible for it to have higher IPC than Haswell absent a complete misstep on Intel's part or poor software compilation/utilization of those resources. Now which of those seems more likely?

Execution purportedly inside of Cyclone according to whom? Apple hasn't said anything, I don't believe compiler source is revealing anything and we don't even have die shots.

There is also more to perf/MHz than execution resources and this is what Nothingness is saying. The other side of it latency and stalls. A processor designed for a lower maximum clock speed have latencies that correspond to a smaller number of clock cycles. The fastest critical path operations in a synchronous CPU design will almost always be designed to take a fixed number of cycles, regardless of clock speed.

Let's take an extreme look at this, a processor that runs at 1MHz. You could probably do it so accesses straight to main memory only took one cycle, no cache or prefetching needed. And forget about worrying about branch misprediction. Now take it even further, let's make this a 100KHz processor that's really the same 1MHz processor internally. Now it can seemingly execute 10 instructions simultaneously, regardless of dependencies. There should be little doubt that this processor would easily exceed the perf/MHz of Haswell.

That's a very extreme and unrealistic example to illustrate a point. But even going from a design target of ~4GHz to ~1.5GHz gives you a pretty big advantage in how you accommodate timings.
 

Magic Carpet

Diamond Member
Oct 2, 2011
3,477
232
106
While 'Cyclone' is impressive enough as a mobile part, these type of comparision between such vastly different architectures designs do not really tell anything.

SNB was not designed for 1.6 GHz. It was primarily designed to operate at twice that speed. Of course when you restrict it to lower speeds and TDPs a whole lot of lesser powered architectures become competitive.

But while a SNB can always clock down to 1.6 GHz, can a 'Cyclone' clock up to 3.2 GHz? Operating at lower frequencies is easy for a CPU that is designed for higher ones, but vice versa might not be possible and -even if possible- might not yield expected linear performance gains with frequency since there might be architectural bottlenecks fundamental to the design.

Aside, you were wise to not benchmark Cyclone against Haswell running software compiled with AVX2 support. Even at 1.6 GHz Haswell would utterly destroy the Cyclone.
This.

And thanks to Intel17 for an interesting thread, I've enjoyed reading it
 

meloz

Senior member
Jul 8, 2008
320
0
76
It would be good if people used the correct term here, perf/MHz, not IPC. The two processors aren't even running the same instructions.

Right you are.

And even performance/MHz is irrelevant in the bigger picture. Two processors can achieve similar performance, one at 1 GHz and other at 1.5 GHz.

Which is better?

Answer is whichever consumes less power and/or costs less, depending on how much weightage you give to performance/watt and performance/$. One of the reasons Xeon continues to rule the server segment inspite of absurd profit margin Intel enjoys on each CPU is because it gives a strong performance/watt and also strong density. Sure a single Xeon might consume 130 watts at peak load but it also does a lot of work with that power.

Aside, this is also what is so funny about Intel's Baytrail: everyone is going on about whether or not Intel have done enough to match or beat the competition in performance/watt. Again, after a certain performance/watt point it does not matter: the new Atom is already looking like a failure thanks to Intel's pricing. Unless they change their attitude Intel should not even bother with Airmont, that will be an even bigger failure.

I sound like an utter sourpuss, sorry Intel17, but you have made the typical moped versus bus comparision to calculate the cost per person per mile. And in this case the poor bus is artificially constrained to half its capacity. Sure, lots of data, and all of it utterly meaningless.
 

FwFred

Member
Sep 8, 2011
149
7
81
Right you are.
the new Atom is already looking like a failure thanks to Intel's pricing. Unless they change their attitude Intel should not even bother with Airmont, that will be an even bigger failure.

Where have you seen anything concrete about Baytrail pricing? All I've seen is expect Baytrail in $199 netbooks. I doubt they are charging anything exorbitant.

Did Intel say what part will be in the $99 tablets? Regardless of the part, I fail to see any evidence Intel is more expensive than Qualcomm/Nvidia/Samsung. Mediatek/Rockchip/Allwinner may be another story.

edit: dailytech says $99 Baytrail tablets
 
Last edited:

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
It would be good if people used the correct term here, perf/MHz, not IPC. The two processors aren't even running the same instructions.



Execution purportedly inside of Cyclone according to whom? Apple hasn't said anything, I don't believe compiler source is revealing anything and we don't even have die shots.

There is also more to perf/MHz than execution resources and this is what Nothingness is saying. The other side of it latency and stalls. A processor designed for a lower maximum clock speed have latencies that correspond to a smaller number of clock cycles. The fastest critical path operations in a synchronous CPU design will almost always be designed to take a fixed number of cycles, regardless of clock speed.

Let's take an extreme look at this, a processor that runs at 1MHz. You could probably do it so accesses straight to main memory only took one cycle, no cache or prefetching needed. And forget about worrying about branch misprediction. Now take it even further, let's make this a 100KHz processor that's really the same 1MHz processor internally. Now it can seemingly execute 10 instructions simultaneously, regardless of dependencies. There should be little doubt that this processor would easily exceed the perf/MHz of Haswell.

That's a very extreme and unrealistic example to illustrate a point. But even going from a design target of ~4GHz to ~1.5GHz gives you a pretty big advantage in how you accommodate timings.

So you honestly believe Cyclone performs better clock for clock than Haswell?
 

Blandge

Member
Jul 10, 2012
172
0
0
There is so much that goes into comparing microarchitectures that it's impossible to determine relative performance without know more details. Specifically, the workload and compiler can completely deform the functionality of any computation engine. Modern CPUs have so many gotchas that if a workload or compiler does something the wrong way it can completely tank performance. We can't say for sure that anything like that is happening here, but it possible that Cyclone may perform better than Haswell in some specific use cases.

The important question is how do they perform when using a mature software stack that does useful work. Anything else is purely academic, and nearly completely useless unless we have ALL of the important details in hand to analyze microarchitecural oddities. Is the code and/or dissassembly of the sub-benchmarks used in this Geekbench version available for analysis? That would be interesting to see.

ARM likes to use benchmarks like Geekbench to show how close their performance is to Intel when doing something like simple math in a loop, or some highly optimized single threaded encryption algorithm. This is something that their microarchitecture is designed to do, and indeed it does it very well (even compared to huge x86 cores), but it gives the false impression that you have a central processing unit that can do what has taken AMD and Intel 20 years to achieve. However, once you want to do something real that requires multiple cores running different software that has less-than perfect optimization (coded by some intern), swapping in and out threads every couple microseconds, moving data from cache to cache, core to core, getting stuck at the back of some buffer in the sideband fabic that's being held up by an interrupt that is waiting for other interrupts and data flying is from all sources of IO through USB, PCIe, SATA, and whatever other orifices that make up the device, and finally you understand that all of that purely academic research you have doing counts for nothing and you are left with a steaming pile a garbage that costs $15 and can do simple math really well.

This is the exact reason why until recently ARM has been considered a second class citizen that is reserved for microcontrollers with extremely tight, highly optimized hand written assembly (or maybe something fancy like C), and this is the exact reason why a lot of us have a hard time being impressed by posting some good benchmark scores. Let me take a snapshot of my desktop at 11pm on a Saturday night and see how well Cyclone handles Firefox with 20 tabs, excel, word, Battlefield 3, antivirus, skype and 50 other processes running on top of Windows 8. But at least it can do SHA encryption really fast.

The impressive part about modern x86 CPUs (AMD and Intel) isn't that they have a "really high IPC". It's that they can handle the chaos that is the Desktop OS with grace, and also service a whole range of workloads from handsets to servers and perform admirably. If all you care about is SHA256 encryption or Sobel edge detection then I can make you an ASIC that will blow your socks off.

Moral of the Story: GPUs and other various Co-Processors on an SoC do all of these operations in Geekbench way better than an GPCPU. The CPU is never going to do any of the things on this list for any significant amount of time, so why the do we even care?

Important things: Multitasking, Locking, synchronization and coherency behavior, Interrupts, IO and Memory Bandwidth and Latency, Caching, Branch Prediction and stalls.

Not Important things: Mathematical calculations that a GPU or other co-processors will do better.

Obviously it all depends on the usecase, and in many cases an ARM microprocessor is the best choice, but this post was a response to "ARM can replace Haswell. Source: These Geekbench scores"
 
Last edited:

Khato

Golden Member
Jul 15, 2001
1,225
281
136
Another interesting point, if Cyclone is locked at 1.3GHz the entire run, why is such a capable design team leaving so much performance on the table? If you don't have a good turbo-like capability, you are behind the curve.

This point has been bugging me as well, especially when comparing the geekbench component scores for the iPhone 5 vs iPhane 5s when both run 32 bit. If you've ever compared a processor against its previous generation at the same frequency, well, typically you'll see a few tests that see 30%+ gains, most in the 0-10% range, a few actually decreasing in performance, and then then 50%+ gains on floating point from doubling width/number of execution units. But comparing A6 against A7 has every single test except for the floating point Mandelbrot showing at least a 20% performance gain.

Now I could see how they'd get that kind of performance gain from one generation to the next if either the previous generation was extremely non-optimized or if it wasn't a refinement of the previous design. But neither of those applies in this case - Swift was already a pretty good design, and unless Apple has at least two separate CPU design teams working in parallel I don't see how they'd have had the time to do anything more than improve upon Swift.

But an interesting thing happens if you adjust the numbers for A7 to assume a 'turbo' in the range of 1.6-1.7 GHz - they suddenly look like what you'd expect to see. A few of the integer tests showing 30%+ gains, with most in the single digit range and the rest staying in the margin of error/showing slight reductions. Likewise, the floating point tests are then single digit gains/roughly even for those that aren't affected by whatever A7 doubled, a marked reduction for Mandelbrot implying that a shift in floating point resources created a contention, and 50%+ gains in the remaining tests thanks to taking advantage of doubled resources.

The above is, of course, just conjecture based on what fits the results. While it'd be quite odd, it's certainly possible that Apple somehow managed that level of performance increase without touching frequency... and then got even more performance on top of that when running the tests in the 64 bit mode. To put the magnitude of the gain that the A7 in 64 bit holds over the A6 in these tests (ignoring AES and SHA1 as they're clearly benefitting from acceleration) into perspective - it's comparable to the gains seen going from a 3 GHz Pentium 4 to a 3 GHz Core 2 Duo... and we all know that gain was only possible because of how poor of a Performance/MHz design the Pentium 4 was.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
This is wrong:
- Windows: VS
- Android: gcc
- Linux/iOS: clang

He believes that Visual Studio uses ICC for its code generation, which we both know is wrong (he states a lot of weird and false things as fact w/o source, readers should take caution)

This is the source for the compilers used for Geekbench: http://www.realworldtech.com/forum/?threadid=135540&curpostid=136174

The pricing he has no knowledge of, so has just made up.

Possibly also pricing Intel has published:

http://ark.intel.com/products/76760/Intel-Atom-Processor-Z3770-2M-Cache-up-to-2_39-GHz

$37 isn't that bad but it's still probably a decent bit more expensive than most competitors out right now. Actual big OEMs may be paying less.
 
Last edited:

mikk

Diamond Member
May 15, 2012
4,173
2,211
136
Possibly also pricing Intel has published:

http://ark.intel.com/products/76760/Intel-Atom-Processor-Z3770-2M-Cache-up-to-2_39-GHz

$37 isn't that bad but it's still probably a decent bit more expensive than most competitors out right now. Actual big OEMs may be paying less.


You can be sure that big OEMs like Acer, Asus etc. get nice discounts because they need much more than 1k CPUs. Bay Trail-T prices are lower than its predecessor and it's low enough so that the CPU itself charges a small portion of the overall device price. Even when some competitors are $10 cheaper, it makes a small difference for the overall tablet price. As long as Intel has superior performance or superior perf/watt a small extra charge is worth it. Furthermore all announced Bay Trail-T tablets are Windows 8 devices, Win8 license is not for free. Android tablets prices should be even lower than this:

Acer: Updated version of 8-inch "Bay Trail" W3-810. Battery: 8 hours. Price: $349
ASUS: 10.1-inch "Bay Trail" Transformer Book Trio T100TA. Battery: 12 hours. Price: $329. (This is a complement to the pricier 13.3-inch Transformer Book.)
Dell: 8-inch "Bay Trail" Venue. Battery: 10+ hours. Price: $299
Dell: 10.8-inch codenamed "Midland" running "BayTrail." Battery life: 9 hours (replaceable). Price: $399
Lenovo: 8-inch "Bay Trail" Miix 8. Battery: 8 hours. Price: $249
Lenovo: 10.1-inch "Bay Trail" Miix 2. Battery: 8 hours. Price: $449.
Toshiba: 8-inch "Bay Trail" Encore. Battery: 6-7 hours. Price: $329
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
He believes that Visual Studio uses ICC for its code generation
It uses ICC with Visual Studio, Visual Studio by itself will give generic vectors for AMD/Intel/VIA. Both AMD and VIA only get scalar instructions though so it is pretty obvious which compiler is being used.
 
Last edited:

rgallant

Golden Member
Apr 14, 2007
1,361
11
81
Now let see them scale it up to 5GHz SB :|

I agree
also can it play CRYSIS @ 2560 x 1440 max settings or maybe I missed that bench mark.
-it would like a smart car passing you on the highway doing 220 mph on a windy day.er not going to happen.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |