Apple A9X the new mobile SoC king

Page 8 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

gdansk

Platinum Member
Feb 8, 2011
2,478
3,373
136
Geekbench, notably the SHA2 section, is total garbage. It doesn't use x64's dedicated SHA instructions.
 

386DX

Member
Feb 11, 2010
197
0
0
I'm not going to detabe with you anymore on this. The numbers speak for themselves.

Max load on an ipad air 2 is 11W vs 29W on the macbook.

Are you seriously saying that a faster SSD, more RAM and marginally bigger screen uses 18W? That is ridiculous.

You only have to look at the idle and max to see isolate the CPU/GPU components as the max is measured when loading the CPU/GPU.

The delta on the ipad air 2 is 6W and macbook is 23W.

Seriously people need to use their brain a bit. Notebook check is measuring power from the wall not the SoC, they are essentially just measuring the size of the power adapter the device comes with. There is no real easy way of measuring the SoC power. If the iPad 2 Air came with a 5W adapter you'll get a max reading of 5W (of course your battery would probably not charge when device is on). Drawing more power from the wall doesn't mean the CPU is actually drawing all that power as the extra power draw is used to charge the battery at a quicker pace.

Like you said the Max load on an iPad 2 Air (as measured by the wall) is 11W and the macbook 29W... I'll give you two guesses on what the wattage on the power adapter that comes with these device are. 10W and 29W... Coincidence? I think not. Once you factor in adapter efficiency you get the 11W and 29W numbers.
 

thunng8

Member
Jan 8, 2013
153
61
101
Seriously people need to use their brain a bit. Notebook check is measuring power from the wall not the SoC, they are essentially just measuring the size of the power adapter the device comes with. There is no real easy way of measuring the SoC power. If the iPad 2 Air came with a 5W adapter you'll get a max reading of 5W (of course your battery would probably not charge when device is on). Drawing more power from the wall doesn't mean the CPU is actually drawing all that power as the extra power draw is used to charge the battery at a quicker pace.

Like you said the Max load on an iPad 2 Air (as measured by the wall) is 11W and the macbook 29W... I'll give you two guesses on what the wattage on the power adapter that comes with these device are. 10W and 29W... Coincidence? I think not. Once you factor in adapter efficiency you get the 11W and 29W numbers.

I think you should get your facts straight. Apple designed their power adapters to have enough power their device at maximum load without losing battery. Hence maximum load is near maximum output of the power adapter.

There are many examples of maximum power draw nowhere near the maximum power output of the power supply. eg.

http://www.notebookcheck.net/Asus-Zenbook-UX305-Subnotebook-Review.136543.0.html

45W power supply with maximum load of 30.3W. In this can even when the notebook is running at maximum possible load - the notebook can also be charged.

So before calling people to use their brains - you should use yours.
 

knutinh

Member
Jan 13, 2006
61
3
66
...
I don't think it is a problem if a benchmark is not vectorized, if it is application logic that the average programer would write. But the algorithms geekbench uses don't belong into this category. The algorithms geekbench uses are normally heavenly vectorized and optimized.
...
The more I learn about software optimization, the more sceptical I am about cpu benchmarks.

Usually, you are testing a particular software implementation, a compiler and some piece of hardware jointly. Trying to compare two pieces of hardware this way is hard.

My experience is that software implementations (and compilers) tend to be quite suboptimal on a pure performance basis, but the degree of "optimalness" is highly non-linear and hard to predict. Thus, small changes in code (or compiler) can result in large changes in performance.

It is often hard to know if a piece of software is close to "optimal". So (often) you don't even know robustly that if you are close to hw limits. In simple cases, you might be able to determine that e.g. memory limits your application and that this memory traffic is unavoidable. But often the problem is so complex (and the hardware is so hard to fathom) that such estimates are crude.

Testing the "sw ecosystem" might be equally relevant. I.e. how many man-hours does it take to implement operation X at a running speed of N seconds? How many dollars worth of tools/licenses? How many prospective customers will the platform offer the developer to distribute costs on? If the answer is that "this platform allows the dev to run some given matrix multiplications at one million/second by using a free library", then that may (or may not) be more relevant than "this platform allows the dev to run those same matrix mults at four million/second by investing 6 months of development time writing assembly and understanding the quirks of cache implementations".

One might expect that any platform/task combination will have some unique performance vs "cost" curve. Measuring absolute hardware limits only tells you the (expected) asymptote of that curve for those willing to put endless effort into the project, while details of that curve tells you more about what you can get for a more moderate effort.

That said, if your box is going to be devoted to one or a few tasks (calculate FFTs or run Quake or whatever), then measuring performance for the application of interest is going to tell you how fast that application is going to run on two or more hw platforms. Until the next recompile at least.

-k
 
Last edited:

Idontcare

Elite Member
Oct 10, 1999
21,118
58
91
The more I learn about software optimization, the more sceptical I am about cpu benchmarks.

Usually, you are testing a particular software implementation, a compiler and some piece of hardware jointly. Trying to compare two pieces of hardware this way is hard.

My experience is that software implementations (and compilers) tend to be quite suboptimal on a pure performance basis, but the degree of "optimalness" is highly non-linear and hard to predict. Thus, small changes in code (or compiler) can result in large changes in performance.

It is often hard to know if a piece of software is close to "optimal". So (often) you don't even know robustly that if you are close to hw limits. In simple cases, you might be able to determine that e.g. memory limits your application and that this memory traffic is unavoidable. But often the problem is so complex (and the hardware is so hard to fathom) that such estimates are crude.

Testing the "sw ecosystem" might be equally relevant. I.e. how many man-hours does it take to implement operation X at a running speed of N seconds? How many dollars worth of tools/licenses? How many prospective customers will the platform offer the developer to distribute costs on? If the answer is that "this platform allows the dev to run some given matrix multiplications at one million/second by using a free library", then that may (or may not) be more relevant than "this platform allows the dev to run those same matrix mults at four million/second by investing 6 months of development time writing assembly and understanding the quirks of cache implementations".

One might expect that any platform/task combination will have some unique performance vs "cost" curve. Measuring absolute hardware limits only tells you the (expected) asymptote of that curve for those willing to put endless effort into the project, while details of that curve tells you more about what you can get for a more moderate effort.

That said, if your box is going to be devoted to one or a few tasks (calculate FFTs or run Quake or whatever), then measuring performance for the application of interest is going to tell you how fast that application is going to run on two or more hw platforms. Until the next recompile at least.

-k

Comes down to why anyone wants the performance in the first place.

If you want the performance because you want to generate higher benchmark numbers, then making ever more optimal compiles for ever higher benchmark scores is going to be a priority.

If, however, you want the performance because you have some software suite or application in mind, then all you really care about when it comes to benchmarks are two things - (1) is the benchmark a reasonable enough proxy for my software suite or application of interest?, and (2) what is the relative performance difference between two or more sets of hardware configurations?

If 3DMark is not a meaningful proxy of gaming, then it does not matter the performance derived from 3DMark benching if you are looking to ascertain price/performance assessments of gaming hardware, regardless how optimal or unoptimal 3DMark has been compiled.

Conversely, if handbrake is an app of interest to you, it does not matter how well optimized or unoptimized Handbrake is for various hardware, all that matters to you is how well it performs on various hardware.

A real-life example for myself is TMPGEnc. Some say it is well optimized for Intel hardware and thus will always turn out faster encode times on an Intel CPU than a comparably priced AMD CPU. For me this distinction is irrelevant, all I am interested in is price/performance for the software as it comes to me from the distributor as I cannot acquire a hypothetically better optimized version of it for an AMD processor, so allegations of compiler optimization bias are irrelevant in this situation.
 

thunng8

Member
Jan 8, 2013
153
61
101
I predict that
A9 duel core with 1.8Ghz
A9X Tri core with 2Ghz, GT6850 GPU

From chinese information.

We will have to wait for benchmarks but it does look like Apple quoted performance increases are for singled threaded tasks.

I.e. In apple's tech specs for iPad it notes that:
- a8 in the mini 4 is 1.3x faster than a7
- a8x in the air 2 is 1.4x faster than a7
- a9x in the pro is 2.5x faster than a7

link: http://www.apple.com/au/ipad/compare/

A8x and a8 difference (even when both are running at 1.5ghz) in single thread can be explained by the larger l2 cache 2mb vs 1mb and faster memory interface.

If they are quoting multithreaded perf they would have put a8x much higher than a8 because of core counts.
 
Last edited:
Mar 10, 2006
11,715
2,012
126
We will have to wait for benchmarks but it does look like Apple quoted performance increases are for singled threaded tasks.

I.e. In apple's tech specs for iPad it notes that:
- a8 in the mini 4 is 1.3x faster than a7
- a8x in the air 2 is 1.4x faster than a7
- a9x in the pro is 2.5x faster than a7

A8x and a8 difference (even when both are running at 1.5ghz) in single thread can be explained by the larger l2 cache 2mb vs 1mb and faster memory interface.

If they are quoting multithreaded perf they would have put a8x much higher than a8 because of core counts.

Yeah, if Apple's performance increase claims are at iso core counts, then they're damn impressive.
 

Nothingness

Platinum Member
Jul 3, 2013
2,717
1,347
136
Yeah, if Apple's performance increase claims are at iso core counts, then they're damn impressive.
If the information quoted here based on Chinese ministry of industry A9 is 1.8 GHz with 2 cores, we are talking of about ~30% better IPC, which would definitely be impressive. I'll wait for benchmarks...
 

cytg111

Lifer
Mar 17, 2008
23,494
13,077
136
A real-life example for myself is TMPGEnc. Some say it is well optimized for Intel hardware and thus will always turn out faster encode times on an Intel CPU than a comparably priced AMD CPU. For me this distinction is irrelevant, all I am interested in is price/performance for the software as it comes to me from the distributor as I cannot acquire a hypothetically better optimized version of it for an AMD processor, so allegations of compiler optimization bias are irrelevant in this situation.

- Excatly this. At the end of the day, the only thing that matters.
 

raghu78

Diamond Member
Aug 23, 2012
4,093
1,475
136
Yeah, if Apple's performance increase claims are at iso core counts, then they're damn impressive.

A9X is 80% faster than A9 according to Apple. Assuming thats a pure multithreaded statement we still get only 33% improvement from a 4th core. The rest of 30% or more improvement will have to come from increased clocks and higher IPC. There is no way IPC can be improved 30% or more on an already impressive high performance CPU core. A9X is definitely quad core and A9 is tri-core. You only need to see Cyclone to Enhanced Cyclone or Broadwell to Skylake to see that a 10% IPC improvement itself is at the upper end of the improvements we can expect on a existing high performance CPU core. The only other way to drastically improve IPC is through new CPU instructions and extending the ISA.
 
Mar 10, 2006
11,715
2,012
126
A9X is 80% faster than A9 according to Apple. Assuming thats a pure multithreaded statement we still get only 33% improvement from a 4th core. The rest of 30% or more improvement will have to come from increased clocks and higher IPC. There is no way IPC can be improved 30% or more on an already impressive high performance CPU core. A9X is definitely quad core and A9 is tri-core. You only need to see Cyclone to Enhanced Cyclone or Broadwell to Skylake to see that a 10% IPC improvement itself is at the upper end of the improvements we can expect on a existing high performance CPU core. The only other way to drastically improve IPC is through new CPU instructions and extending the ISA.

The die shot of A9 shows a dual core config, not a tri-core. I think Apple was able to deliver a very large singlethread performance boost with A9, and A9X may be a tri-core.



The reason I am confident that this is not a tri-core design is that you can very clearly see L2$ in the block that I have outlined. The rectangle that comes off of this block on the bottom left might have been a CPU core, but it doesn't look anything like the two CPU cores that I see inside of my blue rectangle.

I will be very interested to see if Apple has been able to deliver on these pretty impressive claims. If the A9 CPU runs at 1.8GHz, then Apple is getting some serious perf/clock improvements.
 
Last edited:

thunng8

Member
Jan 8, 2013
153
61
101
Yes, I agree. From die shot A9 is definitely dual core.

So that 70% claimed increase if true is very impressive
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Its "up to". And lets see how much comes from bandwidth increase. And then tested with a benchmark that doesnt cripple certain uarchs on purpose.

The A10 will also be produced on TSMCs 16FF+ instead of Samsungs 14FF.
 
Last edited:

Mondozei

Golden Member
Jul 7, 2013
1,043
41
86
When should we be able to see the first benchmarks of the A9? I've read that even though the official release date is the 24th there have been pushbacks on delivery estimates.
 

asendra

Member
Nov 4, 2012
156
12
81
Normally the embargo ends on Wednesday before the release. Though normally have been some leaks in geekbench or similar a few days earlier even.

I'm actually kind of amazed there hasn't been any yet given that this years there has been 1week more than other years.
 

Nothingness

Platinum Member
Jul 3, 2013
2,717
1,347
136
Too bad we don't have the Lua and Dijkstra scores.

IPC about 10% better on integer and 20% on FP. Even though that's less than the 30% that would have come from the Apple claim of 70% faster, that's still very good.
 
Mar 10, 2006
11,715
2,012
126
Too bad we don't have the Lua and Dijkstra scores.

IPC about 10% better on integer and 20% on FP. Even though that's less than the 30% that would have come from the Apple claim of 70% faster, that's still very good.

Indeed. Even more impressive that they improved IPC nicely while also scaling up clocks (though I'm sure the move to FinFETs helped a lot there).

Anyway, wish I were lucky enough to get my iP6S today...the 25th cannot come soon enough.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |