The Kaveri Pre-Launch Thread (A10-7800 and A10-6800k @3,5 Ghz)

Page 12 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

itsmydamnation

Platinum Member
Feb 6, 2011
2,864
3,417
136
,7 Ghz frequency. It is incorrect that they say 20% better IPC since Steamroller has a better core scaling. A10-6800k scores 8.2 fps in default state, means A10-7850k @3,7 Ghz is 4% faster here.

seriously where do people get there logic from. Steamroller gets better scaling because it can decode more instructions per clock then piledrive.

you do know what the I in IPC is right......................

its looking like AMD traded density for clock/power scaling which is disappointing on the high end TDP but it will be very interesting to see the configurations in the 15-35watt area. i think we are more likely to see comparable clocks to Richland there.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
I'm just glad that the speculation will all be over soon.
No, we have Torronto and Carrizo just around the corner.

The speculation is never over homeles. :biggrin:

--
Steamroller gets better scaling because it can decode more instructions per clock then piledriver.
I like to think:
CPI is how fast one unit in a CPU can process one instruction.
IPC is how much units in a CPU can process the same instruction.

Decode rate != IPC rate.
 
Last edited:

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
No, it's not incorrect unless they specify that they mean single threaded IPC.

I thought IPC almost always implied ST performance.

Note also that the stocks numbers are estimated. LOL.

You see why you must take AMD's claims with a grain of salt.

Heck they don't even know what they are comparing it to: i5-4670k with HD 4000.

How can you make this stupid a mistake on a slide like this?

 
Last edited:

itsmydamnation

Platinum Member
Feb 6, 2011
2,864
3,417
136
Decode rate != IPC rate.

i never said decode rate is IPC. but when you are decode limited like piledriver is you are now IPC limited as well. Remember its not just its absolute decode rate but the fact it can only decode for 1 core per module per clock on piledriver.

I thought IPC almost always implied ST performance

onyl from people who dont know what IPC means. Instruction per clock. IPC has nothing to do with performance across ISA's. it only becomes relevant when using the same ISA. im sure intel could double its IPC tomorrow going to a strict RISC ISA, that says nothing about performance.
 
Last edited:

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
i never said decode rate is IPC. but when you are decode limited like piledriver is you are now IPC limited as well. Remember its not just its absolute decode rate but the fact it can only decode for 1 core per module per clock on piledriver.
Piledriver isn't decode limited it was actually decode perfect. The decode was not to, big or small, and allowed for very high clocks. Steamroller's decode does nothing but hamper clocks. The increase implies more units but the slides show no new units.

The bottleneck for Piledriver is the memory subsystem. Not the front-end, not the cores, not the FPU, but the memory portion. AGUs, L1D, L2, Load/Store, Etc.
 
Last edited:

itsmydamnation

Platinum Member
Feb 6, 2011
2,864
3,417
136
except that the major changes where doubling instruction decode ( yes im sure that did that just for fun), increasing L1I to remove the aliasing issue (again for fun) and they made no major cache system changes. There is only 1 change to the cache system for performance and thats not even the caches themselves but the WCC between the L1 and L2.

if you look at other x86 processors its very much likely piledriver is decode limited. it more limited then Nealham SB ,IB , haswell,K8 and K10 per thread when both cores in a module are loaded.

then when you look at what most people call ST performance its integer ops or int/fp mix which require far less bandwidth then throughput SSE/AVX workloads so its not going to be cache bandwidth limited. cache L2 latency is debatable and almost completely unverifiable or verified as an issue.

But the very simple test is 1c2m vs 2c1m shows that the later has worse performance then the former, so its no memory system or central arbitrator/cache probing etc.

AMD increased instruction throughput 30% and we have multi threaded workloads with 20% per clock improvement. When it walks , talks and looks like a duck.... its a duck.
 
Last edited:

NostaSeronx

Diamond Member
Sep 18, 2011
3,689
1,224
136
AMD 10h/12h:
3-wide decode (3 macro-ops)
Core: 3 ALUs, 3 AGUs, 3 vALUs.

AMD 15h models 00h-2Fh:
4-wide decode(4 macro-ops)((2 macro-ops to each core))
Core Partition: 2 ALUs, 2 AGUs, 2 vALUs

AMD 14h/16h:
2-wide decode(2 macro-ops)
Core: 2 ALUs, 2 AGUs, 2 vALU

Hypothetical AMD 15h models 3Fh-4Fh:
8-wide decode(8 macro-ops)((4 macro-ops to each core)) implies;
Core Partition: 4 ALUs, 4 AGUs, 4 vALUs

I don't see the problem, it uses the same decode aligned to units rule from 10h. It also aligns with Bobcat and Jaguar.

So every AMD design is also decode bottlenecked?
 
Last edited:

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
Well, second pass in x264 measures performance(throughput) not IPC(as people use the term for Single Thread). Not to mention that IPC is application depended.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,864
3,417
136
AMD 10h/12h:
3-wide decode (3 macro-ops)
Core: 3 ALUs, 3 AGUs, 3 vALUs.

AMD 15h models 00h-2Fh:
4-wide decode(4 macro-ops)((2 macro-ops to each core))
Core Partition: 2 ALUs, 2 AGUs, 2 vALUs

AMD 14h/16h:
2-wide decode(2 macro-ops)
Core: 2 ALUs, 2 AGUs, 2 vALU

Hypothetical AMD 15h models 3Fh-4Fh:
8-wide decode(8 macro-ops)((4 macro-ops to each core)) implies;
Core Partition: 4 ALUs, 4 AGUs, 4 vALUs

I don't see the problem, it uses the same decode aligned to units rule from 10h. It also aligns with Bobcat and Jaguar.

So every AMD design is also decode bottlenecked?

except it isn't two ops to each core. it's upto 4 ops per core every other cycle. its that lack of granularity that causes the problem. if you dont have 4 ops tied to a single core ready to go each and every cycle you aren't getting your 2 ops per core.

it was likely easier to make 2 decode units (copy paste so to speak) then it was rebuilt the decode unit to be able to decode threads concurrently. Thats why instruction throughput only improved 30% because it doesn't need more then 2 ops on average per core per clock but it currently wasn't getting 2 ops per clock. You can also now have both core do 4 ops decoded a clock which is possible.
 
Last edited:

cytg111

Lifer
Mar 17, 2008
23,537
13,109
136
.. Not sure why hope springs eternal for AMD. They're finished. Will be interesting to see BYT-M eat into AMD's main PC revenue stream next year, dealing the final blow to this disappointment of a company.

- Talk about your emotional -coming out of the day-.. dont hide it inside man, just let it all out, its good for you. One note though, as far as deities go, I think you can get a better contract with one of the older ones rather than in the field of semiconductor manufacturers. Just saying.
 

cytg111

Lifer
Mar 17, 2008
23,537
13,109
136
I thought IPC almost always implied ST performance.

As itsmydamnation said plus there is ways to improve MT IPC as well, take haswells TXT, transactional memory, when implemented is used to better scale multithreaded workloads, so in essence, with TXT, you'll get the same IPC with ST but better with MT as you can push more instructions through due to lesser timed locks.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,361
136
onyl from people who dont know what IPC means. Instruction per clock. IPC has nothing to do with performance across ISA's. it only becomes relevant when using the same ISA. im sure intel could double its IPC tomorrow going to a strict RISC ISA, that says nothing about performance.

IPC is IPC regardless of ISAs. You measure Instructions per Cycle no matter what ISA you are using. You may compare IPC using the same ISA or not. Thats why IPC is application depended, one application may use AVX the other not, you get different IPC in those two. Nobody ever said that IPC is Legacy only. You may use an application not supporting SIMDs instructions to measure IPC but you may also use any application that supports any ISA to measure IPC.

People here take IPC as Single Thread Performance, actually CPI(Cycles Per Instruction) is closer to that than IPC. But again you may use whatever ISA your hardware and software can support to measure both IPC and CPI.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106

DaZeeMan

Member
Jan 2, 2014
103
0
0
Yeah, those yahoos over at AMD are doomed. With over 6 million XBoxes and PS4's selling in Q4, both record setting levels according to MS and Sony (with AMD's take estimated at $60-$100/unit), an increasing GPU market share, the fact that R9's are selling like hotcakes, oh and AMD increasing their x86 market share versus Intel overall, yeah they won't last a week...

Not to mention them showing profitability in the second half of 2013. Yeah, AMD aquiring ATI was the stupidest thing they ever did...

/sarcasm off

AMD should have been dead decades ago according to a good number of people, and yet they are still here, and putting out decent products. Sure, the pendulum swings between NVidia and AMD on the GPU side, and thank goodness it does! It'd get boring only having one brand on top all the time...

It might have been cooler if 3DFX had managed to stick around, but alas it was not meant to be.

As for AMD vs Intel, no question Intel is on top, but Intel is, what, 10x the size of AMD? The fact AMD does as well as it does against a company with around ten times it's revenues is rather remarkable to begin with. Be thankful AMD even bothers to hold on, which helps keep those Intel engineers on their toes.
 

Hans de Vries

Senior member
May 2, 2008
321
1,018
136
www.chip-architect.com
Guys please stay on topic of Kaveri/Richland comparison and leave the intel/amd pricing from 10 years ago for some other topic.

Well, comparing Kaveri with Richland/Trinity..... There's an awful lot of DDR3 I/O on Kaveri
About twice as much, so maybe we should congratulate you with your prediction!



I used a Kaveri package photo to estimate the horizontal versus vertical die size ratio.
If there's lots of logic in the I/O, for instance timing delay lines, sample registers
and sample selectors then the I/O can scale roughly with the process node.

Hans.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
Yeah, those yahoos over at AMD are doomed. With over 6 million XBoxes and PS4's selling in Q4, both record setting levels according to MS and Sony (with AMD's take estimated at $60-$100/unit), an increasing GPU market share, the fact that R9's are selling like hotcakes, oh and AMD increasing their x86 market share versus Intel overall, yeah they won't last a week..

Console margins are ridiculously low and even with that prices we are talking about peanuts cash flows, even for AMD's standards.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Yeah, those yahoos over at AMD are doomed. With over 6 million XBoxes and PS4's selling in Q4, both record setting levels according to MS and Sony (with AMD's take estimated at $60-$100/unit), an increasing GPU market share, the fact that R9's are selling like hotcakes, oh and AMD increasing their x86 market share versus Intel overall, yeah they won't last a week...

Not to mention them showing profitability in the second half of 2013. Yeah, AMD aquiring ATI was the stupidest thing they ever did...

/sarcasm off

AMD should have been dead decades ago according to a good number of people, and yet they are still here, and putting out decent products. Sure, the pendulum swings between NVidia and AMD on the GPU side, and thank goodness it does! It'd get boring only having one brand on top all the time...

It might have been cooler if 3DFX had managed to stick around, but alas it was not meant to be.

As for AMD vs Intel, no question Intel is on top, but Intel is, what, 10x the size of AMD? The fact AMD does as well as it does against a company with around ten times it's revenues is rather remarkable to begin with. Be thankful AMD even bothers to hold on, which helps keep those Intel engineers on their toes.

You confuse revenue and margins with the consoles

For x86 marketshare, its simply artificially inflated as such:
http://www.pcworld.com/article/2062...-amd-gain-x86-market-share-against-intel.html

R9 selling as hotcakes, in the US yes. Rest of the world? Not so much. Simply regular sales there.

Increase GPU share? Dont count on it.

Is VIA dead? No. Nobody says AMD is dead as a company either. But they are dead as competition. And they are on a slow negative feedback cyclus. Its only going downwards. Simply look at the company revenues over the years. Its simple economics.

If Haswell is 10% faster. And Kaveri is 10% faster. Is the gap expanding or staus quo? Its obvious expanding.

For the ATI buy, AMD still havent been able to capitalize on it. How much of the 5.4B$ have come back? They only had to sell their fabs....
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,269
5,134
136
You confuse revenue and margins with the consoles

For x86 marketshare, its simply artificially inflated as such:
http://www.pcworld.com/article/2062...-amd-gain-x86-market-share-against-intel.html

R9 selling as hotcakes, in the US yes. Rest of the world? Not so much. Simply regular sales there.

Increase GPU share? Dont count on it.

Is VIA dead? No. Nobody says AMD is dead as a company either. But they are dead as competition. And they are on a slow negative feedback cyclus. Its only going downwards. Simply look at the company revenues over the years. Its simple economics.

If Haswell is 10% faster. And Kaveri is 10% faster. Is the gap expanding or staus quo? Its obvious expanding.

Yes, but is Kaveri 10% faster? AMD were claiming bigger improvements than that, though we won't know until we see independent benchmarks. (Hopefully this time next week, when the embargo lifts.)

Besides, you got your maths wrong. Let's try this example: if product A goes at 100fps and product B goes at 90FPS, B is 10% slower than A. A and B are both revised, improving by 10%; A2 goes at 110fps, B2 goes at 99fps. Oddly enough B2 is still 10% slower than A2, so the gap has not widened.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
Yes, but is Kaveri 10% faster? AMD were claiming bigger improvements than that, though we won't know until we see independent benchmarks. (Hopefully this time next week, when the embargo lifts.)

Besides, you got your maths wrong. Let's try this example: if product A goes at 100fps and product B goes at 90FPS, B is 10% slower than A. A and B are both revised, improving by 10%; A2 goes at 110fps, B2 goes at 99fps. Oddly enough B2 is still 10% slower than A2, so the gap has not widened.

The gap wides as a value. 10 to 11.

The main problem for Kaveri is that Richland is clocked 10%+ higher. And that it seems Richland is better to turbo than Kaveri. The footnotes show only 4% in one test for example that we can compare to a fully working 6800K.

30% IPC improvement as orginally used didnt come from AMD. It came from overly optimistic people that took things out of context. AMD now says "up to" 20% IPC. But since Richland is clocked higher with 10%+. I was generous to use 10%
 
Last edited:

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106

Its Q2 2013 numbers.

Plus you want to seperate it into dGPUs and iGPUs. AMD is moving its entire production line to CPUs with iGPU. That funny enough messes up those stats. Intel sits on what, 65% graphics share?

Q3 is latest numbers:
http://www.techpowerup.com/194979/g...entially-in-q3-nvidia-gains-as-amd-slips.html

 
Last edited:
Status
Not open for further replies.
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |