Kaveri Steamroller vs BulldozerPiledriver

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

inf64

Diamond Member
Mar 11, 2011
3,765
4,223
136
First of all everyone should relax .

The cosmology results are genuine. BUT they come from pre-production sample and they should be disregarded as we have no clue what kind of clocks or level of bios support that system had.

That amdfx blog is known by sensational information not grounded in reality but in this case the author didn't just make it up, the result is indeed there. But as I have said it should definitely be disregarded. Note that AMD also stated that FP performance should not regress with Kaveri (on the contrary- for source check AT article on SR).



As Exophase said, the wide int exec. capability is situated in that very FP co-processor unit and is basically used a lot when running all those workloads AMD lists in the footnotes for the supposed 30% more ops/cycle improvement versus BD/PD. This means anything that runs SSE code in a nutshell(footnote <2>). SSE is executed in FP unit.
 
Last edited:

Abwx

Lifer
Apr 2, 2011
11,172
3,868
136
The fact that the results make no sense and come from a dubious source is what makes me think it is fake. I'd love to see AMD make an excellent comeback, but getting excited over nonsense doesn't help anyone.

It was and is dubbed nonsense because Galego posted
it his site not because of thorough examination of the
numbers eventual relevance , as pointed by Inf64 the
numbers seems genuine even if not totaly accurate
in respect of a final silicon.
 

NTMBK

Lifer
Nov 14, 2011
10,269
5,134
136
It was and is dubbed nonsense because Galego posted
it his site not because of thorough examination of the
numbers eventual relevance , as pointed by Inf64 the
numbers seems genuine even if not totaly accurate
in respect of a final silicon.

They show a regression in FP, which makes no sense, and they show no improvement between BD and PD, which makes no sense. Either the numbers are faked or it's a nonsense benchmark.
 

inf64

Diamond Member
Mar 11, 2011
3,765
4,223
136
You are wrong here NTMBK . BOINC benchmark was one of the rare ones that showed very little difference between BD and PD. PD is indeed (just) a bit faster in integer part but also, according to BOINC, slower at the same time in fp part. We know PD just flat out outperforms BD in common workloads by a healthy margin (between 7 %and 17%, averaging out at ~11% according to hardware.fr).
 
Last edited:

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
They show a regression in FP, which makes no sense, and they show no improvement between BD and PD, which makes no sense. Either the numbers are faked or it's a nonsense benchmark.

Why would fake numbers from an AMD fanboy have the FP regression then?

We do know what the benchmark is and we know it's not a remotely representative one.
 

NTMBK

Lifer
Nov 14, 2011
10,269
5,134
136
You are wrong here NTMBK . BOINC benchmark was one of the rare ones that showed very little difference between BD and PD. PD is indeed (just) a bit faster in integer part but also, according to BOINC, slower at the same time in fp part. We know PD just flat out outperforms BD in common workloads by a healthy margin (between 7 %and 27%, averaging out at ~11% according to hardware.fr).

So it's a nonsense benchmark, which was my second option.
 

Insert_Nickname

Diamond Member
May 6, 2012
4,971
1,692
136
Of all of these unsubstantiated benchmarks, what I think is most important is the supposed 15.0 GB/s memory bandwidth. If this is really double the 6800K, then it is going to be pretty fast. I wish I knew how much of that increase comes from the RAM speed vs controller optimizations.

I don't know where you got those numbers from, but I have a 6800K/2133MHz doing 21.3GB/s Read and 10.8GB/s write...

If AMD has a memory controller that can do twice that, I can see some serious performance improvements are in store. But that is not going to happen, the theoretical maximum for dual 2133 is only 34.1GB/s, but if they can break 30GB/s I'm impressed.
 

sm625

Diamond Member
May 6, 2011
8,172
137
106
I got that 15GB/s from here:



It's "GP memory bandwidth", whatever that means. Maybe part of the benchmark involves copying memory from RAM to GPU RAM, which theoretically would be reduced to 0nS for a system with unified memory. Maybe that is why it says 15GB/s for kaveri and only 10GB/s for the top intel.
 
Last edited:

Insert_Nickname

Diamond Member
May 6, 2012
4,971
1,692
136
I got that 15GB/s from here:

It's "GP memory bandwidth", whatever that means. Maybe part of the benchmark involves copying memory from RAM to GPU RAM, which theoretically would be reduced to 0nS for a system with unified memory. Maybe that is why it says 15GB/s for kaveri and only 10GB/s for the top intel.

That could explain things. I'd really like to see some memory benches from Sandra/Aida64. Just so I have something known I can compare them to. I'm not familiar with that program being used. If someone can point it out, I'd be happy to run some tests...
 

Fjodor2001

Diamond Member
Feb 6, 2010
3,938
408
126
I got that 15GB/s from here:

It's "GP memory bandwidth", whatever that means. Maybe part of the benchmark involves copying memory from RAM to GPU RAM, which theoretically would be reduced to 0nS for a system with unified memory. Maybe that is why it says 15GB/s for kaveri and only 10GB/s for the top intel.

Hmm... you mean the operation is reduced to a NOP more or less? Or perhaps only updating some MMU tables for a large block of data. Then I think we'd see much higher bandwidth than that.
 
Last edited:

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
Yup. "Simulated benchmarks". Not sure why anyone takes this joke of a blog seriously, it feels like it's written by Galego.

lol

Of all of these unsubstantiated benchmarks, what I think is most important is the supposed 15.0 GB/s memory bandwidth. If this is really double the 6800K, then it is going to be pretty fast. I wish I knew how much of that increase comes from the RAM speed vs controller optimizations.

The a10-6800k gets almost exactly 15 GB/sec.

I got that 15GB/s from here:



It's "GP memory bandwidth", whatever that means. Maybe part of the benchmark involves copying memory from RAM to GPU RAM, which theoretically would be reduced to 0nS for a system with unified memory. Maybe that is why it says 15GB/s for kaveri and only 10GB/s for the top intel.

And on my HD 4000 laptop I get 17.5 GB/sec with 1600 mhz Cas 11 RAM in that test.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,106
136
It is, isn't it. Wasn't that galego's blog that he kept pimping here before he was politely asked to vacate the premises?

Oh, I wondered what happened to him, not that I missed him or anything. Thanks.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,106
136

Yikes! Looks like the rough improvement between these is only around 20% (from the FP score). So that mean means the pre-production sample has a clock of only ~ 2 GHz D:

Unless the IPC jump is much more than we are expecting, Kaveri will need to hit at least 3.5 GHz just to match Richland in ST CPU performance. At 3 GHz or below, Kaveri is going to be a joke, kind of like BD was, in terms of ST CPU performance. This is very sad news if these screen shots are true.

Maybe, as some have suggested, there is now high performance node for 28nm? I think many of us expected the clocks to drop, but not this much - even if Kaveri comes out @ 2.5 GHz with a 20% IPC bump, it will be so out classed that it will become and instant loss leader.

I hope it's not true, really.
 

inf64

Diamond Member
Mar 11, 2011
3,765
4,223
136
I don't know what you are thinking by looking at that image.
Kaveri in question is 1.8GhzBase/2.3GhzTurbo mobile ES chip...
 

Ajay

Lifer
Jan 8, 2001
16,094
8,106
136
I don't know what you are thinking by looking at that image.
Kaveri in question is 1.8GhzBase/2.3GhzTurbo mobile ES chip...

OK, which one, the first or the second? The basic FP score went up by 20% between the two - the is most likely a small, very linear FP test and would be indicative of a 20% increase in clocks (not affected by drivers, just FP instruction loop). I would expect an early ES chip to be clocked at 1.8 GHz. I wouldn't expect a 'Near Production Sample' to be clocked @ 2 GHz.

Now maybe it's completely bogus, in this case, that would be a very good thing.
 

inf64

Diamond Member
Mar 11, 2011
3,765
4,223
136
Well for start the chip is completely different. 1st one has 2M at the beginning of the model string and second one has 1M (different stepping, one is early ES other is pre-prod. ES).
2nd, both are obviously having identical base/boost clocks. Whether one had its boost deactivated and other has it activated is unknown.

This score is basically useless as we know nothing about the ES chip that ran it. So basing your opinion about how SR is going to rock or suck is futile.
 

Abwx

Lifer
Apr 2, 2011
11,172
3,868
136
I would expect an early ES chip to be clocked at 1.8 GHz. I wouldn't expect a 'Near Production Sample' to be clocked @ 2 GHz.

Now maybe it's completely bogus, in this case, that would be a very good thing.

1304 stand for mobile 35W higher perfs part.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,106
136
So all we get out of this thread is that AMD has Kaveri up and running. Got it.
Man, you'd think I'd learn and just avoid baseless rumor threads...
 

nenforcer

Golden Member
Aug 26, 2008
1,767
1
76
So all we get out of this thread is that AMD has Kaveri up and running. Got it.
Man, you'd think I'd learn and just avoid baseless rumor threads...

Yeah, I'm also more interested in benchmarks run on Windows 8.1 which is suppose to have much better performance for APU based designs with a shared GPU / CPU memory space.

Those are probably only a month away, however.
 
Aug 11, 2008
10,451
642
126

If correct that means they need a 25% increase in ipc just to maintain the same CPU performance, since clockspeed is down that much. Igp performance could be good if they can solve the bandwidth problem.

They seem to be biasing performance toward the igp. Time will tell if that is the correct strategy. It could work in mobile, but for desktop, I am not so sure. Still would seem easier just to add a discrete card.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
If correct that means they need a 25% increase in ipc just to maintain the same CPU performance, since clockspeed is down that much. Igp performance could be good if they can solve the bandwidth problem

This will be project denver in red. Big iGPU, weak CPU attached.
 

inf64

Diamond Member
Mar 11, 2011
3,765
4,223
136
What kind of a BS article was that (the Italian translated). Where did they got those CPU clocks? AMD said that they will "maintain" high frequency engine and this is not what that website is saying. Basically the previously touted "high clock" tuned design done on 28nm bulk process would end up on the (frequency) level of 45nm short pipe K10 which is nonsense.
So adding a decoder, optimizing FP unit for better perf./watt and doing a design on smaller node makes you lose 25-30% frequency? Lmao. So how can this thing outperform 4.1-4.4Ghz Turboed PD core by ~15-20% in common workloads? It would need to have 50-60% higher IPC in order to perform 15-20% better than ~4.2Ghz PD (meaning it would perform like 4.8Ghz PD while running at ~3Ghz). This is complete nonsense.

I love "news" like that
 

Enigmoid

Platinum Member
Sep 27, 2012
2,907
31
91
They seem to be biasing performance toward the igp. Time will tell if that is the correct strategy. It could work in mobile, but for desktop, I am not so sure. Still would seem easier just to add a discrete card.

In mobile the CPU is already so weak, it bottlenecks a 7970m to 660m levels. In mobile they need a stronger CPU more than anything. You can see in AT GX60 review (with the a10-4600m and the a10-5750m) how much performance went up in CPU bound games like SC2 with richland vs trinity on the igp alone.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |