AMD SPEEDS?

phillyman36

Golden Member
Jun 28, 2004
1,763
160
106
Im having a hard time understanding AMD and how they do the speed set up how can a amd 2.0 gh processor be faster that a intel 2.4 gh processor? what is amd equivallent to intel 2.8 , 3.0 and 3.4
 

Gamingphreek

Lifer
Mar 31, 2003
11,679
0
81
Well ill try to put this as plainly as i can.

Intel has a long pipeline so it has to work harder to keep up. Therefore we say it does less operations per clock cycle. So it does less work for every mhz. Therefore in order to compete intel has to raise the mhz so it does the same amount of work.

AMD on the otherhand uses a very,no extremely short pipeline. Therefore we say it does more operations per clock cycle. It does more work for every mhz therefore they do not have to raise the clockspeed of their chips to compete effectively, no would they be able to.

The two are going for 1 common goal to get the work do, only their are 2 ways to do it. Intel for instance takes the long straight way around town trying to get someplace therefore allowing them to go at high speeds because it is long but it is straight. AMD when trying to get to that exact same place takes the shorter, windier way therefore they take it slow (mph wise) or they wouldn't make it.

That is why you dont see AMD ramping clock speeds up to 3.2ghz because, referring to the analogy above, would "crash".

-Kevin
 

Yanagi

Golden Member
Jun 8, 2004
1,678
0
0
to put it short. imagine you're going through the drive through on McDOnalds. where you place your order, pay and pick up the food. thats three steps.

If we now look at the intel McDonalds you have 21 different stations. Whereas you have to go through 21 different steps to get your food. to get to the end you have to travel faster to complete all the steps in the same time as the amd route where you only have to complete 12 steps in order to get your food..

my god i suck at putting things in laymans terms.. but I hope you get the idea.. or anyone else for that matter..
 

Sonic587

Golden Member
May 11, 2004
1,146
0
0
To put it shortly, Intel's P4 is an awful design when it comes to efficiency. They do very little work per clock. The trade off is that Intel can raise the clockspeed higher to make up for this.

AMD, on the other hand, is all about efficiency. They cream Intel clock for clock. They don't *need* a 2.4GHz processor to beat a 2.4GHz P4. AMD is fine with much less clockspeed since they are so efficient. If we saw a true 3.0GHz Athlon 64 going up against an Intel 3.0GHz, the A64 would absolutely destroy the P4.

It's pretty easy to tell which AMD processor is marketed against it's P4 equivalent. AMD uses a simple naming scheme.

A64 2800+ = P4 2.8GHz
A64 3000+ = P4 3.0GHz
A64 3200+ = P4 3.2GHz
A64 3400+ = P4 3.4Ghz
and so on.

Many people think clockspeed is the end all be all. I.E. A 3.0GHz processor MUST be faster than a 2.0GHz processor. This pretty much stems from people thinking higher numbers equal faster. That never has been, is not, and never will be true. There are quite a few factors that determine a CPU's overall performance. Clockspeed is one factor, but it is definitely not the only factor.
 

elkinm

Platinum Member
Jun 9, 2001
2,146
0
71
I really like Yanagi's McDonalds analogy. In a CPU architecture you need to complete every stage of a pipeline in a single clock or you lose it so going back to McDonalds:

You order a sandwich, the order is sent in, someone puts a burger in the oven, when done he gives it to the nest guy who ads the ketchup, then next is the mustard, and then the bun pickles onions ect. And then you have the fries.

The slowest part is the cooking of the burger so a single clock cannot be faster then the time it takes to cook the burger or it wouldn't finish.

What Intel does is spit it up some more like cook the burger on one side, then cook on the other. Since it takes less time to cook only one side, the single clock is shorter allowing for higher frequencies but now you need two clocks to cook the burger.

Then you have other optimizations to have maximum work done in a clock like one can add ketchup, mustard and pickles in the time to cook a burger so that can be just one stage. And the problem with a mispredict is when you guess the person wants mustard when he does not so you have to thro it out and start again.

Hope this helps if everything else already said did not.
 

Yanagi

Golden Member
Jun 8, 2004
1,678
0
0
thanks for adding that up to my McDonalds analogy. It feels more complete now I hope the OP will have some basic insight now how different CPU architectures work now. otherwise hes welcome to ask more
 

dguy6789

Diamond Member
Dec 9, 2002
8,558
3
76
Here is my highway theory.

Think of AMD cpus as a 6 lane highway and intel cpus as a 4 lane highway.

If both highways have the same mph, of course the amd one would be able to handle more traffic, but intel ones tend to have higher mph, so it balances things out.

Athlon 64 2800+ = p4 2.8ghz
64 3000+ = p4 3ghz

and so one. But the amd ratings are conservative mostly. In most tests, a 64 2800+ will be close but almost always beat the 2.8, and this goes the same with all the 64s.

There are a few things the intel ones will win in, but for the most part, the same pr rating athlon 64 is superior to the ghz p4.
 

Gamingphreek

Lifer
Mar 31, 2003
11,679
0
81
Intel doesn't win anymore,the new 939 processors even beat Intel in encoding/decoding. Latest AT review says it.

-Kevin
 

Ariste

Member
Jul 5, 2004
173
0
71
As the others have said, the main difference is that AMD procesors do more work per clock cycle than Intel CPUs. I'll try to explain it to you.

Most people think that, when looking at a processor, the only factor in it's overall speed is it's GHz rating. This is because they are under the impressions that 3GHz=3, when the single three at the end is the overall work done by the processor. This is not really true. The equation should look more like this:

3GHz x (work done per clock cycle)=3.

This is because it doesn't really matter how many clock cycles your processor can go through in one second if it's doing barely any work for each of those clock cycles. So, obviously, for that equation to work, the work done per clock cycle has to be 1. Intel, since it has much longer pipelines than AMD, has had to reduce the work done per clock cycle to a number around .6. So the equation for Intel should look like this:

3GHz x .6 (work done per clock cycle)=1.8.

While Intel has had to change the work done per clock cycle, AMD has been able to keep it at one, leaving it's equation looking like this:

3GHz x 1 (work done per clock cycle)=3

So, in other words, an Intel CPU that operates at 3GHz is roughly equivalent to an AMD CPU operating at 1.8GHz. That is basically why Intel CPUs don't blow away AMD CPUs even though they operate at much higher speeds.
 

coldpower27

Golden Member
Jul 18, 2004
1,676
0
76
There is more to the performance rating of a processor, then it's mere GHZ, rating, for instance.

The Number of Math units in the processor plays a role on performance, I think AMD has 3 of each of the 3 types.
while Intel has what? 2 of each of the 3 types.

The amount of lv1 cache that a processor has will play a role.

The amount of lv2 cache will also play a role.

the speed of the pathway from the CPU to RAM the FSB also plays a role.



I will try to make some analogy.

The AMD processor has 9 men doing labor, while the Intel Processor has 6 men, since AMD has 9 men the men can work slower to accomplish the same task, as the 6 men would have to do faster.

If the men were working on desk for example you could say that, the AMD processor has more desk space then the Intel one, since it has more LV1 cache, but on the other hand, the Intel processor has more drawer space in the desk, which isn't quite as fast, but still quite fast, both provide a place for the data to wait, until the processor can do wokr on it.

It's quite hard to assign arbitary value like a AMD 1.8GHZ is equal to Intel 3.0GHZ because, it would only hold true for a certain case, the AMD 64 vs the P4 3.0GHZ, but this doesn't hold true for the Barton 2.2GHZ/400 which only equals a P4 2.8GHZ/800
 

Ariste

Member
Jul 5, 2004
173
0
71
It's quite hard to assign arbitary value like a AMD 1.8GHZ is equal to Intel 3.0GHZ because, it would only hold true for a certain case, the AMD 64 vs the P4 3.0GHZ, but this doesn't hold true for the Barton 2.2GHZ/400 which only equals a P4 2.8GHZ/800

You're definitely right about that. I should have made that clearer in my post.

Yes, a 1.8GHz AMD processor will only be as fast as a 3.0GHz Intel processor if the AMD processor is the Athlon 64. Anything lower or higher will change the relationship between the AMD and Intel processors.
 

Calin

Diamond Member
Apr 9, 2001
3,112
0
0
The McDonald analogy for processors is the best thing since sliced bread

Thanks, yanagi and elkinm. I might use the examples myself (if it is ok)

Calin
 

eklass

Golden Member
Mar 19, 2001
1,218
0
0
/me wonders how many more times people can recreate the mcdonals analogy since there's already half-a-dozen dupes of what someone else has already said
 

imported_funbun

Junior Member
Jul 29, 2004
6
0
0
I've been a Mac user for a long time and Apple was big on the "Mhz Myth" a couple years back. AMD basically has the same philosophies.

Here is a sports car/sport motorcycle analogy:

Dodge Viper = V10 engine

Sport motor cycle = V2 (v twin)

It's obvious that the Viper has tons more horsepower than the motorcycle. However, the motorcycle has a better power to weight ratio. In other words, the motorcycle doesn't have to work as hard to overcome it own weight then the big clunky Viper. The motor cycle can out accelerate, out corner and flatout perform the Viper.
 

Yanagi

Golden Member
Jun 8, 2004
1,678
0
0
Originally posted by: Calin
The McDonald analogy for processors is the best thing since sliced bread

Thanks, yanagi and elkinm. I might use the examples myself (if it is ok)

Calin

Thanks Calin. You have my blessing to use it But please drop me a PM whenever you use it! Would be cool to see how often its gonna be used when people come in here and asks those kinds of questions.
 

Vee

Senior member
Jun 18, 2004
689
0
0
Originally posted by: funbun
Here is a sports car/sport motorcycle analogy:

Dodge Viper = V10 engine

Sport motor cycle = V2 (v twin)

It's obvious that the Viper has tons more horsepower than the motorcycle. However, the motorcycle has a better power to weight ratio. In other words, the motorcycle doesn't have to work as hard to overcome it own weight then the big clunky Viper. The motor cycle can out accelerate, out corner and flatout perform the Viper.

I'm not fond of that analogy.

Consider a 2 litre 4 cyl engine at 3400rpm (3.4GHz P4) vs. a 3 litre 6 cyl engine at 2400rpm (AMD @ 2.4GHz) instead.
There may be something wrong with that picture, bear with me guys, I don't know much about cars. I think you get the idea anyway.

Basically, the question posed here, is how clock rate can be substituted. First, there's two comments to be made about the question as such: One is that, historically, there's never been much correlation between clockspeed and rate of instructions executed. It's just an unenlighted assumption.
The clock's purpose is just to synchronize the switching inside the cpu. The clock is _NOT_ some count of work performed!
Secondly, there's still something like a point, to that intuitive assumption. But clockrate can be substituted the same way as engine rpm can be substituted. By volume.

Early CPUs needed many clockcycles, just to finish a single instruction. As scale of integration grew, making more transistors available on the chip, the number of clockcycles needed to perform an instruction, gradually decreased.

One of the most important techniques to accomplish this, is the pipeline. It works like the assembly line in a car factory. (Like the "McDonald's analogy" It is illustrative to think of instructions as items that have to be assembled.) The CPU works simultaneously on a sequence of instructions. The instructions 'travel' down the pipe, while different parts of work are performed at stages along the pipe. Each instruction still need many cycles to execute, but a finished instruction 'comes off' the end of the pipeline every few - or even each - clockcycle.

Theoretically, it may immediately seem like you can't execute more than one instruction per clockcycle, even on the best pipeline. And if this were to be true, clockrate would then finally become a factor. So here's the trick: First, the incoming chain is split and instructions enters several parallel lines, not just one. These lines just figure out what should be done and what is needed. The instructions then enter a pool, where they wait until all parts needed for their execution are available and ready. Once they are ready, they are dispatched - Out of Order (not wasting time waiting on latecomers) into the next stages of the 'assembly line', and guess what, - we have multiple parallel lines here again. These are the execution units. Simply speaking, K7 and K8 have three identical logical/integer units each, and three individually different, specialized floating point units. Once finished, the executions enter a pool again. This is the queue where they are reordered into correct sequence again, before the results are finalized/written.

Nothing I have mentioned, concerns caches or handling IO bottlenecks in any way at all. This is just raw execution at full speed.
The key to making it perform well is to break the sequential order of the instructions. "Out of Order execution". This is not a simple thing to accomplish though. For one thing, each instruction need its own version of the registers and their contents.
There are many details left. I've tried to explain it simply, so the principles are transparent. I hope anyone now can see, that with these techniques, performance is pretty independent from clockrate. To recap - AMD's cpus have 3 decoder lines, then roughly(sic) four shedulers dispatch into 6 lines of execution.

*********************

Many have mentioned Intel's long pipelines vs AMD's shorter, here, as an explanation. That is also somewhat true, and I want to comment that too.

The primary reason for having a very long pipeline, is to make it possible to reach higher clockrates! High clockrate <= long pipeline. This is because in order to be able to sync faster, the chains and lattices of transistors need to be ready with their switching faster. This is accomplished by keeping the transistor 'chains' shorter and simpler. This however, also means that less can be done at each stage in the pipe, and the pipeline grows in length.

There seem to be some kind of confusion about this "less work done", - remember that pipe still ticks off a completed instruction each clock!

The advantage (somewhat illusory) of the higher clockrate, is that a pipe can dispatch finished instructions at a higher rate. There's a number of disadvantages with long|deep pipelines though. One of them is that every time a branch enters the pipe, the cpu makes a guess what string of instructions should follow it into the pipe. At the end of the pipe, the result of the branch is at hand. If the guess was wrong, we lose all the work in the entire pipe. Code can be full of branches that are hard to guess. This cripples P4 performance on general code, compared to it's excellent performance on menial loops (benchmarks and media). This is part of the reason for the poor Intel performance you can see, in for instance, Business Winstone 2002 and 2003, relative AMD. But only part of the reason. The way P4 performance, on that realistic type of application benchmarks, scales with FSB speed, indicates prefetch is hard to do well, when you're concentrating on a deep pipe and high clockrates. So I would say AMD have better prefetch and branch prediction. AMD seem to get away much better with a lower memory bandwidth, as well as smaller caches. There is, for instance, not much difference in performance between socket 754 and 939/940 sofar. As the Athlons will develop more muscles, there will be eventually. But not much for now.

P.S. If you would use the benchmark suits from like 4 years ago, AMD (AthlonXP too) would destroy Intel completely. Today, most benchmarking, PCMark, SYSMark, media, etc. concentrates on branchless SSE2 vector loops that fit into the cache. This is to some part true about benchmarking the matrix multiplications inside games 3D engines and 3D render software too. They too benefit Intel's vector processing. I wonder, for instance, about how true the typical game benchmark is to real game performance. The benchmark only plays a 'movie' through the 3D engine. The thing that seem wrong to me about that, is not only that the cpu will be tasked with other things in a real game, like AI, pathfinding and physics simulation, but also that the caches will not be the exclusive playground for 3D engine vector code. Just looking at flight simulators, for example, AMD tends to humble Intel. Even though that is a cpu intensive physical simulation, I'm wondering if that effect doesn't spill over to real games too?

AMD must be bitterly disappointed with how benchmarking have changed, to show off their CPUs and Intel's. But I think it's their own fault. I think Intel were absolutely right, on concentrating on performance on menial loops. Because this is the performance most users will mostly notice on a modern media rich PC. The 'general' logic code that AMD is so good at, mostly only lasts for milliseconds, between the huge chunks of data processed by loops, the P4 is good at.

On a side track: Intel have made efforts on branchprediction and prefetch on the Prescott. To some extent that (just as the 1MB cache) is probably eaten by the even longer pipe. But maybe, just maybe, popular *Intel biased* benchmarks are making an injustice against it, compared to P4C.

Even the Athlon64 have difficulties with the P4 on vector computing (in 32-bit mode). And that is not due to GHz, it is due to AMD putting in too wimpy SSE2 vector processing in the K8. AMD may have thought the K8 was 'balanced' or something, but that's not good enough for the benchmarking battlefield. I have a notion we might see massive improvement in vector and FP performance, in next AMD core, while integer/general performance stays much the same. That might kill off media encoding as a popular benchmark, it will also redress Opteron2's SPEC_fp2003 for a really devastating blow against the Itanium.
 

dc5

Senior member
Jul 10, 2004
791
0
0
sucks that intel sell their cpu's for a high price for same performance compared to amd. just look at the bestbuy ads. a 2.8e ghz system sells for $1000 but an amd 2800+sytem sells for $650.

just curious, isn't it possible for intel to shorten their pipelines?
 

GhandiInstinct

Senior member
Mar 1, 2004
573
0
0
So when you pair identical speed AMD and INTEL chips, what would INTEL excell over AMD in? Meaning, what applications. Internet explorer? lol.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |