AMD Bulldozer architecture "update"

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: thilan29
Not sure how valid this is.

Matthias does a pretty thorough job of sifting through the patents and applying their seemingly relevant points to working up a credible picture of the computing potential of bulldozer.

The caveats that apply are self-evident: he's relying on AMD applying for patents for stuff they are intending to incorporate into a cpu microarchitecture as soon as Bulldozer, and he's assuming that what AMD applies for patents will manifest in the microarchitecture in a manner that conforms to the Occam's razor interpretation of how the IP is presented/documented in the patent application.

Basically he does the best any of us could expect him or ourselves to do, and he (to my uneducated eye) appears to do a pretty darn decent job at it. What I really appreciate is that he not only goes to the trouble of doing this stuff but he furthermore goes to the trouble of making it presentable and digestible to us lay-people. That's nice, he doesn't have to share, but he does anyways.

edit: meant to add at the end there "just as the OP elected to share his web findings of the blog with us here in the forums by creating this thread, thanks :thumbsup:"

So anyone feel up to the challenge of determining whether this drive-by poster in spring 2008 really was "in the know" and as such really was divulging NDA secrets in their post?

Originally posted by: sHeFn
Insiders info...

Buldozer, konveyr four stage, two shared FMA of four core, ALU and FPU now shared too, four instructions per clock, fusion CMP/TEST & Jcc.
I can't say more... NDA

http://forums.anandtech.com/me...AR_FORUMVIEWTMP=Linear

 

KingstonU

Golden Member
Dec 26, 2006
1,405
16
81
A for finding this.


Finally someting for me to peruse even if I don't understand more than half of it...ok most of it, I drink it all up none the less

EDIT: WHOA! When did I become a senior member?!?!?
 

Zensal

Senior member
Jan 18, 2005
740
0
0
Good stuff.

Now I'm gonna be stuck in Wikipedia for an hour trying to learn all the things I need to know to actually understand this.
 

JackyP

Member
Nov 2, 2008
66
0
0
Idontcare you're not exactly one of those laypeople, are you? Do you know what cluster based multithreading is? Bulldozer presumably is going to make us of that and it's supposed to be 'better' than SMT according to those slides.
Is this the correct description:
"These patent application describe ways to execute a single thread on both clusters. This could be done by having a thread run ahead for early prefetches memory or by executing both ways of a branch in parallel and scrap the wrong way after branch resolution. A different variant is the parallel execution of the same code to gain reliability of the results by comparing them afterwards."
I thought that would be terribly inefficient?
 

Fox5

Diamond Member
Jan 31, 2005
5,957
7
81
The 'multi-threading done right' slide is from 2005....

SMT at the time is referring to the P4. Switch on event multi-threading sounds like standard, single core multi-threading.
Chip multiprocessing sounds like multiple cpus/sockets.
Cluster-based multi-threading could be referring to phenom. 50% area investment for 80% throughput gain. The L3 cache makes up such a large portion of the die (as compared to dedicated l2 caches for each core) that you could claim that adding more cores would give you only a 50% increase in area for 80% gain in throughput.

I wouldn't look too deeply into a 4 year old slide meant to promote amd's upcoming products.
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
Originally posted by: Idontcare
Originally posted by: sHeFn
Insiders info...

Buldozer, konveyr four stage, two shared FMA of four core, ALU and FPU now shared too, four instructions per clock, fusion CMP/TEST & Jcc.
I can't say more... NDA

http://forums.anandtech.com/me...AR_FORUMVIEWTMP=Linear

2 FMAs for 4 cores? That's... interesting, in a bad way. If that diagram is any good, it would imply instead one 256-bit wide FMA per core which makes a lot more sense for performance.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: JackyP
Idontcare you're not exactly one of those laypeople, are you? Do you know what cluster based multithreading is? Bulldozer presumably is going to make us of that and it's supposed to be 'better' than SMT according to those slides.
Is this the correct description:
"These patent application describe ways to execute a single thread on both clusters. This could be done by having a thread run ahead for early prefetches memory or by executing both ways of a branch in parallel and scrap the wrong way after branch resolution. A different variant is the parallel execution of the same code to gain reliability of the results by comparing them afterwards."
I thought that would be terribly inefficient?

Compared to the designers and experts working on this stuff firsthand at AMD I am most definitely a layperson.

The quote you pulled actually reads to me more like speculative multithreading than anything else, and yeah speculative multithreading is supposed to be terribly inefficient in terms of performance/watt and silicon real-estate allocated to supporting those duplicative compute circuits.

As for "what is clustered multi-threading" it is basically the idea of taking a very wide computation unit (like a 256bit ALU) that can run in "monolithic" fashion on single threads but then you bust-up (virtually) the same hardware resources into a cluster of multi-thread processing resources (say 2x128bit in this case, or 4x64bit) so you can run threads in parallel (truly parallel, not timeslice) on the same hardware that you would otherwise have ran single-threads on.

Checkout this graphic to better see what I am attempting to speak to.

Kuzi had an awesome thread on AMD Bulldozer speculation back in May that in its life ended up getting jam packed full of lots of good infos and sub-topics from all manner of knowledgeable posters here on the forums. Near the end (check last 30 posts or so) we really started talking about the implications of our interpretations of what clustered multithreading entailed.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,786
136
Originally posted by: Fox5
The 'multi-threading done right' slide is from 2005....

SMT at the time is referring to the P4. Switch on event multi-threading sounds like standard, single core multi-threading.

Switch on Event Multi-Threading differs from SMT that it only switches threads on long latency events like a last level cache miss. It takes even less resources(die size for one) than SMT but it has limits on how much performance gain you can get as it doesn't directly improve execution unit utilization like SMT does.

90nm Itanium features SoEMT.
 

MODEL3

Senior member
Jul 22, 2009
528
0
0
I don't know about the link, (I have to read like 10 articles more to understand the terms/diagrams that he is using!)

But if you check the "performance incresements per MHz" that AMD was capable per architecture transaction, the last decade
and you take the MAX (A64), things are not looking well for AMD!

They must do to Phenom II, a "Athlon XP to Athlon 64" performance incresement per MHz with Bulldozer,
just to catch up (if) the "performance per Mhz" of a Q1 2009 Intel architecrure! (first Nehalem)

If you add up the "performance increasements per MHz" and the Clock speed increasements
that Nehalem is going to have from Q1 2009 to Q1 2011 2 year's gap
(If AMD execute their plan flawlessly to launch the Bulldozer in Q1 2011, i am not so sure, just a feeling!)

then AMD will need to do to Phenom II, something like a "Pentium 4 9XX to Core 2" performance incresement per MHz with Bulldozer,
just in order to catch up (if) with the performance of a high end Intel Q1 2011 part!

For me the odds are not looking good for AMD to take the Perf. lead in the next 5 years!

 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: MODEL3
For me the odds are not looking good for AMD to take the Perf. lead in the next 5 years!

For it to happen they need Intel to do what they did the last time it happened, take a step backwards in the IPC dept with something like Netburst II again. But I don't see it happening.

I think we'd all be happy to see the same sort of performance improvements that came from the K6-2 -> Athlon K7 transition.

The K7 didn't put AMD into a dominant lead, it went back and forth every 3 months as AMD and Intel vied to out-do each other in the run-up to 1GHz, but it did make their architecture competitive with the PII/PIII which resulted in a fantastic pace of price cuts and new SKU releases for about 2 yrs.

If Bulldozer architecture can be to PhII (K10.5 Stars core architecture) what the K7 was to the K6 then we'll all get a nice bump in what we can get performance/dollar wise.
 

MODEL3

Senior member
Jul 22, 2009
528
0
0
Originally posted by: Idontcare
Originally posted by: MODEL3
For me the odds are not looking good for AMD to take the Perf. lead in the next 5 years!

For it to happen they need Intel to do what they did the last time it happened, take a step backwards in the IPC dept with something like Netburst II again. But I don't see it happening.

I think we'd all be happy to see the same sort of performance improvements that came from the K6-2 -> Athlon K7 transition.

The K7 didn't put AMD into a dominant lead, it went back and forth every 3 months as AMD and Intel vied to out-do each other in the run-up to 1GHz, but it did make their architecture competitive with the PII/PIII which resulted in a fantastic pace of price cuts and new SKU releases for about 2 yrs.

If Bulldozer architecture can be to PhII (K10.5 Stars core architecture) what the K7 was to the K6 then we'll all get a nice bump in what we can get performance/dollar wise.

Well the K6-3 to K7 transition was a good one!
But I said the last 10 years transitions!
I think K6-2 was 1998 and K7 1999!

Anyway this doesn't matter, I didn't used so old transitions because the improvements rate back then was probably easier than what can be achieved now,
and also many improvements back then was because of implementation of additional instruction sets like MMX/3Dnow/SSE etc... which was very useful for many applications at the time
(Now we don't see instruction sets that have so big and wide impact/usage in the general performance level like the original MMX/SSE)

Actually I think with the thunderbird AMD had a little bit faster CPU, but because the m/b chipsets were not so good like the Intel ones (did anyone remember chipsets like the i440BX?) the system perf. was about the same!

Anyway in general i agree with what you are saying, and also i hope like you, that the AMD bulldozer is going to bring good performance incresements! (actually i would like very much for AMD to regain the perf. crown, competition is always good for the consumer!)

 

Idontcare

Elite Member
Oct 10, 1999
21,110
59
91
Originally posted by: MODEL3
Originally posted by: Idontcare
Originally posted by: MODEL3
For me the odds are not looking good for AMD to take the Perf. lead in the next 5 years!

For it to happen they need Intel to do what they did the last time it happened, take a step backwards in the IPC dept with something like Netburst II again. But I don't see it happening.

I think we'd all be happy to see the same sort of performance improvements that came from the K6-2 -> Athlon K7 transition.

The K7 didn't put AMD into a dominant lead, it went back and forth every 3 months as AMD and Intel vied to out-do each other in the run-up to 1GHz, but it did make their architecture competitive with the PII/PIII which resulted in a fantastic pace of price cuts and new SKU releases for about 2 yrs.

If Bulldozer architecture can be to PhII (K10.5 Stars core architecture) what the K7 was to the K6 then we'll all get a nice bump in what we can get performance/dollar wise.

Well the K6-3 to K7 transition was a good one!
But I said the last 10 years transitions!
I think K6-2 was 1998 and K7 1999!

Anyway this doesn't matter, I didn't used so old transitions because the improvements rate back then was probably easier than what can be achieved now,
and also many improvements back then was because of implementation of additional instruction sets like MMX/3Dnow/SSE etc... which was very useful for many applications at the time
(Now we don't see instruction sets that have so big and wide impact/usage in the general performance level like the original MMX/SSE)

Actually I think with the thunderbird AMD had a little bit faster CPU, but because the m/b chipsets were not so good like the Intel ones (did anyone remember chipsets like the i440BX?) the system perf. was about the same!

Anyway in general i agree with what you are saying, and also i hope like you, that the AMD bulldozer is going to bring good performance incresements! (actually i would like very much for AMD to regain the perf. crown, competition is always good for the consumer!)

(K6-3 was just K6-2 with on-die L2$...no architecture or ISA differences)

Resist the temptation to relegate history as being irrelevant to current day situations just because it falls beyond a threshold of recent memory...consider that Dirk Meyer (AMD's CEO) was hired on to AMD specifically to lead the development of the K7.

If anyone could direct a company to do another K6 -> K7 leap in architecture transition it's Dirk. And if he can't do it then I don't think there is anyone out there for hire at the moment that could. (and Dirk was only available for hire at that time because DEC had just imploded, pushing the Alpha design guys out the door and into the unemployment lines)
 

MODEL3

Senior member
Jul 22, 2009
528
0
0
Originally posted by: Idontcare
Originally posted by: MODEL3
Originally posted by: Idontcare
Originally posted by: MODEL3
For me the odds are not looking good for AMD to take the Perf. lead in the next 5 years!

For it to happen they need Intel to do what they did the last time it happened, take a step backwards in the IPC dept with something like Netburst II again. But I don't see it happening.

I think we'd all be happy to see the same sort of performance improvements that came from the K6-2 -> Athlon K7 transition.

The K7 didn't put AMD into a dominant lead, it went back and forth every 3 months as AMD and Intel vied to out-do each other in the run-up to 1GHz, but it did make their architecture competitive with the PII/PIII which resulted in a fantastic pace of price cuts and new SKU releases for about 2 yrs.

If Bulldozer architecture can be to PhII (K10.5 Stars core architecture) what the K7 was to the K6 then we'll all get a nice bump in what we can get performance/dollar wise.

Well the K6-3 to K7 transition was a good one!
But I said the last 10 years transitions!
I think K6-2 was 1998 and K7 1999!

Anyway this doesn't matter, I didn't used so old transitions because the improvements rate back then was probably easier than what can be achieved now,
and also many improvements back then was because of implementation of additional instruction sets like MMX/3Dnow/SSE etc... which was very useful for many applications at the time
(Now we don't see instruction sets that have so big and wide impact/usage in the general performance level like the original MMX/SSE)

Actually I think with the thunderbird AMD had a little bit faster CPU, but because the m/b chipsets were not so good like the Intel ones (did anyone remember chipsets like the i440BX?) the system perf. was about the same!

Anyway in general i agree with what you are saying, and also i hope like you, that the AMD bulldozer is going to bring good performance incresements! (actually i would like very much for AMD to regain the perf. crown, competition is always good for the consumer!)

(K6-3 was just K6-2 with on-die L2$...no architecture or ISA differences)

I didn't say anything different!
K6-3 was a little faster than K6-2 and when K7 came to market the faster AMD processor processor was the K6-3 not the K6-2! (But it doesn't matter anyway)



Resist the temptation to relegate history as being irrelevant to current day situations just because it falls beyond a threshold of recent memory...consider that Dirk Meyer (AMD's CEO) was hired on to AMD specifically to lead the development of the K7.

I didn't say that it is irrelevant! I just said the following:

Originally posted by: MODEL3
I didn't used so old transitions, because the improvements rate back then was probably easier than what can be achieved now,
and also many improvements back then, was because of implementation of additional instruction sets like MMX/3Dnow/SSE etc... which was very useful for many applications at the time
(Now we don't see instruction sets that have so big and wide impact/usage in the general performance level like the original MMX/SSE)


If anyone could direct a company to do another K6 -> K7 leap in architecture transition it's Dirk. And if he can't do it then I don't think there is anyone out there for hire at the moment that could. (and Dirk was only available for hire at that time because DEC had just imploded, pushing the Alpha design guys out the door and into the unemployment lines)

Are you looking for a job at AMD?
Just kidding!
Well, I see you have a lot of faith in Dirk'a abilities! I don't know him, so i have no personal opinion, but I believe you!

 

4lpha0ne

Junior Member
Apr 30, 2004
4
0
0
So anyone feel up to the challenge of determining whether this drive-by poster in spring 2008 really was "in the know" and as such really was divulging NDA secrets in their post?
Insiders info...

Buldozer, konveyr four stage, two shared FMA of four core, ALU and FPU now shared too, four instructions per clock, fusion CMP/TEST & Jcc.
I can't say more... NDA
Just want to update this. Correct link is now http://forums.anandtech.com/showpost.php?p=25623509&postcount=34

CMP/TEST & Jcc fusion is in Bulldozer according to the GCC mailing list. See: http://gcc.gnu.org/ml/gcc-patches/2010-04/msg01464.html
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |