The Cell Processor by IBM and....

BoberFett

Lifer
Oct 9, 1999
37,563
9
81
Originally posted by: Holmecollie
Well I read som archived post mentioning the Cell proccessor by IBM and Toshiba and read the article @ gamespot

http://www.gamespot.com/ps2/news/news_6073040.html

Let's say this proccessor is that good, will it reach the mainstreem or end up powering supercomputers and $$$ servers?

Forgive my skepticism, but 100 times more power than a P4?

Do you really think the engineers at IBM are 100 times smarter than the engineers at Intel?
 

Robor

Elite Member
Oct 9, 1999
16,979
0
76
Originally posted by: Holmecollie
Well I read som archived post mentioning the Cell proccessor by IBM and Toshiba and read the article @ gamespot

http://www.gamespot.com/ps2/news/news_6073040.html

Let's say this proccessor is that good, will it reach the mainstreem or end up powering supercomputers and $$$ servers?
The real question is, how many of us need a CPU that fast? Plus as the article said it's going to take a lot of programming to get an OS and software together for it. But to answer your question I think it would windup where the $$$ is at.

 

DurocShark

Lifer
Apr 18, 2001
15,708
5
56
Originally posted by: BoberFett
Originally posted by: Holmecollie
Well I read som archived post mentioning the Cell proccessor by IBM and Toshiba and read the article @ gamespot

http://www.gamespot.com/ps2/news/news_6073040.html

Let's say this proccessor is that good, will it reach the mainstreem or end up powering supercomputers and $$$ servers?

Forgive my skepticism, but 100 times more power than a P4?

Do you really think the engineers at IBM are 100 times smarter than the engineers at Intel?

No, but they're not hindered by needing x86 compatability. I'm sure if the Intel engineers were allowed to do something totally unique and unsupported, they'd come up with something super powerful too.
 

cow123

Senior member
Apr 6, 2003
259
0
0
i read this thing a while back, anyway its not 100x faster than the p4 generally... just in floating point performance, also i wonder what p4s were out when that article was published... maybe willamettes?

edit: oh nevermind it said 2.5ghz
 

WarCon

Diamond Member
Feb 27, 2001
3,920
0
0
I am just curious how they are going to power/cool a beast that has 16 processors on one die? Even if they manage a 50% power reduction per processor, your still looking at 640 watts of power at full load (Based on current processor power usage). Maybe they aren't going to have much onboard cache, which would further reduce power needs.

If this becomes real and affordable, it will make P4/P5/Opteron a thing of the past as even a poor OS emulator that only goes 1/4 the speed will still be putting out 250 gflops.
 

PlatinumGold

Lifer
Aug 11, 2000
23,168
0
71
Originally posted by: DurocShark
Originally posted by: BoberFett
Originally posted by: Holmecollie
Well I read som archived post mentioning the Cell proccessor by IBM and Toshiba and read the article @ gamespot

http://www.gamespot.com/ps2/news/news_6073040.html

Let's say this proccessor is that good, will it reach the mainstreem or end up powering supercomputers and $$$ servers?

Forgive my skepticism, but 100 times more power than a P4?

Do you really think the engineers at IBM are 100 times smarter than the engineers at Intel?

No, but they're not hindered by needing x86 compatability. I'm sure if the Intel engineers were allowed to do something totally unique and unsupported, they'd come up with something super powerful too.

They were with the Itanium Processor. NO x86 compatability, clean sheet design. still not 100 time more powerful than a P4.
 

buleyb

Golden Member
Aug 12, 2002
1,301
0
0
Originally posted by: cow123
Itanium is still x86, its just IA64 x86

They support 32bit x86 through emulation.
**EDIT** meaning, like said below this post, IA64 is not x86



And making all these processors run together isn't tough. Making software that can use them for a reasonable cost, thats a hell of a lot tougher. Parallel systems are much harder to write code for, because timing and syncronization is such a overwhelming problem.



and making a 100x faster floating point engine isn't hard, its called altivec, and now its in parallel.
 

TerryMathews

Lifer
Oct 9, 1999
11,473
2
0
Originally posted by: cow123
Itanium is still x86, its just IA64 x86

Ummm... Say what? Itanium is not x86. It can't run x86 instructions. IA64 != x86. Opteron counts as x86 as it can run x86 apps.

Here's a perfect example. Every x86 machine should be able to run edit or edlin from MS-DOS. Good luck getting an Itanic to boot MS-DOS (without the assistance of Windows and emulation)
 

pm

Elite Member Mobile Devices
Jan 25, 2000
7,419
22
81
Good luck getting an Itanic to boot MS-DOS (without the assistance of Windows and emulation)
You wouldn't need luck. An Itanium boots to MSDOS fine. I don't know why you would want to, but you could easily. Without Windows or emulation. There is an internal hardware translation engine on the core of the Itanium and the Itanium 2 that translates IA32 instructions on the fly into IA64. The system will boot MSDOS, or Windows 3.11 if you want.

As far as Cell. My only reply is to wait and see. I seem to remember similar levels of enthusiasm for Transmeta's products. What they are doing is not what I would call "revolutionary", and it's my expectation that, while it may be signficantly faster at certain tasks, it won't be substantially faster for more general operations. It noteworthy that IBM itself says that "elements of its design will be seen in future server chips from IBM". Note that they are not saying "we are replacing our entire product line with Cell." IBM's current high end server processor is the Power 4 and it's not "100 times faster than a 2.5GHz Pentium 4".
 

BD231

Lifer
Feb 26, 2001
10,568
138
106
Good stuff, it's about time for another big jump in processing power/technology. I really hope processor companies start thinking about heat output though......, I'd get a 3+ghz P4 but the need for an air conditioner turns me off.

All purpose gaming/24-7server processors that requier no cooling and have more power than anyone could ever possibly need.
 

TerryMathews

Lifer
Oct 9, 1999
11,473
2
0
Originally posted by: pm
You wouldn't need luck. An Itanium boots to MSDOS fine. I don't know why you would want to, but you could easily. Without Windows or emulation. There is an internal hardware translation engine on the core of the Itanium and the Itanium 2 that translates IA32 instructions on the fly into IA64. The system will boot MSDOS, or Windows 3.11 if you want.

This is interesting. I'm going to have to bitch-slap my sources. I guess they got confused by Intel's emulation layer for Windows and assumed that the chip couldn't natively do it. Evidently, the Windows emulation layer is just better than the hardware support?
 

buleyb

Golden Member
Aug 12, 2002
1,301
0
0
Don't worry Terry, you'll be right in due time. Intel is removing/disabling the hardware emulation in future Itaniums in favor of a software emulation (as it performs better, who knew). So future Itaniums won't be able to boot MS-DOS natively
 

FearoftheNight

Diamond Member
Feb 19, 2003
5,101
0
71
Can pm or someone here explain this to me? WHat exactly does x86 architecture mean? And what do "optimizations" such as sse/sse2/ do? thnx.
 

wetcat007

Diamond Member
Nov 5, 2002
3,502
0
0
Originally posted by: FearoftheNight
Can pm or someone here explain this to me? What exactly does x86 architecture mean? And what do "optimizations" such as sse/sse2/ do? thnx.

I'll put it as plainly as possible, x86=Linux/Windows platform, where as Mac uses it's own platform, that's why u cant install windows onto a mac. x86 is a way to keep all parts compatible with the OS's without having many different platforms competing that need different hardware. Current day CPU's are 686, and previous generations were 586, 486, 386, and so one.

Optimizations are when a program uses a set of instructions set within the CPU which can often speed up when a program uses them. Now in the case of Intel, it's pushing for software companies to use them, instead of AMD Optimizations heh, since they then get the push of SSE2, without 3dnow professional support, it makes there cpu generally faster in applications that take advantage of that, generally media applications.

As for IBM's cpu's I don't really have any faith this will be all they claim, and the kind of heat it produces as well as the insanly complex instructions needed will result in havin it slowed down anyways.
 

pm

Elite Member Mobile Devices
Jan 25, 2000
7,419
22
81
This is interesting. I'm going to have to bitch-slap my sources. I guess they got confused by Intel's emulation layer for Windows and assumed that the chip couldn't natively do it. Evidently, the Windows emulation layer is just better than the hardware support?
To be honest I wasn't completely certain that it would work either. So I walked over and asked someone who works on the systems. He said that he'd booted Windows 95 in 32-bit mode, but had never tried MSDOS. But he said that he was certain that it would work. There is native hardware support for IA32 and IA16 on the Itanium and Itanium 2 (McKinley and Madison). There is also a binary translation mode that does a form of software emulation.
Can pm or someone here explain this to me? WHat exactly does x86 architecture mean? And what do "optimizations" such as sse/sse2/ do? thnx.
This will get pretty far off the original subject of the thread, so I apologize to the person who started the thread for somewhat hijacking it .

"x86" is a generic term that refers to binary compatibility with any x86 chip that Intel has produced (note that one might argue that this definition is very 'Intel-centric' but I think that it's honestly the correct definition) . The x86 family includes among others: the 8086, the 80286, the i386, the i486, the Pentium, etc. So a chip that is compatible with the "x86 architecture" should be able to run any program that was designed to run on any of these chips and any other previous microprocessor that is part of the "x86 family". Nowadays we take it for granted that you could run a program written 14 years ago to run on a 80286 microprocessor will work fine (and much faster) on a modern microprocessor like the Pentium 4, but throughout the greater history of the computer backwards compaibility with previous generations has been pretty rare. The term "x86" is IMO generic since it doesn't distinguish between 16-bit code and 32-bit code and I personally prefer to use IA16 and IA32 instead to be more specific.

As the microprocessor has developed, new instructions have been added such as MMX, SSE, and SSE2 among others. The purpose of all three of these additional instruction sets is primarily to be able to process multiple chunks of data simultaneously with one instruction. This is usually refered to as SIMD - Single Instruction Multiple Data. Other examples of SIMD include Motorola's Altivec and AMD's 3DNow instruction sets.

An example of a SIMD instruction taken completely at random would be the SSE instruction ADDPS - which adds 4 single precision FP numbers in one register to 4 single precision FP numbers in another register all with one instruction. Where this would normally take 4 instructions - if not more - this one instruction processes several numbers in parallel. This kind of parallel functionality is very useful in multimedia, 3D rendering and encryption and can speed up performance in these applications substantially.

But, of course, you only see these performance gains if these new instructions are used. If the program you are running never tries to use any of the SIMD instructions - possibly because it was written/compiled before these new instructions existed - then it will never see their benefit. So that leads to the final part of the question which is optimizing code for SIMD. In many cases, software code (such as written in the C programming language) can simply be recompiled with the latest compiler which knows about these instructions to be able to use them. But to really get the full benefit from any SIMD set of instructions, in most cases the authors of the software need to hand-optimize (write their code in the native language of the microprocessor rather than a higher-level language like C) their code to run best under a SIMD instruction. For example if they have a set of code that performs an operation on a matrix of numbers, in order to get the highest performance from their code they will probably try and hand-write a routine that implements this operation using SIMD. So optimizations for any of the SIMD instruction sets can take two forms: hand-optimized code which can be a difficult task but can yield very high performance gains, or recompiled optimizations where you just stuff your program into the latest compiler and tell the compiler to optimize for certain instructions sets.

For more information on the history of the microprocessor and the "x86 family" I highly recommend borrowing from your local library "The Microprocessor: A Biography" by Michael S. Malone. It's getting a little dated now, as it was written in 1995, but this actually allows it to focus more on the early history of the microprocessor. It's a great book - although a bit "fluffy" when it's not taking about history. It also has a good section on how microprocessors work.

For more information on SIMD and optimization, I would recommend typing expressions like "SSE optimize" and "MMX optimize" into Google.
 

zephyrprime

Diamond Member
Feb 18, 2001
7,512
2
81
The thing has 16 cores so to be 100 times faster than a p4 each core would have to be about 6x faster. But each core would also be much smaller than a p4 core. I just don't think it's possible. To do about 1Teraflop, each core would have to support something like a 32 word long FP vector and 2GHz.
 

zephyrprime

Diamond Member
Feb 18, 2001
7,512
2
81
You know, I've been thinking about it some more and I guess it could sorta be possible to attain something vaguely like the speed that is being claimed for the cell processor.

Alright, here's what I imagagine the cell processor will be like:

I reckon it will have about ~150-200 Million transistors and be made on a 0.09 micron process. So each core would have ~10M transistors. This is roughly equivalent to the number of transistors in an old P2 with off chip L2 cache.

So if each core ran at 2GHz with 32 word long vector capabilities it could attain 1 teraflop. Or, if it ran at 4GHz with 16 word long vector capabilities, it could also attain 1 teraflop.

The but instruction set that such a core could support would be very limited due to the limited number of transistors available. Also, some of the vector instructions may not be able to operate on an entire vector in one cycle. For example, it should be fairly cheap in terms of transistors to implement 16 word long add/sub/and/or/xor/shift/cmp/mov instructions but a mul instruction may only be able to act on 4 words at a time. And the transcendetal functions like cos/sin/log ? Forget about it. It may be neccessary to resort to good old software emulation to do these operations.

So in a way, such a processor I describe would be a sort of hybrid between a modern vector processor and an old school vector processor.

And such a processor would need humongous memory bandwidth.
This is just some speculating on my part so take it with a grain of salt.
 

anthrax

Senior member
Feb 8, 2000
695
3
81
IBM already have Processors with 600 million transitors on chip such as the POWER 4 seiries used on the high end P series servers....The chip, and the board I belive is housed a book mounting which intergratea huge heat sink..
 

0roo0roo

No Lifer
Sep 21, 2002
64,862
84
91
Originally posted by: BD231
Good stuff, it's about time for another big jump in processing power/technology. I really hope processor companies start thinking about heat output though......, I'd get a 3+ghz P4 but the need for an air conditioner turns me off.

All purpose gaming/24-7server processors that requier no cooling and have more power than anyone could ever possibly need.


never heard of zalman? quiet cooler works well on p4.
 

NateSLC

Senior member
Feb 28, 2001
943
0
0
Originally posted by: 0roo0roo
Originally posted by: BD231
Good stuff, it's about time for another big jump in processing power/technology. I really hope processor companies start thinking about heat output though......, I'd get a 3+ghz P4 but the need for an air conditioner turns me off.

All purpose gaming/24-7server processors that requier no cooling and have more power than anyone could ever possibly need.


never heard of zalman? quiet cooler works well on p4.

Zalman might help get that heat off the processor and into the room more efficiently, but he'll still need an A/C unit to get that heat out of the room.
 

0roo0roo

No Lifer
Sep 21, 2002
64,862
84
91
i forget how much a p4 3ghz puts out, but it can't be more then a 100watt lightbulb you live in hell or somethign?
 

zephyrprime

Diamond Member
Feb 18, 2001
7,512
2
81
IBM already have Processors with 600 million transitors on chip such as the POWER 4 seiries used on the high end P series servers
The Power4+ only has 184 million transistors.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |