This is interesting. I'm going to have to bitch-slap my sources. I guess they got confused by Intel's emulation layer for Windows and assumed that the chip couldn't natively do it. Evidently, the Windows emulation layer is just better than the hardware support?
To be honest I wasn't completely certain that it would work either. So I walked over and asked someone who works on the systems. He said that he'd booted Windows 95 in 32-bit mode, but had never tried MSDOS. But he said that he was certain that it would work. There is native hardware support for IA32 and IA16 on the Itanium and Itanium 2 (McKinley and Madison). There is also a binary translation mode that does a form of software emulation.
Can pm or someone here explain this to me? WHat exactly does x86 architecture mean? And what do "optimizations" such as sse/sse2/ do? thnx.
This will get pretty far off the original subject of the thread, so I apologize to the person who started the thread for somewhat hijacking it .
"x86" is a generic term that refers to binary compatibility with any x86 chip that Intel has produced (note that one might argue that this definition is very 'Intel-centric' but I think that it's honestly the correct definition) . The x86 family includes among others: the 8086, the 80286, the i386, the i486, the Pentium, etc. So a chip that is compatible with the "x86 architecture" should be able to run any program that was designed to run on any of these chips and any other previous microprocessor that is part of the "x86 family". Nowadays we take it for granted that you could run a program written 14 years ago to run on a 80286 microprocessor will work fine (and much faster) on a modern microprocessor like the Pentium 4, but throughout the greater history of the computer backwards compaibility with previous generations has been pretty rare. The term "x86" is IMO generic since it doesn't distinguish between 16-bit code and 32-bit code and I personally prefer to use IA16 and IA32 instead to be more specific.
As the microprocessor has developed, new instructions have been added such as MMX, SSE, and SSE2 among others. The purpose of all three of these additional instruction sets is primarily to be able to process multiple chunks of data simultaneously with one instruction. This is usually refered to as SIMD - Single Instruction Multiple Data. Other examples of SIMD include Motorola's Altivec and AMD's 3DNow instruction sets.
An example of a SIMD instruction taken completely at random would be the SSE instruction ADDPS - which adds 4 single precision FP numbers in one register to 4 single precision FP numbers in another register all with one instruction. Where this would normally take 4 instructions - if not more - this one instruction processes several numbers in parallel. This kind of parallel functionality is very useful in multimedia, 3D rendering and encryption and can speed up performance in these applications substantially.
But, of course, you only see these performance gains if these new instructions are used. If the program you are running never tries to use any of the SIMD instructions - possibly because it was written/compiled before these new instructions existed - then it will never see their benefit. So that leads to the final part of the question which is optimizing code for SIMD. In many cases, software code (such as written in the C programming language) can simply be recompiled with the latest compiler which knows about these instructions to be able to use them. But to really get the full benefit from any SIMD set of instructions, in most cases the authors of the software need to hand-optimize (write their code in the native language of the microprocessor rather than a higher-level language like C) their code to run best under a SIMD instruction. For example if they have a set of code that performs an operation on a matrix of numbers, in order to get the highest performance from their code they will probably try and hand-write a routine that implements this operation using SIMD. So optimizations for any of the SIMD instruction sets can take two forms: hand-optimized code which can be a difficult task but can yield very high performance gains, or recompiled optimizations where you just stuff your program into the latest compiler and tell the compiler to optimize for certain instructions sets.
For more information on the history of the microprocessor and the "x86 family" I highly recommend borrowing from your local library "The Microprocessor: A Biography" by Michael S. Malone. It's getting a little dated now, as it was written in 1995, but this actually allows it to focus more on the early history of the microprocessor. It's a great book - although a bit "fluffy" when it's not taking about history. It also has a good section on how microprocessors work.
For more information on SIMD and optimization, I would recommend typing expressions like "SSE optimize" and "MMX optimize" into Google.