64bit, really all that?

Vee · Jun 22, 2004

Originally posted by: glugglug
Vee: is "52-bit" addressing a clean linear mode or is that segmented crap like PAE to get the extra 4 bits?

I'll just pretend I understood the question, and we'll take it from there.

(I think AMD64 supports PAE inside legacy mode) But no, we are free from Intel's dreaded segmented addressing. 64-bit mode does the opposite to Physical Address Extension. It goes from a larger virtual space (64/48) (48-bit in K8) to a smaller physical space (52/40) (40-bit in K8).
In short, no, there's a minimum of "segmented crap". It's linear in everything that matters, and can be.

The AMD86-64 architecture as defined, supports physical addresses up to a maximum of 52 bits (4 Petabytes). In that case we are using all 64 bits for linear virtual addresses. We can't use more than a total of 4 Petabytes, of course, but we can have the memory blocks of any size (well...) anywhere in the 64-bit range. But all that is just AMD86-64 theory.

In practice (in the K8 cpus) we are limited to physical addresses up to a maximum of 40 bits (1 Terabyte). That also limits our linear virtual address range to the lower 48 bits. As before, we cannot use more than 1 Terabyte, but again we can have the memory blocks anywhere in the 48-bit range.

The mapping of addresses from the 48-bit linear space to 40-bit physical space is done by a paging mechanism. I guess the principle is similar to Windows32's mapping of 32-bit linear virtual addresses to physical addresses in 4KB pages.
This is sound since it allows the OS to avoid fragmentation of physical ram, and also allows for storage of memory pages in swap.

In the case of segmented legacy modes, in 'compatibility mode', these are first translated to a linear address with segmentation similar to original modes. However, I think the segment selectors are used directly, combined with the address, to form the linear address. No page descriptors, thus. There seems to be tons of options here though, and we'll have to read up on Windows64 to understand exactly how MS have implemented this. Anyway we now have a linear address, from the segmented mode. This linear address now corresponds to 64-bit (48-bit) mode's linear virtual address. And is then mapped by the same paging mechanism to physical address.

I think we must read the details, about the paging to physical pages, in MS Windows64 documentation, (or Linux kernel stuff).
There is also 'AMD64 Programmers Manual, Volume 2: System Programming'
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf
But I suspect the OS will be the important part in this, and the cpu manual will only provide us with a headache in this case.
- Sorry, I do not have a link to a suitable Windows86-64 document.

Vee · Jun 22, 2004

Originally posted by: Vee

Originally posted by: Gamingphreek
You misunderstanded me. There are CPU's that 128 and 256bit capable. The are the Crusoe and Efficeon. I think Efficeon is the 256bit one.

Click to expand...

OK. My point was you need to consider what those bits mean, in computing context, when making comparisions. I do not know terribly much about Transmeta's CPUs. I tried to read some of the early documents, but it was awfully unconventional and complicated. But my rough guess is still that in the comparable sense that the Crusoe is 128-bit, so is the P4 "128-bit". This in the meaning of what width of bits can be held in a register and operated upon. I'll check up on the Crusoe and maybe get back.

OK, here's a brief update on this. Some shallow and inconclusive research, seem to indicate to me that the registers beneath the morph layer in Crusoe are 32-bit and 80-bit (FPU). Paths seem to be 64-bit. The "128-bit" and "256-bit" properties seem to simply be the length of the VLIW instruction.
So the Crusoe seem to be a "32-bit" CPU and Efficeon a "64-bit" CPU.

White Widow · Jun 22, 2004

The 64-bit addressing in the current A64 chips is almost useless. Even if we had WindowsXP-64 right now, the only software that would see substanital gains solely from 64-bit wide registers would be database application (which adress huge amounts of memory), or other applications that require great precision (like physical modeling). This is because right now these huge (or precise) values must be split, fit into multiple registers, and the resulting values recombined. This is very inefficient. The reason current A64 chips perform so well on current 32-bit software is due to completely separate architectural enhancements in the K8 core that have nothing to do with it's 64-bit capabilitites.

As I see it, there are 2 fundamental benefits to x86-64 ISA. The first is that is allows for the addressing of 8 more GP registers (for a total of 16). The second is that these new registers are 64 bits wide. The first feature will show benefits in the short-to-medium term. The second feature will not really be beneficial to most people for quite some time.

Modern CPU's run much faster internally than the memory systems that support them (L1, L2, RAM, HD, etc.). Right now, 32-bit x86 code has only 8 GP registers with which to do all processing - and incidently 2 of these registers are really only used for stack operations. So in fact, assembly programmers are effectively stuck with only 6 GP registers! That means a lot of movement to/from memory, as various registers need to be populated/de-populated to carry out even simplistic computations and code execution. When a CPU has to read or write a value somewhere other than it's internal registers, the CPU must wait and performance drops. It's incredible to think that everything that gets done on today's 32-bit platforms go through only 6 GP registers. Amazing.

By going from 8(6) to 16 registers, the A64 CPU's can do a lot more work without having to stop and talk to - populate/depopulate - system memory. However, without 64-bit software (a 64-bit OS and 64-bit applications) these extra registers just sit idle and completely unused - this is because there is no means to identify more than 8 GP registers x86.

It is also true that 64-bit register support has the potential to improve performance on the desktop. This is because certain software operations that formerly required multipl registers can be more efficiently coded to do the same work using fewer registers. But in order for this kind of benefit to materialize, we will have to wait for new, 64-bit applications to be written and compiled using optimized 64-bit compilers. This is a slow process at best, and the lag between 64-bit support and native 64-bit software will be substantial. I have seen no realistic and legitimate estimate about the potential gains in DESKTOP sotware from a 64-bit development process, but I would be shocked if it exceeded 25% for most of the software that is used today. Most high-end games, for example, are GPU limited anyway, so improvement in CPU performance will have less than revolutionary effects on gameplay. And do we really care if MS-Word runs 100% faster anyway? I would imagine that the kinds of dekstop/wrokstation applications that stand to show the most gains from access to 64-bit registers are media encoding, CAD, and 3D design. But again, all these applications will need to be competely re-written to take advantage of the new 64-bit registers.

Having said all this, I am NOT a software engineer. I would really like to hear from people who are writing 64-bit code for these chips. To them I would ask:

- are you more excited about having more registers to play with, or simply the fact that the new registers are 64-bit?

Important Points

- Current performance on A64 CPU's has nothing to do with their 64 bit capability

- 64-bit registers primarily benefit only those applications that require either great precision or manipulation of huge numbers (ie. huge memory addresses).

- It is through the addition of more GP registers - NOT their 64-bit nature - that accounts for the majority of performance improvement in x86-64 CPU's over x86 CPU's, especially in the near term.

jhu · Jun 22, 2004

actually, i'd like to hear from someone who works at amd as to why only 8 additional gprs were added. perhaps it was felt that adding additional 8 additional xmm registers as well was sufficient?

White Widow · Jun 22, 2004

Originally posted by: jhu
...why only 8 additional gprs were added

Because AMD was trying to keep the opcode extension reasonable. x86 instructions already give 3 bits to specify which of the 8 registers to use for source, destination, base, and/or index. To go to 32 register would mean you would need 2 more bits, or a total of 6 more bits to specify all the above. This would mean you would need two bytes to use as an extension to the normal x86 opcode - one byte to indicate the next byte contains the extension bits. You could do it, but it makes the code much larger.

What AMD did was to redefine 16 opcodes (the one byte INC reg/DEC reg opcodes) to give themselves four bits: 1 for the operation size override, and three to give 1 extra bit for each of the register specification fields for a total of 4 bits - 16 registers.

Source - here

Jeff7181 · Jun 22, 2004

Well said, White Widow.

I almost wish AMD didn't call it an Athlon 64 because people take the 64-bit capabilities too seriously. Call it an Athlon XT or Athlon 3 or Athlon 8 or K8.

jhu · Jun 22, 2004

oh yeah, i remember reading about that in their application manual several months ago.

Giscardo · Jun 22, 2004

I think Nemesis would stop getting in so much trouble if he knew why encoding and encrypting got such a high speed increase in the 64 bit version.

In a 64bit CPU, the CPU deals with pieces of data in chunks of 64 bit at a time. No magic buzz phrases like "64 bit computing" and "64 bit processing", and welcome to the world of "64 bit" should be used because these things are just marketing terms that make you think there is a whole new realm being tapped in a 64 bit CPU.

That being said, in applications like zipping (and I'm not sure but I think video encoding also), the speed of the application depends on how fast the computer can traverse the large dataset it's dealing with (ie the data that you want to compress, the video you want to encode). In a 64 bit CPU, the CPU happens to be able to look at at pieces of data twice the size as a PPro/P2/P3/P4/K2/Athlon can, which allows it to run this fringe type of process faster. This speed benefit "OF TEH WORLD OF 64 BIT COMPUTING" will not be so dramatic in applications where the bottleneck is calculations like games, and professional 3d rendering.

Speed on the A64 benefits may/will still be seen in these other types of apps because of architecture changes, like the added on-die memory controller, and the increased register count.

If you ran some code that does geometry calculations over and over, on a 32 bit CPU vs a 64 bit CPU, the rest of the architecture remaining the same (use the same algorithms to do the multiply, and divide calcs etc), the only benefit from going to 64bit, is that you can do the operations on larger numbers. The speed would be pretty much the same.

glugglug · Jun 22, 2004

For a few specific applications, like encryption, the difference is much more than double. This is because encryption keys and the math used in doing the encryption/decryption is usually working with 64-bit values, which on a 32-bit CPU needed to be broken up, then the overflow dealt with etc, since 32-bit encryption keys have too few possibilities and are too easy to crack.

Nemesis2038 · Jun 23, 2004

Thanks Giscardo.

If anyone else has the opportunity to try and recreate what I doing and see for yourself this is a basic outline. I cannot give you the software to do this but I suspect you can get beta copies of the software if you ask kindly. I also cannot reveal the company but there is really only a handfull that even mention this level of video. Most are just playing the DVD mastering route which is the main stream. 5 e-mails will get you to the right company.

Basically I am downsampling 1080P/1080i files to 720P to a WMV9 format so some HDTV's will fit on a blank DVD. I still back up the 1080P/1080i Files. in 720P some still will cross to a second DVD however I hope that Dual Layered will have good compatibility with the WMV9 DVD players coming out later this year. Bravo D3 and Apex are to produce a DVD player that will play WMV9 up to 720P. This is a much sooner path than waiting for the DVD forum to get Blue Ray out the door. I also like this method because its using existing DVD's and not an overpriced cartridge unit. The only attractive thing about blue ray is its 27gig capacity but totally not needed when WMV9 players will be out. Dual Layered DVD-9 can easily have a full length movie in WMV9 1080P format. Wish they would up the ApexExtreme gaming machine up to the Athlon 2.4 so it will play 720P divx movies otherwise I may wind up with more machinery in the living room and buying 2 similar devices.

If you want to see the level of quality I am playing with then by all means rent Terminator 2 in the special edition version where on Disc 2 is Terminator 2 in HDTV format and play it back on your PC compared to the regular DVD 480P version. Of course to play the 1080P version you will need a monitor capable of resolutions in the 1920x1440. And you thought gaming at 1600x1280 was high. I am concentrating on the 720P format since most HDTV's and such cant do 1080P they will do 1080I and the 720P dvd players will be here soon. 480P is beautiful but 720p adds a level of depth that makes images 3d like. You get motion sickness easily at this level.

In all fairness my Opteron 64 machine has triple 36gig 15K SCSI drives config compared to the Triple 10K drives in the P4 machine. RAID 0 for speed. I am sure this adds a level of performance but I dont belive it reaches full speed because the drive are easily seen blinking and not full on lit drives. Havent fired up perfmon to see. Once its done its burned to DVD media. Ritek G04 preferrably.

I think I took a lot of heat because everyone out there is playing with DIVX and DVD format not HDTV formats. This is major on the cpu stress scale since to play back a 1080P requires 3.0Ghz but I would say TS files need at least 3.2Ghz. I believe the HDTV 32bit codecs need some more optimizing. Cant wait for hardware to take the load off the CPU. I am eager to get my hands on the HDTV card from ATI to see what can be done with it.

Jeff7181 · Jun 23, 2004

Originally posted by: glugglug
For a few specific applications, like encryption, the difference is much more than double. This is because encryption keys and the math used in doing the encryption/decryption is usually working with 64-bit values, which on a 32-bit CPU needed to be broken up, then the overflow dealt with etc, since 32-bit encryption keys have too few possibilities and are too easy to crack.

I thought most encryption now was 128-bit since DC encrytpion cracking projects are only up to 112 bit or something like that.

I know my wireless encryption is 128-bit, and I use 256-bit encryption when I create encrypted zip files... what's WinXP's NTFS encryption? I assume it's 128-bit, but I've been wrong before... more times than I care to admit

glugglug · Jun 23, 2004

NTFS encryption is 56-bit (so using 64 bit values you just leave the upper byte 0)

Most https sites are 40-bit, not 128-bit by default. The reason for this is that the tougher encryption schemes (including 128-bit SSL) are considered "munitions" and not legal to export from the U.S. Carries the same penalties as weapons trafficing. Your tax dollars at work.

Acanthus · Jun 23, 2004

I would love to see some screenshot benches nemesis.

I have seen NOWHERE that XP-64 and 64 bit apps have brought any performance gains over 40%, and the 40% was in fringe applications no one uses.

White Widow · Jun 23, 2004

Given the perfotmance benefits of running native 64-bit code, I would like to know how easy it is to port something to 64 bits. A simple Google search tells me there are many 64-bit compilers now available, and the similarity between x86 and x86-64 would seem to make code modifications relatively straightforward. Is this transition to 64 bit software going to be slow and painful, or roll like a snowball downhill?

Kermit · Jun 23, 2004

Originally posted by: Giscardo
In a 64bit CPU, the CPU deals with pieces of data in chunks of 64 bit at a time. No magic buzz phrases like "64 bit computing" and "64 bit processing", and welcome to the world of "64 bit" should be used because these things are just marketing terms that make you think there is a whole new realm being tapped in a 64 bit CPU.

That being said, in applications like zipping (and I'm not sure but I think video encoding also), the speed of the application depends on how fast the computer can traverse the large dataset it's dealing with (ie the data that you want to compress, the video you want to encode). In a 64 bit CPU, the CPU happens to be able to look at at pieces of data twice the size as a PPro/P2/P3/P4/K2/Athlon can, which allows it to run this fringe type of process faster. This speed benefit "OF TEH WORLD OF 64 BIT COMPUTING" will not be so dramatic in applications where the bottleneck is calculations like games, and professional 3d rendering.

Forgive me if I'm wrong, but as far as I know, SSE2 allows 2 64 bit float values to be processed at once, so using SSE there should not be a 32 to 64 bit advantage.

Vee · Jun 23, 2004

Originally posted by: White Widow
The 64-bit addressing in the current A64 chips is almost useless. Even if we had WindowsXP-64 right now, the only software that would see substanital gains solely from 64-bit wide registers would be database application (which adress huge amounts of memory),...

The wider virtual address range of 64-bit computing is the really big central issue. At least 99.99% of the reason for 64 bit integer/gpr registers is for handling pointers.
And desktop computing today is ready for moving to 64-bit addressing. It's not just data base apps. A number of applications nead more space today. I assume you know enough to realize that PAE schemes a la Oracle, is not the solution to that need. The need is not for more physical ram, the need is for a larger virtual space.
Aside from that, the wider address range is also going to open up for new applications, not possible today, new uses for existing applications, and new ways of implementing existing applications.

I've never got the feeling that people in general have realized what really changed with the 16 to 32 bit migration. It will be the same this time. Intuitively, people want to understand this as a width issue.
32-bit computing sounds as if it is twice as fast as 16-bit computing. 64-bit computing sounds as if it is twice as fast as 32-bit.
I mostly agree with your arguments regarding the width/performance issue.
Computing widths stay the same in the 64-bit environment, chars 8 and 16, floats 32 and 64, integers still default to 32. I have only two things to add:

Achieving performance gains from increasing width of computing, do not require a new (64-bit) computing platform. Traditionally, that goal has been achieved with instruction/register extensions:
'387 FPU, MMX, 3DNow, SSE, 3DNow+, SSE2. For example, if cryptation would require a boost, add instructions handling 128-bit integers in the already existing 128-bit SSE2 registers.

Performance is a secondary issue. But there are four reasons to expect a performance improvement from '86-64. I agree again that the most important of those is the increased number of registers. That will require new compiler optimizations to fully exploit. - But even if the general performance gain would only be 20-40%, that is surely nothing to sneer at!

glugglug · Jun 23, 2004

Something else I just thought of.... the Java virtual machine is 64-bit, so the JVM will likely run a lot faster when the host CPU is 64-bit as well.
Not that applications which need this kind of performance are usually written in java anyway... but still another place where it will make a big difference.

Vee · Jun 29, 2004

Here are some early 64-bit applications showing 47% - 57% performance improvement.

http://www.amd.com/us-en/Corporate/VirtualPressRoom/0,,51_104_543~87018,00.html

But as I've already stressed several times in this thread, performance is not the really BIG thing about 64-bit computing. But this article also note that. In just a few years, we will routinely be doing things on our PC which are not possible with 32-bit addressing. I deeply resent, both the fact that Intel is so late, and that they and compliant media have excreted so much disinformation, claiming 32-bit will be viable until 2007 something.

eastvillager · Jun 29, 2004

32bit vs 64bit isn't a decision point for me, at this time. The only 64-bit code I regularly run, I run on ultrasparc anyways.

By the time I'm using mostly 64-bit code at home, I'll be using 1 or more multi-core processors that aren't even available today.

The people who need 64-bit on the desktop today are pretty much only developers who have to compile on their desktop box, imho.

edit: I did notice somebody posted that you shouldn't count out a processor just because it is 64-bit and you don't need 64-bit. That is a very valid point for Athlon64. I've got a beater notebook with a64 in it, and it handles 32-bit code just fine.

DAPUNISHER · Jun 29, 2004

Originally posted by: eastvillager
32bit vs 64bit isn't a decision point for me, at this time. The only 64-bit code I regularly run, I run on ultrasparc anyways.

By the time I'm using mostly 64-bit code at home, I'll be using 1 or more multi-core processors that aren't even available today.

The people who need 64-bit on the desktop today are pretty much only developers who have to compile on their desktop box, imho.

I think you are right, and that there's no need for most of us to have 64bit capabilities in our desktop right now, and that by the time it's useful/practical the dual core chips will be out. However, I didn't buy a A64 for it's 64bitness, I did it for it's performance in 32bit. I too will have a dual core beauty when the time comes, but till then I will enjoy everything the A64 has to offer now. I also like the idea that when this chip gets pushed to #2 in my LAN's hierarchy that I'll then have multiple systems capable of 64bit computing. Perhaps 64bit will even extend the useful life of my present CPU so I can't see the downside to having bought a skt754 A64 setup over 6 months ago.

eastvillager · Jun 29, 2004

ok, that's a good distinction. however, as i said before, practically all modern cpus today can handle all three of those. just look at www.top500.org. most of those supercomputers contain p4s, xeons, opterons, itaniums, ppcs, etc (i think there's even an athlon in there somewhere).

most of those top500 supercomputers aren't one computer, they are large, distributed compute farms made up of relatively small 'servers'. Generally, when you build one of those, you don't care how well the individual CPU is tailored to a given task, you just want to throw as many as possible relatively inexpensive processors at the problem for a given amount of $$$. You need very little backplane IO, as you're usually just computing a small chunk of the dataset on each element of the cluster, and you're not doing any significant multitasking in each of those elements.

The're not indicative of 'enterprise' class computing, either, they're indicative of supercomputing, or high performance computing, high megaflops, very little IO.

When people talk about 'enterprise' computing, at least in the environments I work in(large UNIX servers, mostly Sun), they're talking about very good SMP on an OS with an excellent scheduler, where you get better than 75% of each additional processor you add to the box, while maintaining very high IO across your backplane/centerplane for typical server loads(think oracle in a transaction processing environment of 500 or more concurrent users). Most of your 'desktop' processors aren't built to play in this arena, especially if you hamstring them with a Microsoft OS.

I used to run one of the sites in the top500, just barely got in with 256 ultrasparc II processors in that farm.

DAPUNISHER · Jun 29, 2004

I used to run one of the sites in the top500, just barely got in with 256 ultrasparc II processors in that farm.

:shocked: :beer:

mrSHEiK124 · Jun 29, 2004

just look at it this way (just be simple and to the point) what performs better in TODAY's applications in an operating system availible in the stores TODAY, an Athlon XP 3200+ or an Athlon 64 3200+, the 64, why, not because of x86-64, cause at the moment, you aren't using them, its because of a better/improved architecture, on die memory controller, etc etc. but the 64 3200 costs more than the 3200+ XP, but if this is one of those futureproof pcs you are trying to build (i've tried, it doesn't work, just buy mid-end parts and upgrade yearly rather than going for the cream of the crop and waiting 2-3 years) then go ahead an use the Athlon 64, it performs very well in 32-bit apps, is catching up to the P4 in encoding/zipping, and when (I SAID WHEN) a 64-bit os is availible with 64bit apps, then i guess you'll be ready. just my .02

White Widow · Jun 29, 2004

Sheik - If we're all idiots, how are you any different. Many people who have posted to this thread express the exact same opinion you so obnoxiously present. While name calling is generally never a good way to participate in a discussion, I would suggest you AT LEAST have a reason for doing so if you choose to be so juvenile.

64bit, really all that?

Senior member

Senior member

Senior member

Lifer

Senior member

Lifer

Lifer

Senior member

Diamond Member

Member

Lifer

Diamond Member

Lifer

Senior member

Member

Senior member

Diamond Member

Senior member

Senior member

Super Moderator CPU Forum Mod and Elite Member

Senior member

Super Moderator CPU Forum Mod and Elite Member

Lifer

Senior member