Originally posted by: Cookie Monster
I think G80 is a "unified" shader architecture.
But its abit different than your normal definition for a USA. (unified shader architecture)
Now, from the rumours from various sources, it leads the G80 having split into 2 sections.
Dedicated Pixel shaders and unified VS/GS.
For DX10, we know that a shader can be addressed as a PS/VS/GS hence giving birth to USA for future DX10 GPU.
Now from the vr-zone, it said 2 x pixel shader performance and 12 x VS performance.
48 dedicated pixel shaders hence 2x24 = 48. Since the VS/GS is unified with a total of 96 shaders, this can equal for the card being 48VS/48GS if configured to be split into two. Also since its unified it means that it can have a total of 96 VS, hence 8x12=96.
So, basically G80 might have 48 PS, 48 VS and 48 GS. So with a total of 128 shaders, where 48 is dedicated and 96 dedicated in VS/GS.
For the 384bit, i think this might help for some to understand the reason behind it.
From B3D quoting
jawed
My earlier idea that the 256-bit bus is for normal work and the "odd" bus is for Constant Buffers, Streamout Buffers and post-GS cache:
http://www.beyond3d.com/forum/showpo...&postcount=693
Maybe it's a 256-bit bus to conventional local RAM plus a 128-bit bus to a pool of memory dedicated to:
constant buffers
post geometry shader cache
stream-out
is making more sense now... In GTX the odd bus is 128 bits wide, total 768MB. In GT the odd bus is 64 bits wide (with half the memory, only 64MB), total 640MB. Also makes the board a bit cheaper to make.
In many ways, its arguable that the access patterns to these buffers are quite unlike the normal access patterns for local RAM. Well, that's certainly the case for Streamout and post-GS cache (which are tiled-write, serial read). Constant buffers (and the associated Texture buffers?) are more like textures in one sense, so maybe they'll live in regular memory (which is optimised for tiled-write, tiled-read).
If the two-chip rumour has any foundation then I could see something like:
VS/GS die <->VB,CB,TB,SO,PGSC odd-RAM->PS die<->VRAM
Note that the VS/GS pipes don't generally need access to textures, TBs and CBs serve those functions (as well as VB - vertex buffers for input data). The CPU would send vertices directly to the odd RAM, as a serial stream and it would directly update the CBs in odd RAM too.
The PS die then has read-only access to the odd RAM (need a better name!!!) and all framebuffer operations work against VRAM. Obviously the PS die can access any TBs and CBs in the odd RAM, whilst fetching normal textures from VRAM.
Im also lead to believe that G80 will have quite impressive DX10 performance not to mention DX9 performance.
Edit - finally 700M is a possbility. Thats because Nvidia themselves said their next gen part was "...over half a billion transistors..." Which means >500M. So its all possible.