There s only 208 bits active out of 256 when it comes to feeding the SMMs, 48 bits are inactive computing wise and thoses ones are used for the datas of the 768MB pool, do the maths starting from 4096MB, you ll have 3328MB adressed with the first 208 bits and 768MB adressed by the remaining 48 bits.
The discretanpcy comes from the fact that the 768MB adressed by the 48 remaining bits cant be processed by the functional SMMs, there s no crossbars to send the datas in the functional SMMs caches.
Not sure, all they have to do is to use the 768MB for anything, even close to being useless, and technicaly it will be a 4GB card, the eventual weak point is the advertised 224GB/s bandwith wich can be proved to not being accurate, the 256bit bus claim cant be attacked since the bus is effectively 256 bit, it s just that 48 bits are almost useless for about anything else than said marketing.
I think you are absolutely on the right path. But I think there is crossbars to swap the data on the GPU. From what I experience and what I gather.
I think physically swapping the ram is inconvenient enough by itself and this is why they set the card up to use up the 3.5gb completely before hand.
See, I can get my GPU over 3300-3500 mb and its not a slide show. If the data had to go all the way back down the pice, nvidia would never allow that ram to be occupied. It's not reasonable and not what I expect is happenening