I believe (and correct me if I'm wrong) in clamshell mode the data signals are separate for each chip and can be routed point to point, but the command/address signals which would normally be all run to a single chip need to run to both. The GDDR standard allows for pin swapping specifically so that you can run clamshell with those signals mirrored on both sides, allowing you to run the CA signals to where the devices are and then split at the via. CA runs at 2xf, which isn't as critical as the data lines, but it would still create impedance issues if you have to t the signals 10s of mm before the chips and run them to two separate chips.If you can route 8 (or more) full channels on one side of the board, you can more easily route 8 half channels.
IMO, it's just laziness. Instead of completely rerouting the board (a complete redesign), they can just use the current routing or minimally changed and just sink routes through the board at current locations. This makes sense from a time/costs savings perspective, but is a negative for VRAM cooling.
I bet if this was planned from day one, they would have routed them all on the GPU side.