The decision to do 2+2 rather than 4+0 looks more like a cost thing ("well, crap, we have 60% defect rates on these R7 dice, what are we gonna do?!") rather than anything technical.
I too have my worries about the scheduler; each core complex almost looks like a NUMA node, kind of like its very own quad-core CPU in the world's tiniest 2P chassis, with all that that implies regarding communication. But this is probably cheaper than an entirely separate 4c mask/die.
Now what I'm wondering about is: Will the R3 series actually be separate masks/dice that really are 1CCX/4C? Because if these are intended to be the mainstream chip what sells like hotcakes, that may actually be a winning value proposition. And related to this, if this turns out to be the case, will we possibly see R3s that outperform the 2+2 R5s in some or all workloads by virtue of never having to talk across CCXs and the associated caches?
Interesting times ahead...