Margins do not have to be as good, as they are with Ryzen 5 and 7 chips. The market for APUs is MUCH BIGGER, than just for the CPUs.
4C/8T with 16 CU's and 4 GB of HBM2, in two stacks, in 95W TDP package, if performance is right would presumably kill everything on the mainstream and low-end market, even if the APU would cost 300$.
4C/8T+16 CU's die size would be actually around 210 mm2. 44mm2 is CCX, 123 mm2 is die size of Polaris 11, with 4 memory controllers. With HBM2 - you have only 2, and they should be integrated in IMC, that should account for both RAM and HBM2(APUs usually have access to RAM, also).
I'm very dubious of HBM being added right now. Look at Intel and Iris Pro - the extra cost of the L4$ didn't turn out to be worth it to most buyers.
Even if the end result in 16CU Vega GPU would be clock for clock around 40% faster than 1024 GCN core Polaris 11 chip?
We are talking about a situation in which, in games at least, that Vega APU is between GTX 1050 Ti, and RX 470.
There is a very good reason for this whole idea of HBM2.
Lets look at this this way. 4C/8T+16CU APU+4 GB's HBM2, in 95W TDP, and cost of 299$.
Core clocks of CPU 3.4/3.8 GHz, core clocks of GPU 1250 MHz, for example. GPU arch is for example faster by 40% from 1024 GCN core Polaris GPU(1024 GCN core Sapphire RX 460). Idea of APUs is when they bring significant benefits. You end up with 199$ worth of CPU, with 150$ worth GPU, in 50% thermal envelope. For slightly lower cost.
And you can also add to this idea another level of graphical performance. By adding discreet GPU, from the same architecture.