Ok, I can understand how this is smart from a business perspective. It still seems kind of dishonest to the customer, though. They use the customer's lack of knowledge of the product in order to get away with offering the same product for around $50 more with the only difference being some foundational software.
1) There's often more than just software changes.
2) There are often reasons for the "software" changes.
Lets take the example of the HD6850 and HD6870.
The 6850 has one cluster disabled, and is cheaper.
The HD6850 PCB is also different, it has less power coming in (one less PCIe plug), because it uses less power. This is due to lower clocks and functional units. This also means that the PCB itself can be potentially cheaper due to lower spec components. Don't need so much power? Don't need so much power circuitry. Don't have the RAM clocked as high? Use some cheaper specced RAM.
Now, that's the knock on effects from having a cut down GPU core, lower power, cheaper "other" components. So the card can be cheaper not only through the GPU being sold by AMD to the card maker for less, but also because it simply costs less to make it all because the other bits are cheaper.
Now, why is AMD selling a GPU for less and disabling some bits?
2 reasons: 1) in order to have a cheaper product without needing an entire new design. Some people might not want to spend $200 and might only want to spend $160, but they don't want to have to make two GPUs to cover those price points, so they make a $200 GPU and can turn it into a $160 GPU. 2) The GPUs don't always work at $200 spec, so they have two choices: throw them away, or change the specs so they fit them. Instead of 850MHz, we'll say it has to work at 775MHz (or w/e). Therefore more GPUs are now sellable and don't need binning. Then there are chip defects where part of a GPU might not work/work right, so they disable that either physically (laser cut) or through "software" (a different BIOS).
The fact that some cards can be flashed to higher spec cards only happens when the PCB and GPU are exactly the same, as is the case with early HD6950 and 6970 cards. That's because AMD couldn't be bothered to release two designs so they gave a single reference design which works for both. Most of the GPU cores presumably work at 6970 specs, since the BIOS flash rate for 6950s is fairly high, but that doesn't mean they ALL do, and it doesn't mean they always work at the normal voltage. That's often why they weren't certified as HD6970 GPUs in the first place.
Sometimes they will just downrate them in order to get the right number of GPUs for each segment, but that's reflected in the pricing. The GPU core may be cut down despite working fine at normal specs and speeds, but the price is also cut down.
If they decided to release a single SKU at a reduced price relative to the top end one (e.g. instead of the HD6970 and HD6950, they released the HD6970 at midway between the two), AMD and their partners would lose out in the long run.
AMD wouldn't have a product with which to "use up" their GPUs which fail to hit HD6970 specs, and partners wouldn't be able to eventually come up with a cheaper PCB design for the HD6950 chips to increase their profit margins.
The fact that the early run HD6950s are the same as HD6970s is more a matter of time and efficiency in the early production cycle of the card than them doing anything bad. The fact that they happen to mostly work at full spec is luck of the draw.
Another example is the HD5800 series.
The HD5870 and HD5850 launched first, and were on sale for a long time. Then AMD released the HD5830 with the same GPU core, but further cut down.
This GPU was basically "the crap bits" which had accumulated over time that couldn't make the cut for the 5870 or 5850, so they waiting until they had a decent stock and made a new product around it.
This product actually uses more power than the HD5850 despite having less functional units and being slower, because they increased the clock speeds and voltages. This shows how different problems can exist and chips can be binned in different ways. Although the HD5830 GPU cores have less shaders enabled than the HD5850, they run them at a higher clockspeed because the problem is mainly with the shaders being defective. But the working ones can run at decent clockspeeds with enough voltage, so they can harvest them that way.
Basically it's all done to maximise the usable GPU dies that get made.