This is fundamentally wrong: in the two cases we have, CPU operates at different average frequency and thus uses different amounts of energy. With SST enabled the CPU ramps up to max clocks while in the other case it may not even reach max clocks on task completion. Considering higher operating frequencies use exponentially more power, during task completion the SST enabled CPU might use a lot more energy to get the work done. However, this difference if offset by the savings done once CPU job is done and it enters sleep mode. So yes, I can compare SST to garage door. How do you think your car arrived first, if not by accelerating and breaking faster, thus using more energy to get from point A to point B.
The fact that SST is tuned with little power savings in mind in order to maximize responsiveness does not mean it does not offer both. Tune the OS/SST interaction at a different ratio for perf/energy usage and data changes. In fact even using different CPUs such as i3 or i5 might skew the perf/energy gains, since the lower top frequencies will limit the energy burn, while the initial jump from low frequencies to base frequencies still provides a tangible performance benefit. The Core i7-6600U might have been the perfect vehicle to showcase the performance gains SST has to offer, but not necessarily the one with the best power savings as well.
Intel claimed slight power savings. The Anandtech test fails to measure that while being a test tailored for consumers (which it should be), not for isolating CPU power usage. I don't see how this data invalidates Intel's claim, other than putting a bigger emphasis on slight.