I love this in depth analysis of SMT! While I've seen it casually mentioned, it really needs to be emphasized a bit more: securing SMT requires a lot of potentially performance killing implementation work. The effort that must be put in to maintain memory isolation between the various threads being worked on is significant. If both threads are accessing similar regions of memory, then there will be a non-trivial amount of locking, deconfliction, and generally intensive memory management work done.
None of that is easy, quick to implement, and all of it requires a lot of extra data pathways and hardware bounds checking that can introduce latency where you least want it. It also introduces a lot of extra power used. While it doesn't account for ALL of the power difference, we do see that AMD, which, up to now, has had a more robust SMT implementation with respect to thread data security, does consume more power from their SMT for at least some of that effort.
Another thing that is encountered in the general computing world that may not be a thing for Sarah's world is memory controller contention. All those extra threads are going to generate their own share of memory requests. Supporting lots of threads requires a more robust memory controller and beefier memory throughput. It affects every level of processor memory, from the L1 cache layer all the way down.
Those things don't simply disappear when SMT is disabled in the BIOS. They might not be active, and they may consume less power, but they don't cease to exist. No matter what, it the circuits are present, if you have SMT off, a portion of it becomes wasted circuits. If you started from the block diagram of the core and built it for maximum ST performance instead of dedicating some circuits to SMT and it's security needs, you'd have many places to see gains in reduced latency, simplified logic, or other improvements to throughput and/or efficiency. What I'm getting at is that, just toggling SMT on and off doesn't give you the full picture of what the TRUE cost of SMT being present is. It may be super-critical in some tasks, but in others, it's more of an albatross around the core's neck...