The instruction cache is invalidated when you write to the corresponding memory region that the instruction lives in. On modern processors this causes the processor to fault, invalidate/flush the cache, write out the datacache line (the one that contains the instruction), dump all instructions which are currently in flight, and reload everything from step 1. Also any predecode (or similar) bits are tossed.. and if you do it on a P4, you may as well turn the trace cache off
Oh yeah and I would LOVE to see this code run on a transmeta CPU.. I bet it really makes a mess of code-morphing
Steve