What decoder are you using? The typical 720p recode is not particularly demanding. However, while CyberLink is the best AVC decoder the x264 encoder implementation appears borked so that DxVA actually degrades performance -worse, it just flat out does not provide smooth playback regardless of CPU load. So, try disabling DxVA or otherwise (and especially for testing purposes) a freebee software-only decoder such as ffmpeg/libavcodec included with ffdshow.
You may also try Nero/Ateme DxVA if you've got a Burning ROM license but I haven't bothered with it much (not tried the latest 'n' greatest anyway) since restricted to its own player. I don't believe it is on par with CyberLink anyway but it is possible that the x264 problem is limited to CyberLink. I would certainly avoid WinDVD though -the ugly step-sister of DxVA.
As for X1600Pro acceleration specifically, ATI only claimed acceleration for 720 and so 1080 is a bit much to expect if the CPU is borderline. I think an X1650XT is really required for that, if not an X1950 variant (or ye olde X1800), due to their reliance upon shaders.
Whereas on the Nvidia side DxVA performance is essentially uniform from 7300GT up to 8800. PV2 of the 8600 should help but for now the performance in XP is apparently equivalent to that of the previous PV1 at best -which isn't a bad thing, but really requires Vista for maximum acceleration (until promised XP drivers are delivered).
So, a diff'rent GPU should help with standard AVC and it's just a question of bang for the ruble and required 3D performance. But x264 remains dodgy for the mo' AFAIK. Also, don't expect any VC-1 acceleration from PV2 nor Avivo. If that is required then either go with PV1 else look into an ATI X2000 series sporting UVD.
While I'm not familiar with A64 2.0, a P4 3.0 SSE3 is overkill for 1080p 20Mbps even before overclocked to 3.6. SSE2 is 10% weaker (Northwood versus Prescott). Certainly either can manage 720 sans DxVA though and only requires it for 1080.