- Mar 11, 2006
- 33
- 0
- 0
So curiously, has anyone thought of the reasons as to why Conroe is THAT much faster over Dothan/Yonah and similarly Rev. E of the A64?
My thoughts are that 4MB of L2 while shared between both cores does allow single threaded applications (i.e. games) to take advantage of the large cache. Looking at how cache size (and accounting for diminishing returns) improves performance for say a Venice vs San Diego, I would think going from 2MB to 4MB would account for at least 5%-10% of the performance difference between A64 and Conroe.
I'm not sure as to how Macro-Op fusion would really help, but I guess in the end it will allow programs to take advantage of idle execution units boosting overall performance. I'd say it could account from anywhere to 3-5% of the performance boost.
Lastly the boost in SSE throughput will probably be pretty huge in applications that use it. I bet if Anand turned off SSE support in games we could really see some interesting numbers (i.e. compare in Quake4 A64 SSE off vs Conroe SSE off).
This is just my take of where the performance advantage of Conroe actually comes from. The end result is that AMD can game some ground back with a 4MB L3, implementing a shared cache (which I doubt will be done in the K8 generation as the additional control logic would be too difficult to add on). Once/If AMD fixes DDR2 in AM2 then it should be even closer, but definitely I think Intel will hold the lead until AMD comes out with a true next generation CPU (shared cache, larger cache (2MB per core at least), improve their branch prediction to Intel levels, and improve SSE/2/3 performance on their cpus).
My thoughts are that 4MB of L2 while shared between both cores does allow single threaded applications (i.e. games) to take advantage of the large cache. Looking at how cache size (and accounting for diminishing returns) improves performance for say a Venice vs San Diego, I would think going from 2MB to 4MB would account for at least 5%-10% of the performance difference between A64 and Conroe.
I'm not sure as to how Macro-Op fusion would really help, but I guess in the end it will allow programs to take advantage of idle execution units boosting overall performance. I'd say it could account from anywhere to 3-5% of the performance boost.
Lastly the boost in SSE throughput will probably be pretty huge in applications that use it. I bet if Anand turned off SSE support in games we could really see some interesting numbers (i.e. compare in Quake4 A64 SSE off vs Conroe SSE off).
This is just my take of where the performance advantage of Conroe actually comes from. The end result is that AMD can game some ground back with a 4MB L3, implementing a shared cache (which I doubt will be done in the K8 generation as the additional control logic would be too difficult to add on). Once/If AMD fixes DDR2 in AM2 then it should be even closer, but definitely I think Intel will hold the lead until AMD comes out with a true next generation CPU (shared cache, larger cache (2MB per core at least), improve their branch prediction to Intel levels, and improve SSE/2/3 performance on their cpus).