- Feb 12, 2013
- 3,818
- 1
- 0
Apparently, the AVX2 support of Excavator isn't necessarily a good thing. At least on y-cruncher:
http://www.overclock.net/t/1560230/jagatreview-hands-on-amd-fx-8800p-carrizo/400_100#post_24310470
we can see AVX2 mode being only a hair faster than SSE3. Maybe if AVX2 apps were coded with Excavator in mind, it wouldn't be such a problem, but at that point, one may as well use xOP.
Thank you. Works better now.took me all night but here is the dump at 1.4Ghz locked frequency
http://pastebin.com/7A7Chkiu
AVX2 as AVX is less flexible to decode and handle than XOP. The latter has the advantage of 3 operand encoding over SSE.Apparently, the AVX2 support of Excavator isn't necessarily a good thing. At least on y-cruncher:
http://www.overclock.net/t/1560230/jagatreview-hands-on-amd-fx-8800p-carrizo/400_100#post_24310470
we can see AVX2 mode being only a hair faster than SSE3. Maybe if AVX2 apps were coded with Excavator in mind, it wouldn't be such a problem, but at that point, one may as well use xOP.
Apparently, the AVX2 support of Excavator isn't necessarily a good thing. At least on y-cruncher:
http://www.overclock.net/t/1560230/jagatreview-hands-on-amd-fx-8800p-carrizo/400_100#post_24310470
we can see AVX2 mode being only a hair faster than SSE3. Maybe if AVX2 apps were coded with Excavator in mind, it wouldn't be such a problem, but at that point, one may as well use xOP.
A SW RAID with AVX2? Ok, maybe for faster mem copying and some data processing. But anything 128b wide should be enough. as this would already max out the mem subsystem at some xy GB/s, enough for a bunch of parallel hard drives or SSDs.Nice catch. I've come across posts by the stilt a lot he does good work.
Shame to see the avx/avx2 score poorly, but the xop speed makes up for it I guess.
On another note the linux software raid is said to utilize avx2 for large speedups.
A SW RAID with AVX2? Ok, maybe for faster mem copying and some data processing. But anything 128b wide should be enough. as this would already max out the mem subsystem at some xy GB/s, enough for a bunch of parallel hard drives or SSDs.
Excavator supports 256-bit AVX2 instructions, but it needs to split them up into 2 micro instructions. So any 256-bit AVX2 instruction will look worse or be a wash compared to 128-bit AVX/SSE.
Nice catch. I've come across posts by the stilt a lot he does good work.
If you're into coding and instruction latencies on different CPUs interest you http://www.agner.org/optimize/ agner has some nice code, his data has been used for the instruction choosing in gcc.
just picked up carrizo, tell me what you want to test.
As an example of different cooling gear for different TDPs I thought these teardowns of the 13.3" and 15" Apple Mac Pro Retina (late 2013) models were interesting:
https://www.ifixit.com/Teardown/MacBook+Pro+13-Inch+Retina+Display+Late+2013+Teardown/18695
https://www.ifixit.com/Teardown/MacBook+Pro+15-Inch+Retina+Display+Late+2013+Teardown/18696
The first laptop is the 13.3" model with a 28W Haswell dual core.
The second laptop is the 15" model with a 47W Haswell quad core with Iris Pro graphics.
Copper heatsink size is about the same for both, but the 47W model uses two fans (each with blowing air into a very small aluminum finned area). This compared to the 28W model which only uses one fan, but notice the aluminum finned area is much larger.
So overall, I would say both 28W and 47W set-ups have the following amount of material:
Copper heatsink material: About the same for both the 47W and 28W
heatpipe: twice as much for the 47W
fans: twice as much for the 47W
aluminum finned material (this is the part next to the fan): About the ~same total weight for both, the 47W has less of it for each fan.
So it appears the amount of metal used for cooling isn't really that different between the two TDPS, but difference in ventilation is much in favor of the 47W.
Just had a look at the code. They use a lot of AVX2 logical and add operations. With these instructions Skylake has 2x to 4x the throughput of Carrizo.On boot it benchmarks the various supported types, sse, sse2, avx, avx2, etc. and chooses the fastest.
wiki says "required AVX2, AVX is not sufficient" just why I mentioned this.
https://git.kernel.org/cgit/linux/k.../?id=2c935842bdb46f5f557426feb4d2bdfdad1aa5f9
that's the avx2 commit dunno how it'd fare on carrizo if we had a desktop version to test.
Output from a recent intel:
[ 0.165172] raid6: sse2x1 5136 MB/s
[ 0.182199] raid6: sse2x2 6378 MB/s
[ 0.199230] raid6: sse2x4 7460 MB/s
[ 0.216267] raid6: avx2x1 9906 MB/s
[ 0.233299] raid6: avx2x2 11382 MB/s
[ 0.250332] raid6: avx2x4 13363 MB/s
[ 0.250333] raid6: using algorithm avx2x4 (13363 MB/s)
[ 0.250335] raid6: using avx2x2 recovery algorithm
Sorry for the distraction, Just a random detour of the thread I guess.
Dead End Enigmatic carrizo
it seems oems, consumers and even AMD don't care about carrizo. The crazies like me can't sustain it.
I'm starting to think there is deliberate sabotage going on via contracts with Intel.
Occam's Razor. The interesting bit is that there are some more configuration options available to European buyers for these chips than there are for North American folks. It was the same way with mobile Kaveri, more-or-less.
Occam's Razor is that people are stupid. That's usually the correct reason.
Enough already !! us AMDhopeians should apply for a job at AMD for sure now
No, Occam's Razor states that:
"Among competing hypotheses, the one with the fewest assumptions should be selected."
You look at Intel configurations, and they come by default with 1 8MB DIMM, with a 12GB dual channel option, and then you see a Carrizo configuration, and it's not that different.
You speak as though DIMM configurations are the only problems inherent to these Carrizo laptops.
No, Occam's Razor states that:
"Among competing hypotheses, the one with the fewest assumptions should be selected."
We have no direct evidence that anyone involved in selecting Carrizo laptop configurations is fundamentally stupid (and plenty of evidence points to the contrary), so that conclusion requires (potentially erroneous) assumptions on our part. If you will observe the above complaints about Carrizo configurations, the logical conclusions are that OEMs are being cheap/lazy or that they're being "encouraged" to deliberately misconfigure Carrizo. The former argument holds less water since OEMs are raising BoM needlessly with dGPUs, and since the difference in 15w and 35w cooling configurations often amounts to little more than a few fans (read: not a significant increase in BoM).