What is wityh some guys and AVX2??
There doesn't exist a cpu with AVX2 support.
There doesn't exist 1 program that supports AVX2
There doesn't exist compilers that support AVX2
AVX2 brings GPU technology into the CPU cores. It offers the same computing power, without the overhead or limitations. So there's plenty of reason to get excited over AVX2.
And yes, no CPU supports it yet. But neither does any APU today support a unified address space and context switches. That's only
planned to be complete by 2014. So AVX2 will get there sooner.
GCC 4.7 supports AVX2, LLVM 3.1 supports AVX2 and Visual Studio 2012 supports AVX2. So compilers are well ahead of schedule too.
Also, AVX2 is a part of the chain that will be supported by Intel and AMD... so why this is used to draw a rift between intel and AMD is plain stupid, not even mentionning the fact it doesn't exist yet.
Because AMD has yet to announce that they'll support AVX2. It's inevitable that they will, but they'd rather have people use HSA instead. In other words they're betting the farm on other technology. Looking at what can already be achieved with AVX, and all the phenomenal things added by AVX2, that's really going to turn out to be a big mistake on AMD's part.
Just like NVIDIA realized, they should back away from making compromises to graphics performance for the sake of GPGPU. General purpose computing is what the CPU is for, and AVX2 adds a lot more oomph to it. Heterogeneous computing doesn't scale, due to the round-trip latency and bandwidth bottleneck. So the GPU should concentrate on pure graphics only, which is a one-way process.
openCL vs AVXi is also a meaningless discussion... AVX2 is an instructionset
It's not really OpenCL versus AVX2. It's homogeneous versus heterogeneous general purpose throughput computing. OpenCL is just one way to get code auto-vectorized. But AVX2 supports many more programming languages and frameworks. So it's not a question of one or the other. Indeed as you indicate, one is hardware and the other is software. That said, OpenCL may not survive long after homogeneous computing proves to be superior, since it will have to compete against other languages which have fewer restrictions.
AVX2 can be used by any language as-is. All you need is loops with independent iterations to auto-vectorize them. AVX2's gather support is critical in enabling that. And it means developers can use languages they already know and love, instead of trying to shoehorn things into the OpenCL framework and losing performance on heterogeneous architectures.
People who believe AVX2 will outspace gpu in raw power... are idiots.... AVX2 is an instruction set... Its completely disconnected by the speed of the hardware below it...
Sure, it depends on the underlying hardware whether it's a high performance implementation or not. But that's equally true for GPUs!
Haswell's implementation of AVX2 will have three 256-bit execution units per core. Two of these will be capable of FMA operations, resulting in a peak performance of 500 GFLOPS for a quad-core. On a performance/area metric that's actually quite close to any GPU. And you don't lose any of the existing CPU qualities like far superior sequential speed, large cache space per thread, branch prediction to prevent stalls, etc.
Last but not least, AVX2 is not the end of the road. The encoding format supports extending it up to 1024-bit registers. This can be used to lower the power consumption of the CPU's front-end and out-of-order execution, by executing AVX-1024 instructions in four cycles (i.e. same ALU throughput for four times less power consumption in the rest of the pipeline). This would effectively make the CPU behave much more like a GPU in terms of power consumption. So heterogeneous computing won't have any benefits left.