I'm not sure if it has been brought up recently here in this thread but Intel has
unveiled their new oneAPI compute stack. The APIs offered include SYCL and DPC++ which are extensions to SYCL but as of writing this post their SYCL implementation is not
listed as conformant by Khronos ...
So that officially makes it 3 competing compute API standards! We now have the following:
AMD: HIP kernel language -> hcc (Heterogeneous Compute Compiler) -> GCN ISA
Intel: DPC++ kernel language -> DPC++ Compiler -> SPIR-V binaries
Lastly but not least ...
Nvidia: CUDA kernel language -> nvcc (Nvidia CUDA Compiler) -> PTX ISA
It seems as if though Intel is attempting to ride off on the success of CUDA as well and just
like AMD they are providing a source-to-source conversion
tool to migrate from CUDA. SYCL alone is going to be a futile attempt in gaining traction so it appears that DPC++ is the only practical way to go about supporting Intel hardware ...
Unfortunately, their default compilation model is online compilation rather than offline compilation and using SPIR-V kernels as an intermediate representation (PTX is at least specific to a vendor) only exasperates this overhead. At least it's single source and there's extensions to target common CUDA kernels with a compatibility tool but I still don't see a way to write custom kernels with custom assembly ...