But what about other accelerators such as Xilinx Alveo or something? Then you need another compute API for each one of them
Xilinx is big enough to roll out support for their own compute API if they're truly serious about general purpose compute. As a matter of fact if mobile hardware vendors also care about compute as well then they'll all bring out their own compute APIs to save us from the tyranny of OpenCL and ironically their bad drivers as well ...
Us complaining about applications developers not supporting a specific API for technical reasons isn't going to help solve any real problems, ok ? We need all of us to demand the absolute best possible solution from a vendor and if their going to stay obstinate we need to leave them behind or abandon them for good if they aren't going to be attentive to customer needs ...
If the industry wanted OpenCL to be viable then it would've been in a much better condition by now but instead what have is a pile of filth from Khronos so here we are right now where there's already two hardware vendors which have their own proprietary compute APIs and another possible hardware vendor joining in the competition as well. OpenCL should've been under competitive pressure from the beginning to better itself but AMD and possibly Intel as well realized it better than anyone else that they could not hope to match CUDA's programming model which was at least a decade ahead so it was time for both of them to cut loose of OpenCL ...
If we can't have one API to rule them all then APIs NEED to compete in the marketplace. It is time to stop contemplating about OpenCL and start considering the other options out there because the war between CUDA vs OpenCL is already over since Khronos is ending development of OpenCL but it's not too late for HIP/HCC (AMD) or oneAPI (Intel) to make a break for it ...
Rubbish. CUDA on Windows works fantastically. It's the other vendors who aren't providing good Windows support.
And remember, Deep Learning and Compute are two separate things. There are plenty of workstation applications that want access to GPU compute without using machine learning.
It's true that the other vendors have subpar support for GPU compute on Windows but that's probably because there's no vision on Microsoft's part to improve compute on their own platform. Their C++ AMP programming model didn't get any traction, they aren't interested in expanding GPU compute with another API of their own, and they aren't going to fix their kernel space limitations within Windows to allow for a HSA runtime. CUDA uses an ICD model that's built on top of WDDM but with HIP/HCC this really isn't possible since they rely on having a HSA runtime that comes with it's own kernel drivers and using WDDM instead really isn't appropriate option either. CUDA even on Windows with WDDM shows limitations compared to Linux. We have yet to know what Intel seeks to do with their
oneAPI initiative but if it ends up being no more than a SYCL implementation then they stop being a real competitor in GPU compute at that point ...
As far as I can see deep learning is a subset of compute so calling them two separate things is frivolous but let's face the reality for a moment. The vast majority of data centre, deep learning, and other high performance computing customers are based on Linux and not Windows so what exactly are the workstation applications exclusively on Windows that needs a CUDA-like GPU compute model ? If you meant professional rendering then there are other options out there that don't need single source C++ support plus HIP/HCC isn't going to help either in that case since it currently doesn't do graphics. Why risk developing an API on a platform that you don't even own when you face the possibility of being locked out as well ? With Linux, a vendor has full control of their own compute stack which has undeniable benefits for their purposes ...