Adds GPU backend and "CUDA" feature to all the tests. As of MR creation, there were no forks in tested IRs so only vanilla non-parallel (single block, single thread) codegen has been tested.