I recently purchased a new workstation with a Intel W-2295 and Quadro RTX 5000, but I'm having some performance issues.
Currently, the standard solver is running faster without the GPU than with it, and when I use CPU only on the new workstation, it slightly outperforms the old workstation with CPU+GPU (10 core Xeon v3 + Tesla K20c). Comparing the raw TFLOPs of each CPU and expected uplift from adding the Tesla GPU, the slightly higher performance of the W-2295 makes sense and suggests that the CPU is performing as expected.
I've also benchmarked the RTX 5000 with Mixbench CUDA and it too performs as expected for single-precision workloads (~11 TFLOPs).
The only thing that I can think of is that the standard solver is trying to use double-precision on the GPU (which is significantly slower on the RTX 5000 than the Tesla K20c), but I've double-checked the ".com" file and the double-precision flag is off.
Does the standard solver require the use of double-precision for GPGPU? Is there a way to make it use single-precision?