I followed the suggestion in this thread to provide Open CL support on newer AMD hardware and Ubuntu 16.04 (though I'm running 17.04):
https://einsteinathome.org/content/ubuntu-1604#comment-160433
You can view the specs of the machine here https://einsteinathome.org/host/12556329
It is still very early to tell but it works, and the latest release 1.6.3/1.6-148 has some improvements in memory protection which seems to have improved the stability. I did experience issues earlier with memory conflicts in the GPU when running two concurrent GPU tasks and at the same time using the PC.
The installation is simple, follow the steps described here
https://github.com/RadeonOpenCompute/ROCm
I installed over a fresh partition with Ubuntu 17.04. The one issue I did have was that I wanted to run Boinc and Rocm on a separate partition, to avoid clogging up the main partition since I'm using the PC for other tasks as well. It took some fiddling with MBR and grub to get that right.
The ambitions for the Rocm and HSA projects are much higher than simply enabling Open CL but it's worth a try if you are struggling with Open CL drivers on newer hardware (read about the hardware requirements on the github page).
Copyright © 2024 Einstein@Home. All rights reserved.
Thanks Rolf - looks like you
)
Thanks Rolf - looks like you are running at x3 at the moment, can't quite tell how many CPU tasks also running.
Once it get settled be interesting to see the task run times.
With the FGRPB1G tasks being
)
With the FGRPB1G tasks being slightly easier at the moment, execution times are roughly
x1: 610s
x2: 1030s
x3: 1490s
With the normal difficulty of the tasks I think runtimes will increase some 20-30 s. I have set Boinc to use 8 virtual cores, so when running one GPU task it will run 7 CPU tasks, with x2 it will run 6 CPU tasks, with x3 5.
The execution of GPU tasks doesn't seem to affect the CPU tasks much, and vice versa. The throughput of O1Spot1Lo tasks - which is what I always get - is about the same regardless of running x3 GPU tasks / x5 CPU or x1 GPU / x7 CPU or even 1x GPU / x11 CPU. O1Spot1Lo seems to be heavily limited by RAM access, and FGRPB1G doesn't use much CPU or RAM bandwidth at all.
With the gains from running x3 so small I will probably go back to x2 in the long run, x3 is more of a stresstest, it just creates more hot air and noise. This is from x3: