Rocm + Ryzen + RX580, some early experience

Rolf
Rolf
Joined: 7 Aug 17
Posts: 22
Credit: 116,027,081
RAC: 83,935
Topic 209743

I followed the suggestion in this thread to provide Open CL support on newer AMD hardware and Ubuntu 16.04 (though I'm running 17.04):

https://einsteinathome.org/content/ubuntu-1604#comment-160433

You can view the specs of the machine here https://einsteinathome.org/host/12556329

It is still very early to tell but it works, and the latest release 1.6.3/1.6-148 has some improvements in memory protection which seems to have improved the stability. I did experience issues earlier with memory conflicts in the GPU when running two concurrent GPU tasks and at the same time using the PC.

The installation is simple, follow the steps described here

https://github.com/RadeonOpenCompute/ROCm

I installed over a fresh partition with Ubuntu 17.04. The one issue I did have was that I wanted to run Boinc and Rocm on a separate partition, to avoid clogging up the main partition since I'm using the PC for other tasks as well. It took some fiddling with MBR and grub to get that right.

The ambitions for the Rocm and HSA projects are much higher than simply enabling Open CL but it's worth a try if you are struggling with Open CL drivers on newer hardware (read about the hardware requirements on the github page).

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513,211,304
RAC: 0

Thanks Rolf - looks like you

Thanks Rolf - looks like you are running at x3 at the moment, can't quite tell how many CPU tasks also running. 

Once it get settled be interesting to see the task run times.

 

 

Rolf
Rolf
Joined: 7 Aug 17
Posts: 22
Credit: 116,027,081
RAC: 83,935

With the FGRPB1G tasks being

With the FGRPB1G tasks being slightly easier at the moment, execution times are roughly

x1: 610s

x2: 1030s

x3: 1490s

With the normal difficulty of the tasks I think runtimes will increase some 20-30 s. I have set Boinc to use 8 virtual cores, so when running one GPU task it will run 7 CPU tasks, with x2 it will run 6 CPU tasks, with x3 5.

The execution of GPU tasks doesn't seem to affect the CPU tasks much, and vice versa. The throughput of O1Spot1Lo tasks - which is what I always get - is about the same regardless of running x3 GPU tasks / x5 CPU or x1 GPU / x7 CPU or even 1x GPU / x11 CPU. O1Spot1Lo seems to be heavily limited by RAM access, and FGRPB1G doesn't use much CPU or RAM bandwidth at all.

With the gains from running x3 so small I will probably go back to x2 in the long run, x3 is more of a stresstest, it just creates more hot air and noise. This is from x3:

====================    ROCm System Management Interface    ====================
================================================================================
 GPU  DID    Temp     AvgPwr   SCLK     MCLK     Fan      Perf    OverDrive  ECC
  0   67df   72.0c    135.85W  1365Mhz  1750Mhz  53.73%   auto      0%       N/A      
================================================================================
====================           End of ROCm SMI Log          ====================

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.