computation errors on Ubuntu17 + RX460 + OpenCL Mesa

Magiceye04
Magiceye04
Joined: 18 Feb 06
Posts: 31
Credit: 818886114
RAC: 726282
Topic 207045

Hi!

On my new PC (AMD Ryzen, RX460) i installed Ubuntu17.04 and wanted to use the GPU for Boinc.

I use the included free AMDGPU driver, only added the package boinc-client-opencl

 

I got some WUs for Einstein@Home, but they are all aborted with error after some seconds.

Is is possible to use Ubuntu17 for Einstein@Home on the RX460 and it only needs some fine tuning?

example: https://einsteinathome.org/task/636200082

Task 636200082

Name: LATeah0022L_1076.0_0_0.0_14084865_1
Work unit ID: 290015418
Created: 22 Apr 2017 12:00:26 GMT
Sent: 22 Apr 2017 12:04:12 GMT
Received: 22 Apr 2017 12:05:57 GMT
Server state: Over
Outcome: Computation error
Client state: Compute error
Exit status: 11 (0x0000000B) Unknown error code
Computer: 12523728
Report deadline: 6 May 2017 12:04:12 GMT
Run time (sec): 3.25
CPU time (sec): 0.98
Validation state: Invalid
Claimed credit: 0.03
Granted credit: 0.00
Application: Gamma-ray pulsar binary search #1 on GPUs v1.18 (FGRPopencl1K-ati) x86_64-pc-linux-gnu

 

Some information (clinfo):

 

Number of platforms                               1
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 17.0.3
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     AMD POLARIS11 (DRM 3.9.0 / 4.10.0-19-generic, LLVM 4.0.0)
  Device Vendor                                   AMD
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 Mesa 17.0.3
  Driver Version                                  17.0.3
  Device OpenCL C Version                         OpenCL C 1.1
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Max compute units                               14
  Max clock frequency                             1220MHz
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              64
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 0 / 0        (n/a)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     No
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              2145640448 (1.998GiB)
  Error Correction support                        No
  Max memory allocation                           1501948313 (1.399GiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       1024 bits (128 bytes)
  Global Memory cache type                        None
  Image support                                   No
  Local memory type                               Local
  Local memory size                               32768 (32KiB)
  Max constant buffer size                        1501948313 (1.399GiB)
  Max number of constant args                     16
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Profiling timer resolution                      0ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  Device Available                                Yes
  Compiler Available                              Yes
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_fp64

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Clover
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [MESA]
  clCreateContext(NULL, ...) [default]            Success [MESA]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD POLARIS11 (DRM 3.9.0 / 4.10.0-19-generic, LLVM 4.0.0)
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD POLARIS11 (DRM 3.9.0 / 4.10.0-19-generic, LLVM 4.0.0)

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.2.11
  ICD loader Profile                              OpenCL 2.1


Sa 22 Apr 2017 11:51:22 CEST |  | Starting BOINC client version 7.6.33 for x86_64-pc-linux-gnu
Sa 22 Apr 2017 11:51:22 CEST |  | log flags: file_xfer, sched_ops, task
Sa 22 Apr 2017 11:51:22 CEST |  | Libraries: libcurl/7.52.1 OpenSSL/1.0.2g zlib/1.2.11 libidn2/0.16 libpsl/0.17.0 (+libidn2/0.16) librtmp/2.3
Sa 22 Apr 2017 11:51:22 CEST |  | Data directory: /var/lib/boinc-client
Sa 22 Apr 2017 11:51:22 CEST |  | OpenCL: AMD/ATI GPU 0: AMD POLARIS11 (DRM 3.9.0 / 4.10.0-19-generic, LLVM 4.0.0) (driver version 17.0.3, device version OpenCL 1.1 Mesa 17.0.3, 2046MB, 2046MB available, 1366 GFLOPS peak)
Sa 22 Apr 2017 11:51:22 CEST |  | Host name: AZen
Sa 22 Apr 2017 11:51:22 CEST |  | Processor: 16 AuthenticAMD AMD Ryzen 7 1700 Eight-Core Processor [Family 23 Model 1 Stepping 1]
Sa 22 Apr 2017 11:51:22 CEST |  | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic overflow_recov succor smca
Sa 22 Apr 2017 11:51:22 CEST |  | OS: Linux: 4.10.0-19-generic

...

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117722852302
RAC: 34981689

MagicEye wrote:On my new PC

MagicEye wrote:

On my new PC (AMD Ryzen, RX460) i installed Ubuntu17.04 and wanted to use the GPU for Boinc.

I use the included free AMDGPU driver, only added the package boinc-client-opencl

I know nothing about Ubuntu or its package system.  However you might be interested in what I did to get a number of RX460s running smoothly.  The hosts are relatively old Intel dual and quad core machines that previously did CPU crunching only.  Here is a link to one of them.  This machine has been crunching 24/7 for 9 years.  With the RX460 upgrade (same everything else including the 300W PSU)  it has become extremely productive - RAC from ~5K to 270K :-).  The distro I use is PCLinuxOS.  A fresh install comes with the free AMDGPU driver already installed.

I decided to experiment by downloading the AMDGPU-PRO package from the AMD website.  From memory I chose the Red Hat version, although for my situation it probably doesn't matter.  Of course, the install script in that package only recognises a couple of Linux distros, including Ubuntu, but I just perused the install script to understand what it was doing.  It was a couple of months ago so I don't remember specific details but it was fairly straightforward to recognise the OpenCL libs and other bits that might be needed.  Those were the only bits I was interested in - a rather tiny portion of the entire package.  I didn't want the driver itself.

I extracted the package and put the bits needed on an external USB hard drive.  A USB stick would be just as good.  On a test machine, I copied the needed files into place and when I tried it out it just worked.  I have some notes somewhere of exactly what I did and I wrote a shell script to implement those commands to make it simpler for future installs.  I have a further 7 installations that used the script and everything works just fine.

I'm not suggesting this will work for you as is. The way I run BOINC and where things get installed will be completely different to what you might need for Ubuntu.  If you have moderate command line skills, you should be able to work out your own procedure fairly easily.  There's really not many steps involved.  If you are interested, I can document the directories and their contents on the USB hard drive and give you a copy of the script - it's quite simple - most of it is error checking to ensure the disk is mounted and that relevant source and destination directories exist, etc.

The only real work is to copy the files into place, create a couple of symlinks and run ldconfig to make sure the libs can be found.  The biggest difference is likely to be that I use the Berkeley version of BOINC installed in a personal directory rather than a repo version installed in /var/lib/boinc or whatever Ubuntu uses but you should be able to work that out.

 

Cheers,
Gary.

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513211304
RAC: 0

Hi Magiceye welcome to the

Hi Magiceye welcome to the E@H forums

There are two different drivers MESA (opensource) and amdgpu-pro.

The second is the more commonly used for OpenCL here.

amdgpu-pro i think is not officially supported (may work) for 17.04.  see http://support.amd.com/en-us/download/linux

edit: nope see Phoronix

It looks you are using MESA, so you need to resolve either

  • Why Mesa is giving OpenCL errors 
  • Why amdgpu-pro did not install.

You may find the easiest solution is 16.04 + amdgpu-pro

HTH

Details follow

MagicEye wrote:

Number of platforms                               1

  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 17.0.3

the AMDGPU driver should show

$ clinfo
 Number of platforms                               1
   Platform Name                                   AMD Accelerated Parallel Processing
   Platform Vendor                                 Advanced Micro Devices, Inc.
   Platform Version                                OpenCL 2.0 AMD-APP (2236.5)

 

 

Paul
Paul
Joined: 3 May 07
Posts: 123
Credit: 1785730885
RAC: 250924

OP, did you find a solution? 

OP, did you find a solution?  I've tried following Gary's advice, as he has been very helpful over the years, but I could not get his solution to work on my system.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117722852302
RAC: 34981689

Paul_134 wrote:OP, did you

Paul_134 wrote:
OP, did you find a solution?

I presume that he didn't since the Ryzen host mentioned now has an nVidia 970 GPU.

Quote:
I've tried following Gary's advice, as he has been very helpful over the years, but I could not get his solution to work on my system.

I now have quite a few Polaris GPUs ranging from RX460s to RX580s.  They all use exactly the same procedure we discussed last year.  Recently, I spent a bit of time with the latest versions of AMDGPU-PRO (both RH and Ubu variants) and tried playing around with the successor to the now deprecated --compute option.  It's a lot more complicated (a lot of stuff had changed) and I ended up deciding it was too complicated for my limited knowledge.

In the process of trying to work out a procedure that might work, I made an interesting discovery.  It seems to me there are two separate things that need to happen for Polaris GPU crunching to work here.  First BOINC, during startup, has to detect the OpenCL capability of the GPU.  You must have that working since BOINC allows you to download tasks which then fail.  In order to know exactly what libs are needed by the science app itself (nothing to do with BOINC) I just ran ldd on the hsgamma app.  I did this to see exactly what amdgpu component(s) the app was relying on.  Apart from what looked like pretty standard libs, the app only relies on /usr/lib64/libopencl.so.1 and not any of the stuff installed in /opt/amdgpu-pro/lib64/.

I hadn't installed this particular lib that ldd reported - it was already there as part of a standard install.  I've done a bit of googling and found where it came from on github, if you're interested.  My distro has it packaged and in the repo and it was installed by default.

So, it would appear that what I achieved a year ago, pretty much by accident, was to install something that BOINC could find that allowed BOINC to know that a Polaris GPU was a usable GPU.  Once BOINC was satisfied, I already had everything else needed for the science app to run.  In my distro's repo, the package is called lib64opencl1-2.2.11.  On github, I see there is now a 2.2.12 version.  You might find this already packaged somewhere in your distro.  If you can install this, maybe the science app should work for you.  I'm not a programmer so I know very little about this stuff.  I could easily be way off track in my thought process :-).

 

Cheers,
Gary.

Paul
Paul
Joined: 3 May 07
Posts: 123
Credit: 1785730885
RAC: 250924

Gary Roberts wrote:Paul_134

Gary Roberts wrote:
I now have quite a few Polaris GPUs ranging from RX460s to RX580s.  They all use exactly the same procedure we discussed last year.

So so jealous.

Quote:
Recently, I spent a bit of time with the latest versions of AMDGPU-PRO (both RH and Ubu variants) and tried playing around with the successor to the now deprecated --compute option.  It's a lot more complicated (a lot of stuff had changed) and I ended up deciding it was too complicated for my limited knowledge.

Frustrating.  I'm less inclined to try it again, now.

Quote:

So, it would appear that what I achieved a year ago, pretty much by accident, was to install something that BOINC could find that allowed BOINC to know that a Polaris GPU was a usable GPU.  Once BOINC was satisfied, I already had everything else needed for the science app to run.  In my distro's repo, the package is called lib64opencl1-2.2.11.  On github, I see there is now a 2.2.12 version.  You might find this already packaged somewhere in your distro.  If you can install this, maybe the science app should work for you.  I'm not a programmer so I know very little about this stuff.  I could easily be way off track in my thought process :-).

 

LDD shows all the dynamically linked shared libraries are install.  It looks like the same is true for MagicEye.  As best I can tell, the project APPs are just build for the same libraries--OpenCL, obviously--but the proprietary drivers overwrite this library with one behaves differently, haves different capabilities or possibly a different API.  I don't see anyway to fix this.  And I don't know how to get anyone to do that.  The devs for this code just don't seem to know how to fix it.  Seems like AMD isn't making good on their promise to help the OSS side.  If we can get them good backtraces with debug info, *maybe* that will work.

Off topic: Since you have both, how do you like the POLARIS 11?  How does your 580 perform compared to your 460?  I have a RX480.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117722852302
RAC: 34981689

Paul_134 wrote:... LDD shows

Paul_134 wrote:
... LDD shows all the dynamically linked shared libraries are install.

The devil is always in the detail.  Can you please just humour me for a bit :-).  Can you please open a terminal window and cd to the Einstein project directory on the machine where your RX480 is installed.  I've just done this over ssh for one of my machines as shown in the example below.  Can you run the same sequence of commands on your machine and post all the output in your next message?  Thanks.  I'm just trying to confirm exactly where your particular hsgamma app is going to for OpenCL.  If it's not under /usr/lib64/ then list the details of these files where they actually are installed.

[gary@server ~]$ ssh xxxxx
Last login: Mon Feb 26 07:24:56 2018
[gary@xxxxx ~]$ cd $EAH_PROJ
[gary@xxxxx einstein.phys.uwm.edu]$ ls -l hsgamma*ati
-rwxr-xr-x 1 gary gary 10216922 Feb 8 2017 hsgamma_FGRPB1G_1.18_x86_64-pc-linux-gnu__FGRPopencl1K-ati*
[gary@xxxxx einstein.phys.uwm.edu]$ ldd hsgamma*1K-ati
linux-vdso.so.1 (0x00007ffe95fb9000)
libOpenCL.so.1 => /usr/lib64/libOpenCL.so.1 (0x00007fa23f98a000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fa23f76e000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fa23f56a000)
libm.so.6 => /lib64/libm.so.6 (0x00007fa23f264000)
libc.so.6 => /lib64/libc.so.6 (0x00007fa23eeb0000)
/lib64/ld-linux-x86-64.so.2 (0x00007fa23fba3000)
[gary@xxxxx einstein.phys.uwm.edu]$ ls -l /usr/lib64/libOpenCL*
lrwxrwxrwx 1 root root 18 Aug 5 2017 /usr/lib64/libOpenCL.so.1 -> libOpenCL.so.1.0.0*
-rwxr-xr-x 1 root root 100624 Apr 3 2015 /usr/lib64/libOpenCL.so.1.0.0*
[gary@xxxxx einstein.phys.uwm.edu]$ 

Quote:
Off topic: Since you have both, how do you like the POLARIS 11?  How does your 580 perform compared to your 460?  I have a RX480.

The initial cards I bought were 460s and I'm happy with their performance.  Most of the various brands I have don't require a PCIe power connector.  I bought a couple of 560s some time later when they were on special.  Those ones do require PCIe power and give 5-10% more output.  Recently I managed to pick up some Asus 560s that don't need the power connector.  Their RAC has recently plateaued and is quite similar to the earlier powered 560s.

The 570s and 580s are pretty close in output and each one cost pretty much the same to buy. They produce a little more than twice the output of a 460 and they cost a bit more again than twice the price of a 460. A host with a 570 draws about twice the power of a host with a 460.  None of those things was a deciding factor for me.  I had a bunch of 2008-2010 vintage CPU crunchers.  Should I retire them or should I upgrade them was the key question.  When I discovered how to get a 460 to crunch, it was very hard to resist the upgrade urge - so I didn't :-).  The 460s could use existing 300W PSUs so it was very simple to upgrade a bunch of old machines.  When the 570s and 580s were on special, I also bought some 450W PSUs so it cost a bit more for those upgrades.

Here are the pretty stable current RACs for each GPU type as of this morning.

      RX460  ->  260K
      RX560  ->  285K
      RX570  ->  560K
      RX580  ->  575K

I have a machine with dual RX560s.  This morning its RAC was 567K.  It's been going for long enough to be just about stable.  A similar machine with dual RX460s had 541K RAC.

About 10 days ago we had some really hot weather.  I ran a script on two consecutive days to pause crunching on all hosts for about 6 hours each day to keep the room temperature within limits.  It obviously had some effect on RAC for all machines but that seems mostly recovered by now.  A few days before that there was a severe thunderstorm that caused a couple of power fluctuations sufficient to crash most machines.  It took me about 4 hours to get everything restarted.  That would have had some effect on RAC as well.

 

Cheers,
Gary.

Paul
Paul
Joined: 3 May 07
Posts: 123
Credit: 1785730885
RAC: 250924

Gary Roberts wrote:The devil

Gary Roberts wrote:

The devil is always in the detail.  Can you please just humour me for a bit :-).  Can you please open a terminal window and cd to the Einstein project directory on the machine where your RX480 is installed.  I've just done this over ssh for one of my machines as shown in the example below.  Can you run the same sequence of commands on your machine and post all the output in your next message?  Thanks.  I'm just trying to confirm exactly where your particular hsgamma app is going to for OpenCL.  If it's not under /usr/lib64/ then list the details of these files where they actually are installed.

[gary@server ~]$ ssh xxxxx
Last login: Mon Feb 26 07:24:56 2018
[gary@xxxxx ~]$ cd $EAH_PROJ
[gary@xxxxx einstein.phys.uwm.edu]$ ls -l hsgamma*ati
-rwxr-xr-x 1 gary gary 10216922 Feb 8 2017 hsgamma_FGRPB1G_1.18_x86_64-pc-linux-gnu__FGRPopencl1K-ati*
[gary@xxxxx einstein.phys.uwm.edu]$ ldd hsgamma*1K-ati
linux-vdso.so.1 (0x00007ffe95fb9000)
libOpenCL.so.1 => /usr/lib64/libOpenCL.so.1 (0x00007fa23f98a000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fa23f76e000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fa23f56a000)
libm.so.6 => /lib64/libm.so.6 (0x00007fa23f264000)
libc.so.6 => /lib64/libc.so.6 (0x00007fa23eeb0000)
/lib64/ld-linux-x86-64.so.2 (0x00007fa23fba3000)
[gary@xxxxx einstein.phys.uwm.edu]$ ls -l /usr/lib64/libOpenCL*
lrwxrwxrwx 1 root root 18 Aug 5 2017 /usr/lib64/libOpenCL.so.1 -> libOpenCL.so.1.0.0*
-rwxr-xr-x 1 root root 100624 Apr 3 2015 /usr/lib64/libOpenCL.so.1.0.0*
[gary@xxxxx einstein.phys.uwm.edu]$ 

 

$ ls -l hsgamma*
-rwxr-xr-x. 1 pdestefa pdestefa  6278871 Feb  5  2016 hsgamma_FGRP4_1.15_x86_64-pc-linux-gnu__FGRP4-SSE2*
-rwxr-xr-x. 1 pdestefa pdestefa  6320021 Feb 20  2016 hsgamma_FGRPB1_1.00_x86_64-pc-linux-gnu*
-rwxr-xr-x. 1 pdestefa pdestefa  6320021 Sep 13  2016 hsgamma_FGRPB1_1.01_x86_64-pc-linux-gnu__FGRPOLD*
-rwxr-xr-x. 1 pdestefa pdestefa  6401891 Sep 22  2016 hsgamma_FGRPB1_1.05_x86_64-pc-linux-gnu__FGRPSSE*
-rwxr-xr-x. 1 pdestefa pdestefa  6401891 Sep 30  2016 hsgamma_FGRPB1_1.05_x86_64-pc-linux-gnu__FGRPSSE-Beta*
-rwxr-xr-x. 1 pdestefa pdestefa 10197252 Dec  5  2016 hsgamma_FGRPB1G_1.12_x86_64-pc-linux-gnu__FGRPopencl-Beta-ati*
-rwxr-xr-x. 1 pdestefa pdestefa 10216922 May  6  2017 hsgamma_FGRPB1G_1.18_x86_64-pc-linux-gnu__FGRPopencl1K-ati*

$ ldd hsgamma_FGRPB1G_1.18_x86_64-pc-linux-gnu__FGRPopencl1K-ati
        linux-vdso.so.1 (0x00007ffdd9bca000)
        libOpenCL.so.1 => /lib64/libOpenCL.so.1 (0x00007f87ba8d3000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f87ba6b4000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f87ba4b0000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f87ba15b000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f87b9d78000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f87baaf3000)
$ ls -l /usr/lib64/libOpenCL*
lrwxrwxrwx. 1 root root     25 Dec 16  2013 /usr/lib64/libOpenCL.so -> /usr/lib64/libOpenCL.so.1*
lrwxrwxrwx. 1 root root     18 Aug  3  2017 /usr/lib64/libOpenCL.so.1 -> libOpenCL.so.1.0.0*
-rwxr-xr-x. 1 root root 132104 Aug  3  2017 /usr/lib64/libOpenCL.so.1.0.0*

 Okay, how about this one for you?

 

$ clinfo |grep Platform
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 17.2.4
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA
  Platform Name                                   Portable Computing Language
  Platform Vendor                                 The pocl project
  Platform Version                                OpenCL 1.2 pocl 0.15-pre, LLVM 5.0.0
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             POCL
  Platform Name                                   Clover
1 error generated.
=== CL_PROGRAM_BUILD_LOG ===
error: <built-in>:2:10: '/usr/lib64/clang/5.0.0/include/opencl-c.h' file not found
error: <built-in>:2:10: '/usr/lib64/clang/5.0.0/include/opencl-c.h' file not found
error: <built-in>:2:10: '/usr/lib64/clang/5.0.0/include/opencl-c.h' file not found
error: <built-in>:2:10: '/usr/lib64/clang/5.0.0/include/opencl-c.h' file not found
error: <built-in>:2:10: '/usr/lib64/clang/5.0.0/include/opencl-c.h' file not found
error: <built-in>:2:10: '/usr/lib64/clang/5.0.0/include/opencl-c.h' file not found
error: <built-in>:2:10: '/usr/lib64/clang/5.0.0/include/opencl-c.h' file not found
  Platform Name                                   Portable Computing Language
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Clover
    Platform Name                                 Clover
    Platform Name                                 Clover
    Platform Name                                 Clover
$ clpeak | head
Platform: Clover
  Device: AMD Radeon (TM) RX 480 Graphics (AMD POLARIS10 / DRM 3.23.0 / 4.15.3-300.fc27.x86_64, LLVM 5.0.0)
    Driver version  : 17.2.4 (Linux x64)
    Compute units   : 36
    Clock frequency : 1288 MHz
    Build Log: input.cl:34:127: error: call to 'mad' is ambiguous
input.cl:30:22: note: expanded from macro 'MAD_64'
input.cl:29:22: note: expanded from macro 'MAD_16'
input.cl:28:25: note: expanded from macro 'MAD_4'

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117722852302
RAC: 34981689

Paul_134 wrote:$ ldd

Paul_134 wrote:
$ ldd hsgamma_FGRPB1G_1.18_x86_64-pc-linux-gnu__FGRPopencl1K-ati
        linux-vdso.so.1 (0x00007ffdd9bca000)
        libOpenCL.so.1 => /lib64/libOpenCL.so.1 (0x00007f87ba8d3000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f87ba6b4000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f87ba4b0000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f87ba15b000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f87b9d78000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f87baaf3000)
$ ls -l /usr/lib64/libOpenCL*
lrwxrwxrwx. 1 root root     25 Dec 16  2013 /usr/lib64/libOpenCL.so -> /usr/lib64/libOpenCL.so.1*
lrwxrwxrwx. 1 root root     18 Aug  3  2017 /usr/lib64/libOpenCL.so.1 -> libOpenCL.so.1.0.0*
-rwxr-xr-x. 1 root root 132104 Aug  3  2017 /usr/lib64/libOpenCL.so.1.0.0*

Thanks very much for that.  Can you see what's very interesting in your output?

Your hsgamma app is looking for OpenCL in /lib64.  I'm guessing that's the Clover one.  I also believe the one that might actually work for you is in /usr/lib64 as you show in your ls -l listing.  See the size and date of that one?  The one I showed as an example was one of my early machines where the size and date was 100624 on Apr 3 2015.  I've just checked a more recent install of mine and the lib is a bit bigger now 129320 and dated 27 Jul 2017.

Both versions of this lib are working for me.  Your one in /usr/lib64 would appear to be pretty much the same as my new version of 27 Jul.  My guess is all you need to do is convince the hsgamma app to use the lib in /usr/lib64/ rather than the one in /lib64/.

Out of interest, what do you get for running

ls -l /lib64/libOpenCL*

Paul_134 wrote:
Okay, how about this one for you?

I'll answer this next message.

 

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117722852302
RAC: 34981689

Paul_134 wrote:Okay, how

Paul_134 wrote:

Okay, how about this one for you?

 

$ clinfo |grep Platform
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 Mesa 17.2.4
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA
  Platform Name                                   Portable Computing Language
  Platform Vendor                                 The pocl project
  Platform Version                                OpenCL 1.2 pocl 0.15-pre, LLVM 5.0.0
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             POCL
  Platform Name                                   Clover
....


I'm not a programmer so I'm guessing the above shows you have two OpenCL implementations installed, OpenCL 1.1 Mesa 17.2.4 and OpenCL 1.2 pocl 0.15-pre, LLVM 5.0.0 and that something in these is not compatible with the hsgamma app.

I installed stuff (using the --compute option as a guide) in /opt/amdgpu-pro/lib64/.  I didn't want this stuff to affect anything else running on my machines (it was just for BOINC's benefit) so to launch BOINC, I prefixed the launch command with the LD library path as shown below:-

% cd $BOINC ; LD_LIBRARY_PATH=/opt/amdgpu-pro/lib64/ ./boinc --daemon

At the time, I didn't know any better and I just assumed that because BOINC would know about these extra libs then so would any science app launched by BOINC.  However, it seems to me that by a fortuitous set of circumstances, the above allowed BOINC to detect the GPUs and then when the science app fired up it found what it needed elsewhere in /usr/lib64/ where this other OpenCL lib just happened to be already installed.  Seems like the science app didn't know about the stuff in /opt/amdgpu-pro/lib64 after all.  I hadn't worked this out until quite a while after the initial success.  It's only quite recently that all of this now seems to make some sense and be somewhat explicable to a non-programmer.

If I try to launch clinfo by going to /opt/amdgpu-pro/bin/ and issuing the command ./clinfo, it tries to start and immediately fails.  However, as you would expect, if I prefix the command with the LD library path just as I do for BOINC, then clinfo runs fine and gives the following output:-

Number of platforms:      1
Platform Profile:         FULL_PROFILE
Platform Version:         OpenCL 2.0 AMD-APP (2264.10)
Platform Name:            AMD Accelerated Parallel Processing
Platform Vendor:          Advanced Micro Devices, Inc.
Platform Extensions:      cl_khr_icd cl_amd_event_callback cl_amd_offline_devices

Platform Name:             AMD Accelerated Parallel Processing
Number of devices:         2
Device Type:               CL_DEVICE_TYPE_GPU
Vendor ID:                 1002h
Board name:                Radeon RX Series
Device Topology:           PCI[ B#1, D#0, F#0 ]
Max compute units:         32
....

There are lots more lines of output but no error messages or signs of problems that I could see.

 

Cheers,
Gary.

Paul
Paul
Joined: 3 May 07
Posts: 123
Credit: 1785730885
RAC: 250924

Gary Roberts wrote:Paul_134

Gary Roberts wrote:
Paul_134 wrote:
$ ldd hsgamma_FGRPB1G_1.18_x86_64-pc-linux-gnu__FGRPopencl1K-ati
        linux-vdso.so.1 (0x00007ffdd9bca000)
        libOpenCL.so.1 => /lib64/libOpenCL.so.1 (0x00007f87ba8d3000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f87ba6b4000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f87ba4b0000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f87ba15b000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f87b9d78000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f87baaf3000)
$ ls -l /usr/lib64/libOpenCL*
lrwxrwxrwx. 1 root root     25 Dec 16  2013 /usr/lib64/libOpenCL.so -> /usr/lib64/libOpenCL.so.1*
lrwxrwxrwx. 1 root root     18 Aug  3  2017 /usr/lib64/libOpenCL.so.1 -> libOpenCL.so.1.0.0*
-rwxr-xr-x. 1 root root 132104 Aug  3  2017 /usr/lib64/libOpenCL.so.1.0.0*

Thanks very much for that.  Can you see what's very interesting in your output?

Hehe! Ah, that's funny.  No, no, I didn't notice anything odd because that's normal on Fedora/RH.  /lib and /lib64 don't exists anymore.  Those are all the same file.

$ ls -ld /lib
lrwxrwxrwx. 1 root root 7 Aug  2  2017 /lib -> usr/lib/
$ ls -ld /lib64
lrwxrwxrwx. 1 root root 9 Aug  2  2017 /lib64 -> usr/lib64/

But!  I take your point, though.  I did not try to override libopencl with AMD's version, which is exactly what I said was needed before.  You are totally right; I didn't try hard enough.  I didn't consider LD_PRELOAD.  I think that is a great idea.  Thanks again Gary!

BTW, the opencl that ldd found is part of the OCL-ICD package, not clover, mesa, or pocl.  But, I think that is strange.  I'm going to look around a bit...

Hmm, so, I don't see anything wrong with that.  This is something I still don't understand about the ICD system.  Is how is the linker supposed to connect a program that asks for 'libOpenCL.so.1' with an replacement in the ICD?  Is that even how it works?  Ugh, I'm so confused.

In any case, I think you might be on to something.  There is no reason I cannot preload a particular library into BOINC's environment.  That's worth a shot.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.