[Solved] Gamma-ray pulsar binary search GPUs v1.18 occurred opencl error

Cyanr & Cinny
Cyanr & Cinny
Joined: 14 Mar 06
Posts: 7
Credit: 118140284
RAC: 219041
Topic 216637

Well... I guess my AMD/ATI GPU may not be compatible with v1.18 since I am using amdgpu/pro with amd OpenCL... 

$ lspci | grep VGA
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition]

02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn XT [Radeon HD 7870 GHz Edition]

<core_client_version>7.9.3</core_client_version>
<![CDATA[
<message>
process exited with code 69 (0x45, -187)</message>
<stderr_txt>
16:46:13 (32530): [normal]: This Einstein@home App was built at: Jan 16 2017 08:09:16
16:46:13 (32530): [normal]: Start of BOINC application '../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.18_x86_64-pc-linux-gnu__FGRPopencl1K-ati'.
16:46:13 (32530): [debug]: 1e+16 fp, 3.6e+09 fp/s, 2879005 s, 799h43m25s40
command line: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.18_x86_64-pc-linux-gnu__FGRPopencl1K-ati --inputfile ../../projects/einstein.phys.uwm.edu/LATeah0104R.dat --alpha 4.4228137297 --delta -0.0345036602638 --skyRadius 5.817760e-08 --ldiBins 15 --f0start 636.0 --f0Band 8.0 --firstSkyPoint 0 --numSkyPoints 1 --f1dot -1e-13 --f1dotBand 1e-13 --df1dot 2.71528666e-15 --ephemdir ../../projects/einstein.phys.uwm.edu/JPLEPH --Tcoh 2097152.0 --toplist 10 --cohFollow 10 --numCells 1 --useWeights 1 --Srefinement 1 --CohSkyRef 1 --cohfullskybox 1 --mmfu 0.1 --reftime 56100 --model 0 --f0orbit 0.005 --mismatch 0.1 --demodbinary 1 --BinaryPointFile ../../projects/einstein.phys.uwm.edu/templates_LATeah0104R_0644_685114.dat --debug 1 --device 0 -o LATeah0104R_644.0_0_0.0_685114_1_0.out
output files: 'LATeah0104R_644.0_0_0.0_685114_1_0.out' '../../projects/einstein.phys.uwm.edu/LATeah0104R_644.0_0_0.0_685114_1_0' 'LATeah0104R_644.0_0_0.0_685114_1_0.out.cohfu' '../../projects/einstein.phys.uwm.edu/LATeah0104R_644.0_0_0.0_685114_1_1'
16:46:13 (32530): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
16:46:13 (32530): [debug]: glibc version/release: 2.27/stable
16:46:13 (32530): [debug]: Set up communication with graphics process.
boinc_get_opencl_ids returned [0x183adf0 , 0x7f2e8c080190]
Using OpenCL platform provided by: Advanced Micro Devices, Inc.
Using OpenCL device "Pitcairn" by: Advanced Micro Devices, Inc.
Max allocation limit: 1345394688
Global mem size: 1805643776
Couldn't create OpenCL command queue (error: -6)!
OpenCL shutdown complete!
initialize_ocl returned error [2013]
OCL context null
OCL queue null
Error generating generic FFT context object [5]
16:46:15 (32530): [CRITICAL]: ERROR: MAIN() returned with error '5'
FPU status flags:
mv: cannot stat 'LATeah0104R_644.0_0_0.0_685114_1_0.out': No such file or directory
mv: cannot stat 'LATeah0104R_644.0_0_0.0_685114_1_0.out': No such file or directory
mv: cannot stat 'LATeah0104R_644.0_0_0.0_685114_1_0.out': No such file or directory
mv: cannot stat 'LATeah0104R_644.0_0_0.0_685114_1_0.out': No such file or directory
mv: cannot stat 'LATeah0104R_644.0_0_0.0_685114_1_0.out': No such file or directory
mv: cannot stat 'LATeah0104R_644.0_0_0.0_685114_1_0.out.cohfu': No such file or directory
mv: cannot stat 'LATeah0104R_644.0_0_0.0_685114_1_0.out.cohfu': No such file or directory
mv: cannot stat 'LATeah0104R_644.0_0_0.0_685114_1_0.out.cohfu': No such file or directory
mv: cannot stat 'LATeah0104R_644.0_0_0.0_685114_1_0.out.cohfu': No such file or directory
mv: cannot stat 'LATeah0104R_644.0_0_0.0_685114_1_0.out.cohfu': No such file or directory
mv: cannot stat 'LATeah0104R_644.0_0_0.0_685114_1_0.out.cohfu': No such file or directory
mv: cannot stat 'LATeah0104R_644.0_0_0.0_685114_1_0.out.cohfu': No such file or directory
16:46:26 (32530): [normal]: done. calling boinc_finish(69).
16:46:26 (32530): called boinc_finish
</stderr_txt>
]]>

~$ sudo clinfo
Number of platforms 1
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.1 AMD-APP (2671.3)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
Platform Host timer resolution 1ns
Platform Extensions function suffix AMD

Platform Name AMD Accelerated Parallel Processing
Number of devices 2
Device Name Pitcairn
Device Vendor Advanced Micro Devices, Inc.
Device Vendor ID 0x1002
Device Version OpenCL 1.2 AMD-APP (2671.3)
Driver Version 2671.3
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Board Name (AMD) AMD Radeon HD 7800 Series
Device Topology (AMD) PCI-E, 01:00.0
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 10
SIMD per compute unit (AMD) 4
SIMD width (AMD) 16
SIMD instruction width (AMD) 1
Max clock frequency 1010MHz
Graphics IP (AMD) 6.0
Device Partition (core)
Max number of sub-devices 10
Supported partition types None
Max work item dimensions 3
Max work item sizes 1024x1024x1024
Max work group size 256
Preferred work group size (AMD) 256
Max work group size (AMD) 1024
Preferred work group size multiple 64
Wavefront width (AMD) 64
Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (n/a)
float 1 / 1
double 1 / 1 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 32, Little-Endian
Global memory size 1794813952 (1.672GiB)
Global free memory (AMD) <printDeviceInfo:75: get number of CL_DEVICE_GLOBAL_FREE_MEMORY_AMD : error -33>
Global memory channels (AMD) 8
Global memory banks per channel (AMD) 16
Global memory bank width (AMD) 256 bytes
Error Correction support No
Max memory allocation 1336189337 (1.244GiB)
Unified memory for Host and Device No
Minimum alignment for any data type 128 bytes
Alignment of base address 2048 bits (256 bytes)
Global Memory cache type Read/Write
Global Memory cache size 16384 (16KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 134217728 pixels
Max 1D or 2D image array size 2048 images
Base address alignment for 2D image buffers 256 bytes
Pitch alignment for 2D image buffers 256 pixels
Max 2D image size 16384x16384 pixels
Max 3D image size 2048x2048x2048 pixels
Max number of read image args 128
Max number of write image args 8
Local memory type Local
Local memory size 32768 (32KiB)
Local memory syze per CU (AMD) 65536 (64KiB)
Local memory banks (AMD) 32
Max number of constant args 8
Max constant buffer size 65536 (64KiB)
Preferred constant buffer size (AMD) 16384 (16KiB)
Max size of kernel argument 1024
Queue properties
Out-of-order execution No
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 1ns
Profiling timer offset since Epoch (AMD) 1539660304335763334ns (Tue Oct 16 11:25:04 2018)
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Thread trace supported (AMD) No
Number of async queues (AMD) 2
Max real-time compute queues (AMD) 0
Max real-time compute units (AMD) 0
SPIR versions 1.2
printf() buffer size 4194304 (4MiB)
Built-in kernels
Device Extensions cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event

Device Name Pitcairn
Device Vendor Advanced Micro Devices, Inc.
Device Vendor ID 0x1002
Device Version OpenCL 1.2 AMD-APP (2671.3)
Driver Version 2671.3
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Board Name (AMD) AMD Radeon HD 7800 Series
Device Topology (AMD) PCI-E, 02:00.0
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 10
SIMD per compute unit (AMD) 4
SIMD width (AMD) 16
SIMD instruction width (AMD) 1
Max clock frequency 1010MHz
Graphics IP (AMD) 6.0
Device Partition (core)
Max number of sub-devices 10
Supported partition types None
Max work item dimensions 3
Max work item sizes 1024x1024x1024
Max work group size 256
Preferred work group size (AMD) 256
Max work group size (AMD) 1024
Preferred work group size multiple 64
Wavefront width (AMD) 64
Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (n/a)
float 1 / 1
double 1 / 1 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 32, Little-Endian
Global memory size 2111901696 (1.967GiB)
Global free memory (AMD) printDeviceInfo:75: get number of CL_DEVICE_GLOBAL_FREE_MEMORY_AMD : error -33
Global memory channels (AMD) 8
Global memory banks per channel (AMD) 16
Global memory bank width (AMD) 256 bytes
Error Correction support No
Max memory allocation 1576064614 (1.468GiB)
Unified memory for Host and Device No
Minimum alignment for any data type 128 bytes
Alignment of base address 2048 bits (256 bytes)
Global Memory cache type Read/Write
Global Memory cache size 16384 (16KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 134217728 pixels
Max 1D or 2D image array size 2048 images
Base address alignment for 2D image buffers 256 bytes
Pitch alignment for 2D image buffers 256 pixels
Max 2D image size 16384x16384 pixels
Max 3D image size 2048x2048x2048 pixels
Max number of read image args 128
Max number of write image args 8
Local memory type Local
Local memory size 32768 (32KiB)
Local memory syze per CU (AMD) 65536 (64KiB)
Local memory banks (AMD) 32
Max number of constant args 8
Max constant buffer size 65536 (64KiB)
Preferred constant buffer size (AMD) 16384 (16KiB)
Max size of kernel argument 1024
Queue properties
Out-of-order execution No
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 1ns
Profiling timer offset since Epoch (AMD) 1539660304335763334ns (Tue Oct 16 11:25:04 2018)
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Thread trace supported (AMD) No

Cyanr & Cinny
Cyanr & Cinny
Joined: 14 Mar 06
Posts: 7
Credit: 118140284
RAC: 219041

I searched and found error

I searched and found error -6

at

https://streamhpc.com/blog/2013-04-28/opencl-error-codes/

And clinfo cannot calculate memory either

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117671082659
RAC: 35167060

Cyanryaku wrote:Well... I

Cyanryaku wrote:
Well... I guess my AMD/ATI GPU may not be compatible with v1.18 since I am using amdgpu/pro with amd OpenCL...

It's probably also to do with the fact that GCN v1.0 and v1.1 (Southern Islands (SI) and Sea Islands (CIK)) GPUs are not yet fully supported by amdgpu.  It's still a work in progress but perhaps with kernel 4.19.x (due for release very shortly) those early versions of the GCN architecture will finally be fully supported.

I run 2 main classes of AMD GPUs under Linux.  One group are Pitcairn series (SI) eg. HD7850 and R7 370, which I deliberately run under the older fglrx driver which works just fine.  My distro (PCLinuxOS) is a rolling release and I maintain a full local copy of the repository.  That means I can always install and update to particular points in time.  All these machines are updated to mid 2016 when the appropriate versions of Xorg and fglrx were last available in the repo.

The other group are all Polaris GPUs with quite recent kernels (4.18.x) and the amdgpu kernel module.  On top of that I install the OpenCL libs from the Red Hat version (RPM) of what used to be called AMDGPU-PRO.  I started doing this with the 16.60 version by just selecting the small number of OpenCL related packages that were needed.  These days I'm up to the 18.30 version (the latest).  This also works just fine.

Periodically, I will take a machine with a HD7850 and install the latest kernel with the latest amdgpu kernel module and then install the latest set of OpenCL libs.  By using the appropriate SI and CIK kernel parameters everything boots nicely and clinfo seems to show everything in good order.  When I launch BOINC, the GPU and its OpenCL capabilities are correctly recognised and tasks are downloaded.  However, they crash after about 20 sec of run time.  So I put everything back to fglrx in that machine for the moment.

I don't know exactly where the problem is (I'm not a programmer) but I believe there is still something missing with the proper support of SI and CIK under the amdgpu kernel module.  My plan is to try again once 4.19.x becomes available - a week or two I believe.

 

Cheers,
Gary.

Cyanr & Cinny
Cyanr & Cinny
Joined: 14 Mar 06
Posts: 7
Credit: 118140284
RAC: 219041

I already posted a support

I already posted a support ticket at AMD's community board and see what advice they could offer. So far, I have to *disable* the GPU support in Einstein@home's project preference so that no more GPU app will be downloaded. :-S

Cyanr & Cinny
Cyanr & Cinny
Joined: 14 Mar 06
Posts: 7
Credit: 118140284
RAC: 219041

Searched Google and found old

Searched Google and found old articles in github.com which once some people reported the same issue for previous amdgpu/pro opencl stuffs.

Actually, it needs some environment variables to work properly.

1). Can add new script in /etc/profile.d/amdgpu.sh

export GPU_FORCE_64BIT_PTR=1
export GPU_USE_SYNC_OBJECTS=1
export GPU_MAX_ALLOC_PERCENT=100
export GPU_SINGLE_ALLOC_PERCENT=100
export GPU_MAX_HEAP_SIZE=100

2). Can modify /lib/systemd/system/boinc-client.service

[Service]
Environment=GPU_SINGLE_ALLOC_PERCENT=100
Environment=GPU_MAX_HEAP_SIZE=100
Environment=GPU_FORCE_64BIT_PTR=1
Environment=GPU_USE_SYNC_OBJECTS=1
Environment=GPU_MAX_ALLOC_PERCENT=100

ProtectHome=true
Type=simple
Nice=10
User=boinc
WorkingDirectory=/var/lib/boinc
ExecStart=/usr/bin/boinc --redirectio
ExecStop=/usr/bin/boinccmd --quit
ExecReload=/usr/bin/boinccmd --read_cc_config
ExecStopPost=/bin/rm -f lockfile
IOSchedulingClass=idle

Hope this helps others

 

With this /opt/amdgpu-pro/bin/clinfo
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 1232203776
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117671082659
RAC: 35167060

Cyanryaku wrote:Searched

Cyanryaku wrote:
Searched Google and found old articles in github.com which once some people reported the same issue for previous amdgpu/pro opencl stuffs.

Congratulations on getting your SI GPU working.  Would you happen to have a link to the information you mention, thanks?

Cyanryaku wrote:

export GPU_FORCE_64BIT_PTR=1
export GPU_USE_SYNC_OBJECTS=1
export GPU_MAX_ALLOC_PERCENT=100
export GPU_SINGLE_ALLOC_PERCENT=100
export GPU_MAX_HEAP_SIZE=100

In view of your success report, I booted one of my machines with an R7 370 (SI) GPU using the test OS install on a separate disk that I tried a month or two ago.  First of all I upgraded the install to the very latest kernel (4.18.16) and amdgpu module.  I also updated the OpenCL libs to the latest 18.30 version from the amdgpu-pro package from AMD.  The previous test I had done had used 18.20.

I launch BOINC (which resides in a subdir of my home directory) with my own launch script.  For a quick test, I just added the environment variables to the launch script, rather than edit a system file.  I added the exact variables as you have listed them above.  My launch script also sets LD_LIBRARY_PATH to allow BOINC to know where the OpenCL libs are installed.

As before, BOINC started fine and the event log showed the GPU being correctly detected.  I set a small work cache size and downloaded a couple of tasks.  They were set to crunch one at a time.  The first one launched, ran for about 15 seconds and then gave a computation error - the same result as when I last tried this test with the 4.18.2 kernel.

Do you know any reason why including the environment variables in the launch script wouldn't work just the same as putting them in the shell's startup files?

How is your Pitcairn GPU performing?  Are the tasks validating?  Have you tried running 2 concurrent tasks?  Would you mind sharing a link to the host ID so I can look for myself, thanks?

 For the moment, I've swapped the hard disk back to the old install with the fglrx driver and the machine is crunching again.  It only takes a minute or two to reboot with the alternate disk so I'm keen to try again if I can work out what the problem is.  A link to the information you found would be really helpful, thanks.

 

Cheers,
Gary.

Cyanr & Cinny
Cyanr & Cinny
Joined: 14 Mar 06
Posts: 7
Credit: 118140284
RAC: 219041

By which user account you run

By which user account you run BOINC? boinc account? could you make sure that those 5 environment variables are correctly feed in with the that account and passed to amdgpu opencl run-time? by "sudo -E -u boinc xxx?" consider /etc/profile.d/boinc.sh ?

Try best to pass the variables first.

How about read this one?

https://www.easycryptomining.com/ethereum_ubuntu_16.html

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.