Am I reading the following output correctly?
boinc_get_opencl_ids returned [0x1392648 , 0x7f1ef1d86100]
Using OpenCL platform provided by: Mesa
Using OpenCL device "AMD ARUBA (DRM 2.50.0 / 4.15.0-24-generic, LLVM 6.0.0)" by: AMD
Max allocation limit: 750002176
Global mem size: 1071431680
OpenCL compiling FAILED! : -11 . Error message: input.cl:7:26: error: unsupported OpenCL extension 'cl_khr_fp64' - ignoring
input.cl:10:30: error: unknown type name 'double2'; did you mean 'double'?
input.cl:10:30: error: use of type 'double' requires cl_khr_fp64 extension to be enabled
OpenCL device has no FP64 support
LLVM ERROR: Cannot select: 0x17d2890: i32,ch = AtomicCmpSwap 0x187d520, 0x19067c0, 0x1906758, 0x19065b8
0x19067c0: i32,ch = CopyFromReg 0x187d520, Register:i32 %196
0x1906418: i32 = Register %196
0x1906758: i32,ch = CopyFromReg 0x187d520, Register:i32 %198
0x19060d8: i32 = Register %198
0x19065b8: i32 = bitcast 0x177f4d0
0x177f4d0: f32 = fadd 0x17d30b0, 0x18f8fe0
0x17d30b0: f32,ch = CopyFromReg 0x187d520, Register:f32 %190
0x177f8e0: f32 = Register %190
0x18f8fe0: f32 = bitcast 0x1906758
0x1906758: i32,ch = CopyFromReg 0x187d520, Register:i32 %198
0x19060d8: i32 = Register %198
In function: kernel_ts_2_phase_diff_sorted
$ clinfo
Number of platforms 1
Platform Name Clover
Platform Vendor Mesa
Platform Version OpenCL 1.1 Mesa 18.0.5
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd
Platform Extensions function suffix MESA
Platform Name Clover
Number of devices 1
Device Name AMD ARUBA (DRM 2.50.0 / 4.15.0-24-generic, LLVM 6.0.0)
Device Vendor AMD
Device Vendor ID 0x1002
Device Version OpenCL 1.1 Mesa 18.0.5
Driver Version 18.0.5
Device OpenCL C Version OpenCL C 1.1
Device Type GPU
Device Profile FULL_PROFILE
Max compute units 2
Max clock frequency 0MHz
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Preferred work group size multiple 64
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 2 / 2
half 0 / 0 (n/a)
float 4 / 4
double 2 / 2 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Address bits 32, Little-Endian
Global memory size 1071431680 (1022MiB)
Error Correction support No
Max memory allocation 750002176 (715.3MiB)
Unified memory for Host and Device No
Minimum alignment for any data type 128 bytes
Alignment of base address 32768 bits (4096 bytes)
Global Memory cache type None
Image support No
Local memory type Local
Local memory size 32768 (32KiB)
Max constant buffer size 750002176 (715.3MiB)
Max number of constant args 15
Max size of kernel argument 1024
Queue properties
Out-of-order execution No
Profiling Yes
Profiling timer resolution 0ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Device Available Yes
Compiler Available Yes
Device Extensions cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) Clover
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [MESA]
clCreateContext(NULL, ...) [default] Success [MESA]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name Clover
Device Name AMD ARUBA (DRM 2.50.0 / 4.15.0-24-generic, LLVM 6.0.0)
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name Clover
Device Name AMD ARUBA (DRM 2.50.0 / 4.15.0-24-generic, LLVM 6.0.0)
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.8
ICD loader Profile OpenCL 1.2
I'm reading this as:
“Your device does not support double2 and hence useless”
Aside: I have to wrap each chunk of output in code tags rather than the whole lot. Apparently a newline confuses things. A pre tag has no effect. Sigh.
Copyright © 2024 Einstein@Home. All rights reserved.
Double precision or FP64
)
Double precision or FP64 support is not required for Einstein@home to use the GPU, a working OpenCL driver configuration on the other hand is.
So maybe the MESA driver you have installed isn't up to the task? Maybe try another driver?
When it comes to Linux I'm clueless but I'm confident that Gary R can help or give hints on this.
Holmis wrote:... I'm
)
I feel your confidence may be a little misplaced :-).
All my machines run Linux and virtually all use graphics cards so, through trial and error, I do know how to make certain hardware architectures work correctly with the Einstein apps. I'll describe what I *think* but with no guarantees that it's the complete picture by any means.
With unknown hardware, I always go tothis huge table and find the correct sub-section - in this case the HD7000 series. If you scroll down to just below the discrete GPUs, there is a small section of HD 7000 IGPs and you will find the specs of the 7480D there. Note that none of these have double precision. As Holmis mentions, this is not an insurmountable obstacle but it does mean that using it for crunching is going to be rather inefficient, with double precision calculations needing to be transferred back to a CPU core. The other issue is that GPU tasks need a lot of VRAM - you won't be sent tasks in the first place with less than 1GB. With an IGP style device, I'm not sure how this might behave as far as sharing memory is concerned. You have to support CPU cores and drive a display as well as handle GPU task crunching.
So looking at the first log snip, the scheduler has agreed to send tasks. The first thing that happens is that some code is run to test the DP capability of the hardware. The 'OpenCL compiling failed' message refers to the test routine so at this point it is known that the computations can't be completed on the GPU. Hence the first line of the 2nd snip.
I think the bit that comes next is the actual showstopper. My guess is that something the science app is trying to do is not being handled correctly by the Clover OpenCL implementation. I have no direct experience with Clover but I've seen a number of people try to use it and fail.
My experience started with a HD 7770 using the proprietary fglrx driver. That was the first AMD card I used and it still runs today, still using fglrx. As a result of that experience, I purchased a number of HD 7850s at end of life when they were heavily discounted. They too all still run the fglrx version of OpenCL. They are still very productive. I intend to keep them as they are, using the final version of fglrx, until such time as there is workable support for Southern Islands/Sea Islands cards in what has replaced fglrx. I note that the OS being used with the 7480D is Ubuntu 16.04. I know nothing about Ubuntu but it seems to me that at about or just before that time, fglrx was still being used. The hardware may very well work immediately if the last version of Ubuntu with fglrx could be installed - just a guess.
18 months ago I decided to upgrade a bunch of CPU only machines (2009/2010 vintage) with RX 4xx GPUs rather than retire them. With a lot of playing around to start with, I was lucky enough to work out what I needed to extract from the 16.60 version of the Red Hat AMDGPU-PRO package to install on top of the open source amdgpu graphics driver to get a working system. I've steered well clear of Clover. My distro of choice is RPM based so the Red Hat package seemed like the best starting point. I'm not a programmer so until a few things dawned on me I was really stumbling around in the dark. I now have quite a few Polaris GPUs from RX 460 to RX 580 and I've extracted and tested components from AMDGPU-PRO from 16.60 through to the latest 18.20. Funnily enough, the more recent OpenCL components perform pretty much identically to those from 16.60 - no improvement there. There has been about a 5-8% performance improvement from newer versions of the amdgpu graphics driver, however. I'm trying to find time to test that a bit more thoroughly at the moment.
Why am I rabbiting on about all this, you might ask. Simply because, I believe the best way to a rewarding crunching experience is not to struggle with an old IGP like the 7480D but to find something like a cheap RX 460 second hand and plug that into a PCIe x16 slot. Most of them don't need an external PCIe power connector and they perform well even with old 2009 vintage Pentium dual core systems. I'm running some with 300W PSUs (spec is 270W on the 12V rail). A cheap power meter at the wall shows something like 170W power consumption (it fluctuates quite a bit) when running 1CPU task and 2 concurrent GPU tasks on a system with a 2GB RX 460. The output of the GPU is virtually identical to that of the same GPU in a much more modern system (Haswell refresh or even Kaby Lake).
Cheers,
Gary.
Ken Sharp wrote:Aside: I have
)
I'm only guessing here because I haven't actually tried it myself. In the BBCode instructions under the message composition area, it mentions that 'enter' creates a new paragraph and that shift-enter creates a new line. It seems to stress using shift-enter rather than enter. I wonder if you could have used a single set of code tags around the whole lot if you had separated each individual block with a pair of shift-enters?
Cheers,
Gary.
This is a test Apparently
)
This
is
a
test
Apparently not. :-(
Thanks for the lengthy reply above. I'm still messing and will reply properly later.
This line intrigues me:
NOTE: your OpenCL library declares to support OpenCL 1.2,
but it seems to support up to OpenCL 2.1 too.
Ken Sharp wrote:...Apparently
)
Is the following example, by any chance, what you were trying to achieve?
There was just one set of code tags (the BBCode type in square brackets) around the set of three sub-blocks.
I talked about two shift-enters because if you are cutting and pasting a block of text from a log, the most convenient thing is to have the cursor at the end of the last line of that block. The first shift-enter 'completes' what you have just pasted and the second gives you the blank line ready to paste a second block. The other thing you might be interested in is the third last icon at the right hand end of the toolbar - paste as plain text - I tend to click that first before that actual paste operation when pasting blocks of text from logs.
Cheers,
Gary.
Ah, BBcode. I was using HTML.
)
Ah, BBcode. I was using HTML. Still though: <code></code> should work.
Never mind I'll stick to BB from now on.
Magic!