Gamma-ray pulsar binary search #1 on GPUs

Robert G
Robert G
Joined: 6 Nov 16
Posts: 2
Credit: 11,918,033
RAC: 0

Yesterday I´ve got many WUs

Yesterday I´ve got many WUs for a new downloaded application, using openCL instead of CUDA:

Gamma-ray pulsar binary search #1 on GPUs v1.12 (FGRPopencl-Beta-nvidia-mav) x86_64-apple-darwin

many of them stoped after a few seconds with errors, some used only CPU,

but all of them have the same openCL errors:

<core_client_version>7.6.22</core_client_version>
<![CDATA[
<stderr_txt>
19:42:13 (879): [normal]: This Einstein@home App was built at: Nov 25 2016 13:32:38

19:42:13 (879): [normal]: Start of BOINC application 'hsgamma_FGRPB1G_1.12_x86_64-apple-darwin__FGRPopencl-Beta-nvidia-mav'.
19:42:13 (879): [debug]: 2.1e+15 fp, 7e+09 fp/s, 299707 s, 83h15m06s95
command line: hsgamma_FGRPB1G_1.12_x86_64-apple-darwin__FGRPopencl-Beta-nvidia-mav --inputfile ../../projects/einstein.phys.uwm.edu/LATeah2003L.dat --alpha 4.42281478648 --delta -0.0345027837249 --skyRadius 2.152570e-06 --ldiBins 15 --f0start 12 --f0Band 8 --firstSkyPoint 0 --numSkyPoints 1 --f1dot -1e-11 --f1dotBand 1e-12 --df1dot 3.344368011e-15 --ephemdir ../../projects/einstein.phys.uwm.edu/JPLEPH --Tcoh 2097152.0 --toplist 10 --cohFollow 10 --numCells 1 --useWeights 1 --Srefinement 1 --CohSkyRef 1 --cohfullskybox 1 --mmfu 0.1 --reftime 56100 --model 0 --f0orbit 0.005 --mismatch 0.1 --demodbinary 1 --BinaryPointFile ../../projects/einstein.phys.uwm.edu/templates_LATeah2003L_0020_1368.dat --debug 1 --device 1 -o LATeah2003L_20.0_0_-9e-12_1368_0_0.out
output files: 'LATeah2003L_20.0_0_-9e-12_1368_0_0.out' '../../projects/einstein.phys.uwm.edu/LATeah2003L_20.0_0_-9e-12_1368_0_0' 'LATeah2003L_20.0_0_-9e-12_1368_0_0.out.cohfu' '../../projects/einstein.phys.uwm.edu/LATeah2003L_20.0_0_-9e-12_1368_0_1'
19:42:13 (879): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
19:42:13 (879): [debug]: Set up communication with graphics process.
boinc_get_opencl_ids returned [0x2022700 , 0x7fff0000]
Using OpenCL platform provided by: Apple
Using OpenCL device "GeForce GTX 980 Ti" by: NVIDIA
Max allocation limit: 1610612736
OpenCL device has FP64 support
% Opening inputfile: ../../projects/einstein.phys.uwm.edu/LATeah2003L.dat
% Total amount of photon times: 30007
% Preparing toplist of length: 10
% Read 36 binary points
read_checkpoint(): Couldn't open file 'LATeah2003L_20.0_0_-9e-12_1368_0_0.out.cpt': No such file or directory (2)
% fft_size: 16777216 (0x1000000); alloc: 67108872
% Sky point 1/1
% Binary point 1/36
% Creating FFT plan.
% fft length: 16777216 (0x1000000)
Error in OpenCL context: OpenCL Build Warning : Compiler build log:
<program source>:4124:1: warning: no previous prototype for function 'FwdRad8B1'
FwdRad8B1(float2 *R0, float2 *R4, float2 *R2, float2 *R6, float2 *R1, float2 *R5, float2 *R3, float2 *R7)
^
<program source>:4162:1: warning: no previous prototype for function 'InvRad8B1'
InvRad8B1(float2 *R0, float2 *R4, float2 *R2, float2 *R6, float2 *R1, float2 *R5, float2 *R3, float2 *R7)
^
<program source>:4200:1: warning: no previous prototype for function 'FwdPass0'
FwdPass0(uint rw, uint b, uint me, uint inOffset, uint outOffset, __global float2 *bufIn, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4325:1: warning: no previous prototype for function 'FwdPass1'
FwdPass1(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4556:1: warning: no previous prototype for function 'FwdPass2'
FwdPass2(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4787:1: warning: no previous prototype for function 'FwdPass3'
FwdPass3(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __global float2 *bufOut, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4925:19: warning: incompatible pointer types initializing '__global __float4 *' with an expression of type '__global __float2 *'
__global float4 *buff4g = bufOut;
^ ~~~~~~
<program source>:4940:1: warning: no previous prototype for function 'InvPass0'
InvPass0(uint rw, uint b, uint me, uint inOffset, uint outOffset, __global float2 *bufIn, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5065:1: warning: no previous prototype for function 'InvPass1'
InvPass1(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5296:1: warning: no previous prototype for function 'InvPass2'
InvPass2(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5527:1: warning: no previous prototype for function 'InvPass3'
InvPass3(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __global float2 *bufOut, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5665:19: warning: incompatible pointer types initializing '__global __float4 *' with an expression of type '__global __float2 *'
__global float4 *buff4g = bufOut;
^ ~~~~~~
<program source>:5702:8: warning: assigning to '__global __float2 *' from 'const __global __float2 *' discards qualifiers
lwbIn = gbIn + iOffset;
^ ~~~~~~~~~~~~~~
<program source>:5696:7: warning: unused variable 'rw'
uint rw = 1;
^
<program source>:5732:8: warning: assigning to '__global __float2 *' from 'const __global __float2 *' discards qualifiers
lwbIn = gbIn + iOffset;
^ ~~~~~~~~~~~~~~
<program source>:5726:7: warning: unused variable 'rw'
uint rw = 1;
^

Error in OpenCL context: OpenCL Build Warning : Compiler build log:
<program source>:25:1: warning: no previous prototype for function 'TW3step'
TW3step(size_t u)
^

Error in OpenCL context: OpenCL Build Warning : Compiler build log:
<program source>:4124:1: warning: no previous prototype for function 'FwdRad8B1'
FwdRad8B1(float2 *R0, float2 *R4, float2 *R2, float2 *R6, float2 *R1, float2 *R5, float2 *R3, float2 *R7)
^
<program source>:4162:1: warning: no previous prototype for function 'InvRad8B1'
InvRad8B1(float2 *R0, float2 *R4, float2 *R2, float2 *R6, float2 *R1, float2 *R5, float2 *R3, float2 *R7)
^
<program source>:4200:1: warning: no previous prototype for function 'FwdPass0'
FwdPass0(uint rw, uint b, uint me, uint inOffset, uint outOffset, __global float2 *bufIn, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4325:1: warning: no previous prototype for function 'FwdPass1'
FwdPass1(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4556:1: warning: no previous prototype for function 'FwdPass2'
FwdPass2(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4787:1: warning: no previous prototype for function 'FwdPass3'
FwdPass3(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __global float2 *bufOut, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4925:19: warning: incompatible pointer types initializing '__global __float4 *' with an expression of type '__global __float2 *'
__global float4 *buff4g = bufOut;
^ ~~~~~~
<program source>:4940:1: warning: no previous prototype for function 'InvPass0'
InvPass0(uint rw, uint b, uint me, uint inOffset, uint outOffset, __global float2 *bufIn, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5065:1: warning: no previous prototype for function 'InvPass1'
InvPass1(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5296:1: warning: no previous prototype for function 'InvPass2'
InvPass2(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5527:1: warning: no previous prototype for function 'InvPass3'
InvPass3(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __global float2 *bufOut, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5665:19: warning: incompatible pointer types initializing '__global __float4 *' with an expression of type '__global __float2 *'
__global float4 *buff4g = bufOut;
^ ~~~~~~
<program source>:5694:7: warning: unused variable 'rw'
uint rw = 1;
^
<program source>:5720:7: warning: unused variable 'rw'
uint rw = 1;
^

% Scratch buffer size: 136314880
% Starting semicoherent search over f0 and f1.
% nf1dots: 301 df1dot: 3.344368011e-15 f1dot_start: -1e-11 f1dot_band: 1e-12
% Filling array of photon pairs
.
.
.
.
.

Christian Beer
Christian Beer
Moderator
Joined: 9 Feb 05
Posts: 595
Credit: 96,908,763
RAC: 294

The FGRPB1G application is

The FGRPB1G application is currently marked as Beta which means only users which have enabled test applications will get it. And they currently can't opt out from specific applications (Beta takes precedence). Right now it's either all Beta apps or no Beta apps. That's why you can't select the application in your project preferences, the app that is listed there is the CPU only app that was already running.

floyd
floyd
Joined: 12 Sep 11
Posts: 133
Credit: 186,326,751
RAC: 129

So now I've been testing a

So now I've been testing a bit on this system:
AMD FX-8320E 8-core processor, one GTX 750Ti 2GB, one GTX 750 1GB, running Debian jessie + backports, NVIDIA driver 367.57.

There are minor issues with progress reporting, which seems to be faked, and estimated run times, which are much too high. Other than that, the tasks finish fine in all configurations that I have tested: One task, two tasks on two GPUs, two tasks on a single GPU. BUT ... they somehow starve the CPU tasks. My CPU is always idle up to 12% (1 core) beyond what I define in the BOINC settings. Still the GPU tasks always run close to 100% CPU+GPU while the CPU tasks take the penalty. IMO something needs to be done about that if this application is not meant for GPU only systems.

Filipe
Filipe
Joined: 10 Mar 05
Posts: 148
Credit: 245,604,954
RAC: 168

Does Gamma-Ray search will be

Does Gamma-Ray search will be at a later date one only search? instead of having two now: the normal one (Only CPU) and the "G" onde (For CPU and GPU)?

 

 

 

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,962
Credit: 203,684,186
RAC: 29,169

Once we are confident that

Once we are confident that the GPU versions give the correct results, we will disable the CPU versions of FGRPB1G, and run this as GPU only application. Most likely there will still be workunits for the CPU application, so we will end up with a CPU and a GPU app, similar to BRP4(G).

BM

puh32
puh32
Joined: 5 Dec 15
Posts: 11
Credit: 681,126,672
RAC: 222,842

>Most likely there will still

>Most likely there will still be workunits for the CPU application

Yes, please keep the CPU version of FGRPB1.

 

 

Mumak
Joined: 26 Feb 13
Posts: 325
Credit: 1,902,902,254
RAC: 612,573

Do these tasks require DPFP

Do these tasks require DPFP (FP64) ? If yes, what's the portion of work utilizing DPFP ?

-----

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 381
Credit: 201,998,644
RAC: 158

Yesterday I stopped getting

Yesterday I stopped getting the Betas on my dual GTX 960 installation.  According to the somewhat cryptic log message, it appears that they have stopped sending work to multiple-card installations:

2016-12-06 23:06:37.8850 [PID=10979]    Only one Beta app version result per WU (#2648532016-12-06 23:06:37.4277 [PID=10973]   SCHEDULER_REQUEST::parse(): unrecognized: <allow_multiple_clients>0</allow_multiple_clients>

I presume they are fixing the problem that I noted above when running on two cards, but we will see.

 
Conan
Conan
Joined: 19 Jun 05
Posts: 161
Credit: 5,808,481
RAC: 0

Christian Beer wrote:The

Christian Beer wrote:
The FGRPB1G application is currently marked as Beta which means only users which have enabled test applications will get it. And they currently can't opt out from specific applications (Beta takes precedence). Right now it's either all Beta apps or no Beta apps. That's why you can't select the application in your project preferences, the app that is listed there is the CPU only app that was already running.

Interesting bit about Beta taking precedence.

I just had a 16 thread machine request CPU work, expecting to get the "G" type work units (actually wanting to get those types to help testing), but instead downloaded 20 AVX standard work units with not a "G" in sight.

As there are more "G" type ready to send than AVX types I just found it strange I didn't get any. Must of been all allocated to other platforms at the time or something.

 

Conan

Holmis
Joined: 4 Jan 05
Posts: 1,118
Credit: 804,548,876
RAC: 179,828

Jim1348 wrote:Yesterday I

Jim1348 wrote:

Yesterday I stopped getting the Betas on my dual GTX 960 installation.  According to the somewhat cryptic log message, it appears that they have stopped sending work to multiple-card installations:

2016-12-06 23:06:37.8850 [PID=10979]    Only one Beta app version result per WU (#2648532016-12-06 23:06:37.4277 [PID=10973]   SCHEDULER_REQUEST::parse(): unrecognized: <allow_multiple_clients>0</allow_multiple_clients>

I presume they are fixing the problem that I noted above when running on two cards, but we will see.

<allow_multiple_clients> is used for running multiple Boinc clients on the same computer and Einsteins scheduler is to old to recognize that tag, hence the message in the log. It's got nothing to do with running multiple GPU cards. To get rid of the error message edit your cc_config.xml and remove the tags.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.