Yesterday I´ve got many WUs for a new downloaded application, using openCL instead of CUDA:
Gamma-ray pulsar binary search #1 on GPUs v1.12 (FGRPopencl-Beta-nvidia-mav) x86_64-apple-darwin
many of them stoped after a few seconds with errors, some used only CPU,
but all of them have the same openCL errors:
<core_client_version>7.6.22</core_client_version>
<![CDATA[
<stderr_txt>
19:42:13 (879): [normal]: This Einstein@home App was built at: Nov 25 2016 13:32:38
The FGRPB1G application is currently marked as Beta which means only users which have enabled test applications will get it. And they currently can't opt out from specific applications (Beta takes precedence). Right now it's either all Beta apps or no Beta apps. That's why you can't select the application in your project preferences, the app that is listed there is the CPU only app that was already running.
So now I've been testing a bit on this system:
AMD FX-8320E 8-core processor, one GTX 750Ti 2GB, one GTX 750 1GB, running Debian jessie + backports, NVIDIA driver 367.57.
There are minor issues with progress reporting, which seems to be faked, and estimated run times, which are much too high. Other than that, the tasks finish fine in all configurations that I have tested: One task, two tasks on two GPUs, two tasks on a single GPU. BUT ... they somehow starve the CPU tasks. My CPU is always idle up to 12% (1 core) beyond what I define in the BOINC settings. Still the GPU tasks always run close to 100% CPU+GPU while the CPU tasks take the penalty. IMO something needs to be done about that if this application is not meant for GPU only systems.
Does Gamma-Ray search will be at a later date one only search? instead of having two now: the normal one (Only CPU) and the "G" onde (For CPU and GPU)?
Once we are confident that the GPU versions give the correct results, we will disable the CPU versions of FGRPB1G, and run this as GPU only application. Most likely there will still be workunits for the CPU application, so we will end up with a CPU and a GPU app, similar to BRP4(G).
Yesterday I stopped getting the Betas on my dual GTX 960 installation. According to the somewhat cryptic log message, it appears that they have stopped sending work to multiple-card installations:
2016-12-06 23:06:37.8850 [PID=10979] Only one Beta app version result per WU (#2648532016-12-06 23:06:37.4277 [PID=10973] SCHEDULER_REQUEST::parse(): unrecognized: <allow_multiple_clients>0</allow_multiple_clients>
I presume they are fixing the problem that I noted above when running on two cards, but we will see.
The FGRPB1G application is currently marked as Beta which means only users which have enabled test applications will get it. And they currently can't opt out from specific applications (Beta takes precedence). Right now it's either all Beta apps or no Beta apps. That's why you can't select the application in your project preferences, the app that is listed there is the CPU only app that was already running.
Interesting bit about Beta taking precedence.
I just had a 16 thread machine request CPU work, expecting to get the "G" type work units (actually wanting to get those types to help testing), but instead downloaded 20 AVX standard work units with not a "G" in sight.
As there are more "G" type ready to send than AVX types I just found it strange I didn't get any. Must of been all allocated to other platforms at the time or something.
Yesterday I stopped getting the Betas on my dual GTX 960 installation. According to the somewhat cryptic log message, it appears that they have stopped sending work to multiple-card installations:
2016-12-06 23:06:37.8850 [PID=10979] Only one Beta app version result per WU (#2648532016-12-06 23:06:37.4277 [PID=10973] SCHEDULER_REQUEST::parse(): unrecognized: <allow_multiple_clients>0</allow_multiple_clients>
I presume they are fixing the problem that I noted above when running on two cards, but we will see.
<allow_multiple_clients> is used for running multiple Boinc clients on the same computer and Einsteins scheduler is to old to recognize that tag, hence the message in the log. It's got nothing to do with running multiple GPU cards. To get rid of the error message edit your cc_config.xml and remove the tags.
Yesterday I´ve got many WUs
)
Yesterday I´ve got many WUs for a new downloaded application, using openCL instead of CUDA:
Gamma-ray pulsar binary search #1 on GPUs v1.12 (FGRPopencl-Beta-nvidia-mav) x86_64-apple-darwin
many of them stoped after a few seconds with errors, some used only CPU,
but all of them have the same openCL errors:
19:42:13 (879): [normal]: Start of BOINC application 'hsgamma_FGRPB1G_1.12_x86_64-apple-darwin__FGRPopencl-Beta-nvidia-mav'.
19:42:13 (879): [debug]: 2.1e+15 fp, 7e+09 fp/s, 299707 s, 83h15m06s95
command line: hsgamma_FGRPB1G_1.12_x86_64-apple-darwin__FGRPopencl-Beta-nvidia-mav --inputfile ../../projects/einstein.phys.uwm.edu/LATeah2003L.dat --alpha 4.42281478648 --delta -0.0345027837249 --skyRadius 2.152570e-06 --ldiBins 15 --f0start 12 --f0Band 8 --firstSkyPoint 0 --numSkyPoints 1 --f1dot -1e-11 --f1dotBand 1e-12 --df1dot 3.344368011e-15 --ephemdir ../../projects/einstein.phys.uwm.edu/JPLEPH --Tcoh 2097152.0 --toplist 10 --cohFollow 10 --numCells 1 --useWeights 1 --Srefinement 1 --CohSkyRef 1 --cohfullskybox 1 --mmfu 0.1 --reftime 56100 --model 0 --f0orbit 0.005 --mismatch 0.1 --demodbinary 1 --BinaryPointFile ../../projects/einstein.phys.uwm.edu/templates_LATeah2003L_0020_1368.dat --debug 1 --device 1 -o LATeah2003L_20.0_0_-9e-12_1368_0_0.out
output files: 'LATeah2003L_20.0_0_-9e-12_1368_0_0.out' '../../projects/einstein.phys.uwm.edu/LATeah2003L_20.0_0_-9e-12_1368_0_0' 'LATeah2003L_20.0_0_-9e-12_1368_0_0.out.cohfu' '../../projects/einstein.phys.uwm.edu/LATeah2003L_20.0_0_-9e-12_1368_0_1'
19:42:13 (879): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
19:42:13 (879): [debug]: Set up communication with graphics process.
boinc_get_opencl_ids returned [0x2022700 , 0x7fff0000]
Using OpenCL platform provided by: Apple
Using OpenCL device "GeForce GTX 980 Ti" by: NVIDIA
Max allocation limit: 1610612736
OpenCL device has FP64 support
% Opening inputfile: ../../projects/einstein.phys.uwm.edu/LATeah2003L.dat
% Total amount of photon times: 30007
% Preparing toplist of length: 10
% Read 36 binary points
read_checkpoint(): Couldn't open file 'LATeah2003L_20.0_0_-9e-12_1368_0_0.out.cpt': No such file or directory (2)
% fft_size: 16777216 (0x1000000); alloc: 67108872
% Sky point 1/1
% Binary point 1/36
% Creating FFT plan.
% fft length: 16777216 (0x1000000)
Error in OpenCL context: OpenCL Build Warning : Compiler build log:
<program source>:4124:1: warning: no previous prototype for function 'FwdRad8B1'
FwdRad8B1(float2 *R0, float2 *R4, float2 *R2, float2 *R6, float2 *R1, float2 *R5, float2 *R3, float2 *R7)
^
<program source>:4162:1: warning: no previous prototype for function 'InvRad8B1'
InvRad8B1(float2 *R0, float2 *R4, float2 *R2, float2 *R6, float2 *R1, float2 *R5, float2 *R3, float2 *R7)
^
<program source>:4200:1: warning: no previous prototype for function 'FwdPass0'
FwdPass0(uint rw, uint b, uint me, uint inOffset, uint outOffset, __global float2 *bufIn, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4325:1: warning: no previous prototype for function 'FwdPass1'
FwdPass1(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4556:1: warning: no previous prototype for function 'FwdPass2'
FwdPass2(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4787:1: warning: no previous prototype for function 'FwdPass3'
FwdPass3(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __global float2 *bufOut, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4925:19: warning: incompatible pointer types initializing '__global __float4 *' with an expression of type '__global __float2 *'
__global float4 *buff4g = bufOut;
^ ~~~~~~
<program source>:4940:1: warning: no previous prototype for function 'InvPass0'
InvPass0(uint rw, uint b, uint me, uint inOffset, uint outOffset, __global float2 *bufIn, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5065:1: warning: no previous prototype for function 'InvPass1'
InvPass1(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5296:1: warning: no previous prototype for function 'InvPass2'
InvPass2(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5527:1: warning: no previous prototype for function 'InvPass3'
InvPass3(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __global float2 *bufOut, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5665:19: warning: incompatible pointer types initializing '__global __float4 *' with an expression of type '__global __float2 *'
__global float4 *buff4g = bufOut;
^ ~~~~~~
<program source>:5702:8: warning: assigning to '__global __float2 *' from 'const __global __float2 *' discards qualifiers
lwbIn = gbIn + iOffset;
^ ~~~~~~~~~~~~~~
<program source>:5696:7: warning: unused variable 'rw'
uint rw = 1;
^
<program source>:5732:8: warning: assigning to '__global __float2 *' from 'const __global __float2 *' discards qualifiers
lwbIn = gbIn + iOffset;
^ ~~~~~~~~~~~~~~
<program source>:5726:7: warning: unused variable 'rw'
uint rw = 1;
^
Error in OpenCL context: OpenCL Build Warning : Compiler build log:
<program source>:25:1: warning: no previous prototype for function 'TW3step'
TW3step(size_t u)
^
Error in OpenCL context: OpenCL Build Warning : Compiler build log:
<program source>:4124:1: warning: no previous prototype for function 'FwdRad8B1'
FwdRad8B1(float2 *R0, float2 *R4, float2 *R2, float2 *R6, float2 *R1, float2 *R5, float2 *R3, float2 *R7)
^
<program source>:4162:1: warning: no previous prototype for function 'InvRad8B1'
InvRad8B1(float2 *R0, float2 *R4, float2 *R2, float2 *R6, float2 *R1, float2 *R5, float2 *R3, float2 *R7)
^
<program source>:4200:1: warning: no previous prototype for function 'FwdPass0'
FwdPass0(uint rw, uint b, uint me, uint inOffset, uint outOffset, __global float2 *bufIn, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4325:1: warning: no previous prototype for function 'FwdPass1'
FwdPass1(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4556:1: warning: no previous prototype for function 'FwdPass2'
FwdPass2(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4787:1: warning: no previous prototype for function 'FwdPass3'
FwdPass3(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __global float2 *bufOut, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:4925:19: warning: incompatible pointer types initializing '__global __float4 *' with an expression of type '__global __float2 *'
__global float4 *buff4g = bufOut;
^ ~~~~~~
<program source>:4940:1: warning: no previous prototype for function 'InvPass0'
InvPass0(uint rw, uint b, uint me, uint inOffset, uint outOffset, __global float2 *bufIn, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5065:1: warning: no previous prototype for function 'InvPass1'
InvPass1(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5296:1: warning: no previous prototype for function 'InvPass2'
InvPass2(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __local float *bufOutRe, __local float *bufOutIm, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5527:1: warning: no previous prototype for function 'InvPass3'
InvPass3(uint rw, uint b, uint me, uint inOffset, uint outOffset, __local float *bufInRe, __local float *bufInIm, __global float2 *bufOut, float2 *R0, float2 *R1, float2 *R2, float2 *R3, float2 *R4, float2 *R5, float2 *R6, float2 *R7, float2 *R8, float2 *R9, float2 *R10, float2 *R11, float2 *R12, float2 *R13, float2 *R14, float2 *R15)
^
<program source>:5665:19: warning: incompatible pointer types initializing '__global __float4 *' with an expression of type '__global __float2 *'
__global float4 *buff4g = bufOut;
^ ~~~~~~
<program source>:5694:7: warning: unused variable 'rw'
uint rw = 1;
^
<program source>:5720:7: warning: unused variable 'rw'
uint rw = 1;
^
% Scratch buffer size: 136314880
% Starting semicoherent search over f0 and f1.
% nf1dots: 301 df1dot: 3.344368011e-15 f1dot_start: -1e-11 f1dot_band: 1e-12
% Filling array of photon pairs
.
.
.
.
.
The FGRPB1G application is
)
The FGRPB1G application is currently marked as Beta which means only users which have enabled test applications will get it. And they currently can't opt out from specific applications (Beta takes precedence). Right now it's either all Beta apps or no Beta apps. That's why you can't select the application in your project preferences, the app that is listed there is the CPU only app that was already running.
So now I've been testing a
)
So now I've been testing a bit on this system:
AMD FX-8320E 8-core processor, one GTX 750Ti 2GB, one GTX 750 1GB, running Debian jessie + backports, NVIDIA driver 367.57.
There are minor issues with progress reporting, which seems to be faked, and estimated run times, which are much too high. Other than that, the tasks finish fine in all configurations that I have tested: One task, two tasks on two GPUs, two tasks on a single GPU. BUT ... they somehow starve the CPU tasks. My CPU is always idle up to 12% (1 core) beyond what I define in the BOINC settings. Still the GPU tasks always run close to 100% CPU+GPU while the CPU tasks take the penalty. IMO something needs to be done about that if this application is not meant for GPU only systems.
Does Gamma-Ray search will be
)
Does Gamma-Ray search will be at a later date one only search? instead of having two now: the normal one (Only CPU) and the "G" onde (For CPU and GPU)?
Once we are confident that
)
Once we are confident that the GPU versions give the correct results, we will disable the CPU versions of FGRPB1G, and run this as GPU only application. Most likely there will still be workunits for the CPU application, so we will end up with a CPU and a GPU app, similar to BRP4(G).
BM
>Most likely there will still
)
>Most likely there will still be workunits for the CPU application
Yes, please keep the CPU version of FGRPB1.
Do these tasks require DPFP
)
Do these tasks require DPFP (FP64) ? If yes, what's the portion of work utilizing DPFP ?
-----
Yesterday I stopped getting
)
Yesterday I stopped getting the Betas on my dual GTX 960 installation. According to the somewhat cryptic log message, it appears that they have stopped sending work to multiple-card installations:
Christian Beer wrote:The
)
Interesting bit about Beta taking precedence.
I just had a 16 thread machine request CPU work, expecting to get the "G" type work units (actually wanting to get those types to help testing), but instead downloaded 20 AVX standard work units with not a "G" in sight.
As there are more "G" type ready to send than AVX types I just found it strange I didn't get any. Must of been all allocated to other platforms at the time or something.
Conan
Jim1348 wrote:Yesterday I
)