I keep getting this error

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1592322349
RAC: 772284
Topic 220255

Computer D5, ID: 12499431, It just came online as a backup to S@H, It run Seti just fine and previously ran E@H just fine. It runs for about a minute then errors out on the GPU. 

<core_client_version>7.6.33</ID: 12499431core_client_version>

<![CDATA[
<message>
(unknown error) - exit code 1024 (0x400)
</message>
<stderr_txt>
putenv 'LAL_DEBUG_LEVEL=3'
2019-12-21 12:52:57.6095 (11488) [normal]: This program is published under the GNU General Public License, version 2
2019-12-21 12:52:57.6095 (11488) [normal]: For details see http://einstein.phys.uwm.edu/license.php
2019-12-21 12:52:57.6095 (11488) [normal]: This Einstein@home App was built at: Dec 19 2019 12:14:49

2019-12-21 12:52:57.6251 (11488) [normal]: Start of BOINC application 'projects/einstein.phys.uwm.edu/einstein_O2MDF_2.07_windows_x86_64__GW-opencl-nvidia.exe'.
Activated exception handling...
[DEBUG} GPU type: 1
[DEBUG} got GPU info from BOINC
[DEBUG} got VendorID 4318
2019-12-21 12:52:58.3906 (11488) [debug]: BSGL output files
2019-12-21 12:52:58.3906 (11488) [debug]: Flags: LAL_DEBUG, OPTIMIZE, HS_OPTIMIZATION, GC_SSE2_OPT, X64, SSE, SSE2, GNUC X86 GNUX86
2019-12-21 12:52:58.4062 (11488) [debug]: Set up communication with graphics process.

DEPRECATION WARNING: program has invoked obsolete function XLALGetVersionString(). Please see XLALVCSInfoString() for information about a replacement.
Code-version: %% LAL: 6.19.2.1 (CLEAN 98bbe72a728eb25935e9195dafae691335dabf8c)
%% LALPulsar: 1.17.1.1 (CLEAN 98bbe72a728eb25935e9195dafae691335dabf8c)
%% LALApps: 6.23.0.1 (CLEAN 98bbe72a728eb25935e9195dafae691335dabf8c)

2019-12-21 12:52:59.0935 (11488) [normal]: Reading input data ... 2019-12-21 12:53:56.1114 (11488) [normal]: Search FstatMethod used: 'ResampOpenCL'
2019-12-21 12:53:56.1114 (11488) [normal]: Recalc FstatMethod used: 'DemodSSE'
2019-12-21 12:53:56.1114 (11488) [normal]: OpenCL Device used for Search/Recalc and/or semi coherent step: 'GeForce GTX 1060 3GB (Platform: NVIDIA CUDA, global memory: 3072 MiB)'
2019-12-21 12:53:56.1114 (11488) [normal]: OpenCL version is used for the semi-coherent step!
2019-12-21 12:54:10.2331 (11488) [normal]: Number of segments: 17, total number of SFTs in segments: 10091
done.
% --- GPS reference time = 1177858472.0000 , GPS data mid time = 1177858472.0000
2019-12-21 12:54:10.2799 (11488) [normal]: dFreqStack = 4.035776e-007, df1dot = 2.558432e-012, df2dot = 1.356969e-018, df3dot = 0.000000e+000
% --- Setup, N = 17, T = 864000 s, Tobs = 19750204 s, gammaRefine = 31, gamma2Refine = 51, gamma3Refine = 1

DEPRECATION WARNING: program has invoked obsolete function InitDopplerSkyScan(). Please see XLALInitDopplerSkyScan() for information about a replacement.
2019-12-21 12:54:10.9985 (11488) [normal]: INFO: No checkpoint checkpoint.cpt found - starting from scratch
% --- Cpt:0, total:141, sky:1/1, f1dot:1/141

0.% --- CG:2118404 FG:123892 f1dotmin_fg:-3.050054795097e-008 df1dot_fg:8.253006451613e-014 f2dotmin_fg:-6.651808823529e-019 df2dot_fg:2.660723529412e-020 f3dotmin_fg:0 df3dot_fg:1
XLAL Error - XLALComputeECLFFT_OpenCL (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/MinGW6.3/TARGET/windows-x64/EinsteinAtHome/source/lalsuite/lalpulsar/src/ComputeFstat_Resamp_OpenCL.c:1248): Processing FFT failed: CL_MEM_OBJECT_ALLOCATION_FAILURE
XLAL Error - XLALComputeECLFFT_OpenCL (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/MinGW6.3/TARGET/windows-x64/EinsteinAtHome/source/lalsuite/lalpulsar/src/ComputeFstat_Resamp_OpenCL.c:1248): Internal function call failed
XLAL Error - XLALComputeFaFb_Resamp_OpenCL (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/MinGW6.3/TARGET/windows-x64/EinsteinAtHome/source/lalsuite/lalpulsar/src/ComputeFstat_Resamp_OpenCL.c:654): Check failed: (*fftfuncs->computefft_func) ( fftfuncs->fftplan, ws->TS_FFT, ((void *)0) ) == XLAL_SUCCESS
XLAL Error - XLALComputeFaFb_Resamp_OpenCL (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/MinGW6.3/TARGET/windows-x64/EinsteinAtHome/source/lalsuite/lalpulsar/src/ComputeFstat_Resamp_OpenCL.c:654): Internal function call failed
XLAL Error - XLALComputeFstatResamp_OpenCL (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/MinGW6.3/TARGET/windows-x64/EinsteinAtHome/source/lalsuite/lalpulsar/src/ComputeFstat_Resamp_OpenCL.c:441): Check failed: XLALComputeFaFb_Resamp_OpenCL ( resamp, ws, thisPoint, common->dFreq, numFreqBins, TimeSeriesX_SRC_a, TimeSeriesX_SRC_b ) == XLAL_SUCCESS
XLAL Error - XLALComputeFstatResamp_OpenCL (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/MinGW6.3/TARGET/windows-x64/EinsteinAtHome/source/lalsuite/lalpulsar/src/ComputeFstat_Resamp_OpenCL.c:441): Internal function call failed
XLAL Error - XLALComputeFstat (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/MinGW6.3/TARGET/windows-x64/EinsteinAtHome/source/lalsuite/lalpulsar/src/ComputeFstat.c:875): Check failed: (input->method_funcs.compute_func) ( *Fstats, common, input->method_data ) == XLAL_SUCCESS
XLAL Error - XLALComputeFstat (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/MinGW6.3/TARGET/windows-x64/EinsteinAtHome/source/lalsuite/lalpulsar/src/ComputeFstat.c:875): Internal function call failed
MAIN: XLALComputeFstat() failed with errno=1024
2019-12-21 12:54:11.9514 (11488) [CRITICAL]: ERROR: MAIN() returned with error '1024'
2019-12-21 12:54:11.9514 (11488) [debug]: resultfile '../../projects/einstein.phys.uwm.edu/h1_0887.40_O2C02Cl4In0__O2MDFV2_VelaJr1_887.85Hz_62_1_0' (len 92), current config file: 0
Code-version: %% LAL: 6.19.2.1 (CLEAN 98bbe72a728eb25935e9195dafae691335dabf8c)
%% LALPulsar: 1.17.1.1 (CLEAN 98bbe72a728eb25935e9195dafae691335dabf8c)
%% LALApps: 6.23.0.1 (CLEAN 98bbe72a728eb25935e9195dafae691335dabf8c)

FPU status flags: PRECISION
2019-12-21 12:54:11.9670 (11488) [debug]: worker done. return(1024) to caller
2019-12-21 12:54:11.9670 (11488) [normal]: done. calling boinc_finish(1024).
12:54:11 (11488): called boinc_finish

</stderr_txt>
]]>

Logforme
Logforme
Joined: 13 Aug 10
Posts: 332
Credit: 1714373961
RAC: 0

The first error in your log

The first error in your log is: CL_MEM_OBJECT_ALLOCATION_FAILURE. The app failed to allocate more memory on the GPU.

Are you running more than one task at a time on the GPU?

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1592322349
RAC: 772284

Yep  I'm running 2X on a 3GB

Yep  I'm running 2X on a 3GB GTX1060,  That runs just fine on an almost identical dedicated E@H host. 

I have switched over to pulsars and I will soon know if they run OK. 

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1592322349
RAC: 772284

  Well I now have 3 of

 

Well I now have 3 of these 

Gamma-ray pulsar binary search #1 on GPUs v1.22 () windows_x86_64

 that went into pending so obviously they did not error out. 

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1592322349
RAC: 772284

I now have 7 pulsars pending

I now have 7 pulsars pending and 1 validated so  now I am forced to go for 2nd prize with a gravity wave being 1st prize. 

Any ideas on how to get this 1060 to run the gravity wave would really be appreciated. 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.