Gamma-Ray Pulsar Search #2 1.04 always fails with compute error

Justin La Sotten
Justin La Sotten
Joined: 9 Dec 05
Posts: 6
Credit: 2930440
RAC: 0
Topic 196978

I'm running a linux machine, kernel 3.5.0, x86_64, with 2 nVidia Quadro 2000 GPUs.

Each time it downloads the units, starts to process them, about 2-5 seconds in they fail with these errors:

Fri 31 May 2013 08:28:02 AM EDT | Einstein@Home | Computation for task LATeah0024U_944.0_404380_0.0_0 finished
Fri 31 May 2013 08:28:02 AM EDT | Einstein@Home | Output file LATeah0024U_944.0_404380_0.0_0_0 for task LATeah0024U_944.0_404380_0.0_0 absent
Fri 31 May 2013 08:28:02 AM EDT | Einstein@Home | Output file LATeah0024U_944.0_404380_0.0_0_1 for task LATeah0024U_944.0_404380_0.0_0 absent
Fri 31 May 2013 08:28:02 AM EDT | Einstein@Home | Starting task LATeah0024U_944.0_405720_0.0_1 using hsgamma_FGRP2 version 104 in slot 11
Fri 31 May 2013 08:28:03 AM EDT | Einstein@Home | Computation for task LATeah0024U_944.0_406040_0.0_1 finished
Fri 31 May 2013 08:28:03 AM EDT | Einstein@Home | Output file LATeah0024U_944.0_406040_0.0_1_0 for task LATeah0024U_944.0_406040_0.0_1 absent
Fri 31 May 2013 08:28:03 AM EDT | Einstein@Home | Output file LATeah0024U_944.0_406040_0.0_1_1 for task LATeah0024U_944.0_406040_0.0_1 absent
Fri 31 May 2013 08:28:03 AM EDT | Einstein@Home | Starting task LATeah0024U_944.0_403060_0.0_0 using hsgamma_FGRP2 version 104 in slot 9
Fri 31 May 2013 08:28:04 AM EDT | Einstein@Home | Computation for task LATeah0024U_944.0_404540_0.0_0 finished
Fri 31 May 2013 08:28:04 AM EDT | Einstein@Home | Output file LATeah0024U_944.0_404540_0.0_0_0 for task LATeah0024U_944.0_404540_0.0_0 absent
Fri 31 May 2013 08:28:04 AM EDT | Einstein@Home | Output file LATeah0024U_944.0_404540_0.0_0_1 for task LATeah0024U_944.0_404540_0.0_0 absent
Fri 31 May 2013 08:28:04 AM EDT | Einstein@Home | Starting task LATeah0024U_944.0_403260_0.0_1 using hsgamma_FGRP2 version 104 in slot 10
Fri 31 May 2013 08:28:06 AM EDT | Einstein@Home | Computation for task LATeah0024U_944.0_405720_0.0_1 finished
Fri 31 May 2013 08:28:06 AM EDT | Einstein@Home | Output file LATeah0024U_944.0_405720_0.0_1_0 for task LATeah0024U_944.0_405720_0.0_1 absent
Fri 31 May 2013 08:28:06 AM EDT | Einstein@Home | Output file LATeah0024U_944.0_405720_0.0_1_1 for task LATeah0024U_944.0_405720_0.0_1 absent
Fri 31 May 2013 08:28:07 AM EDT | Einstein@Home | Computation for task LATeah0024U_944.0_403060_0.0_0 finished
Fri 31 May 2013 08:28:07 AM EDT | Einstein@Home | Output file LATeah0024U_944.0_403060_0.0_0_0 for task LATeah0024U_944.0_403060_0.0_0 absent
Fri 31 May 2013 08:28:07 AM EDT | Einstein@Home | Output file LATeah0024U_944.0_403060_0.0_0_1 for task LATeah0024U_944.0_403060_0.0_0 absent
Fri 31 May 2013 08:28:08 AM EDT | Einstein@Home | Computation for task LATeah0024U_944.0_403260_0.0_1 finished
Fri 31 May 2013 08:28:08 AM EDT | Einstein@Home | Output file LATeah0024U_944.0_403260_0.0_1_0 for task LATeah0024U_944.0_403260_0.0_1 absent
Fri 31 May 2013 08:28:08 AM EDT | Einstein@Home | Output file LATeah0024U_944.0_403260_0.0_1_1 for task LATeah0024U_944.0_403260_0.0_1 absent

I've done a few things to try to remedy this.
1) I reset the project
2) I turned off Gamma-Ray Pulsar Search #2 in project preferences, updated project on machine, reset project on machine.

None of them worked. I'm still being sent these work units and they continue to fail.
I'd like to process them, so I'd rather fix the issue rather than turn them off completely.

Other info:
I'm not running an anonymous platform, just using the stock apps that are auto-downloaded. These are the apps that it currently downloaded:

einsteinbinary_BRP4_1.33_x86_64-pc-linux-gnu__BRP4cuda32nv270
einsteinbinary_BRP5_1.33_x86_64-pc-linux-gnu__BRP4cuda32nv270
einstein_S5R6_1.01_graphics_i686-pc-linux-gnu
hsgamma_FGRP2_1.04_i686-pc-linux-gnu

All other work units seem to process just fine, mostly all the CUDA units.

Any help anyone can give to help solve this problem would be greatly appreciated.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2979430805
RAC: 782629

Gamma-Ray Pulsar Search #2 1.04 always fails with compute error

The tasks error with

execv: No such file or directory You might also mention that you appear to be running an extreme-alpha BOINC v7.1.1

Neil Newell
Neil Newell
Joined: 20 Nov 12
Posts: 176
Credit: 169699457
RAC: 0

I had something very similar

I had something very similar on linux X86-32 (tasks failing shortly after they start) - unfortunately I can't remember the resolution! Possibly something to do with permissions? (are other WU's ok?)

Martin P.
Martin P.
Joined: 17 Feb 05
Posts: 162
Credit: 40156217
RAC: 0

Don't know, but I think I

Don't know, but I think I have the same problem:

Quote:

7.0.64

(unknown error) - exit code -1 (0xffffffff)

2013-06-18 13:25:00.4726 (4116) [normal]: This program is published under the GNU General Public License, version 2
2013-06-18 13:25:00.4726 (4116) [normal]: For details see http://einstein.phys.uwm.edu/license.php
2013-06-18 13:25:00.4726 (4116) [normal]: This Einstein@home App was built at: Jan 10 2013 13:49:42

2013-06-18 13:25:00.4726 (4116) [normal]: Start of BOINC application 'projects/einstein.phys.uwm.edu/einstein_S6BucketLVE_1.04_windows_intelx86__SSE2.exe'.
Activated exception handling...
command line: projects/einstein.phys.uwm.edu/einstein_S6BucketLVE_1.04_windows_intelx86__SSE2.exe --Freq=504.513682292 --FreqBand=0.05 --dFreq=1.61431023092e-06 --f1dot=-2.64248266531e-09 --f1dotBand=2.90673093185e-09 --df1dot=5.78907096294e-11 --skyGridFile=../../projects/einstein.phys.uwm.edu/skygrid_GC_Dc0.5_m0.3_0510Hz_S6Bucket.dat --numSkyPartitions=3639 --partitionIndex=648 --gammaRefine=230 --ephemE=../../projects/einstein.phys.uwm.edu/earth_09_11 --ephemS=../../projects/einstein.phys.uwm.edu/sun_09_11 --nCand1=3000 -o ../../projects/einstein.phys.uwm.edu/h1_0504.40_S6GC1__S6BucketLVEa_504.513682292Hz_648_0_0 --gridType=3 --printCand1 --semiCohToplist --segmentList=../../projects/einstein.phys.uwm.edu/S6GC1_T60h_v1_Segments.seg --computeLV --LVrho=0 --LVuseAllTerms=false --SortToplist=3 --recalcToplistStats -d1 --Dterms=8 --DataFiles1=..\..\projects\einstein.phys.uwm.edu\h1_0504.40_S6GC1;..\..\projects\einstein.phys.uwm.edu\l1_0504.40_S6GC1;..\..\projects\einstein.phys.uwm.edu\h1_0504.45_S6GC1;..\..\projects\einstein.phys.uwm.edu\l1_0504.45_S6GC1;..\..\projects\einstein.phys.uwm.edu\h1_0504.50_S6GC1;..\..\projects\einstein.phys.uwm.edu\l1_0504.50_S6GC1;..\..\projects\einstein.phys.uwm.edu\h1_0504.55_S6GC1;..\..\projects\einstein.phys.uwm.edu\l1_0504.55_S6GC1;..\..\projects\einstein.phys.uwm.edu\h1_0504.60_S6GC1;..\..\projects\einstein.phys.uwm.edu\l1_0504.60_S6GC1;..\..\projects\einstein.phys.uwm.edu\h1_0504.65_S6GC1;..\..\projects\einstein.phys.uwm.edu\l1_0504.65_S6GC1
2013-06-18 13:25:00.5818 (4116) [debug]: Flags: LAL_NDEBUG, OPTIMIZE, HS_OPTIMIZATION, GC_SSE2_OPT, i386, SSE, SSE2, GNUC X86 GNUX86
2013-06-18 13:25:00.5818 (4116) [debug]: Set up communication with graphics process.
Code-version: %% LAL: 6.7.0.1 (CLEAN a3a9e5096133465fbbcb8e23737d12d20190c497)
%% LALApps: 6.7.0.1 (CLEAN a3a9e5096133465fbbcb8e23737d12d20190c497)

2013-06-18 13:25:00.8626 (4116) [normal]: Reading input data ... done.
% --- GPS reference time = 960499913.5000 , GPS data mid time = 960499913.5000
% --- Setup, N = 90, T = 215977s, Tobs = 22059873s, gammaRefine = 230.000000, gamma2Refine = 35741.180671
2013-06-18 13:25:36.0574 (4116) [debug]: Successfully read checkpoint:728
% --- Cpt:728, total:728, sky:15/14, f1dot:1/52
2013-06-18 13:25:36.0574 (4116) [normal]: Finished main analysis.
2013-06-18 13:25:36.0574 (4116) [normal]: Recalculating statistics for the final toplist...
2013-06-18 13:25:36.0574 (4116) [CRITICAL]: Required frequency-bins [854893, 854908] not covered by SFT-interval [907925, 908416]
[Parameters: alpha:0, Dphi_alpha:8.549002e+005, Tsft:1.800000e+003, *Tdot_al:1.000011e+000]
XLAL Error - LocalXLALComputeFaFb (/home/jenkins/workspace/workspace/EAH-GW-S6LV1/SLAVE/MINGW32/TARGET/windows-x86/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/FDS_isolated/OptimizedCFS/LocalComputeFstat.c:554): Input domain error

LocalXALComputeFaFb() failed
Error[1] 5: function LocalComputeFStat, file /home/jenkins/workspace/workspace/EAH-GW-S6LV1/SLAVE/MINGW32/TARGET/windows-x86/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/FDS_isolated/OptimizedCFS/LocalComputeFstat.c, line 338, $Id$
ABORT: XLAL function call failed
XLALComputeExtraStatsSemiCoherent, line 360 : Failed call to LAL function ComputeFStat(). statusCode=5

XLAL Error - XLALComputeExtraStatsSemiCoherent (/home/jenkins/workspace/workspace/EAH-GW-S6LV1/SLAVE/MINGW32/TARGET/windows-x86/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/GCT/LineVeto.c:361): Internal function call failed: Input domain error

Error in function XLALComputeExtraStatsForToplist, line 220 : Failed call to XLALComputeLineVetoSemiCoherent().

XLAL Error - XLALComputeExtraStatsForToplist (/home/jenkins/workspace/workspace/EAH-GW-S6LV1/SLAVE/MINGW32/TARGET/windows-x86/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/GCT/LineVeto.c:221): Internal function call failed: Input domain error
XLAL Error - MAIN (/home/jenkins/workspace/workspace/EAH-GW-S6LV1/SLAVE/MINGW32/TARGET/windows-x86/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/GCT/HierarchSearchGCT.c:1814): Check (XLAL_SUCCESS == XLALComputeExtraStatsForToplist ( semiCohToplist, "GCTtop", &stackMultiSFT, &stackMultiNoiseWeights, &stackMultiDetStates, &CFparams, refTimeGPS, uvar_SignalOnly, uvar_outputSingleSegStats )) failed
XLAL Error - MAIN (/home/jenkins/workspace/workspace/EAH-GW-S6LV1/SLAVE/MINGW32/TARGET/windows-x86/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/GCT/HierarchSearchGCT.c:1814): XLALComputeExtraStatsForToplist() failed with xlalErrno = 1057.

XLAL Error - MAIN (/home/jenkins/workspace/workspace/EAH-GW-S6LV1/SLAVE/MINGW32/TARGET/windows-x86/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/GCT/HierarchSearchGCT.c:1814): Invalid pointer
2013-06-18 13:25:36.0730 (4116) [CRITICAL]: ERROR: MAIN() returned with error '-1'
FPU status flags: COND_0 PRECISION
2013-06-18 13:25:36.0730 (4116) [normal]: done. calling boinc_finish(-1).
13:25:36 (4116): called boinc_finish

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.