Cant run Petri's optimized app

Ben Scott
Ben Scott
Joined: 30 Mar 20
Posts: 52
Credit: 906,638,481
RAC: 2,003,014
Topic 229728

I have tried Petri's gamma ray app, both 0.95 and 1.0 but both

fail immediately. I am running the latest Debian on a Ryzen 5 5500

and an RTX 3080. Thank you.

 

 

Name: LATeah4021L13_1188.0_0_0.0_3019422_0

Workunit ID: 740176253

Created: 28 Jun 2023 7:07:41 UTC

Sent: 28 Jun 2023 7:24:10 UTC

Report deadline: 12 Jul 2023 7:24:10 UTC

Received: 28 Jun 2023 7:27:19 UTC

Server state: Over

Outcome: Computation error

Client state: Compute error

Exit status: 11 (0x0000000B) Unknown error code

Computer: 13146945

Run time (sec): 3.06

CPU time (sec): 1.43

Peak working set size (MB): 0

Peak swap size (MB): 0

Peak disk usage (MB): 0.02

Validation state: Invalid

Granted credit: 0

Application: Gamma-ray pulsar binary search #1 on GPUs
Anonymous platform


Stderr output

<core_client_version>7.20.5</core_client_version>
<![CDATA[
<message>
process exited with code 11 (0xb, -245)</message>
<stderr_txt>
00:24:24 (141321): [normal]: This Einstein@home App was built at: Mar 30 2022 19:19:39

00:24:24 (141321): [normal]: Start of BOINC application '../../projects/einstein.phys.uwm.edu/HSgammaPulsar_x86_64-pc-linux-gnu-opencl_v0.95'.
00:24:24 (141321): [debug]: 1e+16 fp, 1e+09 fp/s, 10500000 s, 2916h40m00s00
00:24:24 (141321): [normal]: % CPU usage: 1.000000, GPU usage: 1.000000
command line: ../../projects/einstein.phys.uwm.edu/HSgammaPulsar_x86_64-pc-linux-gnu-opencl_v0.95 --inputfile ../../projects/einstein.phys.uwm.edu/LATeah4021L13.dat --alpha 0.943218186562 --delta 1.30995332125 --skyRadius 8.726650e-08 --ldiBins 30 --f0start 1180.0 --f0Band 8.0 --firstSkyPoint 0 --numSkyPoints 1 --f1dot -1e-13 --f1dotBand 1e-13 --df1dot 1.413729381e-15 --ephemdir ../../projects/einstein.phys.uwm.edu/JPLEPH --Tcoh 2097152.0 --toplist 10 --cohFollow 10 --numCells 1 --useWeights 1 --Srefinement 1 --CohSkyRef 1 --cohfullskybox 1 --mmfu 0.1 --reftime 56100 --model 0 --f0orbit 0.005 --mismatch 0.1 --demodbinary 1 --BinaryPointFile ../../projects/einstein.phys.uwm.edu/templates_LATeah4021L13_1188_3019422.dat --debug 0 -o LATeah4021L13_1188.0_0_0.0_3019422_0_0.out
output files: 'LATeah4021L13_1188.0_0_0.0_3019422_0_0.out' '../../projects/einstein.phys.uwm.edu/LATeah4021L13_1188.0_0_0.0_3019422_0_0' 'LATeah4021L13_1188.0_0_0.0_3019422_0_0.out.cohfu' '../../projects/einstein.phys.uwm.edu/LATeah4021L13_1188.0_0_0.0_3019422_0_1'
00:24:24 (141321): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
00:24:24 (141321): [debug]: glibc version/release: 2.36/stable
00:24:24 (141321): [debug]: Set up communication with graphics process.
EAH_SLEEP file found, value 1

Eah sleep true, 1
boinc_get_opencl_ids returned [0x557276f6dfb0 , 0x557276f71940]
Using OpenCL platform provided by: NVIDIA Corporation
Using OpenCL device "NVIDIA GeForce RTX 3080" by: NVIDIA Corporation
Max allocation limit: 2623160320
Global mem size: 10492641280
OpenCL device has FP64 support
13 warnings generated.
SemiCoh mode 0 start
(1)read_checkpoint(): Couldn't open file 'LATeah4021L13_1188.0_0_0.0_3019422_0_0.out.cpt': No such file or directory (2)
skypoint loop(1)
S0:InitEpehem called from barycenter.c 0x55727484ca48 0x55727484ca40 0x55727484ca38dpleph[initephem]: Cannot open file .405, result = 104
-- done
dpleph[state]: Time 2454683.289515 outside range of ephemeris
dpleph[state]: Time 2454683.289515 outside range of ephemeris

-- signal handler called: signal 1
9 stack frames obtained for this thread:

End of stcaktrace
00:24:26 (141321): called boinc_finish(11)

</stderr_txt>
]]>







Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4,779
Credit: 17,805,491,088
RAC: 3,983,831

You're missing the JPLEPH.405

You're missing the JPLEPH.405 file in the project directory. You were running correctly before so somehow that file got deleted or corrupted.

Check for its presence. You may have to reset the project to get the file resent to you. Or if you feel comfortable manually downloading it, close BOINC and open the client_state.xml file and look for the files download URL and manually download it and put it back into your project directory

 

Ben Scott
Ben Scott
Joined: 30 Mar 20
Posts: 52
Credit: 906,638,481
RAC: 2,003,014

I tried again and that file

I tried again and that file is definately there and the errors persist.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,749
Credit: 35,724,982,759
RAC: 38,703,853

probably not the issue but do

probably not the issue but do you show any missing dependencies? run 'ldd' on the special app binary.

i've seen this issue pop up a couple times before and it can be a pain. I'm pretty sure one of my systems had this issue once and i don't remember how i solved it, but i think it was a project reset or maybe even a total boinc reinstall. not sure if the other person ever solved it. might also be permissions related. make sure the .405 file isn't protected in any way.

_________________________________________________________________________

Ben Scott
Ben Scott
Joined: 30 Mar 20
Posts: 52
Credit: 906,638,481
RAC: 2,003,014

ldd shows everything linked

ldd shows everything linked up fine.  JPLEPH.405
 is world readable.

On a related note, do you know how to catch the stderr.txt file before it vanishes onto the remote server and I have to hunt for it?

 

Thank you.

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,749
Credit: 35,724,982,759
RAC: 38,703,853

Before the task is reported,

Before the task is reported, the stderr is in the client state file. Or real-time in the running slot directory.  

_________________________________________________________________________

Ben Scott
Ben Scott
Joined: 30 Mar 20
Posts: 52
Credit: 906,638,481
RAC: 2,003,014

I nuked the boinc directory

I nuked the boinc directory after  detaching from the project then reinstalled everything and still no joy.

mikey
mikey
Joined: 22 Jan 05
Posts: 12,058
Credit: 1,834,324,605
RAC: 24,074

Ben Scott wrote: I nuked the

Ben Scott wrote:

I nuked the boinc directory after  detaching from the project then reinstalled everything and still no joy.

But you are running Boinc version 7.20.5 isn't that past the version 19 that Petri's app is? I wonder if your version of Boinc  is too new to accept his tweaks.

Your other pc is running 7.18.1 so it may work there if you run gpu tasks on your AMD6600

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,749
Credit: 35,724,982,759
RAC: 38,703,853

mikey wrote:Ben Scott

mikey wrote:

Ben Scott wrote:

I nuked the boinc directory after  detaching from the project then reinstalled everything and still no joy.

But you are running Boinc version 7.20.5 isn't that past the version 19 that Petri's app is? I wonder if your version of Boinc  is too new to accept his tweaks.

Your other pc is running 7.18.1 so it may work there if you run gpu tasks on your AMD6600

app version and boinc version have nothing to do with each other. boinc simply calls the science app executable and then it's the science app doing everything from there.

the version of 'BOINC' that many on our team is 7.19.0, meaning it was build from the 7.19.0 master branch. nothing more. and betri's app isn't versioned in that way anyway. it's v1.0. has nothing to do with Petri's gamma ray app.

mountkidd is running the special app on a 7.20.5 system no problem.

_________________________________________________________________________

mikey
mikey
Joined: 22 Jan 05
Posts: 12,058
Credit: 1,834,324,605
RAC: 24,074

Ian&Steve C. wrote: mikey

Ian&Steve C. wrote:

mikey wrote:

Ben Scott wrote:

I nuked the boinc directory after  detaching from the project then reinstalled everything and still no joy.

But you are running Boinc version 7.20.5 isn't that past the version 19 that Petri's app is? I wonder if your version of Boinc  is too new to accept his tweaks.

Your other pc is running 7.18.1 so it may work there if you run gpu tasks on your AMD6600

app version and boinc version have nothing to do with each other. boinc simply calls the science app executable and then it's the science app doing everything from there.

the version of 'BOINC' that many on our team is 7.19.0, meaning it was build from the 7.19.0 master branch. nothing more. and betri's app isn't versioned in that way anyway. it's v1.0. has nothing to do with Petri's gamma ray app.

mountkidd is running the special app on a 7.20.5 system no problem. 

I did not know that, thanks

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,749
Credit: 35,724,982,759
RAC: 38,703,853

OP might also try ditching

OP might also try ditching Debian and running Ubuntu LTS to see if it makes any difference.

_________________________________________________________________________

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.