Hello ppls.. some of my pc gave me this problem..
<core_client_version>7.6.33</core_client_version> <![CDATA[ <message> process exited with code 68 (0x44, -188) </message> <stderr_txt> 21:15:40 (9474): [normal]: This Einstein@home App was built at: Jul 26 2017 11:32:40
21:15:40 (9474): [normal]: Start of BOINC application '../../projects/einstein.phys.uwm.edu/hsgamma_FGRP5_1.08_x86_64-pc-linux-gnu__FGRPSSE'.
21:15:40 (9474): [debug]: 2.1e+15 fp, 2.1e+09 fp/s, 1022507 s, 284h01m47s25
command line: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRP5_1.08_x86_64-pc-linux-gnu__FGRPSSE --inputfile ../../projects/einstein.phys.uwm.edu/JPLEPH.405 --alpha 2.1039176188 --delta -0.9808959836 --skyRadius 0.001361356817 --ldiBins 15 --f0start 1048 --f0Band 16 --firstSkyPoint 400954 --numSkyPoints 58 --f1dot -1.0e-13 --f1dotBand 1.0e-13 --df1dot 1.344493449e-15 --ephemdir ../../projects/einstein.phys.uwm.edu/JPLEPH --Tcoh 4194304.0 --toplist 10 --cohFollow 10 --numCells 1 --useWeights 1 --Srefinement 1 --CohSkyRef 1 --cohfullskybox 1 --mmfu 0.15 --reftime 56757.0 --f0orbit 0.005 --freeRadiusFactor 2 --mismatch 0.15 --debug 0 -o LATeah1075F_1064.0_400954_0.0_0_0.out
output files: 'LATeah1075F_1064.0_400954_0.0_0_0.out' '../../projects/einstein.phys.uwm.edu/LATeah1075F_1064.0_400954_0.0_0_0' 'LATeah1075F_1064.0_400954_0.0_0_0.out.cohfu' '../../projects/einstein.phys.uwm.edu/LATeah1075F_1064.0_400954_0.0_0_1'
21:15:40 (9474): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
21:15:40 (9474): [debug]: glibc version/release: 2.24/stable
21:15:40 (9474): [debug]: Set up communication with graphics process.
Line 1 in inputfile ../../projects/einstein.phys.uwm.edu/JPLEPH.405 seems to be damaged.
21:15:40 (9474): [CRITICAL]: ERROR: MAIN() returned with error '4'
FPU status flags:
mv: cannot stat 'LATeah1075F_1064.0_400954_0.0_0_0.out': No such file or directory
mv: cannot stat 'LATeah1075F_1064.0_400954_0.0_0_0.out': No such file or directory
mv: cannot stat 'LATeah1075F_1064.0_400954_0.0_0_0.out': No such file or directory
mv: cannot stat 'LATeah1075F_1064.0_400954_0.0_0_0.out': No such file or directory
mv: cannot stat 'LATeah1075F_1064.0_400954_0.0_0_0.out': No such file or directory
mv: cannot stat 'LATeah1075F_1064.0_400954_0.0_0_0.out.cohfu': No such file or directory
mv: cannot stat 'LATeah1075F_1064.0_400954_0.0_0_0.out.cohfu': No such file or directory
mv: cannot stat 'LATeah1075F_1064.0_400954_0.0_0_0.out.cohfu': No such file or directory
mv: cannot stat 'LATeah1075F_1064.0_400954_0.0_0_0.out.cohfu': No such file or directory
mv: cannot stat 'LATeah1075F_1064.0_400954_0.0_0_0.out.cohfu': No such file or directory
mv: cannot stat 'LATeah1075F_1064.0_400954_0.0_0_0.out.cohfu': No such file or directory
21:15:51 (9474): [normal]: done. calling boinc_finish(68).
21:15:51 (9474): called boinc_finish
</stderr_txt>
]]>
Have re-download "http://einstein6.aei.uni-hannover.de/EinsteinAtHome/download/29/JPLEPH.405" and verify MD5 reading from file "/var/lib/boinc-client/client_state.xml" ...the file is good... what i can do for solve the problem ?
Copyright © 2024 Einstein@Home. All rights reserved.
ExaGroup wrote: Hello ppls..
)
First it would help ALOT if you would unhide your computers so the experts can see what is really going on. Click on my name and then 'show computers' and you will see all that anyone could see about yours, no names just the facts are shown along with the tasks you are running and the outcomes of each of them.
Hooo yes sure ..unhide right
)
Hooo yes sure ..unhide right now.. sorry Mikey
ExaGroup wrote:Hello ppls..
)
You have 91 hosts in your full list so it would help if you provide a link to the particular host that gave this error message. Does more than one host show the same problem? Do the failures happen intermittently or does every task fail?
Many years ago, I had an example where BOINC would decide (intermittently) that this same file was corrupt. With BOINC stopped, running an MD5 check said that the file was OK. I decided to replace it anyway. My initial assumption was that perhaps it was sitting on a particular part of the disk that contained an intermittent bad block. I renamed the file JPLEPH.BAD so it wouldn't move and that way the replacement would occupy a different part of the disk. For a while, everything seemed OK but then the intermittent failures returned.
After much tearing of (non-existent) hair, I eventually did a full and exhaustive RAM test which revealed a bad memory location. After replacing the stick, the problem went away completely. My final assumption was that when BOINC does the MD5 checks of important files like this one, it (by chance) happened to hit the bad memory location intermittently, causing the check to fail.
I seem to remember that my example actually mentioned (in the log) that the MD5 check failed. The above message doesn't actually say that so it could be something else than the MD5 check. For reference, here is exactly the full ls -l listing for that file from one of my hosts. Is yours exactly the same number of bytes?
-rw-r--r-- 1 gary gary 9319680 Oct 5 2014 JPLEPH.405
Cheers,
Gary.
I have more than one host
)
I have more than one host that has the problem and unfortunately the problem is intermittently...
Woking on this project having 71 VM and other 20 phisical machine.. (and other 20 out of the project that running other ;-) ..)
..haven't checked how many have problems ..(no free time at the moment for doing).. here the 1st three machine..
https://einsteinathome.org/it-it/task/1047078615
https://einsteinathome.org/it-it/task/1047078719
https://einsteinathome.org/it-it/task/1047079004
On that three have verifiy JPLEPH.405 file.. that have the correct MD5 and "exactly the same number of bytes"
thx you Gary for the help
maybe found something.. get 3
)
maybe found something.. get 3 VM linux and 3 PH win2003 having the problem.. from my list
all of these saying trouble with "Line 1 in inputfile ../../projects/einstein.phys.uwm.edu/JPLEPH.405 seems to be damaged."
check MD5 & bytes of JPLEPH.405 ..and are ok
..but from Stderr output reading..
<message>
process exited with code 68 (0x44, -188)
</message>
..for the linux box
..while for the win2003 box
<message>
The name limit for the local computer network adapter card was exceeded.
(0x44) - exit code 68 (0x44)</message>
<stderr_txt>
..the messages seems equal
..so have increase (on all the 6 machine) the dynamicport range
..and now wait.. ;-)
We had similar incident just
)
We had similar incident just before Christmas. See here https://einsteinathome.org/content/cpu-tasks-error-out-after-12-seconds
Harri Liljeroos wrote: We
)
Thanks for posting the link. That incident didn't affect me and I'd completely forgotten about it.
I'd been checking the tasks on one of the OP's hosts and noticed that all the failures dated back to around December 20 - so obviously the same issue. All the current results were being completed successfully so the OP mustn't have noticed how long ago the problem actually was and that everything was now fine :-).
Cheers,
Gary.
well.. thx Harri and Gary
)
well.. thx Harri and Gary ..yup i hadn't noticed the dates ..sorry