> EDIT: THIS MAY NOT BE 4.14 RELATED
>
> Did some more digging.
>
Marco, maybe you should do some more digging on this other one:
Result ID=982764 WU ID=306509 Sent=13 Jan 2005 21:04:31 UTC, Reported= 16 Jan 2005 13:32:45 UTC Over Success Done CPU time =7,072.22 Claimed credit= 21.56 Granted credit= pending
This one is a similar one to the already reported previous unit, I also think that there is something odd going on in here.
And I also think it has nothing to do with CC 4.14 because Seti and CPDN are working fine with it.
> > EDIT: THIS MAY NOT BE 4.14 RELATED
> >
> > Did some more digging.
> >
>
> Marco, maybe you should do some more digging on this other one:
> Result ID=982764 WU ID=306509 Sent=13 Jan 2005 21:04:31 UTC, Reported= 16 Jan
> 2005 13:32:45 UTC Over Success Done CPU time =7,072.22 Claimed credit= 21.56
> Granted credit= pending
> This one is a similar one to the already reported previous unit, I also think
> that there is something odd going on in here.
>
> And I also think it has nothing to do with CC 4.14 because Seti and CPDN are
> working fine with it.
>
> this[/url] thread.
Those units may have computation errors (and i think they really have :)), and once they are validated they will be marked as ones with errors, and will not be chosen as canonical results.
> Those units may have computation errors (and i think they really have :)), and
> once they are validated they will be marked as ones with errors, and will not
> be chosen as canonical results.
>
The clients all identify these results as successful, so something isn't right; maybe it's the WU itself that is garbled.
Posted: 9 Jan 2005 22:40:36 UTC
Last modified: 9 Jan 2005 22:57:47 UTC
"Sorry, a bug in our code is triggered by data in this Workunit. It ends up in an endlees loop at the final calculation, so the counter will stay at 100% until the result reaches the "maximum number of floating point operations" that is defined in the workunit and then the client will continue with the next one. So you don't have to do anything special, just wait..."
Posted: 9 Jan 2005 23:48:37 UTC
Last modified: 10 Jan 2005 22:32:27 UTC
"The maximum FLOPS are, I think, set quite high, so you'll have to wait quite a while and still won't get any credit as the result file isn't valid."
The stderr.txt file of the work-units you're talking about look very similiar to the ones in the thread I'm pointing to in the link above.
TTYL
Richard
I think that I have found, not the problem, but what is happening with this last units which we have been credited a much lesser amount of points compared to others of the same lenght but for which we were regularly being creditted with 95 or more credits.
At the time the unit is received and it stands in stand by in our program, it shows a possible completion time of say 8 hr 34' 00" or thereabouts, to mention a figure. In normal units of same lenght time when they begin to be analized the CPU time advances and shows a % of progress; by the time it switches to another project and starts another project's unit, the time used stays there, together with the percentage of advance. When this unit gets its turn again, it takes from there and goes on, but the time keeps showing with an advancement of progress until it finnally reaches the 100%, in these faulty units what happens is that it starts from cero and gets to say a 12 to 13% advance with say 60 minutes of analysis and stops, and when it goes back again the percentage of progress keeps growing but each time starting the CPU total time from cero thus in the final stage that is the one that counts for the unit itself when reported back.
Probably someone that only has E@H project will not have this problem, but having other projects, we must remember that each project gets at least 60 minutes before switching to another's project unit. So a unit of with a total estimated 8 hours from E@H will have to take 8 turns before it gets finished, if the unit keeps track of all the previous steps both in the total time and progress columns, that's fine, but if it only keeps track in the progress column and only takes into consideration the very last tracking time, then you are receiving 1/8 of the total result.
At this very moment of writting this note : Einstein@Home - 2005-01-16 22:09:22 - Restarting result H1_0064.9__0065.1_0.1_T01_Test03_5 using einstein version 4.71 which already had a 32% progress showing, has restarted counting total time of CPU from cero, but keeping the progress correctly. Have waited 12 minutes and now it shows 00:12:03 CPU time with 35.37% progress. Well I know before hand that when it stops it will preserve the progress % data but will erase the total time and next round time will star again from cero.
This is only happening in the units of Einstein as I also have CPDN and Seti and both run fine with the 4.14 CC so it's got to be our units here.
First let me put a big *Thanks* to all above, detailed postings, we all know, it takes time to "assemble" the text.
Over the weekend, with 1 exception, none of the clients got any more a "never-ending" WU or a download crashed.
All of the clients are attached to several projects.
One is a CPDN/EINSTEIN/LHC/PIRATE attached, switching every 30 minutes (HT cpu)
never seen a restart as mentioned (still crossing all finger), after preemting. Doing work from CDPDN && Einstein.
The other multi attached clients are EINSTEIN/LHC/PIRATE && SETI, working on seti && Einstein, no problems too.
some of them even got little work of Einstein, no crashes registered.
Liberto might thats for you ;-)
1 Client, an AMD 2400, running at regular speed, 512 MB XP. 4.56, no gui.
was typing text in word, the time while typing the key and word was schowing it, was several time extremly delayed, unusual delayed.
Finaly started the task manager to take a closer look which task is hoovering the cpu cycles.
The task manager shows 3 task so called setiathome_4.08_windows_intelx86.exe. Yes three. Normal would be, on a single cpu, only ONE.
The GUI (4.56) started now, allowed me to END one of the mentioned three, the one with the lowest ProcessID of all tree. Boinc.exe still in memory. And the other 2 still having heavy cpu load.
The remnants of the other boinc tasks have been ended with the task mananger.
Next time starting boinc, the Einstein WU restarted from zero, from the beginning like it was neven started before (no checkpoint given or bad one)
and here part from stdout.txt, it looks like not all entries could be logged:
2005-01-16 21:51:25 [SETI@home] Pausing result 16se00aa.8304.689.978394.6_2 (left in memory)
2005-01-16 21:51:25 [Einstein@Home] Resuming result H1_0068.4__0068.9_0.1_T00_Test03_4 using einstein version 4.71
2005-01-16 22:49:26 [---] Insufficient work; requesting more
2005-01-16 22:49:26 [Pirates@Home] Requesting 8640.00 seconds of work
2005-01-16 22:49:26 [Pirates@Home] Sending request to scheduler: http://pirates.vassar.edu/cgi-bin/scheduler
2005-01-16 22:49:28 [Pirates@Home] Scheduler RPC to http://pirates.vassar.edu/cgi-bin/scheduler succeeded
2005-01-16 22:49:28 [Pirates@Home] Message from server: No work available
2005-01-16 22:49:28 [Pirates@Home] No work from project
2005-01-16 22:49:28 [Pirates@Home] Deferring communication with project for 3 hours, 11 minutes, and 21 seconds
2005-01-16 23:21:25 [SETI@home] Resuming result 16se00aa.8304.689.978394.6_2 using setiathome version 4.08
2005-01-16 23:21:25 [Einstein@Home] Pausing result H1_0068.4__0068.9_0.1_T00_Test03_4 (left in memory)
2005-01-16 23:49:28 [Pirates@Home] Deferring communication with project for 2 hours, 11 minutes, and 21 seconds
2005-01-16 23:51:26 [SETI@home] Pausing result 16se00aa.8304.689.978394.6_2 (left in memory)
2005-01-16 23:51:26 [Einstein@Home] Resuming result
H1_0068.4__0068.9_0.1_T00_Test03_4 using einstein version 4.71
2005-01-17 00:11:19 [Einstein@Home] Unrecoverable error for result H1_0068.4__0068.9_0.1_T00_Test03_4 (Unzulässige Funktion. (0x1) - exit code 1 (0x1))
2005-01-17 00:11:19 [Einstein@Home] Deferring communication with project for 1 minutes and 0 seconds
2005-01-17 00:11:19 [Einstein@Home] Computation for result H1_0068.4__0068.9_0.1_T00_Test03 finished
2005-01-17 00:11:19 [Einstein@Home] Starting result H1_0056.9__0057.0_0.1_T01_Test03_4 using einstein version 4.71
To pause/resume tasks hit CTRL-C, to exit hit CTRL-BREAK
2005-01-17 00:20:38 [---] Starting BOINC client version 4.56 for windows_intelx86
2005-01-17 00:20:38 [SETI@home] Host location: school
2005-01-17 00:20:38 [SETI@home] Using separate project prefs for school
2005-01-17 00:20:38 [Pirates@Home] Host location: home
2005-01-17 00:20:38 [Pirates@Home] Using separate project prefs for home
2005-01-17 00:20:38 [Einstein@Home] Host location: home
2005-01-17 00:20:38 [Einstein@Home] Using separate project prefs for home
2005-01-17 00:20:38 [SETI@home] Host ID is 414613
2005-01-17 00:20:38 [Pirates@Home] Host ID is 6086
2005-01-17 00:20:38 [Einstein@Home] Host ID is 3808
2005-01-17 00:20:38 [---] General prefs: from SETI@home (last modified 2005-01-16 15:46:45)
2005-01-17 00:20:38 [---] General prefs: using separate prefs for school
2005-01-17 00:20:38 [SETI@home] Deferring computation for result 16se00aa.8304.689.978394.6_2
2005-01-17 00:20:38 [Einstein@Home] Resuming computation for result H1_0056.9__0057.0_0.1_T01_Test03_4 using einstein version 4.71
2005-01-17 00:20:38 [Einstein@Home] Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
2005-01-17 00:20:39 [Einstein@Home] Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
2005-01-17 00:20:39 [Einstein@Home] Host location: home
2005-01-17 00:20:39 [Einstein@Home] Using separate project prefs for home
2005-01-17 00:20:39 [Pirates@Home] Deferring communication with project for 1 hours, 40 minutes, and 10 seconds
2005-01-17 01:20:39 [Pirates@Home] Deferring communication with project for 40 minutes and 10 seconds
2005-01-17 01:26:40 [SETI@home] Restarting result 16se00aa.8304.689.978394.6_2 using setiathome version 4.08
2005-01-17 01:26:40 [Einstein@Home] Pausing result H1_0056.9__0057.0_0.1_T01_Test03_4 (left in memory)
2005-01-17 01:26:40 [SETI@home] Sending request to scheduler:
This "error" also could be generated to the fact, I couldn't be terminated on regular way. But even then, the last checkpoint is usualy taken.
One host is running with the 4.14 version. The client AMD 2700 got 27 WUs Einstein Work friday evening (faster and longer running work), no crash and crunching without a stop since then.
Ric wrote: First let me put a big *Thanks* to all above, detailed postings, we all know, it takes time to "assemble" the text.
Over the weekend, with 1 exception, none of the clients got any more a "never-ending" WU or a download crashed.
Liberto might thats for you ;-) End of ric's comment.
Well I can tell you now that being that special one, I do not know if to be happy or angry - I'd rather stay with the happy idea.
I am engaged with Pirates, CPDN, Seti, LHC, ProteinPredictor and Einstein projects, all six in one single cpu p4 WXP sp1 - 512RAM memory- 26Ghard disk., and by the way waiting for Astropulse to come into the pìcture.
By the way, since my previous message, and after a good night' sleep (good morming to all) that same unit is going around and has reached an 88.88% advance in this next step of restarting counting the time and now it has reached 58'02"; so I am most affraid it will stop before getting finished totally. I will wait to report the complete cicle now that I will be able to see it actually happen. And I say this because the time to finish column instead of decreasing has been increasing.
Well it has happened after 59'56" of process it switched to continue to another unit from seti and left the advance of the E@H unit (progress) at 89.04% but the CPU time shows --- and the time to completion also shows --- (Meaning who knows!)
I can say in advance that this unit will request, when send, a minimum amount of credit which will not exceed probably the 12 or so credits. In normal analysis this unit would have represented 99.5 credits.
But I have decided to keep this writting in the word pad and wait until next round when it will finish totally and I will be able to report the complete final of the movie.
Einstein@Home - 2005-01-17 08:45:52 - Pausing result H1_0064.9__0065.1_0.1_T01_Test03_5 (removed from memory)
SETI@home - 2005-01-17 08:45:52 - Restarting result 27ap04ab.21429.27024.303408.121_2 using setiathome version 4.08
In the meantime the messages went like this:
Einstein@Home - 2005-01-17 08:58:59 - Requesting 35553 seconds of work
Einstein@Home - 2005-01-17 08:58:59 - Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
Einstein@Home - 2005-01-17 08:59:02 - Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
Einstein@Home - 2005-01-17 08:59:02 - Message from server: No work available
Einstein@Home - 2005-01-17 08:59:02 - No work from project
Einstein@Home - 2005-01-17 08:59:02 - Deferring communication with project for 1 hours, 0 minutes, and 0 seconds Surprisingly after only 45 minutes of process with seti it moved on back to Einstein
Einstein@Home - 2005-01-17 09:45:53 - Restarting result H1_0064.9__0065.1_0.1_T01_Test03_5 using einstein version 4.71
SETI@home - 2005-01-17 09:45:53 - Pausing result 27ap04ab.21429.27024.303408.121_2 (removed from memory)
At this stage and after a restarted time counting of 35'45" it shows a 95.75% progress and a remaing time to completion of 1'35". Approaching launching time the count goes like this: CPU time 00:45':00" witn a progress of 97.47% and 1' 09" to go.
So red button time approaching, four, three, two, one, ignition... after 58'20"
Einstein@Home - 2005-01-17 10:45:07 - Computation for result H1_0064.9__0065.1_0.1_T01_Test03 finished
climateprediction.net - 2005-01-17 10:45:07 - Restarting result 2pem_100147514_1 using hadsm3 version 4.04
Einstein@Home - 2005-01-17 10:45:07 - Started upload of H1_0064.9__0065.1_0.1_T01_Test03_5_0
Einstein@Home - 2005-01-17 10:45:21 - Finished upload of H1_0064.9__0065.1_0.1_T01_Test03_5_0
Einstein@Home - 2005-01-17 10:45:21 - Throughput 10268 bytes/sec
Einstein@Home - 2005-01-17 10:46:03 - Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
Einstein@Home - 2005-01-17 10:46:08 - Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
Einstein@Home - 2005-01-17 10:46:08 - No work from project
And as I predicted, the result received in Einstein reads like this:
985034 306628 15 Jan 2005 15:53:40 UTC 17 Jan 2005 9:45:40 UTC Over Success Done 3,514.97 10.71 pending
So it may be seen that is even less than what I predicted. I am very sorry for the long story.
After some days running BOINC 4.14 and 4.59 with Einstein@home 4.71 I can confirm a problem in CPU time handling.
When the task switches from E@h to the next project, CPU time is reset to the value BOINC or the application was started with. This seems not to happen with other projects applications.
This only happens when the application did'nt stay in memory (removed from memory)
Beside this there is no problem with BOINC 4.14 (on Windows NT/2000/XP) to report, not on update and not when running over the weekend on several hosts. This may be different on hosts with Windows 95 family operating systems.
After switching to version 4.14 this afternoon my CPDN project on a WinNT aborted !!
Seti on this machine is still working fine and on my Win2K machine Seti, Predictor and CPDN are still working.
I got no new work for Einstein on my machines so I don't know how Einstein is going.
Any ideas ?
> EDIT: THIS MAY NOT BE 4.14
)
> EDIT: THIS MAY NOT BE 4.14 RELATED
>
> Did some more digging.
>
Marco, maybe you should do some more digging on this other one:
Result ID=982764 WU ID=306509 Sent=13 Jan 2005 21:04:31 UTC, Reported= 16 Jan 2005 13:32:45 UTC Over Success Done CPU time =7,072.22 Claimed credit= 21.56 Granted credit= pending
This one is a similar one to the already reported previous unit, I also think that there is something odd going on in here.
And I also think it has nothing to do with CC 4.14 because Seti and CPDN are working fine with it.
Patience is a virtue
> > EDIT: THIS MAY NOT BE
)
> > EDIT: THIS MAY NOT BE 4.14 RELATED
> >
> > Did some more digging.
> >
>
> Marco, maybe you should do some more digging on this other one:
> Result ID=982764 WU ID=306509 Sent=13 Jan 2005 21:04:31 UTC, Reported= 16 Jan
> 2005 13:32:45 UTC Over Success Done CPU time =7,072.22 Claimed credit= 21.56
> Granted credit= pending
> This one is a similar one to the already reported previous unit, I also think
> that there is something odd going on in here.
>
> And I also think it has nothing to do with CC 4.14 because Seti and CPDN are
> working fine with it.
>
> this[/url] thread.
Richard
>
Those units may have
)
Those units may have computation errors (and i think they really have :)), and once they are validated they will be marked as ones with errors, and will not be chosen as canonical results.
Administrator
Message@Home
> Those units may have
)
> Those units may have computation errors (and i think they really have :)), and
> once they are validated they will be marked as ones with errors, and will not
> be chosen as canonical results.
>
The clients all identify these results as successful, so something isn't right; maybe it's the WU itself that is garbled.
- Marco
Team Canada
Posted by Bernd Machenschalk
)
Posted by Bernd Machenschalk in this thread.
Posted: 9 Jan 2005 22:40:36 UTC
Last modified: 9 Jan 2005 22:57:47 UTC
"Sorry, a bug in our code is triggered by data in this Workunit. It ends up in an endlees loop at the final calculation, so the counter will stay at 100% until the result reaches the "maximum number of floating point operations" that is defined in the workunit and then the client will continue with the next one. So you don't have to do anything special, just wait..."
Posted: 9 Jan 2005 23:48:37 UTC
Last modified: 10 Jan 2005 22:32:27 UTC
"The maximum FLOPS are, I think, set quite high, so you'll have to wait quite a while and still won't get any credit as the result file isn't valid."
_______________________________________________________________________________
The stderr.txt file of the work-units you're talking about look very similiar to the ones in the thread I'm pointing to in the link above.
TTYL
Richard
I think that I have found,
)
I think that I have found, not the problem, but what is happening with this last units which we have been credited a much lesser amount of points compared to others of the same lenght but for which we were regularly being creditted with 95 or more credits.
At the time the unit is received and it stands in stand by in our program, it shows a possible completion time of say 8 hr 34' 00" or thereabouts, to mention a figure. In normal units of same lenght time when they begin to be analized the CPU time advances and shows a % of progress; by the time it switches to another project and starts another project's unit, the time used stays there, together with the percentage of advance. When this unit gets its turn again, it takes from there and goes on, but the time keeps showing with an advancement of progress until it finnally reaches the 100%, in these faulty units what happens is that it starts from cero and gets to say a 12 to 13% advance with say 60 minutes of analysis and stops, and when it goes back again the percentage of progress keeps growing but each time starting the CPU total time from cero thus in the final stage that is the one that counts for the unit itself when reported back.
Probably someone that only has E@H project will not have this problem, but having other projects, we must remember that each project gets at least 60 minutes before switching to another's project unit. So a unit of with a total estimated 8 hours from E@H will have to take 8 turns before it gets finished, if the unit keeps track of all the previous steps both in the total time and progress columns, that's fine, but if it only keeps track in the progress column and only takes into consideration the very last tracking time, then you are receiving 1/8 of the total result.
At this very moment of writting this note : Einstein@Home - 2005-01-16 22:09:22 - Restarting result H1_0064.9__0065.1_0.1_T01_Test03_5 using einstein version 4.71 which already had a 32% progress showing, has restarted counting total time of CPU from cero, but keeping the progress correctly. Have waited 12 minutes and now it shows 00:12:03 CPU time with 35.37% progress. Well I know before hand that when it stops it will preserve the progress % data but will erase the total time and next round time will star again from cero.
This is only happening in the units of Einstein as I also have CPDN and Seti and both run fine with the 4.14 CC so it's got to be our units here.
Patience is a virtue
First let me put a big
)
First let me put a big *Thanks* to all above, detailed postings, we all know, it takes time to "assemble" the text.
Over the weekend, with 1 exception, none of the clients got any more a "never-ending" WU or a download crashed.
All of the clients are attached to several projects.
One is a CPDN/EINSTEIN/LHC/PIRATE attached, switching every 30 minutes (HT cpu)
never seen a restart as mentioned (still crossing all finger), after preemting. Doing work from CDPDN && Einstein.
The other multi attached clients are EINSTEIN/LHC/PIRATE && SETI, working on seti && Einstein, no problems too.
some of them even got little work of Einstein, no crashes registered.
Liberto might thats for you ;-)
1 Client, an AMD 2400, running at regular speed, 512 MB XP. 4.56, no gui.
was typing text in word, the time while typing the key and word was schowing it, was several time extremly delayed, unusual delayed.
Finaly started the task manager to take a closer look which task is hoovering the cpu cycles.
The task manager shows 3 task so called setiathome_4.08_windows_intelx86.exe. Yes three. Normal would be, on a single cpu, only ONE.
The GUI (4.56) started now, allowed me to END one of the mentioned three, the one with the lowest ProcessID of all tree. Boinc.exe still in memory. And the other 2 still having heavy cpu load.
The remnants of the other boinc tasks have been ended with the task mananger.
Next time starting boinc, the Einstein WU restarted from zero, from the beginning like it was neven started before (no checkpoint given or bad one)
and here part from stdout.txt, it looks like not all entries could be logged:
2005-01-16 21:51:25 [SETI@home] Pausing result 16se00aa.8304.689.978394.6_2 (left in memory)
2005-01-16 21:51:25 [Einstein@Home] Resuming result H1_0068.4__0068.9_0.1_T00_Test03_4 using einstein version 4.71
2005-01-16 22:49:26 [---] Insufficient work; requesting more
2005-01-16 22:49:26 [Pirates@Home] Requesting 8640.00 seconds of work
2005-01-16 22:49:26 [Pirates@Home] Sending request to scheduler: http://pirates.vassar.edu/cgi-bin/scheduler
2005-01-16 22:49:28 [Pirates@Home] Scheduler RPC to http://pirates.vassar.edu/cgi-bin/scheduler succeeded
2005-01-16 22:49:28 [Pirates@Home] Message from server: No work available
2005-01-16 22:49:28 [Pirates@Home] No work from project
2005-01-16 22:49:28 [Pirates@Home] Deferring communication with project for 3 hours, 11 minutes, and 21 seconds
2005-01-16 23:21:25 [SETI@home] Resuming result 16se00aa.8304.689.978394.6_2 using setiathome version 4.08
2005-01-16 23:21:25 [Einstein@Home] Pausing result H1_0068.4__0068.9_0.1_T00_Test03_4 (left in memory)
2005-01-16 23:49:28 [Pirates@Home] Deferring communication with project for 2 hours, 11 minutes, and 21 seconds
2005-01-16 23:51:26 [SETI@home] Pausing result 16se00aa.8304.689.978394.6_2 (left in memory)
2005-01-16 23:51:26 [Einstein@Home] Resuming result
H1_0068.4__0068.9_0.1_T00_Test03_4 using einstein version 4.71
2005-01-17 00:11:19 [Einstein@Home] Unrecoverable error for result H1_0068.4__0068.9_0.1_T00_Test03_4 (Unzulässige Funktion. (0x1) - exit code 1 (0x1))
2005-01-17 00:11:19 [Einstein@Home] Deferring communication with project for 1 minutes and 0 seconds
2005-01-17 00:11:19 [Einstein@Home] Computation for result H1_0068.4__0068.9_0.1_T00_Test03 finished
2005-01-17 00:11:19 [Einstein@Home] Starting result H1_0056.9__0057.0_0.1_T01_Test03_4 using einstein version 4.71
To pause/resume tasks hit CTRL-C, to exit hit CTRL-BREAK
2005-01-17 00:20:38 [---] Starting BOINC client version 4.56 for windows_intelx86
2005-01-17 00:20:38 [SETI@home] Host location: school
2005-01-17 00:20:38 [SETI@home] Using separate project prefs for school
2005-01-17 00:20:38 [Pirates@Home] Host location: home
2005-01-17 00:20:38 [Pirates@Home] Using separate project prefs for home
2005-01-17 00:20:38 [Einstein@Home] Host location: home
2005-01-17 00:20:38 [Einstein@Home] Using separate project prefs for home
2005-01-17 00:20:38 [SETI@home] Host ID is 414613
2005-01-17 00:20:38 [Pirates@Home] Host ID is 6086
2005-01-17 00:20:38 [Einstein@Home] Host ID is 3808
2005-01-17 00:20:38 [---] General prefs: from SETI@home (last modified 2005-01-16 15:46:45)
2005-01-17 00:20:38 [---] General prefs: using separate prefs for school
2005-01-17 00:20:38 [SETI@home] Deferring computation for result 16se00aa.8304.689.978394.6_2
2005-01-17 00:20:38 [Einstein@Home] Resuming computation for result H1_0056.9__0057.0_0.1_T01_Test03_4 using einstein version 4.71
2005-01-17 00:20:38 [Einstein@Home] Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
2005-01-17 00:20:39 [Einstein@Home] Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
2005-01-17 00:20:39 [Einstein@Home] Host location: home
2005-01-17 00:20:39 [Einstein@Home] Using separate project prefs for home
2005-01-17 00:20:39 [Pirates@Home] Deferring communication with project for 1 hours, 40 minutes, and 10 seconds
2005-01-17 01:20:39 [Pirates@Home] Deferring communication with project for 40 minutes and 10 seconds
2005-01-17 01:26:40 [SETI@home] Restarting result 16se00aa.8304.689.978394.6_2 using setiathome version 4.08
2005-01-17 01:26:40 [Einstein@Home] Pausing result H1_0056.9__0057.0_0.1_T01_Test03_4 (left in memory)
2005-01-17 01:26:40 [SETI@home] Sending request to scheduler:
This "error" also could be generated to the fact, I couldn't be terminated on regular way. But even then, the last checkpoint is usualy taken.
One host is running with the 4.14 version. The client AMD 2700 got 27 WUs Einstein Work friday evening (faster and longer running work), no crash and crunching without a stop since then.
Ric wrote: First let me put a
)
Ric wrote: First let me put a big *Thanks* to all above, detailed postings, we all know, it takes time to "assemble" the text.
Over the weekend, with 1 exception, none of the clients got any more a "never-ending" WU or a download crashed.
Liberto might thats for you ;-) End of ric's comment.
Well I can tell you now that being that special one, I do not know if to be happy or angry - I'd rather stay with the happy idea.
I am engaged with Pirates, CPDN, Seti, LHC, ProteinPredictor and Einstein projects, all six in one single cpu p4 WXP sp1 - 512RAM memory- 26Ghard disk., and by the way waiting for Astropulse to come into the pìcture.
By the way, since my previous message, and after a good night' sleep (good morming to all) that same unit is going around and has reached an 88.88% advance in this next step of restarting counting the time and now it has reached 58'02"; so I am most affraid it will stop before getting finished totally. I will wait to report the complete cicle now that I will be able to see it actually happen. And I say this because the time to finish column instead of decreasing has been increasing.
Well it has happened after 59'56" of process it switched to continue to another unit from seti and left the advance of the E@H unit (progress) at 89.04% but the CPU time shows --- and the time to completion also shows --- (Meaning who knows!)
I can say in advance that this unit will request, when send, a minimum amount of credit which will not exceed probably the 12 or so credits. In normal analysis this unit would have represented 99.5 credits.
But I have decided to keep this writting in the word pad and wait until next round when it will finish totally and I will be able to report the complete final of the movie.
Einstein@Home - 2005-01-17 08:45:52 - Pausing result H1_0064.9__0065.1_0.1_T01_Test03_5 (removed from memory)
SETI@home - 2005-01-17 08:45:52 - Restarting result 27ap04ab.21429.27024.303408.121_2 using setiathome version 4.08
In the meantime the messages went like this:
Einstein@Home - 2005-01-17 08:58:59 - Requesting 35553 seconds of work
Einstein@Home - 2005-01-17 08:58:59 - Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
Einstein@Home - 2005-01-17 08:59:02 - Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
Einstein@Home - 2005-01-17 08:59:02 - Message from server: No work available
Einstein@Home - 2005-01-17 08:59:02 - No work from project
Einstein@Home - 2005-01-17 08:59:02 - Deferring communication with project for 1 hours, 0 minutes, and 0 seconds
Surprisingly after only 45 minutes of process with seti it moved on back to Einstein
Einstein@Home - 2005-01-17 09:45:53 - Restarting result H1_0064.9__0065.1_0.1_T01_Test03_5 using einstein version 4.71
SETI@home - 2005-01-17 09:45:53 - Pausing result 27ap04ab.21429.27024.303408.121_2 (removed from memory)
At this stage and after a restarted time counting of 35'45" it shows a 95.75% progress and a remaing time to completion of 1'35". Approaching launching time the count goes like this: CPU time 00:45':00" witn a progress of 97.47% and 1' 09" to go.
So red button time approaching, four, three, two, one, ignition... after 58'20"
Einstein@Home - 2005-01-17 10:45:07 - Computation for result H1_0064.9__0065.1_0.1_T01_Test03 finished
climateprediction.net - 2005-01-17 10:45:07 - Restarting result 2pem_100147514_1 using hadsm3 version 4.04
Einstein@Home - 2005-01-17 10:45:07 - Started upload of H1_0064.9__0065.1_0.1_T01_Test03_5_0
Einstein@Home - 2005-01-17 10:45:21 - Finished upload of H1_0064.9__0065.1_0.1_T01_Test03_5_0
Einstein@Home - 2005-01-17 10:45:21 - Throughput 10268 bytes/sec
Einstein@Home - 2005-01-17 10:46:03 - Sending request to scheduler: http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
Einstein@Home - 2005-01-17 10:46:08 - Scheduler RPC to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
Einstein@Home - 2005-01-17 10:46:08 - No work from project
And as I predicted, the result received in Einstein reads like this:
985034 306628 15 Jan 2005 15:53:40 UTC 17 Jan 2005 9:45:40 UTC Over Success Done 3,514.97 10.71 pending
So it may be seen that is even less than what I predicted. I am very sorry for the long story.
Patience is a virtue
After some days running BOINC
)
After some days running BOINC 4.14 and 4.59 with Einstein@home 4.71 I can confirm a problem in CPU time handling.
When the task switches from E@h to the next project, CPU time is reset to the value BOINC or the application was started with. This seems not to happen with other projects applications.
This only happens when the application did'nt stay in memory (removed from memory)
Beside this there is no problem with BOINC 4.14 (on Windows NT/2000/XP) to report, not on update and not when running over the weekend on several hosts. This may be different on hosts with Windows 95 family operating systems.
Greetings from Bremen/Germany
Jens Seidler (TheBigJens)
After switching to version
)
After switching to version 4.14 this afternoon my CPDN project on a WinNT aborted !!
Seti on this machine is still working fine and on my Win2K machine Seti, Predictor and CPDN are still working.
I got no new work for Einstein on my machines so I don't know how Einstein is going.
Any ideas ?
legolas13