I recently upgraded to BOINC 7.2.39(x86) on a Vista machine.
Before I upgraded, I had dis-allowed any new tasks, and completed all existing tasks. I then removed Einstein@Home from BOINC.
After installing the new release of BOINC, I added the Einstein@Home project. After BOINC downloaded everything for the project, number crunching began.
Under Tools->Computing Preferences, I have "Tasks checkpoint to disk at most every 120 seconds."
This is the same setting as I was using in 7.2.28.1 from which I upgraded.
I had to restart my system, and after the restart, I noticed that the running tasks had started over from the beginning.
I checked the slots 0 and 1 folders, but there is no checkpoint.cpt file in either folder.
The stderr.txt file in both slots has the line;
2014-02-17 06:39:18.7348 (2720) [normal]: INFO: No checkpoint checkpoint.cpt found - starting from scratch
After running for 50 minutes, I still find no checkpoint.cpt files.
I must be missing something simple, as I am pretty sure that in the previous version of BOINC that I was using, a checkpoint was created, so that when I restarted the system, tasks would not start from the beginning again.
Constructive input is appreciated.
Joe
Copyright © 2024 Einstein@Home. All rights reserved.
No checkpoint checkpoint.cpt found
)
Hi!
Unfortunately, some of the CasA GW tasks do checkpoint rather infrequently, (sometimes at intervals > 50 min depending on CPU performance), see Bernd's message on this here: http://einsteinathome.org/node/197302&nowrap=true#129184 and the discussion leading to this message. Please bear with us while we are looking into this.
Cheers
HB
Thankyou for the quick
)
Thankyou for the quick response, and the link. It has answered my question.
I now have a checkpoint.cpt file in each slot. It created the files approx. one hour after Einstein@Home started running, according to the stderr.txt file.
I will monitor the timestamp of the checkpoint.cpt files to determine the best time to shutdown my system, and not loose all of that work.
Joe
Hi I'm not quite sure how
)
Hi
I'm not quite sure how suspending (to disk/ to RAM) instead of power off might also help in this scenario but if that's available on the machine(s) in question, it's worth a try, I guess.
Cheers
HB
RE: Hi I'm not quite sure
)
There is a line of discussion being followed by the third-party developers at SETI concerning a very different model of disk write caching which Microsoft implemented in (Vista?)/7/8, compared with XP and earlier.
http://support.microsoft.com/kb/148505
The BOINC API doesn't do any of that, so there's a possibility that data might be lost at shutdown. We've seen it mainly in truncated stderr.txt, but it might affect checkpoint files too - though I personally would be surprised if even Microsoft could delay-write files by as much as an hour.
Edit - I see I linked a very old MS KB article. But there's plenty of discussion on the web about how Windows 7 caching is more aggressive - e.g.
http://cboard.cprogramming.com/c-programming/129192-fflush-not-working-windows-7-a-2.html
RE: Thankyou for the quick
)
Each project chooses whether to have their units save checkpoints or not and how often, it is part of the flexibility the projects have. So if you run multiple projects it could be tricky.