CPU time since checkpoint: 4h

HAL
HAL
Joined: 9 Mar 20
Posts: 2052
Credit: 41039990
RAC: 39111

I will search for a solution

I will search for a solution to the sleep function again today and see what I can come up with. I will report back if I find one ...

Processing work units with "outdated" (according to Microsoft) Ryzen 7 1700

mikey
mikey
Joined: 22 Jan 05
Posts: 12676
Credit: 1839077099
RAC: 4012

Be sure to check the power

Be sure to check the power management section as well as the screensaver section under system settings

HAL
HAL
Joined: 9 Mar 20
Posts: 2052
Credit: 41039990
RAC: 39111

mikey wrote: Be sure to

mikey wrote:

Be sure to check the power management section as well as the screensaver section under system settings

Yea I checked all that it wasn't any of that. It would act like it was going to sleep - power light blinking and all - but it wouldn't wake up.

But ... now it works. Last time I tried it it wouldn't work, since then I think there's been at least one Linux kernel update. Maybe that fixed it I dunno. But now I can put it to sleep to save power and wake it up again next evening and no loss of compute time - starts with the last state in BOINC and the last % progress. Has nothing to do with checkpoints it's going to sleep baby!

Processing work units with "outdated" (according to Microsoft) Ryzen 7 1700

mikey
mikey
Joined: 22 Jan 05
Posts: 12676
Credit: 1839077099
RAC: 4012

HAL wrote: mikey wrote: Be

HAL wrote:

mikey wrote:

Be sure to check the power management section as well as the screensaver section under system settings

Yea I checked all that it wasn't any of that. It would act like it was going to sleep - power light blinking and all - but it wouldn't wake up.

But ... now it works. Last time I tried it it wouldn't work, since then I think there's been at least one Linux kernel update. Maybe that fixed it I dunno. But now I can put it to sleep to save power and wake it up again next evening and no loss of compute time - starts with the last state in BOINC and the last % progress. Has nothing to do with checkpoints it's going to sleep baby! 

WOO HOO!!

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4963
Credit: 18699402209
RAC: 6231737

Just means hibernate mode is

Just means hibernate mode is working as it should.  The PC is saving its compute state to disk file and replaying it upon wakeup.

 

Scrooge McDuck
Scrooge McDuck
Joined: 2 May 07
Posts: 1052
Credit: 17869501
RAC: 12422

Scrooge McDuck schrieb:This

Scrooge McDuck wrote:
This is an ongoing problem, well more of an annoyance to watch out for to avoid wasting CPU time. It affects the FGRP5 CPU app. [...]

I'd like to add: the current run of O3MD1 CPU work units also has a differing number (between 32 and 64) of checkpoints. Those with only 32 checkpoints are also quite some bummers on older computers not running 24/7. It easily takes an hour between checkpoints for such work units too.

Number of checkpoints can only be found in stderr.txt logfile in WU's slot directory:

2023-02-13 03:52:03.6794 (8268) [normal]: Cpt:0,  total:32,  sky:1/1,  f1dot:1/32

The hibernate mode is necessary for these.
Scrooge McDuck
Scrooge McDuck
Joined: 2 May 07
Posts: 1052
Credit: 17869501
RAC: 12422

There are again a bunch of

There are again a bunch of FGRP5 (CPU) tasks send out at the moment that only contain FIVE (5) skypoints, corresponding to only writing five checkpoints between 0% and 90% progress (i.e.: 18%, 36%, 54%, 72%, 90%). This can take hours on old computers.

command line: projects/einstein.phys.uwm.edu/hsgamma_FGRP5_1.08_windows_intelx86__FGRPSSE.exe --inputfile ../../projects/einstein.phys.uwm.edu/LATeah2009F.dat --alpha 4.5409627413 --delta 0.1251924672 --skyRadius 0.001125737368 --ldiBins 15 --f0start 72.0 --f0Band 16 --firstSkyPoint 2780 --numSkyPoints 5 --f1dot -5.000000000000006e-11 [...]

Again they have a low number (here: 88.0 < 100) after the prefix "LATeah2009F" in the workunit name.

The only option to finish such tasks without wasting hours of computation is: don't stop them, don't shut down the computer. Instead send OS into hibernate mode (suspend to disk). Or one can carefully watch the progress meter in BOINC manager and check the task details for "CPU time since last checkpoint" before shutting down the computer shortly after one of the few checkpoints was written.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.