slow work

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9352143
RAC: 0

RE: RE: Here's the link

Message 50900 in response to message 50899

Quote:
Quote:

Here's the link to Dr. Allen's post on the matter:

Long WU Criteria

The only part I'm not clear on is how they determine the credit per CPU second.

Essentially it works out that roughly 1GHz class hosts and faster will get long WUs.

Alinator

Hosts whose benchmarks place them among the slowest 20% of hosts are given short WU if possible. The remaning 80% of machines get both slow and fast WU.

Cheers,
Bruce

Hi Dr. Allen,

That part I got. Here's a snippet from the last contact log from my P4:

2006-11-19 06:21:14.7627 [PID=18514] [normal ] [HOST#656342] [RESULT#53452272 l1_1408.5_S5R1__322_S5R1a_1] got result (DB: server_state=4 outcome=0 client_state=0 validate_state=0 delete_state=0)
2006-11-19 06:21:14.7627 [PID=18514] [debug ] cpu 31471.171875 cpcs 0.003723, cc 111.696181

The part I'm not getting is how the CPCS is calculated. If you multiply the shown CPCS and the CPU seconds reported you get a credit value of 117.167173, which doesn't match the reported CC value of 111.696181. When I calculate CPCS manually I get 0.003549.

Not much of a difference I grant you, but it's a "mystery" and I hate computational mysteries! :-)

Regards,

Alinator

Bruce Allen
Bruce Allen
Moderator
Joined: 15 Oct 04
Posts: 1119
Credit: 172127663
RAC: 0

RE: RE: RE: Here's the

Message 50901 in response to message 50900

Quote:
Quote:
Quote:

Here's the link to Dr. Allen's post on the matter:

Long WU Criteria

The only part I'm not clear on is how they determine the credit per CPU second.

Essentially it works out that roughly 1GHz class hosts and faster will get long WUs.

Alinator

Hosts whose benchmarks place them among the slowest 20% of hosts are given short WU if possible. The remaning 80% of machines get both slow and fast WU.

Cheers,
Bruce

Hi Dr. Allen,

That part I got. Here's a snippet from the last contact log from my P4:

2006-11-19 06:21:14.7627 [PID=18514] [normal ] [HOST#656342] [RESULT#53452272 l1_1408.5_S5R1__322_S5R1a_1] got result (DB: server_state=4 outcome=0 client_state=0 validate_state=0 delete_state=0)
2006-11-19 06:21:14.7627 [PID=18514] [debug ] cpu 31471.171875 cpcs 0.003723, cc 111.696181

The part I'm not getting is how the CPCS is calculated. If you multiply the shown CPCS and the CPU seconds reported you get a credit value of 117.167173, which doesn't match the reported CC value of 111.696181. When I calculate CPCS manually I get 0.003549.

Not much of a difference I grant you, but it's a "mystery" and I hate computational mysteries! :-)

Regards,

Alinator

Your machine must have microscopic benchmark values. What happens if you use the BOINC manager to re-run the benchmarks when the machine is idle? Do the benchmark values change? Note: you can find the benchmark values in client_state.xml

Bruce

Director, Einstein@Home

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9352143
RAC: 0

RE: Your machine must

Message 50902 in response to message 50901

Quote:

Your machine must have microscopic benchmark values. What happens if you use the BOINC manager to re-run the benchmarks when the machine is idle? Do the benchmark values change? Note: you can find the benchmark values in client_state.xml

Bruce

Sorry about not replying sooner, but the P4 and PIII got "locked out" shortly after you posted with a LAN Router meltdown at their location (it's remote) and you folks had your server failure, so I hadn't fully thought about your reply until it was too late. I decided to let them run their course to study their behaviour on full "AutoBOINC" and they just got back in the game yesterday.

Anyway, when I first posted the P4 was showing slightly lower BM's than normal, which was most likely due to Windows trying to figure out why the router was acting up and not responding normally. In addition, I had failed to take into account the other performance metrics. Rerunning the calcs, now they work out as expected.

Conclusion: DUH..... Alinator! Give self slap on head! :-)

Alinator

Omikronman
Omikronman
Joined: 23 Nov 06
Posts: 33
Credit: 83254
RAC: 0

I have now seen several

I have now seen several different sizes of work units here:

a) >= 2000 seconds CPU time
b) >= 20000 seconds CPU time
c) >= 27000 seconds CPU time
d) >= 33000 seconds CPU time
e) >= 38000 seconds CPU time

Pooh Bear 27
Pooh Bear 27
Joined: 20 Mar 05
Posts: 1376
Credit: 20312671
RAC: 0

RE: I have now seen several

Message 50904 in response to message 50903

Quote:

I have now seen several different sizes of work units here:

a) >= 2000 seconds CPU time
b) >= 20000 seconds CPU time
c) >= 27000 seconds CPU time
d) >= 33000 seconds CPU time
e) >= 38000 seconds CPU time


Many factors can change time. Depending on what else the computer is doing at the time. If you are video editing, and crunching simultaneously, your time will be longer, because your CPU is being more heavily used so it slows the process down. Dust and dirt in the fans and heatsink, or heat issues can also cause your times to go up. Make sure your system is clean.

Omikronman
Omikronman
Joined: 23 Nov 06
Posts: 33
Credit: 83254
RAC: 0

Everything is new and clean

Message 50905 in response to message 50904

Everything is new and clean here. I do no other work when Boinc is active. :-)

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2980564025
RAC: 758553

RE: I have now seen several

Message 50906 in response to message 50903

Quote:

I have now seen several different sizes of work units here:

a) >= 2000 seconds CPU time
b) >= 20000 seconds CPU time
c) >= 27000 seconds CPU time
d) >= 33000 seconds CPU time
e) >= 38000 seconds CPU time


Your ~2,000 second results are short WUs, running normally, and your ~20,000 second results are long WUs running normally.

Anything more than that is a problem. For example, look at your result 56773680 (~38,500 seconds). It contains the section:

2006-12-05 09:23:20.7530 [normal]: Start of BOINC application 'einstein_S5R1_4.28_i686-apple-darwin'.
2006-12-05 09:23:20.7676 [normal]: Started search at lalDebugLevel = 0
2006-12-05 09:23:22.1964 [normal]: Found checkpoint-file 'Fstat.out.ckp'
Failed to read checkpoint-counters from 'Fstat.out.ckp'!
2006-12-05 09:23:22.1971 [normal]: No usable checkpoint found, starting from beginning.

So the early results got lost, and a substantial part of the work had to be re-done: that's where the extra time went.

It doesn't say in the text why the program restarted. Possible reasons are:
a) You switched the machine off and went to bed!
b) You are crunching multiple projects, and the machine gave some time to another one.
c) You started doing some other work on the machine, and BOINC is set to exit when the machine is in use.

If the situation is (b) or (c), you could try changing your preferences (this board, 'Your account', click on 'General preferences'). The two to look at are the second and fourth in the top group: "Do work while computer is in use?" and "Leave applications in memory while suspended?". If you change both of these to 'yes' (then save and update the BOINC Manager), you may have a lower chance of this kind of error.

(Some people have said that 'leave in memory' can cause problems on some Macs - keep an eye on it, and be prepared to switch the preferences back if it doesn't work out).

Omikronman
Omikronman
Joined: 23 Nov 06
Posts: 33
Credit: 83254
RAC: 0

Yes, I use the settings "Do

Message 50907 in response to message 50906

Yes, I use the settings "Do work while computer is in use?" and "Leave applications in memory while suspended?" both with "yes" running. I have seen that a work needed to be re-done when I exit BOINC without switching the running work to "pause" before. If I set it to "pause" and then exit BOINC it is possible to continue later. The work infos say that sometimes the work checkpoint can´t be found. O.o

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.