Won't finish in time?

tng*
tng*
Joined: 6 Sep 05
Posts: 13
Credit: 66,244,577
RAC: 0
Topic 193690

I'm getting the following message when I request work from Einstein:

5/21/2008 6:58:41 PM|Einstein@Home|Sending scheduler request: To fetch work. Requesting 6985 seconds of work, reporting 0 completed tasks
5/21/2008 6:58:46 PM|Einstein@Home|Scheduler request succeeded: got 0 new tasks
5/21/2008 6:58:46 PM|Einstein@Home|Message from server: No work sent
5/21/2008 6:58:46 PM|Einstein@Home|Message from server: (won't finish in time) Computer on 99.8% of time, BOINC on 100.0% of that

This is on this system which is currently crunching 7 einstein and 1 proteins@home task on its eight cores, with no tasks waiting to run. The result duration correction factor for this host is 0.72277. I'm getting the same message on other hosts.

Clearly, this host would be able to finish before the deadline. The only explanation that I can come up with is that proteins (and some other projects that this host is attached to, but which do not have work at the moment) runs at a much higher priority than einstein, and the relative priorities are being taken into account. Is this what is happening? If not, anybody have any idea what is happening?

Sou'westerly
Sou'westerly
Joined: 9 Jun 06
Posts: 57
Credit: 715,838
RAC: 0

Won't finish in time?

The server messages are thus:
2008-05-22 00:06:43.1383 [PID=30379] [debug ] in_send_results_for_file(h1_1105.60_S5R3, 0) prev_result.id=97901846
2008-05-22 00:06:43.1539 [PID=30379] [debug ] est cpu dur 89456.300922; running_frac 0.998009; est 64923.272133
2008-05-22 00:06:43.1540 [PID=30379] [debug ] [WU#39641115 h1_1105.60_S5R3__738_S5R3b] needs 64923 seconds on [HOST#1307521]; delay_bound is 1555200 (estimated_delay is 1645887.125029)
2008-05-22 00:06:43.1542 [PID=30379] [normal ] [HOST#1307521] Sent 0 results [scheduler ran 0.234769 seconds]
2008-05-22 00:06:43.1552 [PID=30379] [debug ] [HOST#1307521] MSG(high) No work sent
2008-05-22 00:06:43.1552 [PID=30379] [debug ] [HOST#1307521] MSG(high) (won't finish in time) Computer on 99.8% of time, BOINC on 100.0% of that
2008-05-22 00:06:43.1552 [PID=30379] [normal ] sending delay request 60.000000
Which indicate to me that you are in debt to other projects and that BOINC is running pretty well continuously with a constant connection. It is therefore only downloading Einstein work units when it absolutely needs them and so is keeping its options open should any of the other projects come on line with work.
Dave.

tng*
tng*
Joined: 6 Sep 05
Posts: 13
Credit: 66,244,577
RAC: 0

OK. That's pretty much the

Message 81923 in response to message 81922

OK. That's pretty much the way I want it to work -- the other projects are intermittent or unreliable, so I want to crunch them when I can get work from them. When I can't get work from those projects, I know I can count on einstein for some work.

Just never saw this before, and was curious.

Quote:
The server messages are thus:
2008-05-22 00:06:43.1383 [PID=30379] [debug ] in_send_results_for_file(h1_1105.60_S5R3, 0) prev_result.id=97901846
2008-05-22 00:06:43.1539 [PID=30379] [debug ] est cpu dur 89456.300922; running_frac 0.998009; est 64923.272133
2008-05-22 00:06:43.1540 [PID=30379] [debug ] [WU#39641115 h1_1105.60_S5R3__738_S5R3b] needs 64923 seconds on [HOST#1307521]; delay_bound is 1555200 (estimated_delay is 1645887.125029)
2008-05-22 00:06:43.1542 [PID=30379] [normal ] [HOST#1307521] Sent 0 results [scheduler ran 0.234769 seconds]
2008-05-22 00:06:43.1552 [PID=30379] [debug ] [HOST#1307521] MSG(high) No work sent
2008-05-22 00:06:43.1552 [PID=30379] [debug ] [HOST#1307521] MSG(high) (won't finish in time) Computer on 99.8% of time, BOINC on 100.0% of that
2008-05-22 00:06:43.1552 [PID=30379] [normal ] sending delay request 60.000000
Which indicate to me that you are in debt to other projects and that BOINC is running pretty well continuously with a constant connection. It is therefore only downloading Einstein work units when it absolutely needs them and so is keeping its options open should any of the other projects come on line with work.
Dave.


Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,115
Credit: 36,579,747,390
RAC: 37,851,381

RE: Clearly, this host

Quote:
Clearly, this host would be able to finish before the deadline.

That's not how BOINC would see things. You haven't specified the resource share you have allocated to E@H but I'm guessing it would be quite low, say 10-20%. At 10%, BOINC can't afford to have more than 1.8 days E@H work on hand at the very most and there would be safety margins that would reduce it further. You said that you have 7 E@H tasks crunching and (from your results page) a task takes around 10 hours. Because the DCF hasn't reached its proper value yet (I'm guessing it should be a lot lower than 0.72277) the estimated crunch time for each of your results would probably be of the order of 15-20 hours when first downloaded. I'll bet that 7 times this estimate is way larger than whatever BOINC is actually allowed to have, depending on your actual resource share. So BOINC is quite correct in refusing to get more at the moment.

You can't force BOINC to download more E@H work under these conditions since BOINC will want to run one of the other projects as soon as one of them can actually supply some work. BOINC will not let a CPU lie idle so it will get more E@H work on a "just-in-time" basis, even though this will make the debt situation worse. When other projects get work, expect E@H not to run for quite a long time.

Quote:
The only explanation that I can come up with is that proteins (and some other projects that this host is attached to, but which do not have work at the moment) runs at a much higher priority than einstein ....

BOINC does NOT prioritize any particular project on anything else but the resource share you set and the debts that evolve from the crunching history of the host. BOINC doesn't play favourites :-). You need to think through the implications of the resource shares you set and the likely abilities of various projects to supply work. If you give your backup project a low share, expect it to have debt related problems when it is forced to run beyond its share like at the moment.

Cheers,
Gary.

RandyC
RandyC
Joined: 18 Jan 05
Posts: 2,663
Credit: 108,317,431
RAC: 19,918

RE: BOINC does NOT

Message 81925 in response to message 81924

Quote:

BOINC does NOT prioritize any particular project on anything else but the resource share you set and the debts that evolve from the crunching history of the host. BOINC doesn't play favourites :-). You need to think through the implications of the resource shares you set and the likely abilities of various projects to supply work. If you give your backup project a low share, expect it to have debt related problems when it is forced to run beyond its share like at the moment.

You should also look at your 'Computer is connected every...' value. It should be set to something realistic rather than the max value. Use the 'Maintain enough work for an additional...' value to increase the size of your cache up to an additional 10 days. If you're using broadband internet connection, try setting the 'connected every...' value to 1 day or less. If you are a dial-up user, set it to what you need to...

I use cable and my values are 0.1 and 1.5 days respectively. I always have plenty of work and I'm attached to at least two projects on each system.

Seti Classic Final Total: 11446 WU.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.