E@H Cache vs Deadline

Robby
Robby
Joined: 18 Jan 05
Posts: 33
Credit: 403342998
RAC: 2974

Looks like gen prefs do apply

Looks like gen prefs do apply across all projects see http://boinc-doc.net/index.php and http://boinc-doc.net/site-common/glossary/letter-g.php .

But just to test I left S@H at 5 and E@H at 2 days to see if it actually works this way. Running Linux cli with the -return_results_immediately option. Slow machine so may take awhile.
http://boinc.berkeley.edu/client_unix.php

Heffed
Heffed
Joined: 18 Jan 05
Posts: 257
Credit: 12368
RAC: 0

> Looks like gen prefs do

Message 1968 in response to message 1967

> Looks like gen prefs do apply across all projects see
> http://boinc-doc.net/index.php and
> http://boinc-doc.net/site-common/glossary/letter-g.php .
>
> But just to test I left S@H at 5 and E@H at 2 days to see if it actually works
> this way. Running Linux cli with the -return_results_immediately option.
> Slow machine so may take awhile.
> http://boinc.berkeley.edu/client_unix.php

Or, you can just look at what it says on the general preferences page. ;)

General preferences

These apply to all BOINC projects in which you participate.
On computers attached to multiple projects, the most recently modified preferences will be used.

Trust us, It works this way. :)

Astro
Astro
Joined: 18 Jan 05
Posts: 257
Credit: 1000560
RAC: 0

> The BOINC scheduler does

Message 1969 in response to message 1965

> The BOINC scheduler does not take deadlines into account when deciding which
> WUs to crunch, or even whether to download more WUs. I have been attempting
> to bring this to the attention of the BOINC developers for quite a while.
>
I don't think it'd be right for Boinc to assign priority to work units based solely upon the deadline. THere must be some other mix of criterion to handle this problem. If Boinc looked only at deadlines then Seti units would never be crunched since they have two week deadlines. Or looking at it another way, a project could lower it's deadline and force boinc users to crunch more work for them.

I don't really think that any project would intentionally rob cpu time from other projects, but that would be the result.

Hope this idea helps.

tony

genes
genes
Joined: 10 Nov 04
Posts: 41
Credit: 3488217
RAC: 10212

Here's my observation on the

Here's my observation on the Cache vs Deadline debate, FWIW. I'm still crunching 4.71 WU's which were deadlined at 1/25, but I'm letting them finish rather than dumping them. I don't really want to intervene in the operation of BOINC, unless necessary to update software, recover from crashes, etc. If I get no credit for this work, then so be it.

The problem seems to be that it's not so much that the deadline for Einstein is too short, it's that the client requests too much work. All my machines are connected to at least 5 projects (including LHC), and some are also connected to Pirates. The BOINC CC appears to request work FOR EACH PROJECT as if that project is the only one operating on a system. When Predictor came back online, my Einstein WU's started to fall behind, since Einstein was now receiving less actual attention on each machine. When LHC comes back online either this week or next, the problem is going to get worse. I may never finish an Einstein WU within the deadline.

Now, everybody seems to think that this is a problem with the deadlines, which is partly true, considering that I COULD be running my machines far less than the 24/7 that I do. Einstein should at least extend the deadlines (or accept the late work anyway) to cover the real-world behavior of the BOINC client and their apps. BUT, the problem really is the dishonest nature of the client requesting more work than it can do. It should take into account the assigned resource share when requesting work. I mean, it knows the amount, since it calculates and displays it for you, so why not use it? On my systems, Einstein gets no more than 20% (assuming LHC active), but in reality 25%, since there is no LHC work. Also, the new Einstein WU's appear to take far longer than the estimates, compounding the problem.

BTW, I have my cache set to 2 days. I used to have it set to 5 days, back in the time when the clients would go crazy if they ran out of work on dual processor machines. (I don't know if that's been fixed, but I don't feel like trying it out.) I think that's a reasonable amount and that I shouldn't have to second-guess the client and set the cache based on the observed behavior of the client and the project deadlines.

Sorry for the rather long rant ;-) I've been watching this happen the last few days and it's starting to bug me, as most people seem to be missing the point.

-Gene


Honza
Honza
Joined: 10 Nov 04
Posts: 136
Credit: 3332354
RAC: 0

I agree with Gene..or what he

I agree with Gene..or what he has posted.
There is an option in BOINC Manger (now in version 4.62) that allows user to pause some WU(s) so that WU near deadline can be computed in advance. This is, i think, a good option to have.

Heffed
Heffed
Joined: 18 Jan 05
Posts: 257
Credit: 12368
RAC: 0

> It should take into account

Message 1972 in response to message 1970

> It should take into account the assigned
> resource share when requesting work. I mean, it knows the amount, since it
> calculates and displays it for you, so why not use it? On my systems,
> Einstein gets no more than 20% (assuming LHC active), but in reality 25%,
> since there is no LHC work. Also, the new Einstein WU's appear to take far
> longer than the estimates, compounding the problem.

It does take resource share into account, as well as how much up time BOINC has on your machine when assigning work. If you aren't seeing any difference with resource share, perhaps make a more extreme adjustment to see for yourself? Put one of your projects at 10000 and see how many WUs the other projects download.

keputnam
keputnam
Joined: 18 Jan 05
Posts: 47
Credit: 84986834
RAC: 26893

> > It should take into

Message 1973 in response to message 1972

> > It should take into account the assigned
> > resource share when requesting work. I mean, it knows the amount, since
> it
> > calculates and displays it for you, so why not use it? On my systems,
> > Einstein gets no more than 20% (assuming LHC active), but in reality
> 25%,
> > since there is no LHC work. Also, the new Einstein WU's appear to take
> far
> > longer than the estimates, compounding the problem.
>
> It does take resource share into account, as well as how much up time BOINC
> has on your machine when assigning work. If you aren't seeing any difference
> with resource share, perhaps make a more extreme adjustment to see for
> yourself? Put one of your projects at 10000 and see how many WUs the other
> projects download.
>
>

Maybe in theory, but this is not observed in the real world (at least on my machines).

At point, I had three Einstein WUs on one of my machines, at least two of which would obviously miss their deadline (the third just barely squeaked in), and the CC downloaded two more! None of my machines are really state of the art, and most are running all of the public projects, but that is beside the point. If I can't finish the WUs I have before their deadline(s), why did the CC ask for more work from that project?

I have varying resource shares, (weighted for Einstein and Pirates) and I do see those two projects get more CPU time, so that part DOES work.


John McLeod VII
John McLeod VII
Moderator
Joined: 10 Nov 04
Posts: 547
Credit: 632255
RAC: 0

> > > It should take into

Message 1974 in response to message 1973

> > > It should take into account the assigned
> > > resource share when requesting work. I mean, it knows the amount,
> since
> > it
> > > calculates and displays it for you, so why not use it? On my
> systems,
> > > Einstein gets no more than 20% (assuming LHC active), but in
> reality
> > 25%,
> > > since there is no LHC work. Also, the new Einstein WU's appear to
> take
> > far
> > > longer than the estimates, compounding the problem.
> >
> > It does take resource share into account, as well as how much up time
> BOINC
> > has on your machine when assigning work. If you aren't seeing any
> difference
> > with resource share, perhaps make a more extreme adjustment to see for
> > yourself? Put one of your projects at 10000 and see how many WUs the
> other
> > projects download.
> >
> >
>
> Maybe in theory, but this is not observed in the real world (at least on my
> machines).
>
> At point, I had three Einstein WUs on one of my machines, at least two of
> which would obviously miss their deadline (the third just barely squeaked in),
> and the CC downloaded two more! None of my machines are really state of the
> art, and most are running all of the public projects, but that is beside the
> point. If I can't finish the WUs I have before their deadline(s), why did the
> CC ask for more work from that project?
>
> I have varying resource shares, (weighted for Einstein and Pirates) and I do
> see those two projects get more CPU time, so that part DOES work.
>
I have Pirates set to 1000 and Einstein set to 50. They both request 8640 seconds of work on each connection. There is a problem if any one of the following is true of a system or WU. If you get a WU that has a ratio of Crunch Time Remaining to Time To Deadline that is nearly one and you have any other work (even with a much longer deadline). A slow machine that finally gets work from ALL of the projects (remember most of them have been off recently). An outage of some sort that takes your machines offline for a couple of days...

A solution to these would be to attempt a time crunch problem at the hourly switchover to new work. If there is a time problem, then switch to selecting the closest deadline until the time problem is resolved. If you are in time trouble, do not download any more work (more work will only make the situation worse), add some flag to the UI indicating the problem. Since the resource usage debt is being tracked, I would suggest that a project with a negative debt (it has done more than its share of processing recently) should not be contacted for more work unless there is a CPU starved, and no projects with higher debt will give work. This does mean a change to the scheduler so that it will calculate debt for projects that do not currently have work downloaded on the host.

Heffed
Heffed
Joined: 18 Jan 05
Posts: 257
Credit: 12368
RAC: 0

> I have Pirates set to 1000

Message 1975 in response to message 1974

> I have Pirates set to 1000 and Einstein set to 50. They both request 8640
> seconds of work on each connection.

Well then I'm totally confused, because that's not what I see at all. Currently the only projects I have asking for work on my log are LHC and Pirates, so I'll use them as an example.

Resource shares for 5 projects at 100/(5.71%), E@H at 250/(14.29%), and Pirates at 1,000/(57.14%). Connect to server every 3 days. CC is version 4.19.

--- - 2005-01-29 20:40:26 - Insufficient work; requesting more
LHC@home - 2005-01-29 20:40:26 - Requesting 27811 seconds of work

--- - 2005-01-29 20:51:04 - May run out of work in 3.00 days; requesting more
Pirates@Home - 2005-01-29 20:51:04 - Requesting 269484 seconds of work

That's quite a huge difference in what they are requesting!

In fact, just after joining E@H, I was adjusting resource shares because I didn't know what to expect from the project. (as well as just upgrading my CC from 4.13 to 4.15) I set all projects to an equal resource share, (100) with a "connect to server" of 1 day, I get 3-4 WUs in my queue on average from each project.

When I set Pirates to 1000, I need to boost my "connect to server" to 3 days to get the same number of WUs (3-4) from each project. (Not a problem because Pirates rarely has work)

If it's not looking at resource share, what would explain this?

Robby
Robby
Joined: 18 Jan 05
Posts: 33
Credit: 403342998
RAC: 2974

After finally setting E@H to

After finally setting E@H to 1 day and leaving S@H at 5, the S@H cache on this computer dropped (from 4 down to 2) as expected but the E@H cache unexpectedly grew by one extra wu (from 3 to 4). H1_0050.9_0051.4_0.1_T02_Test, after completeing, hung at 100% and apparently killed the E@H client. BoincCC continued but it stopped switching between projects and no further progress was made in either E@H or S@H.

After waiting about 1/2 day and seeing no change, I stopped then restarted BoinCC, neither project was running, and the wu above restarted from 0% although it was past the deadline. Boinc bash script also reporting cpu usage at 23h 48m for E@H and 16h 19m for S@H which doesn't seem to be correct. May in part be related to using the 2.6.8 kernel. At this point I reset E@H and currently have the 2 S@H wu's and just 1 new E@H wu which seems about right for a 1 day setting on this old machine running BoinCC 4.19 and E@H 4.73, S@H still at 4.02.

Want to leave S@H with a larger cache since two other machines are only crunching S@H wu's and am anticipating some outages with the rumored killing off of S@H Classic and 100% migration to Boinc in the near future.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.