Deadlines too short?

Grimm
Grimm
Joined: 22 Jan 05
Posts: 40
Credit: 239433118
RAC: 80397
Topic 187369

Hello all!

Just joined this project. I downloaded my first two workunits last night. On my old machine (500 MHz)it looks like a workunit will take about 40 hours to process. Since I am splitting my time 50/50 with SETI, by my calculations, the second work unit will complete just a couple of hours before the deadline. This seems like an awfully tight margin to work in. I have bumped up the Enstein priority so now it will get 66% of the process time but I would still like to suggest a longer report deadline period. Two weeks instead of one might be more appropriate.

Thanks.

groucho
groucho
Joined: 18 Jan 05
Posts: 5
Credit: 37601
RAC: 0

Deadlines too short?

I totally agree.

G

Military intelligence is a contradiction in terms (Groucho Marx)

at

Buster Gunn
Buster Gunn
Joined: 22 Jan 05
Posts: 8
Credit: 1637782
RAC: 0

Hi, Just joined a week ago.

Hi, Just joined a week ago. I agree, deadlines too short and downloads to big.

Heffed
Heffed
Joined: 18 Jan 05
Posts: 257
Credit: 12368
RAC: 0

> Hi, Just joined a week ago.

Message 1936 in response to message 1935

> Hi, Just joined a week ago. I agree, deadlines too short and downloads to
> big.

The large downloads don't happen very often. Most of the WUs are generated more or less instantly during normal scheduler contact.

Shaktai
Shaktai
Joined: 8 Nov 04
Posts: 183
Credit: 426451
RAC: 0

As mentioned in some other

As mentioned in some other threads, the project is still in testing stage. The deadlines are being kept intentionally short for the testing period to better facilitate result analysis. The project team has announced that the deadlines will be extended when the project goes public. Remember that you "volunteered for testing." That is why you go the invite.

It shouldn't be too much longer before they go public though, then we should all see some longer deadlines. When we (early testers) started, the deadlines were about 12 days, but the need for quicker feedback to speed development, led to shorter deadlines.

The Ox
The Ox
Joined: 22 Jan 05
Posts: 11
Credit: 15294359
RAC: 0

I've kept my cache size very

I've kept my cache size very low because of the nature of both Einstein and Predictor being essential in test-phases. When you're running 4 (5 if LHC comes back online soon like they're hoping) projects at once, a short cache is the only way to meet these deadlines.

Now CPDN has nice WU deadlines. If only all projects could be like CPDN... Hehe.


www.clintcollins.org - spouting off at the speed of site

Borged by MGP
Borged by MGP
Joined: 22 Jan 05
Posts: 12
Credit: 95513713
RAC: 0

I appreciate the project is

I appreciate the project is still in testing, however the short deadlines are now causing problems with WUs not being returned in time.

Rarely are PCs dedicated to Einstein alone (indeed the idea of boinc is to have a at least a couple of projects running in case one goes down). Some of the problems that have been highlighted recently relate to the multiple project setups I believe, so running non dedicated is ideal for this beta stage of the project.

The timesharing on slower PCs is resulting in many deadlines being missed. Yet the PC keeps crunching the older WU that is out of time. Meantime a new WU has downloaded, even with only a 1 day cache. As that is not started promptly, it too will overrun the deadline. This does not help the projects troubleshooting of results, it does the opposite by adding increased delay.

This is compounded by boinc not having any facility enabling individual WUs to be prioritised as their deadlines approach. Or even a facility to delete or reject a single WU, without resetting the entire project.

I know when the project goes public longer crunching deadlines will be set. Can at least an extended time, even if only part of the eventual limit, be implemented now, to reduce the current no reply issues?
______

Edit: Oh and thought I'd be clever and temporarily up the resoource share of the project, update so the PC would see that till it's next expected updates and thus give more priority to the expiring WUs. All that's done is given me even more WUs to do which there is no hope of returning in time.

Heffed
Heffed
Joined: 18 Jan 05
Posts: 257
Credit: 12368
RAC: 0

If your system still goes

If your system still goes over with 1 day, try less than a day. .01 or similar.

And it's interesting that you mention increasing the resource share resulted in more WUs, as there are those that see no difference on their systems. It's extemely obvious on my system. (copy and paste from another thread below)

Out of curiosity, what were the shares you used, how many WUs did it give you, and what version of the CC are you running?

From another thread.

> I have Pirates set to 1000 and Einstein set to 50. They both request 8640
> seconds of work on each connection.

Well then I'm totally confused, because that's not what I see at all. Currently the only projects I have asking for work on my log are LHC and Pirates, so I'll use them as an example.

Resource shares for 5 projects at 100/(5.71%), E@H at 250/(14.29%), and Pirates at 1,000/(57.14%). Connect to server every 3 days. CC is version 4.19.

--- - 2005-01-29 20:40:26 - Insufficient work; requesting more
LHC@home - 2005-01-29 20:40:26 - Requesting 27811 seconds of work

--- - 2005-01-29 20:51:04 - May run out of work in 3.00 days; requesting more
Pirates@Home - 2005-01-29 20:51:04 - Requesting 269484 seconds of work

That's quite a huge difference in what they are requesting!

In fact, just after joining E@H, I was adjusting resource shares because I didn't know what to expect from the project. (as well as just upgrading my CC from 4.13 to 4.15) I set all projects to an equal resource share, (100) with a "connect to server" of 1 day, I get 3-4 WUs in my queue on average from each project.

When I set Pirates to 1000, I need to boost my "connect to server" to 3 days to get the same number of WUs (3-4) from each project. (Not a problem because Pirates rarely has work)

If it's not looking at resource share, what would explain this?

Shaktai
Shaktai
Joined: 8 Nov 04
Posts: 183
Credit: 426451
RAC: 0

Is the problem really that

Is the problem really that the deadlines are too short, or is it that the caches are too big. This is not the first project to experience that problem, but it is more obvious here.

There are a lot of factors involved, and the development teams are working on them. Some of the solutions may take time however.

The method that I have adopted for calculating my cache is to take the shortest deadline for any project I am working and divide it by 3. That then becomes my cache time for "all" projects. Doing that all my projects are kept fresh, all my computers complete work on time and all work units get returned well before the deadline. My oldest computer however is 600mhz, so it may be that computers with less then 300mhz may need more consideration. In reality, I set it much lower then that, to accomodate the testing projects that change frequently. That method is also based upon 24/7 crunching. If you crunch less then that, you may need to adjust more for the percentage of the day.

Resource sharing does work and does compensate for the number of projects, but it doesn't do it immediately. A project and client must "update" before it recognizes the changed resource share. If you don't do a manual update on all clients and all projects after changing resource share, then the next time the client connects to a project and requests work (the automatic update), it will remember the old settings. It will receive the new settings when it connects, but will have already requested a higher amount of work based upon the old settings. The new resource share settings will take effect on the second update after the change. This is a design flaw, admittedly, and they haven't yet found an adequate fix.

Simply speaking, if you are running multiple projects and cannot complete the work units in time, then set a smaller cache, a much smaller cace. If you can, do a manual update on all clients and all projects. Folks with large farms may not be able to do that, so just remember that the new resource share may not take effect until the "second update" after.

Borged by MGP
Borged by MGP
Joined: 22 Jan 05
Posts: 12
Credit: 95513713
RAC: 0

My caches had been set to 1

My caches had been set to 1 day. At the moment I haven't split PCs into different groups of work, home, school, so with around 20 PCs including those at the office, all of quite wide variance specs, the fastest P4 HTs running all 4 boinc projects, and the slowest PIII 400 only 1 (seti), the cache is set to both limit the number of connections needed per day on the office network, and also to minimise the chance of the single project PCs drying up when their project's server goes off line.

As for Einstein, the preferences were upgraded from a 200 Seti / 60 Einstein / 60 Predictor share, to 250 for Einstein giving around 40% Seti, 60% Einstein / 8 Predictor (which could give me an issue with Predictor as their dedlines are short too, although the WUs crunch in far less time than Einstein).

The particular PC is an Athlon 1800XP laptop running Win XP Pro, 256MB RAM.
It wanted 44660 seconds of work on triggering the update, deciding it was going to runout of work in 1.00 days

I do not the points made about cache sizes, and resources sharing, as posted by Shaktai, and will see if I can make adjustments with that in mind. It will take a couple of days for that to work through the farm though.

I do remain of the view that the deadlines are too short given that the WUs are bigger than those of seti, around 2/3s as big in time terms. Short deadlines on WUs taking an hour or two to crunch on an average PC are fine. There ought to be some sort of multiplier of the expected crunching time that could factor an optimum time?

Borged by MGP
Borged by MGP
Joined: 22 Jan 05
Posts: 12
Credit: 95513713
RAC: 0

Ah I have another theory as

Ah I have another theory as to why many results are timing out. I'v been doing Einstein for about a week, since the signup invite.

The client must update to get it's new settings. When it first connects it has no settings and uses the boinc default 100. It ends up downloading too much first time round. Hence whilst my clients have not downloaded many WUs recently, in the first couple of days of running the project, until the client took up the effects of the resource share preferences, too many WUs exist which will now time out.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.