Too many WU's

Paul D. Buck

Joined: 17 Jan 05

Posts: 754

Credit: 5385205

RAC: 0

The setting I like, that is

5 Mar 2005 15:22:08 UTC

Message 6665 in response to message 6663

(moderation:

)

The setting I like, that is currently partially enabled at seti, is a project
> imposed maximum queue. You can not set your queue longer than 10 days at seti.
> The reason I consider this partially enabled is you can still set your queue
> higher at another project and seti will not lower it automatically.

Yes, but if there was "capped" settings would they affect the scheduler's decisions enough?

Einstein@Home has a 7 day deadline, which is fine, but with my 2 day queue size I was still overrunning the deadlines on a regular basis.

Jayargh

Joined: 9 Feb 05

Posts: 64

Credit: 1205159

RAC: 0

> Einstein@Home has a 7 day

5 Mar 2005 16:01:48 UTC

Message 6666 in response to message 6665

(moderation:

)

> Einstein@Home has a 7 day deadline, which is fine, but with my 2 day queue
> size I was still overrunning the deadlines on a regular basis.
Paul does your cputime = roughly time to completion or is it skewed like mine? How about anyone else? Thusly getting too much work. As I have said in another thread my machines are taking 25-35% longer than what Einstein's benchmark calculations say :( Pushing me to too close to deadlines . I have still not seen this question answered by Bruce or anyone else running Einstein.

Ned Ludd

Joined: 9 Feb 05

Posts: 23

Credit: 56045

RAC: 0

> The setting I like, that is

5 Mar 2005 16:18:02 UTC

Message 6667 in response to message 6663

(moderation:

)

> The setting I like, that is currently partially enabled at seti, is a project
> imposed maximum queue. You can not set your queue longer than 10 days at seti.
> The reason I consider this partially enabled is you can still set your queue
> higher at another project and seti will not lower it automatically.

Trouble is, this is a global setting -- change it for one project and you change it for all of them.

That said, you really don't need to set this very high if you are running more than one project -- you only need to set this higher than a half-day or so if you need to cache work when the single project you crunch could be down.

I'm currently running with "connect every 0.08 days" -- about every two hours, and I've always got E@H and LHC, and I've usually got SETI (not at the moment, obvously).

Paul D. Buck

Joined: 17 Jan 05

Posts: 754

Credit: 5385205

RAC: 0

> > > Einstein@Home has a 7

5 Mar 2005 17:35:58 UTC

Message 6668 in response to message 6666

(moderation:

)

>
> > Einstein@Home has a 7 day deadline, which is fine, but with my 2 day
> queue
> > size I was still overrunning the deadlines on a regular basis.
> Paul does your cputime = roughly time to completion or is it skewed like
> mine? How about anyone else? Thusly getting too much work. As I have said in
> another thread my machines are taking 25-35% longer than what Einstein's
> benchmark calculations say :( Pushing me to too close to deadlines . I have
> still not seen this question answered by Bruce or anyone else running
> Einstein.

For the 6 machines I have these results (times in seconds):

[pre]
Computer Type/Spd Est. Time Avg. Time Nbr. Results
======== ============= ========= ========= ============
P4b P4-HT 3.0 GHz 31,860 38,047 7
P4a P4-HT 3.2 GHz 53,220 50,529 8
EQ-1 P4-HT 3.2 GHz 38,100 37,282 17
EQ-2 P4-HT 3.0 GHz 40,320 38,913 8
RaidServ P4 2.8 GHz 30,120 29,204 5
Mac-G5a G5 Dual 2 GHz 28,920 28,527 2
[/pre]

A couple notes, the P4 3.2 GHz processors are identical but the MB are not, this will/should change this week when the updated MB arrives

The non-HT machine of course beats the HT machines in processing time, but will not have the same throughput ...

The times seem to track pretty well, though I do not have a large sample size on completion times, and the estimate times are all from a single value per machine

Bruce Allen

Moderator

Joined: 15 Oct 04

Posts: 1119

Credit: 172127663

RAC: 0

> Another point of confusion

5 Mar 2005 18:33:02 UTC

Message 6669 in response to message 6661

(moderation:

)

> Another point of confusion is that BOINC tries to keep the queue
> between x and 2x days (4.1x clients) per project.

John, could you please point me to the part of the scheduler code where the factor of 2 appears? I don't see it. I see that when the client requests n seconds of work, the scheduler keeps sending WU until it has exceeded n seconds, then it stops. Where is the factor of 2?

Bruce

Director, Einstein@Home

Keck_Komputers

Joined: 18 Jan 05

Posts: 376

Credit: 5744955

RAC: 0

> > Another point of

5 Mar 2005 22:23:51 UTC

Message 6670 in response to message 6669

(moderation:

)

> > Another point of confusion is that BOINC tries to keep the queue
> > between x and 2x days (4.1x clients) per project.
>
> John, could you please point me to the part of the scheduler code where the
> factor of 2 appears? I don't see it. I see that when the client requests n
> seconds of work, the scheduler keeps sending WU until it has exceeded n
> seconds, then it stops. Where is the factor of 2?
>
> Bruce
>
It's in the file boinc_public/client/cs_scheduler.c about 2/3 to 3/4 of the way down (sorry don't have line number handy). Opps on looking closer the times 2 line is commented out in the public branch too, I thought it was just commented out in the development branch.

// determine work requests for each project
// NOTE: don't need to divide by active_frac etc.;
// the scheduler does that (see sched/sched_send.C)
//
p->work_request = max(0.0,
//(2*work_min_period - estimated_time_to_starvation)
(work_min_period - estimated_time_to_starvation)
* ncpus
);

BOINC WIKI

BOINCing since 2002/12/8

Bruce Allen

Moderator

Joined: 15 Oct 04

Posts: 1119

Credit: 172127663

RAC: 0

> > > Another point of

5 Mar 2005 23:32:17 UTC

Message 6671 in response to message 6670

(moderation:

)

> > > Another point of confusion is that BOINC tries to keep the queue
> > > between x and 2x days (4.1x clients) per project.
> >
> > John, could you please point me to the part of the scheduler code where
> the
> > factor of 2 appears? I don't see it. I see that when the client
> requests n
> > seconds of work, the scheduler keeps sending WU until it has exceeded n
> > seconds, then it stops. Where is the factor of 2?
> >
> > Bruce
> >
> It's in the file boinc_public/client/cs_scheduler.c about 2/3 to 3/4 of the
> way down (sorry don't have line number handy). Opps on looking closer the
> times 2 line is commented out in the public branch too, I thought it was just
> commented out in the development branch.
>
> // determine work requests for each project
> // NOTE: don't need to divide by active_frac etc.;
> // the scheduler does that (see sched/sched_send.C)
> //
> p->work_request = max(0.0,
> //(2*work_min_period - estimated_time_to_starvation)
> (work_min_period - estimated_time_to_starvation)
> * ncpus
> );

John,

FWIW, the E@H scheduler and backend code is ALL from BOINC development CVS.

The relevant code is in the same place: sched/sched_send.C:

reply.wreq.seconds_to_fill -= wu_seconds_filled;

The right hand side is obtained from estimate_wallclock_duration().

As far as I can tell, a request for N seconds of work gets the smallest amount possible which is >= N.

Bruce

Director, Einstein@Home

John McLeod VII

Moderator

Joined: 10 Nov 04

Posts: 547

Credit: 632255

RAC: 0

This was discussed in the

6 Mar 2005 0:36:19 UTC

Message 6672

(moderation:

)

This was discussed in the Alpha test mail list. There was a time when the connect every x days, the x was doubled, and then when the work queue was reduced to x, the work queue was filled to 2x again. This was changed (I thought with 4.50, but I could be wrong here) to fill the queue to x, and refill when it was worked down to 1/2 x. I have not looked at the code, but am relying on comments in the alpha mail list.

BOINC WIKI

Ingleside

Joined: 23 Jan 05

Posts: 33

Credit: 82113555

RAC: 0

> > FWIW, the E@H scheduler

26 Mar 2005 17:53:42 UTC

Message 6673 in response to message 6671

(moderation:

)

>
> FWIW, the E@H scheduler and backend code is ALL from BOINC development CVS.
>
> The relevant code is in the same place: sched/sched_send.C:
>

The asking for 2x amount of work is in the client, not in the scheduling-server.

v4.1x and v4.2x is built from the public branch, and if you takes a closer look on dates you'll see boinc_public/client/cs_scheduler.c still had 2x till everything was upgraded for the v4.20-release.

"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."

Too many WU's

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner