Report deadline: 1 week!

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

> 100 hours for a single WU!?

Message 3358 in response to message 3357

> 100 hours for a single WU!? That's amazingly slow. If it is that slow, I
> would think that YOU would know to sign up for only ONE project on those
> machines. Assign the reallly slow machines to ONE project, set their cache
> for .1 days and I doubt you would have any problems.
>
> Eeek!! that sounded a bit aggressive and I didn't mean to give that impression
> so I hope your not offended, just trying to be helpful.

But one of the main design goals (though they are not written down anywhere concrete that I know of) is to support multiple projects. WAY back in the beta days about a year ago we had quite a discussion about these issues with no resolution as we were debating in advance of knowledge. Now that we are in "production" mode on several projects some of the discussion points made then were about the potential problems with scheduling without more interaction between the projects.

The design goal of not requiring inter-project communication is still a valid (and correct in my opinion) choice. However, the client/participant side does have the information about what it is doing. As yet, the capture and use of this information is not occurring.

John should be able to sign up for all available projects, but then, when he is out of work, his client should be able to get some within its limitations. If he signs up for only one project, he does not get the benefit of the assurance that work will always be running.

John McLeod VII
John McLeod VII
Moderator
Joined: 10 Nov 04
Posts: 547
Credit: 632255
RAC: 0

> > 100 hours for a single

Message 3359 in response to message 3358

> > 100 hours for a single WU!? That's amazingly slow. If it is that slow,
> I
> > would think that YOU would know to sign up for only ONE project on those
> > machines. Assign the reallly slow machines to ONE project, set their
> cache
> > for .1 days and I doubt you would have any problems.
> >
> > Eeek!! that sounded a bit aggressive and I didn't mean to give that
> impression
> > so I hope your not offended, just trying to be helpful.
>
> But one of the main design goals (though they are not written down anywhere
> concrete that I know of) is to support multiple projects. WAY back in the
> beta days about a year ago we had quite a discussion about these issues with
> no resolution as we were debating in advance of knowledge. Now that we are in
> "production" mode on several projects some of the discussion points made then
> were about the potential problems with scheduling without more interaction
> between the projects.
>
> The design goal of not requiring inter-project communication is still a valid
> (and correct in my opinion) choice. However, the client/participant side does
> have the information about what it is doing. As yet, the capture and
> use of this information is not occurring.
>
> John should be able to sign up for all available projects, but then, when he
> is out of work, his client should be able to get some within its limitations.
> If he signs up for only one project, he does not get the benefit of the
> assurance that work will always be running.
>
Thank you. Better description than I was using.

Cochise
Cochise
Joined: 11 Feb 05
Posts: 38
Credit: 3717
RAC: 0

I agree, I see now that I

I agree, I see now that I missed the point.

genes
genes
Joined: 10 Nov 04
Posts: 41
Credit: 2816243
RAC: 9109

I've got a total of 6

I've got a total of 6 machines running the available projects of Boinc. They ranged at the time of "the problem" from 550MHz P3 to 3.2GHz P4(HT), including a couple of Dual P3's at 933MHz and 1GHz. All are running every available Boinc project (only a few running Pirates) in keeping with Paul's statements of one of the design goals of Boinc -- always being able to get at least some work. This seems to also be important with dual and HT machines, since they need at least two WU's to be happy.

"The Problem" happened when Predictor came back online, all of a sudden all of these machines started getting lots of Predictor WU's and, since it hadn't run in a loooong time, they got priority. I had a 2-day cache at the time, and yet every one of these machines, slow AND fast, started missing deadlines on the Einstein WU's. By the yime I noticed what was happening, it was too late. I set my cache back to 0.5 days, and they gradually worked through the backlog. Some of the after-deadline WU's did receive credit, some did not. I didn't want to just reset all the clients, since, after all, this experience was a valuable test of the Boinc system.

What it showed was that the CC asks for work as if each project is the only one running. Some have claimed that it does proportion according to resource share, but it does not add up the shares and take the total into account. Also, it seems that the "debt" calculation is flawed, since Predictor was able to cause mayhem when it came back online. These parameters should be adjusted so that the CC works smoothly without a lot of operator intervention to perform a balancing act. It seems to me that the changes in the latest client are trending towards MORE operator intervention, not less. While many people (myself included) like to fiddle with it, we should not have to do so in order to prevent loss of the scientific results that we are trying to accomplish.

All of that having been said, I hope I'm prepared for when LHC comes back online this week! (BTW, I recently did upgrade the slowest machine (550MHz P3) to a 933MHz one.)

Sorry for the long post -- I really tried not to rant & hope I succeeded.

Crunch long & Prosper
-Gene


Randy
Randy
Joined: 20 Feb 05
Posts: 1
Credit: 55301
RAC: 0

I joined the project today

I joined the project today after being a Boinc SETI@Home regular for many months but the lack of new WUs in the last few days (current technical problems) persuaded me to sign up to einstein. My initial thoughts are that a 7 day deadline is too short and it will discourage crunchers.

I have a single P4 2.6GHz (HT) exclusively running BOINC SETI@Home until today.

With HT enabled, the first two WUs from Einstein are predicting completion after 13 to 14 hours. Unlike a number of serious crunchers, the PC is only switched on when I'm using it for other work - typically weekday evenings and longer at the weekends. So providing that usage pattern holds true, the WUs will complete within the deadline window. However, if the PC does not get used for a few days then it's likely the deadline will be missed.

I've no issue with the 7 day deadline as such, providing the WUs were smaller i.e. faster to crunch. The Einstein WUs are 3 to 4 times more crunching and half the deadline duration of SETI@Home.

So can I suggest a longer deadline once this project has settled down. We should be crunching for the science, with any credit system just a way of keeping score. An artificial deadline does not help the science since more people would end up crunching the same WUs when the deadlines are missed and could feel they're 'wasting' their efforts.

I hope you'll take this as constructive feedback.

Keep crunching

Randy

Ross Morgan
Ross Morgan
Joined: 20 Feb 05
Posts: 18
Credit: 122639
RAC: 0

I have to agree with Randy

I have to agree with Randy here, for users with only one machine the deadline either has to be extended or the WU's have to be made smaller if you want to attract a wide range of crunchers rather than just the really serious ones.

Thierry Van Driessche
Thierry Van Dri...
Joined: 9 Feb 05
Posts: 210
Credit: 229929
RAC: 0

There is also another

There is also another issue.

At home and running with 2 users the same PC, I’m the only user running Boinc.

I was away from home for 1 week. During that time, Boinc was still running and downloaded more WU’s.
But the other user had to restart the PC for one or another reason. Due to this, 3 WU’s downloaded 7 days ago and the ones downloaded during my absence passed their deadline.

To accelerate the processing of these ones, I changed the resource share. But by doing that, Boinc downloaded even more WU's.

BTW, the "connect to" I set is only 1 day.

Their should be a way Boinc shouldn’t download any WU anymore if there are still WU’s that passed already their deadline and are still crunching.

Greetings from Belgium
Thierry

Thierry Van Driessche
Thierry Van Dri...
Joined: 9 Feb 05
Posts: 210
Credit: 229929
RAC: 0

Reading through some more

Reading through some more stuff here, I understand now the WU’s are coming from 1 big file.

This brings the problem of deadline into a whole other perspective.

Comes the question if changing the “connect to� makes any changes to the WU’s that will be crunched hidden in this one big file?

I mean the following:
With the originally setting “connect to� set at 2 days, E@H downloaded 2 files. Now using only a setting of 0.5 days, will that means that from the 2 files downloaded only a part of this file will be used to crunch WU’s, so less WU’s will be extracted?

Greetings from Belgium
Thierry

Ziran
Ziran
Joined: 26 Nov 04
Posts: 194
Credit: 605124
RAC: 686

> Reading through some more

Message 3366 in response to message 3365

> Reading through some more stuff here, I understand now the WU’s are coming
> from 1 big file.
>
> This brings the problem of deadline into a whole other perspective.
>
> Comes the question if changing the “connect to� makes any changes to the
> WU’s that will be crunched hidden in this one big file?
>
> I mean the following:
> With the originally setting “connect to� set at 2 days, E@H downloaded 2
> files. Now using only a setting of 0.5 days, will that means that from the 2
> files downloaded only a part of this file will be used to crunch WU’s, so
> less WU’s will be extracted?

No this only means that you will have fewer WU’s stored on your computer. If you had 2 wu’s on your machine before, you will in a couple of days only have one WU stored. The number of WU’s extracted from a data set is something you can’t influence.

edit:
Since you have HT enabled on your machine you will always have at lest 2 Einstein WU’s on your machine if you don’t participate in multiple projects. The “connect to� setting will only influence how long in advance you will download an additional WU.

Then you're really interested in a subject, there is no way to avoid it. You have to read the Manual.

Ziran
Ziran
Joined: 26 Nov 04
Posts: 194
Credit: 605124
RAC: 686

To sum things up a

To sum things up a little:
The crunching time of the WU’s are the way they are because of scientific reasons, and can wherefore not be shortened.
The deadline is set to 7 days to minimize the size of the database and thereby reduce the strain on the server and allows more people to participate

The 7-day deadline is a problem because:
Some machines are to slow to complete a WU in time running 24/7.
Some machines have trouble completing a WU in time running only parts of the day.
Some machines have trouble completing a WU in time because they aren’t used every day.
Some machines have trouble completing a WU in time because they participate in multiple projects.

So if we can’t get more funding to Bruce and the rest of the team, they are stuck with the resources they got and have to do as god job as they can with it. One could say that al projects aren’t for everyone and there probably will be projects in the future that will demand turnaround times of less then one day (Folding?).

Now lest analyze the problem at hand. The 7-day deadline is because of the size of the database. So if the size of the database were reduced, we could probably talk Bruce in to extending the deadline with a couple of hours.

So what could be done to reduce the size of the database? If i am correct, the database that we are talking about hear are the database that consists of entries of what host have downloaded what WU. Then you download a WU an entry is made in the database. This entry remains in the database until all other users who have downloaded the same WU have bean accounted for or the deadline for unaccounted users have bean reached. My question is, are the entries in the database removed a.s.a.p. or is the removal delayed a couple of days so we can look at some past results? If so, i am willing to trade that feature for a couple of hour’s extension to the deadline.

IMHO unfinished WU’s laying around on my HD doesn’t make anyone happy. The only thing they are good for is if the project is down, so i can’t download more. If you participate in multiple projects, the risk of all projects being off line at the same time is small, reducing the need for a large cache. If you reduce the size of your cache you will reduce the average turnaround time of your WU’s and whereby in theory reduce the size of the database. You will also “extend� the deadline. The deadline is determent by then you download a WU, not by then you start crunching it. Don’t forget that if you download less WU’s at the time, you will be putting more stress on the servers by asking for work more often.

To reduce the trouble that machines have that participate in multiple projects i only se one long-term solution, to make the BOINC client consider all projects that it is attached to then downloading work. Most of the things to make this work smoothly have already bin implemented, so the modifications needed isn’t that big. The things needed to determent what WU to crunch next (resource debt) is already in place. BOINC already considers what percentage of the day your computer is on then downloading work. So the only thing missing is the calculation on how hutch work is needed for each attached project during the time from now to the value you specified in “ Connect to network about every �. If you don’t change your “ Switch between applications every� this will not make such a big difference, you will still be having a couple of half finished WU’s laying around on your HD.

Let’s say we participate in 2 projects: Einstein (13H/WU) and SETI (5H/WU) with an equal resource share. If we leave “ Switch between applications every� to 60 min we will have the same problem with 2 downloaded WU’s. But if we set “ Switch between applications every� to 14H then BOINC will finish all WU’s before it is supposed to switch application. Lets say we shall download 48H of work. With an equal resource share 24H etch. Einstein will download 2 WU’s (26H) and SETI 5 WU’s (25H). But if BOINC is set to only download 10H of work etch time it will download an Einstein WU’s and determined that it don’t need to download any SETI WU’s this time.

Then you're really interested in a subject, there is no way to avoid it. You have to read the Manual.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.