Petition - Deadline Relief for Longest Results

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,979
Credit: 205,052,741
RAC: 36,340

RE: RE: The other thing

Message 70034 in response to message 70030

Quote:
Quote:
The other thing that SETI does that would help here is the initial replication of 3 with a quorum of 2. With the new BOINC software(5.8.x and up), the software will send 3, wait for the first 2 results and then cancel the 3rd WU if the host has not started it yet. We could eliminate the 45-60 day waits some people have gotten for credit that way.
Off topic: We could also send smaller, more reasonable WU's that don't scare people off.

But doesn't that take a backend update as well (which is needed here)?

I think so, but even more important is that it also requires a newer "minimal" Client version. We're still issuing work for all Clients from version 4.19 on, and I don't intend to change this without need.

Another aspect is that this way the computation time spent on the canceled result is alway wasted (does the participant get credit for it anyway?). My guess would be that while this gives faster results for the project and faster credit for the fast participants the waste of computing power is larger than what is lost by results arriving too late in the current scheme.

BM

BM

ohiomike
ohiomike
Joined: 4 Nov 06
Posts: 80
Credit: 6,453,639
RAC: 0

RE: RE: RE: The other

Message 70035 in response to message 70034

Quote:
Quote:
Quote:
The other thing that SETI does that would help here is the initial replication of 3 with a quorum of 2. With the new BOINC software(5.8.x and up), the software will send 3, wait for the first 2 results and then cancel the 3rd WU if the host has not started it yet. We could eliminate the 45-60 day waits some people have gotten for credit that way.
Off topic: We could also send smaller, more reasonable WU's that don't scare people off.

But doesn't that take a backend update as well (which is needed here)?

I think so, but even more important is that it also requires a newer "minimal" Client version. We're still issuing work for all Clients from version 4.19 on, and I don't intend to change this without need.

Another aspect is that this way the computation time spent on the canceled result is alway wasted (does the participant get credit for it anyway?). My guess would be that while this gives faster results for the project and faster credit for the fast participants the waste of computing power is larger than what is lost by results arriving too late in the current scheme.

BM

There is no work/credit lost. WU's are not aborted if they have begun to run. Only WU's in the clients queue that are no longer needed are aborted. All in all it is a good thing because almost 100% of the WU's crunched are used. The sending of the "trailer" and the book-keeping overhead might be a pain for the project however.


Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,516
Credit: 461,162,614
RAC: 40,293

RE: RE: RE: RE: The

Message 70036 in response to message 70035

Quote:
Quote:
Quote:
Quote:
The other thing that SETI does that would help here is the initial replication of 3 with a quorum of 2. With the new BOINC software(5.8.x and up), the software will send 3, wait for the first 2 results and then cancel the 3rd WU if the host has not started it yet. We could eliminate the 45-60 day waits some people have gotten for credit that way.
Off topic: We could also send smaller, more reasonable WU's that don't scare people off.

But doesn't that take a backend update as well (which is needed here)?

I think so, but even more important is that it also requires a newer "minimal" Client version. We're still issuing work for all Clients from version 4.19 on, and I don't intend to change this without need.

Another aspect is that this way the computation time spent on the canceled result is alway wasted (does the participant get credit for it anyway?). My guess would be that while this gives faster results for the project and faster credit for the fast participants the waste of computing power is larger than what is lost by results arriving too late in the current scheme.

BM

There is no work/credit lost. WU's are not aborted if they have begun to run. Only WU's in the clients queue that are no longer needed are aborted. All in all it is a good thing because almost 100% of the WU's crunched are used. The sending of the "trailer" and the book-keeping overhead might be a pain for the project however.

But if 3 WUs get crunched, but 2 would be enough for validation, I would regard the effort for the 3rd result "wasted" (regardless of credits granted). I'm crunching for science, not for credits. I personally would not be happy with such a policy at all.

CU

BRM

Nothing But Idle Time
Nothing But Idl...
Joined: 24 Aug 05
Posts: 158
Credit: 289,204
RAC: 0

RE: But if 3 WUs get

Message 70037 in response to message 70036

Quote:
But if 3 WUs get crunched, but 2 would be enough for validation, I would regard the effort for the 3rd result "wasted" (regardless of credits granted). I'm crunching for science, not for credits. I personally would not be happy with such a policy at all.
BRM

Strongly agree.

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9,352,143
RAC: 0

RE: But if 3 WUs get

Message 70038 in response to message 70036

Quote:


But if 3 WUs get crunched, but 2 would be enough for validation, I would regard the effort for the 3rd result "wasted" (regardless of credits granted). I'm crunching for science, not for credits. I personally would not be happy with such a policy at all.

CU

BRM

I started tracking this about 10 months ago when enough Core 2 D's and then Q's started appearing over at SAH to start making an impact on the work stream my hosts were seeing.

As I said before, the criteria SAH used to set the tightness factor for the deadline was that a PI-100 runing with 33.3333% machine ontime should be able to make the deadline (IIRC).

What's working out in practice currently is that with a 3/2 IR/MQ there is no scientifically useful reason to run basically anything less than a PIII or Athlon 'Classic' on SAH, since the odds are the result returned will be the trailer for the WU, even if you run it 24/7. Currently, my Katmai 550 running 24/7 with a 0.01 cache coupled CI is still effective, but if I ran it for 12 hours per day or with a 1 or 2 day CI, I estimated it would be returning 50% trailers or more.

So looking it at it from the viewpoint of not wasting my money on electricity I'd have to serious consider dropping the project on this host.

The beauty of EAH has been ever since you went to 2/2 way back when, as long as the host can meet the deadline, you know for a fact your host has contributed to the science, and therefore it was worth running it here no matter how fast or slow it is. This has only broken down recently in S5R2, and then only because the beta apps have been a lot slower than what we had before, as well as for some reason the scheduler has seen fit to send the oldest ones template frequencies which are beyond their capabilities with a 2 week deadline.

Alinator

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,516
Credit: 461,162,614
RAC: 40,293

RE: This has only broken

Message 70039 in response to message 70038

Quote:
This has only broken down recently in S5R2, and then only because the beta apps have been a lot slower than what we had before, ....

Akos made an interesting remark, stating that the new apps are, in fact, several orders of magnitude *faster* than the old ones, probably meaning that they can do the same "scientific work" many times faster. So if the pre-S5R2 apps were biplanes, the new ones seem to be jet fighters. Problem is they get assigned much longer missions (just to stretch your paradigm a bit more :-) ) in the hierarchical all-sky search of S5R2.

I just wanted to clearify "slow" a bit so people don't get the impression that the apps "deteriorated" over time in some way.

CU

BRM

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9,352,143
RAC: 0

RE: RE: This has only

Message 70040 in response to message 70039

Quote:
Quote:
This has only broken down recently in S5R2, and then only because the beta apps have been a lot slower than what we had before, ....

Akos made an interesting remark, stating that the new apps are, in fact, several orders of magnitude *faster* than the old ones, probably meaning that they can do the same "scientific work" many times faster. So if the pre-S5R2 apps were biplanes, the new ones seem to be jet fighters. Problem is they get assigned much longer missions (just to stretch your paradigm a bit more :-) ) in the hierarchical all-sky search of S5R2.

I just wanted to clearify "slow" a bit so people don't get the impression that the apps "deteriorated" over time in some way.

CU

BRM

LOL...

Agreed. It's all relative (as it should be on on EAH). ;-)

The new work is more difficult compared to the old work. So even though the new apps have a lot of the improvements which were in the old apps performance wise, in the current configuration it seems like they are slower, relativityly (pun intended) speaking! :-)

Alinator

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9,352,143
RAC: 0

RE: I think so, but even

Message 70041 in response to message 70034

Quote:

I think so, but even more important is that it also requires a newer "minimal" Client version. We're still issuing work for all Clients from version 4.19 on, and I don't intend to change this without need.

Another aspect is that this way the computation time spent on the canceled result is alway wasted (does the participant get credit for it anyway?). My guess would be that while this gives faster results for the project and faster credit for the fast participants the waste of computing power is larger than what is lost by results arriving too late in the current scheme.

BM

Bernd,

Take a look at this host:

Bad 4.19 Host

You may have to consider cutting off clients older than the later releases of 4x in some circumstances. IIRC, there were serious client side scheduler and other issues with some of them. While this wasn't such a big deal back then, it seems to cause some problems with the state of the project today. ;-)

Alinator

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,516
Credit: 461,162,614
RAC: 40,293

RE: RE: I think so, but

Message 70042 in response to message 70041

Quote:
Quote:

I think so, but even more important is that it also requires a newer "minimal" Client version. We're still issuing work for all Clients from version 4.19 on, and I don't intend to change this without need.

Another aspect is that this way the computation time spent on the canceled result is alway wasted (does the participant get credit for it anyway?). My guess would be that while this gives faster results for the project and faster credit for the fast participants the waste of computing power is larger than what is lost by results arriving too late in the current scheme.

BM

Bernd,

Take a look at this host:

Bad 4.19 Host

You may have to consider cutting off clients older than the later releases of 4x in some circumstances. IIRC, there were serious client side scheduler and other issues with some of them. While this wasn't such a big deal back then, it seems to cause some problems with the state of the project today. ;-)

Alinator

Is the link correct?
CU

BRM

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9,352,143
RAC: 0

OOPS..... Bad 4.19

OOPS.....

Bad 4.19 host

LOL...

The one time I didn't check to make sure the link was right before moving on to other problems! ;-)

The thought just occured to me that hosts like this might have been a contributing factor to the trouble we saw at the end of S5R1. Since it appears it will grab a big load of work every time it connnects and then blow the deadline for all but a few, it could leave the project side thinking that a given set of datapaks has an adequate number of hosts running it and possibly delay the time it takes to get around to issuing them to a host more likely to actually return them on time.

Alinator

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.