The other thing that SETI does that would help here is the initial replication of 3 with a quorum of 2. With the new BOINC software(5.8.x and up), the software will send 3, wait for the first 2 results and then cancel the 3rd WU if the host has not started it yet. We could eliminate the 45-60 day waits some people have gotten for credit that way.
Off topic: We could also send smaller, more reasonable WU's that don't scare people off.
But doesn't that take a backend update as well (which is needed here)?
I think so, but even more important is that it also requires a newer "minimal" Client version. We're still issuing work for all Clients from version 4.19 on, and I don't intend to change this without need.
Another aspect is that this way the computation time spent on the canceled result is alway wasted (does the participant get credit for it anyway?). My guess would be that while this gives faster results for the project and faster credit for the fast participants the waste of computing power is larger than what is lost by results arriving too late in the current scheme.
The other thing that SETI does that would help here is the initial replication of 3 with a quorum of 2. With the new BOINC software(5.8.x and up), the software will send 3, wait for the first 2 results and then cancel the 3rd WU if the host has not started it yet. We could eliminate the 45-60 day waits some people have gotten for credit that way.
Off topic: We could also send smaller, more reasonable WU's that don't scare people off.
But doesn't that take a backend update as well (which is needed here)?
I think so, but even more important is that it also requires a newer "minimal" Client version. We're still issuing work for all Clients from version 4.19 on, and I don't intend to change this without need.
Another aspect is that this way the computation time spent on the canceled result is alway wasted (does the participant get credit for it anyway?). My guess would be that while this gives faster results for the project and faster credit for the fast participants the waste of computing power is larger than what is lost by results arriving too late in the current scheme.
BM
There is no work/credit lost. WU's are not aborted if they have begun to run. Only WU's in the clients queue that are no longer needed are aborted. All in all it is a good thing because almost 100% of the WU's crunched are used. The sending of the "trailer" and the book-keeping overhead might be a pain for the project however.
The other thing that SETI does that would help here is the initial replication of 3 with a quorum of 2. With the new BOINC software(5.8.x and up), the software will send 3, wait for the first 2 results and then cancel the 3rd WU if the host has not started it yet. We could eliminate the 45-60 day waits some people have gotten for credit that way.
Off topic: We could also send smaller, more reasonable WU's that don't scare people off.
But doesn't that take a backend update as well (which is needed here)?
I think so, but even more important is that it also requires a newer "minimal" Client version. We're still issuing work for all Clients from version 4.19 on, and I don't intend to change this without need.
Another aspect is that this way the computation time spent on the canceled result is alway wasted (does the participant get credit for it anyway?). My guess would be that while this gives faster results for the project and faster credit for the fast participants the waste of computing power is larger than what is lost by results arriving too late in the current scheme.
BM
There is no work/credit lost. WU's are not aborted if they have begun to run. Only WU's in the clients queue that are no longer needed are aborted. All in all it is a good thing because almost 100% of the WU's crunched are used. The sending of the "trailer" and the book-keeping overhead might be a pain for the project however.
But if 3 WUs get crunched, but 2 would be enough for validation, I would regard the effort for the 3rd result "wasted" (regardless of credits granted). I'm crunching for science, not for credits. I personally would not be happy with such a policy at all.
But if 3 WUs get crunched, but 2 would be enough for validation, I would regard the effort for the 3rd result "wasted" (regardless of credits granted). I'm crunching for science, not for credits. I personally would not be happy with such a policy at all.
BRM
But if 3 WUs get crunched, but 2 would be enough for validation, I would regard the effort for the 3rd result "wasted" (regardless of credits granted). I'm crunching for science, not for credits. I personally would not be happy with such a policy at all.
CU
BRM
I started tracking this about 10 months ago when enough Core 2 D's and then Q's started appearing over at SAH to start making an impact on the work stream my hosts were seeing.
As I said before, the criteria SAH used to set the tightness factor for the deadline was that a PI-100 runing with 33.3333% machine ontime should be able to make the deadline (IIRC).
What's working out in practice currently is that with a 3/2 IR/MQ there is no scientifically useful reason to run basically anything less than a PIII or Athlon 'Classic' on SAH, since the odds are the result returned will be the trailer for the WU, even if you run it 24/7. Currently, my Katmai 550 running 24/7 with a 0.01 cache coupled CI is still effective, but if I ran it for 12 hours per day or with a 1 or 2 day CI, I estimated it would be returning 50% trailers or more.
So looking it at it from the viewpoint of not wasting my money on electricity I'd have to serious consider dropping the project on this host.
The beauty of EAH has been ever since you went to 2/2 way back when, as long as the host can meet the deadline, you know for a fact your host has contributed to the science, and therefore it was worth running it here no matter how fast or slow it is. This has only broken down recently in S5R2, and then only because the beta apps have been a lot slower than what we had before, as well as for some reason the scheduler has seen fit to send the oldest ones template frequencies which are beyond their capabilities with a 2 week deadline.
This has only broken down recently in S5R2, and then only because the beta apps have been a lot slower than what we had before, ....
Akos made an interesting remark, stating that the new apps are, in fact, several orders of magnitude *faster* than the old ones, probably meaning that they can do the same "scientific work" many times faster. So if the pre-S5R2 apps were biplanes, the new ones seem to be jet fighters. Problem is they get assigned much longer missions (just to stretch your paradigm a bit more :-) ) in the hierarchical all-sky search of S5R2.
I just wanted to clearify "slow" a bit so people don't get the impression that the apps "deteriorated" over time in some way.
This has only broken down recently in S5R2, and then only because the beta apps have been a lot slower than what we had before, ....
Akos made an interesting remark, stating that the new apps are, in fact, several orders of magnitude *faster* than the old ones, probably meaning that they can do the same "scientific work" many times faster. So if the pre-S5R2 apps were biplanes, the new ones seem to be jet fighters. Problem is they get assigned much longer missions (just to stretch your paradigm a bit more :-) ) in the hierarchical all-sky search of S5R2.
I just wanted to clearify "slow" a bit so people don't get the impression that the apps "deteriorated" over time in some way.
CU
BRM
LOL...
Agreed. It's all relative (as it should be on on EAH). ;-)
The new work is more difficult compared to the old work. So even though the new apps have a lot of the improvements which were in the old apps performance wise, in the current configuration it seems like they are slower, relativityly (pun intended) speaking! :-)
I think so, but even more important is that it also requires a newer "minimal" Client version. We're still issuing work for all Clients from version 4.19 on, and I don't intend to change this without need.
Another aspect is that this way the computation time spent on the canceled result is alway wasted (does the participant get credit for it anyway?). My guess would be that while this gives faster results for the project and faster credit for the fast participants the waste of computing power is larger than what is lost by results arriving too late in the current scheme.
You may have to consider cutting off clients older than the later releases of 4x in some circumstances. IIRC, there were serious client side scheduler and other issues with some of them. While this wasn't such a big deal back then, it seems to cause some problems with the state of the project today. ;-)
I think so, but even more important is that it also requires a newer "minimal" Client version. We're still issuing work for all Clients from version 4.19 on, and I don't intend to change this without need.
Another aspect is that this way the computation time spent on the canceled result is alway wasted (does the participant get credit for it anyway?). My guess would be that while this gives faster results for the project and faster credit for the fast participants the waste of computing power is larger than what is lost by results arriving too late in the current scheme.
You may have to consider cutting off clients older than the later releases of 4x in some circumstances. IIRC, there were serious client side scheduler and other issues with some of them. While this wasn't such a big deal back then, it seems to cause some problems with the state of the project today. ;-)
The one time I didn't check to make sure the link was right before moving on to other problems! ;-)
The thought just occured to me that hosts like this might have been a contributing factor to the trouble we saw at the end of S5R1. Since it appears it will grab a big load of work every time it connnects and then blow the deadline for all but a few, it could leave the project side thinking that a given set of datapaks has an adequate number of hosts running it and possibly delay the time it takes to get around to issuing them to a host more likely to actually return them on time.
RE: RE: The other thing
)
I think so, but even more important is that it also requires a newer "minimal" Client version. We're still issuing work for all Clients from version 4.19 on, and I don't intend to change this without need.
Another aspect is that this way the computation time spent on the canceled result is alway wasted (does the participant get credit for it anyway?). My guess would be that while this gives faster results for the project and faster credit for the fast participants the waste of computing power is larger than what is lost by results arriving too late in the current scheme.
BM
BM
RE: RE: RE: The other
)
There is no work/credit lost. WU's are not aborted if they have begun to run. Only WU's in the clients queue that are no longer needed are aborted. All in all it is a good thing because almost 100% of the WU's crunched are used. The sending of the "trailer" and the book-keeping overhead might be a pain for the project however.
RE: RE: RE: RE: The
)
But if 3 WUs get crunched, but 2 would be enough for validation, I would regard the effort for the 3rd result "wasted" (regardless of credits granted). I'm crunching for science, not for credits. I personally would not be happy with such a policy at all.
CU
BRM
RE: But if 3 WUs get
)
Strongly agree.
RE: But if 3 WUs get
)
I started tracking this about 10 months ago when enough Core 2 D's and then Q's started appearing over at SAH to start making an impact on the work stream my hosts were seeing.
As I said before, the criteria SAH used to set the tightness factor for the deadline was that a PI-100 runing with 33.3333% machine ontime should be able to make the deadline (IIRC).
What's working out in practice currently is that with a 3/2 IR/MQ there is no scientifically useful reason to run basically anything less than a PIII or Athlon 'Classic' on SAH, since the odds are the result returned will be the trailer for the WU, even if you run it 24/7. Currently, my Katmai 550 running 24/7 with a 0.01 cache coupled CI is still effective, but if I ran it for 12 hours per day or with a 1 or 2 day CI, I estimated it would be returning 50% trailers or more.
So looking it at it from the viewpoint of not wasting my money on electricity I'd have to serious consider dropping the project on this host.
The beauty of EAH has been ever since you went to 2/2 way back when, as long as the host can meet the deadline, you know for a fact your host has contributed to the science, and therefore it was worth running it here no matter how fast or slow it is. This has only broken down recently in S5R2, and then only because the beta apps have been a lot slower than what we had before, as well as for some reason the scheduler has seen fit to send the oldest ones template frequencies which are beyond their capabilities with a 2 week deadline.
Alinator
RE: This has only broken
)
Akos made an interesting remark, stating that the new apps are, in fact, several orders of magnitude *faster* than the old ones, probably meaning that they can do the same "scientific work" many times faster. So if the pre-S5R2 apps were biplanes, the new ones seem to be jet fighters. Problem is they get assigned much longer missions (just to stretch your paradigm a bit more :-) ) in the hierarchical all-sky search of S5R2.
I just wanted to clearify "slow" a bit so people don't get the impression that the apps "deteriorated" over time in some way.
CU
BRM
RE: RE: This has only
)
LOL...
Agreed. It's all relative (as it should be on on EAH). ;-)
The new work is more difficult compared to the old work. So even though the new apps have a lot of the improvements which were in the old apps performance wise, in the current configuration it seems like they are slower, relativityly (pun intended) speaking! :-)
Alinator
RE: I think so, but even
)
Bernd,
Take a look at this host:
Bad 4.19 Host
You may have to consider cutting off clients older than the later releases of 4x in some circumstances. IIRC, there were serious client side scheduler and other issues with some of them. While this wasn't such a big deal back then, it seems to cause some problems with the state of the project today. ;-)
Alinator
RE: RE: I think so, but
)
Is the link correct?
CU
BRM
OOPS..... Bad 4.19
)
OOPS.....
Bad 4.19 host
LOL...
The one time I didn't check to make sure the link was right before moving on to other problems! ;-)
The thought just occured to me that hosts like this might have been a contributing factor to the trouble we saw at the end of S5R1. Since it appears it will grab a big load of work every time it connnects and then blow the deadline for all but a few, it could leave the project side thinking that a given set of datapaks has an adequate number of hosts running it and possibly delay the time it takes to get around to issuing them to a host more likely to actually return them on time.
Alinator