I believe that it has the ability to speed things up actually. BOINC has the ability to remove WUs from your cache and issue new WUs in their place if the project has reached a quorum for a WU. ie they don't want to waste your processing time on an answer they already have. If they send 6 out, 3 may come back quickly and new WUs reissued right away to everyone. I believe they went to the 3 initial replication system when the 'Alberts' came out to increase their processing efficiency by lowering the replication to that necessary to form a quorum(3)....which is ok to a degree but it can take up to 4 weeks or more to get a WU quorum if one of the hosts fails to return a result and one more replication must be issued for another possible 2 week period etc.etc. If they issue 6 initially they may avoid this problem by taking advantage of the Akosf Hungarian Revolution. Longer WUs may be an attempt to increase processing time and relieve traffic on their servers....just a guess....Cheers, Rog.
That sounds quite sensible!
[idle speculation]
Another alternative is that they are looking really hard for something exciting and special, but don't want to explicitly give us false hope!? :-)
[/idle speculation]
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
CDNgeezer: That's an interesting possiblity, I steped trhough a few dozne of the WUs manually. Too few of them appear to've been assigned to judge. I found 1 with 4 results in, but most only had 1 or 2 users so it's not reall judgable meaningfully.
Otoh, these wu's are prefixed with a p, not a r or z so they might be something completely new.
CDNgeezer: That's an interesting possiblity, I steped trhough a few dozne of the WUs manually. Too few of them appear to've been assigned to judge. I found 1 with 4 results in, but most only had 1 or 2 users so it's not reall judgable meaningfully.
Otoh, these wu's are prefixed with a p, not a r or z so they might be something completely new.
I just got a single result in my input queue which has a leading j (so not p or r or z).
j2_0550.0__88_S4R2a_5
This one also had six initial replications, and has already sent out a seventh, as one has had a client error.
It just started running on my host (I suspended a few standard ones ahead of it to push on). At the 2.5% completion level, it appears that runtime will be roughly twice this host's typical for a large WU, though the estimated runtime based on the claimed work content is typical for a standard WU.
As to the high replication on this one and the p ones, I'd assume we are seeing a test for which they want reasonably quick results with moderately high assurance. When you combine the big "cache" people with the folks who sign up and never return anything, the probable delay for getting a specific set of multiple results at replication 3 must be considerable.
Too low quota, added 100% redundancy - maybe we're too fast now and they cannot generate work fast enough anymore. The low quota enforces increasing the cache btw., in case you receive too many short WUs on one day, you might have some long ones left from the other day.
On one XP1800+ I need exactly 32 long running WUs to fill 24 hours though, so even a single short one makes me run out of Einstein work. But SIMAP has enough work for the spare hours :-)
btw.: is it the P or is it the 2 that has the longer runtimes?
I don't have P1 and J1 ones, the long running ones seem to be P2 and J2 on my boxes.
j2_0550.0__88_S4R2a_5
As to the high replication on this one and the p ones, I'd assume we are seeing a test for which they want reasonably quick results with moderately high assurance. When you combine the big "cache" people with the folks who sign up and never return anything, the probable delay for getting a specific set of multiple results at replication 3 must be considerable.
Thanks for letting us known on yet another WU type (?).
What you suggest later does make sense.
Quote:
I believe that it has the ability to speed things up, actually... BOINC has the ability to remove WUs from your cache and issue new WUs in their place if the project has reached a quorum for a WU. ie they don't want to waste your processing time on an answer they already have...
Interesting speculations, Rog.
Cancelling of Results that are no longer needed to process (e.i. WU met it's quorum and being validated) can be possible AFTER host contacted scheduler. AFAIK, it's not server driven (e.i. you have to wait until host contacts scheduler). We have been through this process on CPDN lately and it has been discussed prior and post hoc the killer trickle was sent, btw... still I may be wrong.
Standard BOINC core/installation contact scheduler only seldom (when new work is need and to report batch of WUs already completed). If you have trux's "retunrn_results_immediately", scheduler is contacted after every WUs finished, leaving some chance to abort Results on host queue. But I don't thing it's a general setting.
I argue that chance of contacting scheduler before Result is actually being crunched is quite low...or at least lower than chance of completing >3 results; sorry for tautology :-)
It is not to say that such approach (have initial replication of 6) is bad. It may be intentionally to have more results for new WUs that quorum...and how BOINC scheduler and BOINC client works so far + setting of 6 repication...it may bring more than 3 results.
There is a - theoretically simple - way to speed up things - if it was your idea in the above paragraph. As I have suggested several times, scheduler needs to be more smart - not a primitive feeder.
On a less abstract layer: when you sent 3 Results to Host with low Average turnaround time, you have a very high probabality of having them back soon (50% of having them back in time of highest Average turnaround time). As simple as that. There are many host with turn-around time of 1/2 day or less, so you get results back overnight.
The bad thing - those machine are likely to use optimalized client and possibly being overclocked...which is not much suitable for validitaing 'important' results [sure, they are all important :-)] Or take it another way - once they get validated on such powerfull arsenal, there's a high chance having them validated on other machines as well.
When you sent 6 results - in your scenario, you got a 50% chance that at least 3 results are back at 3rd lowest (or 4th highest) Average turnaround time of a bunch of random machines. I'm unable to put a more precise number there, but statistically it's significantly higher than previous scenario of 3 Results for fast machines with low Average turnaround time.
AFAIK, initial replication of 6 should be usefull not to get results back sooner, but to have results from a more various machines, hence test validity better.
ad Mike's idle speculation - wish it was that way :-)
Einstein team, PLAESE, put some light on recent questions or we may overheats our brains and CPUs in speculations.
If they send 6 out, 3 may come back quickly and new WUs reissued right away to everyone.
If this scenario is correct then what if there are 4 or more hosts crunching away on the same wu before 3 are reported. Would the 'extra' host(s) work be shutdown and lose credit for time spent processing or would they continue until finished and receive credit? Who knows until the powers that be clue us in.
Thanx for the info. From a credit standpoint, shouldn't they wait until all 6 are finished to post credit? I don't know how they issued credit back when there was a larger (than 3) quorum. I'm relatively new to Einstein (cough). 'Course with 6 there isn't a middle result as there is with 3. (shrug)
Thanx for the info. From a credit standpoint, shouldn't they wait until all 6 are finished to post credit? I don't know how they issued credit back when there was a larger (than 3) quorum. I'm relatively new to Einstein (cough). 'Course with 6 there isn't a middle result as there is with 3. (shrug)
No. Once the minimum quorum is achieved credit is granted for everyone who's submitted. the late comers will get credit on submission.
The double length work units sound like an indication that Bruce et al. aren't having much luck convincing their university buerocrats to get them a better server.
RE: I believe that it has
)
That sounds quite sensible!
[idle speculation]
Another alternative is that they are looking really hard for something exciting and special, but don't want to explicitly give us false hope!? :-)
[/idle speculation]
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
CDNgeezer: That's an
)
CDNgeezer: That's an interesting possiblity, I steped trhough a few dozne of the WUs manually. Too few of them appear to've been assigned to judge. I found 1 with 4 results in, but most only had 1 or 2 users so it's not reall judgable meaningfully.
Otoh, these wu's are prefixed with a p, not a r or z so they might be something completely new.
RE: Otoh, these wu's are
)
>Thanks for the 'heads up', Dan....will keep an eye out for them....Cheers, Rog.
RE: CDNgeezer: That's an
)
I just got a single result in my input queue which has a leading j (so not p or r or z).
j2_0550.0__88_S4R2a_5
This one also had six initial replications, and has already sent out a seventh, as one has had a client error.
see:
http://einsteinathome.org/workunit/7706225
It just started running on my host (I suspended a few standard ones ahead of it to push on). At the 2.5% completion level, it appears that runtime will be roughly twice this host's typical for a large WU, though the estimated runtime based on the claimed work content is typical for a standard WU.
As to the high replication on this one and the p ones, I'd assume we are seeing a test for which they want reasonably quick results with moderately high assurance. When you combine the big "cache" people with the folks who sign up and never return anything, the probable delay for getting a specific set of multiple results at replication 3 must be considerable.
Too low quota, added 100%
)
Too low quota, added 100% redundancy - maybe we're too fast now and they cannot generate work fast enough anymore. The low quota enforces increasing the cache btw., in case you receive too many short WUs on one day, you might have some long ones left from the other day.
On one XP1800+ I need exactly 32 long running WUs to fill 24 hours though, so even a single short one makes me run out of Einstein work. But SIMAP has enough work for the spare hours :-)
btw.: is it the P or is it the 2 that has the longer runtimes?
I don't have P1 and J1 ones, the long running ones seem to be P2 and J2 on my boxes.
RE: j2_0550.0__88_S4R2a_5 A
)
Thanks for letting us known on yet another WU type (?).
What you suggest later does make sense.
Interesting speculations, Rog.
Cancelling of Results that are no longer needed to process (e.i. WU met it's quorum and being validated) can be possible AFTER host contacted scheduler. AFAIK, it's not server driven (e.i. you have to wait until host contacts scheduler). We have been through this process on CPDN lately and it has been discussed prior and post hoc the killer trickle was sent, btw... still I may be wrong.
Standard BOINC core/installation contact scheduler only seldom (when new work is need and to report batch of WUs already completed). If you have trux's "retunrn_results_immediately", scheduler is contacted after every WUs finished, leaving some chance to abort Results on host queue. But I don't thing it's a general setting.
I argue that chance of contacting scheduler before Result is actually being crunched is quite low...or at least lower than chance of completing >3 results; sorry for tautology :-)
It is not to say that such approach (have initial replication of 6) is bad. It may be intentionally to have more results for new WUs that quorum...and how BOINC scheduler and BOINC client works so far + setting of 6 repication...it may bring more than 3 results.
There is a - theoretically simple - way to speed up things - if it was your idea in the above paragraph. As I have suggested several times, scheduler needs to be more smart - not a primitive feeder.
On a less abstract layer: when you sent 3 Results to Host with low Average turnaround time, you have a very high probabality of having them back soon (50% of having them back in time of highest Average turnaround time). As simple as that. There are many host with turn-around time of 1/2 day or less, so you get results back overnight.
The bad thing - those machine are likely to use optimalized client and possibly being overclocked...which is not much suitable for validitaing 'important' results [sure, they are all important :-)] Or take it another way - once they get validated on such powerfull arsenal, there's a high chance having them validated on other machines as well.
When you sent 6 results - in your scenario, you got a 50% chance that at least 3 results are back at 3rd lowest (or 4th highest) Average turnaround time of a bunch of random machines. I'm unable to put a more precise number there, but statistically it's significantly higher than previous scenario of 3 Results for fast machines with low Average turnaround time.
AFAIK, initial replication of 6 should be usefull not to get results back sooner, but to have results from a more various machines, hence test validity better.
ad Mike's idle speculation - wish it was that way :-)
Einstein team, PLAESE, put some light on recent questions or we may overheats our brains and CPUs in speculations.
RE: If they send 6 out, 3
)
If this scenario is correct then what if there are 4 or more hosts crunching away on the same wu before 3 are reported. Would the 'extra' host(s) work be shutdown and lose credit for time spent processing or would they continue until finished and receive credit? Who knows until the powers that be clue us in.
Info on the wu that started
)
Info on the wu that started this.
3 results in and they have validated.
The 3 left is beeing cruched.
se here http://einsteinathome.org/workunit/7705956
Anders n
RE: Info on the wu that
)
Thanx for the info. From a credit standpoint, shouldn't they wait until all 6 are finished to post credit? I don't know how they issued credit back when there was a larger (than 3) quorum. I'm relatively new to Einstein (cough). 'Course with 6 there isn't a middle result as there is with 3. (shrug)
RE: RE: Info on the wu
)
No. Once the minimum quorum is achieved credit is granted for everyone who's submitted. the late comers will get credit on submission.
The double length work units sound like an indication that Bruce et al. aren't having much luck convincing their university buerocrats to get them a better server.