I noticed my pending credit has ballooned, and I was having a look to see why....
I notice that in all my workunits lately, the unit is sent out to one host first (for my best host, almost aways me. :-) and then sent out to three others only when the WU comes back from that host (what happens when there is an error I don't know.)
Is this a new policy to send out "frontrunners" to see if the WU is ok before sending it to more hosts?
--miw
--miw
Copyright © 2024 Einstein@Home. All rights reserved.
Scheduler now sends "frontrunners"
)
I notice that in all my workunits lately, the unit is sent out to one host first ...
That happened to me for awhile. At one point I had 11 pendings, which for my output of less than 2.5 WU per day is alot. When I checked the WUs they had been sent to me way before they were sent to anyone else. Here's an example of 4 days ahead, there were many at 3 days ahead: http://einsteinathome.org/workunit/1045849
After awhile the 11 cleared and I no longer get the WU's way ahead of anyone else. It was really odd and I wondered what was happening to the scheduler. I guess it still has the problem.
Joe B
It's the way E@H sends out
)
It's the way E@H sends out WUs. Since it doesn't send the large input file out with each WU, it needs to find other hosts with the same input file on their machine to process the same WU as you. (or wait until another host is free, then send out the same input file) If a host is slower, or doesn't connect for several days, it can sometimes take a while to find another host with the same input file.
I notice that in all my
)
I notice that in all my workunits lately, the unit is sent out to one host first (for my best host, almost aways me. :-) and then sent out to three others only when the WU comes back from that host (what happens when there is an error I don't know.)
Is this a new policy to send out "frontrunners" to see if the WU is ok before sending it to more hosts?
My guess is that this is a non-deliberate side-effect of two other rules. Such unintended consequences are surprisingly common, and are most likely patented by Murphy.
Consider
rule 1 - assign work from the data the client already holds
rule 2 - don't assign consecutive wu to the same pairings of computers
Now suppose A (by luck) is the first computer to be assigned work from a new dataset.
Eventually, along comes B who has no more wu to be assigned form their old data, and thay are assigned wu from the same dataset as B. Because of rule 2, B will only be assigned one wu that is shared with B. B's next wu after that will be a different wu from the same dataset. Meanwhile A may well want a second wu.
Then along comes C, D, E each will only be assigned one of the WU that any other computer has had. We might have this picture just after G gets their frist wu from this dataset:
wu 1 : A, B, C, D
wu 2 : B,
wu 3 : A, E, F, G
wu 4 : A,
wu 5 : B, E
wu 6 : A,
wu 7 : C, E
wu 8 : B, F
wu 9 : A
wu 10: D, F
wu 11: C,
wu 12: B
wu 13: A
wu 14: D
wu 15: C
wu 16: B
wu 17: A
wu 18: C
wu 19: D
Notice, even with 7 different computers, we have just 2 wu completed,
and most wu issued still have only one computer assigned.
Of course this is just one possible picture - I have assumed all computers are running at the same speed and each new computer arriving one step later, so we have 7xA, 6xB, 5xC, 4xD, 3xE at the time shown. Real life is never that neatand if 'A', is significantly faster than average A the frontrunner effect will be magnified further.
Later on, when there are many computers all crunching data from this dataset, the frontrunner effect disappears.
I wonder if your observatins fit this sort of pattern ?
In other words, if you were first (A) the scheduler was not so much waiting for you to finish the wu, but for other computers to arrive?
~~gravywavy
I've noticed my "pending" is
)
I've noticed my "pending" is increasing as well. I'm up to 3,700 or so pending credits now, while my RAC continues to decline from a peak of about 1100 toward 900.
My guess is that this is a
)
My guess is that this is a non-deliberate side-effect of two other rules. Such unintended consequences are surprisingly common, and are most likely patented by Murphy.
Consider
rule 1 - assign work from the data the client already holds
rule 2 - don't assign consecutive wu to the same pairings of computers
Now this is interesting. So if for some reason computer A manages to download all the 150 WU's from the data set, we would need 450 other computers to download the same data set to complete it, if no WU had to be re sent.
What is the minimum number of computers needed to complete a data set, if no WU have to be re sent?
Is there a limit on how many WU's a computer can download from the same data set? should there be? if a computer downloads more WU's from the same data set then is needed to complete the data set with the minimum number of computers, it would mean that this computer would block the remaining computers from downloading WU's and cause more computers to have to download the same data set, so it can be completed. It would also mean that it would take more time before an unused paring is possible for the remaining WU's and there by prolonging the time the WU remains in the database.
Then you're really interested in a subject, there is no way to avoid it. You have to read the Manual.
Well, actually it is a little
)
Well, actually it is a little bit nasty. When you look here, then you can see that all the WU's on the first page and some on the second page (at this time including WU 1219597) are only assigned to my machine.
Yes, I have downloaded a lot of WU's to stress the scheduler, I thought that a higher number of Wu's assigned to only one machine may force it to assign them to a second machine. Well, lets see what happens.
BTW: I learned one thing: At least with the 4.19 client one shall never ever set the connection interval to a value higher than 1. The dumb thing downloaded too many WU's. The logic behind seems to be written during some drunken pahse of the programmer ... Happily the machine is fast enough not to loose any calculation, but downloading 19 WU's should be prevented by the scheduler (actually loading another 8 at a time when there are already 11 hanging around).
So you're dissapointed that
)
So you're dissapointed that the client did what you told it to do? You asked for a larger cache of WU's and the project gave it to you! There are limits of 8 WU's per day per host to keep some sanity going, but the rest is in your hands... :)
So you're dissapointed that
)
So you're dissapointed that the client did what you told it to do? You asked for a larger cache of WU's and the project gave it to you! There are limits of 8 WU's per day per host to keep some sanity going, but the rest is in your hands... :)
Yes, I am disappointed!
I did change the settings to 1.75 days and the @%$! gave me close to 5 days.