I contacted the participant in question. It turned out that accidentally the hostid was shared on a cluster of ~50 machines.
BM
That shouldn't be possible, should it? How on Earth do you share one hostID among a cluster, other than to constantly copy the whole BOINC directory to other computers in the cluster, wait for them to crunch the tasks, copy the whole directory back to the original computer, etc.? As that's hardly accidental. ;-)
That shouldn't be possible, should it? How on Earth do you share one hostID among a cluster, other than to constantly copy the whole BOINC directory to other computers in the cluster, wait for them to crunch the tasks, copy the whole directory back to the original computer, etc.? As that's hardly accidental. ;-)
You can throw up a 50-nodes job thinking that the BOINC directory is local to the machine but in fact is shared via NFS. This would require some tinkering with the lockfiles / NFS cache settings, but it could work.
Actually the user copied a BOINC directory to 50 machines and then apparently tweaked the client_state.xml file but forgot to change the hostid.
Actually the user copied a BOINC directory to 50 machines and then apparently tweaked the client_state.xml file but forgot to change the hostid.
BM
Um.. if all those machines return all the same tasks, for the one hostID, shouldn't the Einstein server then throw up a flag, saying "Hey, that hostID already returned those tasks!" ??
I'm not even going into the fact that the original hostID says it has 8 CPUs, was re-registered yesterday yet has close to 9,000 tasks, all running between 1 and 8.5 hours. Perhaps Juvian days are used. Or Plutonian. ;-)
You sure he isn't using the [trac]wiki:SuperHost[/trac] idea?
Possibly so, as the SuperHost (SH) is the only hostID (HID) known to the project. While it doesn't crunch (that I can see from the idea), only does the storing of tasks in and out to the network behind it. And with that, one task in the queue can be sent to multiple computers on the network, which then all send it back to the SH, which then does the communicating with the project server.
The only but I see is that while all hosts on the network will have their own HID known to the SH, they don't communicate with the project server directly, so they won't send in the multiple tasks they got from the same HID. Only the SH HID will send the work back (validated and all, if need be) and that's going to be the one task it sent out to the network and received back.
Actually the user copied a BOINC directory to 50 machines and then apparently tweaked the client_state.xml file but forgot to change the hostid.
BM
Um.. if all those machines return all the same tasks, for the one hostID, shouldn't the Einstein server then throw up a flag, saying "Hey, that hostID already returned those tasks!" ??
I'm not even going into the fact that the original hostID says it has 8 CPUs, was re-registered yesterday yet has close to 9,000 tasks, all running between 1 and 8.5 hours. Perhaps Juvian days are used. Or Plutonian. ;-)
You sure he isn't using the [trac]wiki:SuperHost[/trac] idea?
I'm sure he was trying his own way of implementing a poor-man's version of it.
I don't precisely know how this happened, but I also don't know how exactly he messed up the client_state.xml file, and I surely don't have time to dig deeper into it.
..and I surely don't have time to dig deeper into it.
BM
You certainly have more important things to do, we know!
(Such as perhaps an optimized Windows app?....Sorry, just kidding, I know you have your priorities and will get around to that according to your schedule, and we can certainly appreciate that! :)
As expected, Akos' 4 core RAC has screamed past that of peanut's 8 core as if the latter were standing still. The rate of climb is so strong that it looks like it'll keep going for a while yet, if Akos keeps crunching :).
I've had a bit of a look through the results list of the current top "superhost". In particular, many results come from a large number of different frequencies and these results are all jumbled together giving the appearance that a large number of "compatible" hosts (identical platform) that used to have quite separate work streams were simply merged together somehow. Interestingly enough, this top superhost hasn't taken on any new work for the last couple of days and the list of results is decaying quite rapidly (almost 3K lower than what it was when I first noticed). Hopefully this means that the individual boxes that made up this superhost are now maintaining their own individual caches.
RE: I contacted the
)
That shouldn't be possible, should it? How on Earth do you share one hostID among a cluster, other than to constantly copy the whole BOINC directory to other computers in the cluster, wait for them to crunch the tasks, copy the whole directory back to the original computer, etc.? As that's hardly accidental. ;-)
RE: That shouldn't be
)
You can throw up a 50-nodes job thinking that the BOINC directory is local to the machine but in fact is shared via NFS. This would require some tinkering with the lockfiles / NFS cache settings, but it could work.
Actually the user copied a BOINC directory to 50 machines and then apparently tweaked the client_state.xml file but forgot to change the hostid.
BM
BM
RE: Actually the user
)
Um.. if all those machines return all the same tasks, for the one hostID, shouldn't the Einstein server then throw up a flag, saying "Hey, that hostID already returned those tasks!" ??
I'm not even going into the fact that the original hostID says it has 8 CPUs, was re-registered yesterday yet has close to 9,000 tasks, all running between 1 and 8.5 hours. Perhaps Juvian days are used. Or Plutonian. ;-)
You sure he isn't using the [trac]wiki:SuperHost[/trac] idea?
If the superhost idea caught
)
If the superhost idea caught on would the top computer list just become the same as the top participant list?
Possibly so, as the SuperHost
)
Possibly so, as the SuperHost (SH) is the only hostID (HID) known to the project. While it doesn't crunch (that I can see from the idea), only does the storing of tasks in and out to the network behind it. And with that, one task in the queue can be sent to multiple computers on the network, which then all send it back to the SH, which then does the communicating with the project server.
The only but I see is that while all hosts on the network will have their own HID known to the SH, they don't communicate with the project server directly, so they won't send in the multiple tasks they got from the same HID. Only the SH HID will send the work back (validated and all, if need be) and that's going to be the one task it sent out to the network and received back.
RE: RE: Actually the user
)
I'm sure he was trying his own way of implementing a poor-man's version of it.
I don't precisely know how this happened, but I also don't know how exactly he messed up the client_state.xml file, and I surely don't have time to dig deeper into it.
BM
BM
RE: ..and I surely don't
)
You certainly have more important things to do, we know!
(Such as perhaps an optimized Windows app?....Sorry, just kidding, I know you have your priorities and will get around to that according to your schedule, and we can certainly appreciate that! :)
akosf, what speed is your
)
akosf, what speed is your q6600 clocked at?
~120 credits per hour core is remarkable. Thats what you're getting on your latest work units.
As expected, Akos' 4 core RAC
)
As expected, Akos' 4 core RAC has screamed past that of peanut's 8 core as if the latter were standing still. The rate of climb is so strong that it looks like it'll keep going for a while yet, if Akos keeps crunching :).
I've had a bit of a look through the results list of the current top "superhost". In particular, many results come from a large number of different frequencies and these results are all jumbled together giving the appearance that a large number of "compatible" hosts (identical platform) that used to have quite separate work streams were simply merged together somehow. Interestingly enough, this top superhost hasn't taken on any new work for the last couple of days and the list of results is decaying quite rapidly (almost 3K lower than what it was when I first noticed). Hopefully this means that the individual boxes that made up this superhost are now maintaining their own individual caches.
Cheers,
Gary.
RE: akosf, what speed is
)
It runs on 3,4GHz, but I will rebuild this computer tomorrow.
I would like to optimize its performance/power ratio.