In a small network, several computers are crunching E@H, but all on different chuncks.
It would be most efficient if those computers run on the same 14 Mb of data.
Has somebody an idea how to do this or would/is this an item for the wishlist ?
Copyright © 2024 Einstein@Home. All rights reserved.
Efficiency of crunches per downloaded chunck of data.
)
Don't think it is possible with current software. It would need different server software, and also differences at the client end to make the server changes worthwhile.
If you'd like to see it, certainly add it to the wish list, but if my guesses are right will not be implemented in the forseeable future. I'll sketch out why I think that.
The changes are quite large to make it useful: for example, even if by luck all your clients were crunching the same data, they would not know that, and each client would download the same data separately. The client knows when it already has the file, but has no way of knowing that it is available on the LAN rather than from the server.
One reason that LAN caches of data are not encouraged is that they would only benefit but would cost quite a bit of development time. Another reason is the security would have to be careful, otherwise I could hijack your machine by spoofing information about where its local cache was, and feeding it dud data.
Yet another is that security demands a way to prevent illicit collaboration beween machines with the same owner, ie the same wu is not crunched by two machines owned by the same owner. Otherwise a sneaky user could simply copy the result from the first machine to the second and get a free crunch. Even worse, they could copy a _different_ wu's result to both machines and spoil the data in order to get fast credit, or as deliberate sabotage. Sadly the designers of distributed computing have to be aware of such scoundrels! Those opportunites would be more likely to happen if all your machines were crunching the same dataset, and would make it harder for the scheduler to generate independent pairs of machines. People with more than N machines would not be able to do what you ask anyway, for the same reason. Again, not impossible to solve all those obstacles, but it would take a lot of effort to do so.
So, in short, it would be more complicated to do than it sounds at first.
Of course you are entitled (like all of us) to make your wish on the wish list: just remember that not all wishes come true. Perhaps some BOINC developer (I am not one of them) will take it up after all, or perhaps you will feel happier for just asking.
~~gravywavy
Thanks for the time you put
)
Thanks for the time you put into your reply.
Don't see what should be changed for the server code !
The present distributed philosophy is server > single clients only and better would have been: Server > Lan gateway > Client(s), where lan gateway is optional.
First impression is that the BOINC-client would need an additional package for the Lan option and some changes in the client module to poll from the gateway instead of the server.
This looks all very simple to me, but know that simple looking things can be complicated afterall.
I will admit that it is not desirable that two machines owned by the same user crunch in the same canonial WU, but for the other security-issues I don't see much difference for the mentioned distribution models.
----
EdH
RE: Don't see what should
)
had it been desinged your way to begin with, no real problem. Give every client a LAN cache, and when it is a true standalone the cache is found at 127.0.0.1 - you're right, it would have been simple.
Altering it now means adding another layer in the design. That is what is complicated.
The server thinks it knows what files are on each client, and treats each client separately. Some data that currently refers to clients would still refer to them; other similar data would need to be combined into data about the LAN cache. In making the changes some would be bound to be forgotten.
Inserting an extra layer into a software design after the initial design phase is a fruitful way of creating bugs. I am not saying so from any knowledge of BOINC, I do not know the code and a developer might come along and tell me I am wrong here. But in general, from my experience of writing other complicated software, that is they way I would guess.
The time to try it would be at a complete re-write of the code from the ground up.
~~gravywavy
RE: had it been desinged
)
I might be a high flyer, but Earth's Gravity fields have a stiff grip on you boy. It keeps you down to earth to much !
No serious, Software goes from version to version, Solaris 10 was based on Solaris 9 and as such not a complete rewrite.
Changing software certainly introduces side-effects, e.g. bugs, but we have in house testing and public beta's to find them.
When you have found the right wave, try a jump and release yourself from your Down.
Cheers, Eric.ie
There has been discussion of
)
There has been discussion of this issue, but it is not regarded as a high priority. One of the reasons that it MIGHT become a more interesting problem and get some attention is for systems where there is a security need to isolate the actual processing machines from a direct conection.
So, the mid-tier would collect work for the back tier (isolated) and you would use sneaker-net to move the work on a disk from one side to the other. Because this is an issue that only affects certain installations the core development staff is not likely to work on this problem anytime soon as there is no "clamor" for an implementation.
Even if there was, it is likely that the answer would be to develop the tier and show us the code and we will add the changes... ie: you want it ... you write it ...
At least for the foreseeable future.
RE: ie: you want it ... you
)
Hmm, Paul you got me here. I have been supporting business software, but that does not mean that I can write it....
Tried it, fell over . / \ and thelike, so I gave up in the end.
Understand and respect that there is a priority and that this issue has not the highest one.
At the moment experimenting with a workaround, just moved the whole boinc folder to an other computer and run the second wu on that.
Lets see the result of that. Know I might get error situ's, but E@H wu's are not to long.
In case I have offended you, than I hope it is not to soon to accept my appologies and I hope Gravywavy is willing to help you out.
With Kind Regards,
Eric
RE: In case I have offended
)
Eric,
Oh, heavens no. Trust me, it is *REAL*[/b] hard to offend me, or even annoy me ... one of the benefits I get with the disability ...
I was just telling you the philosophy as I understood it from the BETA test days. And, like it or not, the basic fundamentals (that is redundant) of the BOINC Developers is that they don't deviate from the initial concept much. The allocation to specific computers is both a anti-cheat mechanism, but more importantly, it does allow a project traceability.
Let us say that there is a discovery that all the Results processed on the PowerMac are found to be invalid. With traceability, they could identify all the result that are "contaminated". It also adds rigor to the process. Something most scientists like ... so, here is how we got this result ... you try it ... if the second trial works as well, then you have progress.
As I said a long time ago in a discussion, we make the same measurements and do the same expiriments we already know the answer to, just to make sure that the universe has not changed on us in the meantime.
Back to the software, that phrase could be taken,um, the way you did, and I am sorry that you did... it was just what I said, no more, no less. No intonation or stress ... :)
But with a small team, open source, add water and stir ... you need it, you are going to have to write it ... (is that better?) ... :)
It's good to hear from you
)
It's good to hear from you again Paul ! Thanks.
Have been able to upload the WU from that other computer, but uploading the
result via the original computer did not work.
Side-effect was that the host record was updated twice for the second computer,
so a merge operation was needed to correct that.
Regretfully, I was not able to see if the second computer would continue on the
same 'Work Unit Data File'=WUDF now, as the WU done was a T28, which is probably
one of the last wu's from the WUDF.
Getting some doubts about the usefulness of this mini-expiriment. It might save
some time downloading, but you lose that on the operations involved. Besides this
the first computer does only get 2 WU's at a time, for now, so only two computers
can run with the same WUDF.
This Message is spuwed if you try to get more WU's: "Computer on 26.3% of time,
BOINC on 100% of that, this project gets 100% of that", would indicate that
computer might later get 4 wu's max or so ??
With a weeks margin, a 800 MHz would be able to do 7+ and 2 PC's 14+ of the same
WUDF, but it does not get it assigned !
Conclusion: a less ridged wu-assigment policy would also benefit the efficiency.
But I think there is more to it:
With the arrival of the S4 dataset we got WUDF's of 6+ Mb. To my suprise my
computers need the same time to crunch a WU. This shows that there is no relation
between the length of the WUDF and the crunchtime and leads me to believe that
every wu crunches on a piece of the WUFD. If this is the case, than it would be
most efficient to reduce the WUDF-size to 0.5+ MB, so every Wu has just the data
it needs and not a chain which is a 28 fold of the data it needs.
Hopefully, above is the case. and if so, than it would be relatively easy to solve, but would give the maximun efficiency for all users.
RE: In a small network,
)
With BOINC 2.25 and above you can remote control BOINC on all your other computers. Now if you can remote control the other computers you would be able to write a program that went and found out what data sets the other computers that you manage have. The program would then download all data sets that it didn't have from the other computers. it would then upload the data sets that the other computers are missing so all the computers that you manage have a copy of all downloaded data sets. this would mostly be of a benefit to modem users.
Then you're really interested in a subject, there is no way to avoid it. You have to read the Manual.
RE: With BOINC 2.25 and
)
... I think you mean 4.45.