Eidt: italics added to call out a section. In response the the building analogy, what if an inspector has to come in and check the final house to make sure it meets the requirements? And what if the fast, inexperienced worker can produce a house that meets all predetermined tolerances, the same as the slower, more experienced worker? Is the house any less good because it was finished sooner? I don't think so because it still meets the requirements and was built to tolerance. This is my point above about the validator. The validator is like the building inspector. If it says it meets tolerances, you either believe it and accept the results, or don't believe it and say the inspector (vaidator) is inept and can't do his/her job properly. Are the scientists in the latter group? Do we even know that for sure? As noted above, since the validator ensures the quality of the accepted results, I believe the scientists have programmed it properly so that it will only accept results within pre-determined tolerances. I don't think the work units should be thrown out if they validate simply because a different version of the application crunched them.
What we have here is a failure to communicate. (CHL)
Here's my crack at an analogy:
Speaking from the heart of Union Country (Michigan, USA), here is how the situation stands:
EAH is being (run, developed, contracted for...insert your own description here) as a Union project. Therefore, it WILL follow the "Union Rules". The E@H science application was developed, tested, and certified by Journeyman or higher workers based on the 'contract specs' (read Approved Scientific Protocol). Akos (an apprentice) has looked at the app and figured..."hey, this thing can be tweaked a little bit here and there to make it MUCH faster while still giving the same output (or better) as the original app".
Well, the Akos' app goes out in various flavors to various users. Most of the results are (apparently) as good as, or better than the original. Unfortunately, a couple of the intermediate versions returned some poor quality (read Invalid) results, which got the attention of the Shop Steward.
The Shop Steward takes a quick look at the situation and says "What are you doing? This thing doesn't follow Union Standards...get it recalled immediately!" Which Akos does right away.
Now some of the Users get riled about this and point out that the Inspector (validator) accepted the results as equivalent to the output from the approved science app., to which the Shop Steward replies: "I don't give a rat's @ss whether it's 10 times faster or 20% more accurate. It hasn't been certified and it's not going to be accepted", and all the 'bogus' results are going to be redone.
so can anyone give me a reason why that version is allowed for official crunching and how the "enough precision" thing fits with that what is said here before?
Because it is written by the project developers and tested directly by them. Not a 3rd party.
What we have here is a failure to communicate. (CHL)
Here's my crack at an analogy:
Speaking from the heart of Union Country (Michigan, USA), here is how the situation stands:
EAH is being (run, developed, contracted for...insert your own description here) as a Union project. Therefore, it WILL follow the "Union Rules". The E@H science application was developed, tested, and certified by Journeyman or higher workers based on the 'contract specs' (read Approved Scientific Protocol). Akos (an apprentice) has looked at the app and figured..."hey, this thing can be tweaked a little bit here and there to make it MUCH faster while still giving the same output (or better) as the original app".
Well, the Akos' app goes out in various flavors to various users. Most of the results are (apparently) as good as, or better than the original. Unfortunately, a couple of the intermediate versions returned some poor quality (read Invalid) results, which got the attention of the Shop Steward.
The Shop Steward takes a quick look at the situation and says "What are you doing? This thing doesn't follow Union Standards...get it recalled immediately!" Which Akos does right away.
Now some of the Users get riled about this and point out that the Inspector (validator) accepted the results as equivalent to the output from the approved science app., to which the Shop Steward replies: "I don't give a rat's @ss whether it's 10 times faster or 20% more accurate. It hasn't been certified and it's not going to be accepted", and all the 'bogus' results are going to be redone.
This is quite old ground, and some of you don't appear to realise we are on about the third lap, I think. So let's get over it team! :-)
I think we've still got a few more laps to go!! :).
Quote:
If it wasn't a duty of a moderator, I would have left this dead horse of a thread ages ago! :-)
Cheers, Mike.
There's still life in this "dead horse" yet, as one of its legs just kicked me while my attention was diverted :). I was laughing at RandyC's latest creative analogy and didn't see it lash out!! :).
Maybe we need to remind people, yet again, that Bruce is the person who makes the decisions and Bruce has been away. It's only just over a week since the first patch hit the streets so surely we can all just relax and wait until Bruce has had a chance to review the whole situation and make a calm and measured judgement about what needs to be done. After all, I'm sure there would be heaps of criticism if some hasty and ill-conceived decision were made without taking the time to think through all the issues.
Could TMR's rescmp tool be used for this? Not sure it'll determine if they're identical though..
I don't think that TMR's rescmp tool has any certifications...
Oh I didn't mean that. Just could it be used to test the idea results are exacly the same. Still upto the einstien staff and scientists. They'd wont to do their own tests.
[EDIT]I suppose its a bit pointess really. Dont know why i brought it up.
Hm... Perhaps the Einstein staff could do a TMR's rescmp-like tool... and they could certificate this... but i think it would be too difficult to them because it needs lots of energy and time.
AGAIN the results are the SAME compared to the standard app.. although it takes 30 to 40 % less processing time. Thats a FACT why do you keep ignoring that ?
I'm not sure that the results are same, perhaps there are a hidden calculation fault.
Could you justify your assertion on a way that Einstein staff can approve them?
Hmm. In actual fact, that's what the validator is for. It is a tool that decides if, within experimental error, two results are the same. If this tool doesn't work, then the project is in real trouble, because one *has* to assume there are bogus apps out there. If experience has taught us one thing, it is that there are people out there who are not ethical like you, and if they can find a way to get a result past the validator without crunching (as much) they they will. Some of them are not just unethical but smart.
My assumption about what the problem was before is that we had a situation where two identical "bad" results were being presented to the validator for comparison and were either getting through or screwing up the validator. That this could happen is a concern, but if the validator can be trusted to be "sufficiently" robust, then it is fixable.
By the way, I agree with you that we don't know the results are "the same" for all inputs, even if the validator lets a large class of results through. On the other hand, we don't know for sure that the standard app is "correct" for all input streams, unless of course the algorithm has undergone a formal proof of correctness and the code has undergone a formal proof of adherance to the algorithm, and all the compilers have undergone a formal proof.....etc. I somehow doubt it because formal proofs of correctness are very hard to do and rarely undertaken. Does this matter? Not really. The scientific method is all about replication of results. No doubt if any assertions come out of this whole project, other teams will try to get the same results from the same data using their own analysis. The Boinc Project will have done its job in that it *should* be showing the scients where to look in the huge mountain of data. Boinc is well-suited to the class of problems where finding the solution is intractable but the checking of a solution for correctness is actually quite tractable.
So, while I have taken down the optimised app (which validated OK 15/15 times) and gone back to the stock app because the team have requested it, I don't think it does anything for the ultimate validity of the science. (But it is probably a good move to step back and think.)
I'll sit back and wait until people have had time to draw breath and think this through, But if the end result turns out to be: "Akos' optimised apps were threatening the validity of the project, and having them taken down fixes it." Then I for one will have very serious doubts about the validity of the whole project and the point in donating my electricity to it.
Could you justify your assertion on a way that Einstein staff can approve them?
Hmm. In actual fact, that's what the validator is for.
You are right, the validator has to be validate the results, but i think this designation is a bit excessive. The validator do only a comparison between the results and checks only some feedback signals. So, the 'result comparator' would be a better name instead of 'validator'.
Quote:
It is a tool that decides if, within experimental error, two results are the same. If this tool doesn't work, then the project is in real trouble, because one *has* to assume there are bogus apps out there. If experience has taught us one thing, it is that there are people out there who are not ethical like you, and if they can find a way to get a result past the validator without crunching (as much) they they will. Some of them are not just unethical but smart.
I think this tool cannot do a true validation, because it would need a perfect result for each comparison, but the project doesn't have enough resources to produce these perfect results. ( and if you have a perfect result... why do you want to regenerate it with an imperfect machine? )
Quote:
Then I for one will have very serious doubts about the validity of the whole project and the point in donating my electricity to it.
I'm a second one who has this serious doubt. The distributed project staffs could write better applications that would be able to produce more safety results, but it doesn't guarantee perfect results.
Perhaps you remember for the SETI classic, one of my favourite examples. This application used a very safety code, it checked itself and its data areas in every minutes and generated a checksum. But of course, this method also was imperfect. Everybody could put a backdoor into the code and could neutralize this method too...
You can reduce the possibility of wrong results, but this value never will reach the zero. So, perhaps there are no perfect solutions for open computing projects...
You can reduce the possibility of wrong results, but this value never will reach the zero. So, perhaps there are no perfect solutions for open computing projects...
I think the first half of this hits the nail right bang on the head. You will never reduce the probability of wrong results to zero. In fact I would say that the probability of *some* wrong results getting into the database is almost unity over the life of the experiment.
But this, I think, is a property of open distributed computing projects. The key is to have a methodology that is robust against a small percentage of bogus results. I'm pretty sure that Einstein has a robust methodology, but I would like to hear someone from the science team come out and say it.
I think the first half of this hits the nail right bang on the head. You will never reduce the probability of wrong results to zero. In fact I would say that the probability of *some* wrong results getting into the database is almost unity over the life of the experiment.
But this, I think, is a property of open distributed computing projects. The key is to have a methodology that is robust against a small percentage of bogus results. I'm pretty sure that Einstein has a robust methodology, but I would like to hear someone from the science team come out and say it.
I agree, the robust methodology is the key of reliability of the scientific result.
RE: Eidt: italics added to
)
What we have here is a failure to communicate. (CHL)
Here's my crack at an analogy:
Speaking from the heart of Union Country (Michigan, USA), here is how the situation stands:
EAH is being (run, developed, contracted for...insert your own description here) as a Union project. Therefore, it WILL follow the "Union Rules". The E@H science application was developed, tested, and certified by Journeyman or higher workers based on the 'contract specs' (read Approved Scientific Protocol). Akos (an apprentice) has looked at the app and figured..."hey, this thing can be tweaked a little bit here and there to make it MUCH faster while still giving the same output (or better) as the original app".
Well, the Akos' app goes out in various flavors to various users. Most of the results are (apparently) as good as, or better than the original. Unfortunately, a couple of the intermediate versions returned some poor quality (read Invalid) results, which got the attention of the Shop Steward.
The Shop Steward takes a quick look at the situation and says "What are you doing? This thing doesn't follow Union Standards...get it recalled immediately!" Which Akos does right away.
Now some of the Users get riled about this and point out that the Inspector (validator) accepted the results as equivalent to the output from the approved science app., to which the Shop Steward replies: "I don't give a rat's @ss whether it's 10 times faster or 20% more accurate. It hasn't been certified and it's not going to be accepted", and all the 'bogus' results are going to be redone.
End of Story!
Seti Classic Final Total: 11446 WU.
RE: RE: being faster
)
Because it is written by the project developers and tested directly by them. Not a 3rd party.
RE: What we have here is a
)
Excellent !
[
RE: This is quite old
)
I think we've still got a few more laps to go!! :).
There's still life in this "dead horse" yet, as one of its legs just kicked me while my attention was diverted :). I was laughing at RandyC's latest creative analogy and didn't see it lash out!! :).
Maybe we need to remind people, yet again, that Bruce is the person who makes the decisions and Bruce has been away. It's only just over a week since the first patch hit the streets so surely we can all just relax and wait until Bruce has had a chance to review the whole situation and make a calm and measured judgement about what needs to be done. After all, I'm sure there would be heaps of criticism if some hasty and ill-conceived decision were made without taking the time to think through all the issues.
Cheers,
Gary.
RE: RE: RE: Could TMR's
)
Hm... Perhaps the Einstein staff could do a TMR's rescmp-like tool... and they could certificate this... but i think it would be too difficult to them because it needs lots of energy and time.
RE: RE: AGAIN the results
)
Hmm. In actual fact, that's what the validator is for. It is a tool that decides if, within experimental error, two results are the same. If this tool doesn't work, then the project is in real trouble, because one *has* to assume there are bogus apps out there. If experience has taught us one thing, it is that there are people out there who are not ethical like you, and if they can find a way to get a result past the validator without crunching (as much) they they will. Some of them are not just unethical but smart.
My assumption about what the problem was before is that we had a situation where two identical "bad" results were being presented to the validator for comparison and were either getting through or screwing up the validator. That this could happen is a concern, but if the validator can be trusted to be "sufficiently" robust, then it is fixable.
By the way, I agree with you that we don't know the results are "the same" for all inputs, even if the validator lets a large class of results through. On the other hand, we don't know for sure that the standard app is "correct" for all input streams, unless of course the algorithm has undergone a formal proof of correctness and the code has undergone a formal proof of adherance to the algorithm, and all the compilers have undergone a formal proof.....etc. I somehow doubt it because formal proofs of correctness are very hard to do and rarely undertaken. Does this matter? Not really. The scientific method is all about replication of results. No doubt if any assertions come out of this whole project, other teams will try to get the same results from the same data using their own analysis. The Boinc Project will have done its job in that it *should* be showing the scients where to look in the huge mountain of data. Boinc is well-suited to the class of problems where finding the solution is intractable but the checking of a solution for correctness is actually quite tractable.
So, while I have taken down the optimised app (which validated OK 15/15 times) and gone back to the stock app because the team have requested it, I don't think it does anything for the ultimate validity of the science. (But it is probably a good move to step back and think.)
I'll sit back and wait until people have had time to draw breath and think this through, But if the end result turns out to be: "Akos' optimised apps were threatening the validity of the project, and having them taken down fixes it." Then I for one will have very serious doubts about the validity of the whole project and the point in donating my electricity to it.
yours in crunching,
--miw
RE: RE: Could you justify
)
You are right, the validator has to be validate the results, but i think this designation is a bit excessive. The validator do only a comparison between the results and checks only some feedback signals. So, the 'result comparator' would be a better name instead of 'validator'.
I think this tool cannot do a true validation, because it would need a perfect result for each comparison, but the project doesn't have enough resources to produce these perfect results. ( and if you have a perfect result... why do you want to regenerate it with an imperfect machine? )
I'm a second one who has this serious doubt. The distributed project staffs could write better applications that would be able to produce more safety results, but it doesn't guarantee perfect results.
Perhaps you remember for the SETI classic, one of my favourite examples. This application used a very safety code, it checked itself and its data areas in every minutes and generated a checksum. But of course, this method also was imperfect. Everybody could put a backdoor into the code and could neutralize this method too...
You can reduce the possibility of wrong results, but this value never will reach the zero. So, perhaps there are no perfect solutions for open computing projects...
miw, Thank you for some
)
miw,
Thank you for some pretty astute and rational thinking.
You have summed up the situation very nicely indeed.
Cheers,
Gary.
RE: You can reduce the
)
I think the first half of this hits the nail right bang on the head. You will never reduce the probability of wrong results to zero. In fact I would say that the probability of *some* wrong results getting into the database is almost unity over the life of the experiment.
But this, I think, is a property of open distributed computing projects. The key is to have a methodology that is robust against a small percentage of bogus results. I'm pretty sure that Einstein has a robust methodology, but I would like to hear someone from the science team come out and say it.
yours in crunching,
--miw
RE: I think the first half
)
I agree, the robust methodology is the key of reliability of the scientific result.