First WU finished. The other results are done with the patched app. So the beta app is about 8% slower than the patched app and about 22% faster than the offical app.
cu,
Michael
As an added bonus, you cross-validated with Darwin...
Wuuh, I didn't see that jet. :-)
Maybe I should do some more WUs until I get results with Linux boxes too.
Hey Bruce, let's do some together! ;-)
As an added bonus, you cross-validated with Darwin...
But the validation problem is definitely not fixed with this version (just to prevent too high expectation :-) .
Oh-oh...came home to find my notebook exposed to full sunshine over the last couple of hours (silly me). HD temperature 59°C, not sure what the max was ... still no errors :-). Must have stayed below 75°C or else system would have shut down automatically. I will try later to simulate some network errors.
But the validation problem is definitely not fixed with this version (just to prevent too high expectation :-) .
Oh, I know it's not "fixed". My take is that the only two "fix-like" items hoped for in the beta are the AMD/Windows disparity issue and the checkpointing issue (Exit Status 10). Beyond that, verbosity was increased in debug routines...
However, it was still interesting to note that the validation happened with a non-Microsoft OS... I haven't been paired up with another OS in quite a while...
From the beta page:
'a diffrent function is used for rounding in sin/cos calculation (ftol() instead of modf())...'
To me these validation errors smell like rounding(precision) problems and this new function might have effects to them, but of cause I can be wrong.
I had several Darwin wingmen and about each fivth WU got 'checked but no consensus yet', unfortunately I don't remember if it was Darwin/Linux or Darwin/Windows. I will do some more WUs and one of the next is with a Darwin partner again.
From the beta page:
'a diffrent function is used for rounding in sin/cos calculation (ftol() instead of modf())...'
To me these validation errors smell like rounding(precision) problems and this new function might have effects to them, but of cause I can be wrong.
The modf function splits a double into it's fractional part and it's integral part. This is an exact computation that involves no rounding or loss of precision, the result will be the same, bit-by-bit, under the gcc compiled code and the MS VC code, with or without SSE2.
I've got a Workunit manually stripped down to a single sky coordinate which I can run under Win and Linux to reproduce the x-platform-validation problem. It does show up with the new app as well, as could be expected.
From the beta page:
'a diffrent function is used for rounding in sin/cos calculation (ftol() instead of modf())...'
To me these validation errors smell like rounding(precision) problems and this new function might have effects to them, but of cause I can be wrong.
The modf function splits a double into it's fractional part and it's integral part. This is an exact computation that involves no rounding or loss of precision, the result will be the same, bit-by-bit, under the gcc compiled code and the MS VC code, with or without SSE2.
I've got a Workunit manually stripped down to a single sky coordinate which I can run under Win and Linux to reproduce the x-platform-validation problem. It does show up with the new app as well, as could be expected.
CU
BRM
The part I'm interested in, and I couldn't find a direct comparison via some searching with Google, is if modf() is faster or slower than ftol() ? If modf() is faster, certainly I would prefer to keep using that instead of having the Windows app use ftol(), unless in the long run assembly is used for both platforms and the speed then balances out.
From the beta page:
'a diffrent function is used for rounding in sin/cos calculation (ftol() instead of modf())...'
To me these validation errors smell like rounding(precision) problems and this new function might have effects to them, but of cause I can be wrong.
The modf function splits a double into it's fractional part and it's integral part. This is an exact computation that involves no rounding or loss of precision, the result will be the same, bit-by-bit, under the gcc compiled code and the MS VC code, with or without SSE2.
I've got a Workunit manually stripped down to a single sky coordinate which I can run under Win and Linux to reproduce the x-platform-validation problem. It does show up with the new app as well, as could be expected.
CU
BRM
The part I'm interested in, and I couldn't find a direct comparison via some searching with Google, is if modf() is faster or slower than ftol() ? If modf() is faster, certainly I would prefer to keep using that instead of having the Windows app use ftol(), unless in the long run assembly is used for both platforms and the speed then balances out.
Brian
It depends...hard to say in general.
Not wanting to hijack this thread, just a few comments:
-E.g. the SSE2 code path of modf used by the version of MS VC concerned is quite fast
-The non-SSE2 code path of modf is very, very, very slow
-ftol has only one non-SEE2 codepath, which is much faster than non-SSE2 modf but a bit slower than SSE2-modf. (all for the version of MS VC used here).
Then again, this beta isn't really that much about performance.
It survived a virus scan so far, I'll pull a few network cables later on.
Then again, this beta isn't really that much about performance.
The performance angle is a required feedback point since it is mentioned by Bernd in his post. Additionally:
One shouldn't really preach inter-project parity unless one can provide somewhat respectable intra-project parity.
Hence, discussion about performance is highly germane to the broader beta discussion and I will respectfully disagree with anyone who infers otherwise. :-)
Edit: I do understand though that continued concentration on that single aspect once the reporting has been done is not really productive, providing that it is understood that it was a good step forward, but more work is needed. Could that wait to the optimization stage? Perhaps. One could argue though that the optimization will help Linux just as much and thus the penalty would still be there.
Quote:
It survived a virus scan so far, I'll pull a few network cables later on.
Virus scanning is a good test, as it would stress the checkpointing. Another disk test would be a defrag.
The "overheating" scenario is also a good "test", although I'm not sure you would really wish to repeat that very often...
I'm not sure what yanking a cable is going to do for you unless you time it specifically during a network request.
My system uses the screensaver for 5 minutes and then black screens. Perhaps going longer than that could be helpful, so I can try that...
Final damage report from my first result using 4.23:
Total time: 59556.80 seconds
Paired with: Athlon XP system running Windows XP wuid=34030246
I guess this will be a test of XP vs. XP validation ;-)
I didn't try anything "creative" while running the result, because it really is a good idea to first make sure that you are able to get all the way through at least one submission before doing "creative" stuff, lest you have too many input variables to check.
On to the "creative" attempts at causing a breakage:
In my opinion, this is a noble effort to undertake, but probably largely a waste of time if the system being tested upon was previously fundamentally stable. Edit: If a "creative" crash is caused, then said creativity needs to be regression tested with 4.17 and then multiple people need to attempt the same creativity.
What's really needed is for recruitment of people over in the "Problems and Bug Reports" section who were having problems. If Kirsten is still around, perhaps she could ask members of BOINC Denmark that may have had issues in the past to do testing with the understanding that there will likely be no change to what they experienced before, only that they will be contributing more information than before to try to narrow down the causes of the issues.
Fundamentally, without the involvement of people who were having problems before, a few tests here and there to make sure that "non-creative" (aka "normal") runs of the application don't crash (aka "no new bugs") and the results still validate by-in-large, then considering the largely debug nature of this build of the application, I see no reason to not deploy to the entire user base by the end of the week so as to expedite the retrieval of meaningful debug data for other problems that exist.
I'm not sure what yanking a cable is going to do for you unless you time it specifically during a network request.
Was planning to suspend network traffic, wait until there's something to upload, sabotage network, push "Update" manually to force it to contact server while network is sabotaged.
RE: RE: First WU
)
Wuuh, I didn't see that jet. :-)
Maybe I should do some more WUs until I get results with Linux boxes too.
Hey Bruce, let's do some together! ;-)
cu,
Michael
RE: RE: As an added
)
But the validation problem is definitely not fixed with this version (just to prevent too high expectation :-) .
Oh-oh...came home to find my notebook exposed to full sunshine over the last couple of hours (silly me). HD temperature 59°C, not sure what the max was ... still no errors :-). Must have stayed below 75°C or else system would have shut down automatically. I will try later to simulate some network errors.
CU
BRM
RE: But the validation
)
Oh, I know it's not "fixed". My take is that the only two "fix-like" items hoped for in the beta are the AMD/Windows disparity issue and the checkpointing issue (Exit Status 10). Beyond that, verbosity was increased in debug routines...
However, it was still interesting to note that the validation happened with a non-Microsoft OS... I haven't been paired up with another OS in quite a while...
Brian
From the beta page: 'a
)
From the beta page:
'a diffrent function is used for rounding in sin/cos calculation (ftol() instead of modf())...'
To me these validation errors smell like rounding(precision) problems and this new function might have effects to them, but of cause I can be wrong.
I had several Darwin wingmen and about each fivth WU got 'checked but no consensus yet', unfortunately I don't remember if it was Darwin/Linux or Darwin/Windows. I will do some more WUs and one of the next is with a Darwin partner again.
cu,
Michael
RE: From the beta page: 'a
)
The modf function splits a double into it's fractional part and it's integral part. This is an exact computation that involves no rounding or loss of precision, the result will be the same, bit-by-bit, under the gcc compiled code and the MS VC code, with or without SSE2.
I've got a Workunit manually stripped down to a single sky coordinate which I can run under Win and Linux to reproduce the x-platform-validation problem. It does show up with the new app as well, as could be expected.
CU
BRM
RE: RE: From the beta
)
The part I'm interested in, and I couldn't find a direct comparison via some searching with Google, is if modf() is faster or slower than ftol() ? If modf() is faster, certainly I would prefer to keep using that instead of having the Windows app use ftol(), unless in the long run assembly is used for both platforms and the speed then balances out.
Brian
RE: RE: RE: From the
)
It depends...hard to say in general.
Not wanting to hijack this thread, just a few comments:
-E.g. the SSE2 code path of modf used by the version of MS VC concerned is quite fast
-The non-SSE2 code path of modf is very, very, very slow
-ftol has only one non-SEE2 codepath, which is much faster than non-SSE2 modf but a bit slower than SSE2-modf. (all for the version of MS VC used here).
Then again, this beta isn't really that much about performance.
It survived a virus scan so far, I'll pull a few network cables later on.
CU
BRM
RE: Then again, this beta
)
The performance angle is a required feedback point since it is mentioned by Bernd in his post. Additionally:
One shouldn't really preach inter-project parity unless one can provide somewhat respectable intra-project parity.
Hence, discussion about performance is highly germane to the broader beta discussion and I will respectfully disagree with anyone who infers otherwise. :-)
Edit: I do understand though that continued concentration on that single aspect once the reporting has been done is not really productive, providing that it is understood that it was a good step forward, but more work is needed. Could that wait to the optimization stage? Perhaps. One could argue though that the optimization will help Linux just as much and thus the penalty would still be there.
Virus scanning is a good test, as it would stress the checkpointing. Another disk test would be a defrag.
The "overheating" scenario is also a good "test", although I'm not sure you would really wish to repeat that very often...
I'm not sure what yanking a cable is going to do for you unless you time it specifically during a network request.
My system uses the screensaver for 5 minutes and then black screens. Perhaps going longer than that could be helpful, so I can try that...
Final damage report from my
)
Final damage report from my first result using 4.23:
Total time: 59556.80 seconds
Paired with: Athlon XP system running Windows XP
wuid=34030246
I guess this will be a test of XP vs. XP validation ;-)
I didn't try anything "creative" while running the result, because it really is a good idea to first make sure that you are able to get all the way through at least one submission before doing "creative" stuff, lest you have too many input variables to check.
On to the "creative" attempts at causing a breakage:
In my opinion, this is a noble effort to undertake, but probably largely a waste of time if the system being tested upon was previously fundamentally stable. Edit: If a "creative" crash is caused, then said creativity needs to be regression tested with 4.17 and then multiple people need to attempt the same creativity.
What's really needed is for recruitment of people over in the "Problems and Bug Reports" section who were having problems. If Kirsten is still around, perhaps she could ask members of BOINC Denmark that may have had issues in the past to do testing with the understanding that there will likely be no change to what they experienced before, only that they will be contributing more information than before to try to narrow down the causes of the issues.
Fundamentally, without the involvement of people who were having problems before, a few tests here and there to make sure that "non-creative" (aka "normal") runs of the application don't crash (aka "no new bugs") and the results still validate by-in-large, then considering the largely debug nature of this build of the application, I see no reason to not deploy to the entire user base by the end of the week so as to expedite the retrieval of meaningful debug data for other problems that exist.
IMO, YMMV, etc, etc, etc...
Brian
RE: I'm not sure what
)
Was planning to suspend network traffic, wait until there's something to upload, sabotage network, push "Update" manually to force it to contact server while network is sabotaged.
Let's see what happens.
CU
H-B