I have an "interesting question". I tried to get VTune up and working on my machine to answer it, but since it is an AMD processor, it squawked about the processor architecture...and on top of that, I have no idea how to use the blessed thing... The C++ DLL that I worked on was not a performance drag (credit card auth on tcp/ip usually happened very quickly), so it was never "tuned"...
Sooooo.....
What is the effect that happens when you "ABC" again? Is that working against the modf() -> ftol() change, or is there still some activity going to modf() despite the change, meaning there's another "buggy detection" different from the one that was already worked around, or is that string changing some other function?
I have an "interesting question". I tried to get VTune up and working on my machine to answer it, but since it is an AMD processor, it squawked about the processor architecture...and on top of that, I have no idea how to use the blessed thing... The C++ DLL that I worked on was not a performance drag (credit card auth on tcp/ip usually happened very quickly), so it was never "tuned"...
Sooooo.....
What is the effect that happens when you "ABC" again? Is that working against the modf() -> ftol() change, or is there still some activity going to modf() despite the change, meaning there's another "buggy detection" different from the one that was already worked around, or is that string changing some other function?
Brian
My VTune trial license expired...
Anyway...the effect of the "ABC" patch is that on AMD CPUs that supports SSE2, a global flag in the runtime lib is set differently. This flag toggles the (usually) faster SSE2 codepath for several functions, not just modf. What Bernd did was to rewrite the code in the hot-loop so that it would no longer call modf but ftol, for which, in VS 2003, only one code path exists which is reasonable fast. The slow codepath will continue to be executed in the new Win apps, but no longer in the hot-loop, as I understand it, so the overall effect of "ABC"ing the app should be much smaller now.
This flag toggles the (usually) faster SSE2 codepath for several functions, not just modf. What Bernd did was to rewrite the code in the hot-loop so that it would no longer call modf but ftol, for which, in VS 2003, only one code path exists which is reasonable fast. The slow codepath will continue to be executed in the new Win apps, but no longer in the hot-loop, as I understand it, so the overall effect of "ABC"ing the app should be much smaller now.
Gotcha... Yeah, it doesn't have as much octane as on 4.17... I really hope Barcelona/Agena (Phenom...btw, IMO, silly name) will at least get AMD back onto a level playing field from an architecture standpoint...
If there are differences between the various compilers and math libraries, how do we know which ones will give scientifically accurate results?
This is a very good question, and difficult to answer indeed.
"Science" (as it applies here) is based on mathematics, which is based on an ideal world: values are continuous, spaces are infinite etc.
Calculations performed on real-world machines (computers) are not like this: resources (memory, time) are limited, and so is precision, which means values are discrete. In this sense every (non-trivial real-number) calculation done on a computer is wrong wrt. the ideal model the implementation is based on. However, in many (hopefully most) cases the difference ("error") is neglectable.
Although every "computation" as mentioned is "wrong", i.e. differs from the mathematical idea, the difference itself varies between the systems the calculations are done with (CPUs, compilers, libraries etc.). A way to make all computations at least wrong in the same way is to set a standard for the way they are performed, independent of the properties named above. This was tried in IEEE 754.
Almost all "systems" have some way of enforcing calculations conforming to this standard. However, most modern processors have evolved beyond this standard and e.g. implemented ways to accelerate their own understanding of floating-point arithmetic, so enforcing "IEEE arithmetic" is still possible, but would noticeably slow down the computation compared to the systems "native" way.
So for us a way to ensure cross-platform compatibility would be to use IEEE arithmetic (and it would make the various CPUs truly comparable), but it would generally slow down the computation. For a project whose success (i.e. probability of detecting a gravitational wave) depends so much on the "computing power" (here: the number of computations done) this would have a severe impact, too.
Quote:
And, have a lot of us been producing results that are worthless?
Definitely not. In principle all results have been helpful, even though they didn't pass validation. We will need to adjust the App and/or the validator to make the good results pass validation, regardless of the platform they have been calculated on.
Interesting questions indeed. Follow up question: One way to improve validation would be to inject simulated "Pulsar Signals" into the input data and verify that the clients find them. Are there any plans to do that in the future?
Fully true. There actually are two types of "signal injections" already done: "hardware injections" that actually affect the test masses of the detector (sometimes used for calibrations, too), testing the whole pipeline from detector to data analysis. There are "software injections", too, where fake pulsar signals are added by software to the data that has been recorded from the detector.
There should be more detailed descriptions of this in the S3 results report (available through a link from the front page).
This, however, is beyond the scope of a single workunit, and thus does not help for technically validating individual results.
For the curious: You can provoke a "client error" (Breakpoint) by putting a file named "EAH_MSC_BREAKPOINT" into the BOINC directory (remember to remove it after testing!).
Oh, just what we need... provoking... ;-) I don't think we should provoke Einstein. He may roll over and declare E=MC3 :-O
Well, this is just for testing getting the symbols from the symbol store, so only if you're really curious. It will happen right at the beginning, so shouldn't waste computing time. You should probably set the project to "no new work" before you try this, and "allow more work" after you removed the file in order not to trash too many results. The result should look like this result, in particular you should find "PDB Symbols Loaded".
For the curious: You can provoke a "client error" (Breakpoint) by putting a file named "EAH_MSC_BREAKPOINT" into the BOINC directory (remember to remove it after testing!).
Oh, just what we need... provoking... ;-) I don't think we should provoke Einstein. He may roll over and declare E=MC3 :-O
Well, this is just for testing getting the symbols from the symbol store, so only if you're really curious. It will happen right at the beginning, so shouldn't waste computing time. You should probably set the project to "no new work" before you try this, and "allow more work" after you removed the file in order not to trash too many results. The result should look like this result, in particular you should find "PDB Symbols Loaded".
It seems the l1_* files never get deleted, slowly filling up the disks of hosts until the quota is reached, effectively shutting down work for Einsein@Home after some time. Could be responsible for some hosts dropping out of E@H.
1st WU finished w/o incident, and valid. My host was the 3rd to finish, zeroing out the credits of a Linux box. I do hope it wasn't you again, Gary :-(. I was lucky to get away alive with it last time...
Hey Bernd, got some odd errors for my WU and wanted to check back:
Quote:
27/06/2007 23:59:37|Einstein@Home|[error] einstein_S5R2 not responding to screensaver, requesting exit
27/06/2007 23:59:38|Einstein@Home|Task h1_0493.15_S5R2__281_S5R2c_0 exited with zero status but no 'finished' file
27/06/2007 23:59:38|Einstein@Home|If this happens repeatedly you may need to reset the project.
27/06/2007 23:59:38|Einstein@Home|Restarting task h1_0493.15_S5R2__281_S5R2c_0 using einstein_S5R2 version 424
28/06/2007 00:00:40||Suspending computation - user is active
28/06/2007 00:06:23||Resuming computation
28/06/2007 00:06:23|Einstein@Home|[error] einstein_S5R2 not responding to screensaver, requesting exit
28/06/2007 00:06:24|Einstein@Home|Task h1_0493.15_S5R2__280_S5R2c_0 exited with zero status but no 'finished' file
28/06/2007 00:06:24|Einstein@Home|If this happens repeatedly you may need to reset the project.
28/06/2007 00:06:24|Einstein@Home|Restarting task h1_0493.15_S5R2__280_S5R2c_0 using einstein_S5R2 version 424
It doesn't look too relevant to me and I'm not sure where it comes from (maybe from having a second screen connected to my notebook at that time and set to "primary") but I thought I'd get back anyway just to be on the save side. Both WUs seem to be crunching away normally now.
First WU with 4.24 completed and validated OK even with WinXP paired with Darwin wingman. Running AMD XP2600+ WinXP against wingman with Intel core 2/T5600 Darwin 8.9.1.
Appears to be using the same datapack as pre-4.24 app with time decrease of 28500 sec (approx 24% better)...that's just under 8 hrs less time/WU. Nice improvement on the AMD/Windows penalty. CPU does NOT have SSE2 capability, only SSE.
RE: Interesting questions
)
I have an "interesting question". I tried to get VTune up and working on my machine to answer it, but since it is an AMD processor, it squawked about the processor architecture...and on top of that, I have no idea how to use the blessed thing... The C++ DLL that I worked on was not a performance drag (credit card auth on tcp/ip usually happened very quickly), so it was never "tuned"...
Sooooo.....
What is the effect that happens when you "ABC" again? Is that working against the modf() -> ftol() change, or is there still some activity going to modf() despite the change, meaning there's another "buggy detection" different from the one that was already worked around, or is that string changing some other function?
Brian
RE: RE: Interesting
)
My VTune trial license expired...
Anyway...the effect of the "ABC" patch is that on AMD CPUs that supports SSE2, a global flag in the runtime lib is set differently. This flag toggles the (usually) faster SSE2 codepath for several functions, not just modf. What Bernd did was to rewrite the code in the hot-loop so that it would no longer call modf but ftol, for which, in VS 2003, only one code path exists which is reasonable fast. The slow codepath will continue to be executed in the new Win apps, but no longer in the hot-loop, as I understand it, so the overall effect of "ABC"ing the app should be much smaller now.
CU
H-B
RE: This flag toggles the
)
Gotcha... Yeah, it doesn't have as much octane as on 4.17... I really hope Barcelona/Agena (Phenom...btw, IMO, silly name) will at least get AMD back onto a level playing field from an architecture standpoint...
RE: This may sound like a
)
There is no such thing as a silly question.
This is a very good question, and difficult to answer indeed.
"Science" (as it applies here) is based on mathematics, which is based on an ideal world: values are continuous, spaces are infinite etc.
Calculations performed on real-world machines (computers) are not like this: resources (memory, time) are limited, and so is precision, which means values are discrete. In this sense every (non-trivial real-number) calculation done on a computer is wrong wrt. the ideal model the implementation is based on. However, in many (hopefully most) cases the difference ("error") is neglectable.
Although every "computation" as mentioned is "wrong", i.e. differs from the mathematical idea, the difference itself varies between the systems the calculations are done with (CPUs, compilers, libraries etc.). A way to make all computations at least wrong in the same way is to set a standard for the way they are performed, independent of the properties named above. This was tried in IEEE 754.
Almost all "systems" have some way of enforcing calculations conforming to this standard. However, most modern processors have evolved beyond this standard and e.g. implemented ways to accelerate their own understanding of floating-point arithmetic, so enforcing "IEEE arithmetic" is still possible, but would noticeably slow down the computation compared to the systems "native" way.
So for us a way to ensure cross-platform compatibility would be to use IEEE arithmetic (and it would make the various CPUs truly comparable), but it would generally slow down the computation. For a project whose success (i.e. probability of detecting a gravitational wave) depends so much on the "computing power" (here: the number of computations done) this would have a severe impact, too.
Definitely not. In principle all results have been helpful, even though they didn't pass validation. We will need to adjust the App and/or the validator to make the good results pass validation, regardless of the platform they have been calculated on.
BM
BM
RE: Interesting questions
)
Fully true. There actually are two types of "signal injections" already done: "hardware injections" that actually affect the test masses of the detector (sometimes used for calibrations, too), testing the whole pipeline from detector to data analysis. There are "software injections", too, where fake pulsar signals are added by software to the data that has been recorded from the detector.
There should be more detailed descriptions of this in the S3 results report (available through a link from the front page).
This, however, is beyond the scope of a single workunit, and thus does not help for technically validating individual results.
BM
BM
RE: RE: For the curious:
)
Well, this is just for testing getting the symbols from the symbol store, so only if you're really curious. It will happen right at the beginning, so shouldn't waste computing time. You should probably set the project to "no new work" before you try this, and "allow more work" after you removed the file in order not to trash too many results. The result should look like this result, in particular you should find "PDB Symbols Loaded".
BM
BM
RE: RE: RE: For the
)
BTW, Bernd, did you notice this message from Gary Roberts? http://einsteinathome.org/node/192856&nowrap=true#70886
It seems the l1_* files never get deleted, slowly filling up the disks of hosts until the quota is reached, effectively shutting down work for Einsein@Home after some time. Could be responsible for some hosts dropping out of E@H.
CU
BRM
1st WU finished w/o incident,
)
1st WU finished w/o incident, and valid. My host was the 3rd to finish, zeroing out the credits of a Linux box. I do hope it wasn't you again, Gary :-(. I was lucky to get away alive with it last time...
CU
BRM
Hey Bernd, got some odd
)
Hey Bernd, got some odd errors for my WU and wanted to check back:
It doesn't look too relevant to me and I'm not sure where it comes from (maybe from having a second screen connected to my notebook at that time and set to "primary") but I thought I'd get back anyway just to be on the save side. Both WUs seem to be crunching away normally now.
[edited for spelling]
First WU with 4.24 completed
)
First WU with 4.24 completed and validated OK even with WinXP paired with Darwin wingman. Running AMD XP2600+ WinXP against wingman with Intel core 2/T5600 Darwin 8.9.1.
Appears to be using the same datapack as pre-4.24 app with time decrease of 28500 sec (approx 24% better)...that's just under 8 hrs less time/WU. Nice improvement on the AMD/Windows penalty. CPU does NOT have SSE2 capability, only SSE.
Seti Classic Final Total: 11446 WU.