The debug code wasn't so much removed but changed so that it is more useful (and happens to be faster). The focus of the Win beta tests are still on stability, I guess that's why the changelists don't dwell on performance so much.
Do you know if there has been more progress on finding out the wide variations in runtime?
Hi!
There has been significant progress, see this message.
It seems they sorted out the bugs, so is there anybody working on the optimization for MacOS X PPC?
I hope :)
I wait about a new PPC app ; the actually app for PPC is a bit slow (G5 15-20% slower at the same clock frequency than Athlons64 ) compared to the old app S5R2 app (G5 15-20% faster than Athlons64 at the same clock frequency )
Optimized app for MAC OS X intel and windowsX86 is ready about a month now , i begin to be impatient :)
It seems they sorted out the bugs, so is there anybody working on the optimization for MacOS X PPC?
I hope :)
I wait about a new PPC app ; the actually app for PPC is a bit slow (G5 15-20% slower at the same clock frequency than Athlons64 ) compared to the old app S5R2 app (G5 15-20% faster than Athlons64 at the same clock frequency )
Optimized app for MAC OS X intel and windowsX86 is ready about a month now , i begin to be impatient :)
First, and most importantly, the only SSE optimized application is for Intel MacOS X. The only speedup that the Windows app has seen is the removal of some debug code and a different way of handling a math error that was faster than what was being done previously. That is not true "optimization". It is more of a "removal of a slowdown" than anything else. The other apps may see similar improvements, which brings us to...
Second, the AMD/Windows combination suffered a performance penalty compared to AMD/Linux back on S5R2. I think this penalty still exists, although the person I used to try to compare against, FalconFly, no longer runs BOINC on their AMD/Linux machines (don't know why, and haven't asked). I'm talking about a penalty across the same processor architecture. For example, an Athlon64 3200+ would perform significantly better with the Linux application than the same identical processor running the Windows application. This delta was around 15%, if I remember correctly... I would like to see that gap closed some...
I am tired of this no-information-policy. Just reduced my CPU share for Einstein@Home from 50% to 10%. Will be back once they offer a more efficient Mac PPC app (when they switched from S5R2 to S5R3 my RAC dropped from 1,000 to less than 700 with the exact same settings as before)
I should definitely build a new PPC App from current code. Some recent changes in BOINC prevented me from doing so, and also the fact that the Apps I made from current code ran slightly slower than the first S5R3 Apps.
I hope that I can get a new PPC App out of the door this week, but I wouldn't expect it to be noticeably faster than the current one. I intend to publish a "power App" for a limited range of machines (something like G5 w. MacOS >= 10.4) once I got the AltiVec code working for the current code (which it isn't yet).
einstein@home will start the S5R4 before having an S5R3 optimized app for PPC.
The most surprising is that it took 3 months to get an S5R3 optimized app for Windows x86 that represent 90% of einstein users while x86 MacOS X users have an S5R3 optimized app since more than 3 months that represent 3% of einstein users.
einstein@home will start the S5R4 before having an S5R3 optimized app for PPC.
The most surprising is that it took 3 months to get an S5R3 optimized app for Windows x86 that represent 90% of einstein users while x86 MacOS X users have an S5R3 optimized app since more than 3 months that represent 3% of einstein users.
Which makes no sense !
S5R3 was split into S5R3 and S5R3b (to start now) for technical reasons, S5R4 is still several month in the future.
From a perspective of minimizing risk to the project, it does make sense to roll out new features first for the lesser used platforms, tho. There were also technical reasons (mostly related to compilers used) why this particular optimization was first released on Mac OS-i686, and then on Linux-i686.
All the Intel apps use the same SSE code, which is handcrafted assembly code. Of course the PPC has a completely different vector unit.
I think beta app are here to minimize risks no ? i'm surprized By your argument because it seems to me That you do not proceed in this way before.
I fully understand that there are technical problems for PPC app
But the difference between x86 app for Macos X /Linux or Windows is very
Tenuous so i d'ont understand why it takes more than 3 months to recompile for windows !
RE: RE: The debug code
)
Hi!
There has been significant progress, see this message.
CU
Bikeman
It seems they sorted out the
)
It seems they sorted out the bugs, so is there anybody working on the optimization for MacOS X PPC?
RE: It seems they sorted
)
I hope :)
I wait about a new PPC app ; the actually app for PPC is a bit slow (G5 15-20% slower at the same clock frequency than Athlons64 ) compared to the old app S5R2 app (G5 15-20% faster than Athlons64 at the same clock frequency )
Optimized app for MAC OS X intel and windowsX86 is ready about a month now , i begin to be impatient :)
RE: RE: It seems they
)
First, and most importantly, the only SSE optimized application is for Intel MacOS X. The only speedup that the Windows app has seen is the removal of some debug code and a different way of handling a math error that was faster than what was being done previously. That is not true "optimization". It is more of a "removal of a slowdown" than anything else. The other apps may see similar improvements, which brings us to...
Second, the AMD/Windows combination suffered a performance penalty compared to AMD/Linux back on S5R2. I think this penalty still exists, although the person I used to try to compare against, FalconFly, no longer runs BOINC on their AMD/Linux machines (don't know why, and haven't asked). I'm talking about a penalty across the same processor architecture. For example, an Athlon64 3200+ would perform significantly better with the Linux application than the same identical processor running the Windows application. This delta was around 15%, if I remember correctly... I would like to see that gap closed some...
I am tired of this
)
I am tired of this no-information-policy. Just reduced my CPU share for Einstein@Home from 50% to 10%. Will be back once they offer a more efficient Mac PPC app (when they switched from S5R2 to S5R3 my RAC dropped from 1,000 to less than 700 with the exact same settings as before)
See you sooner or later ...
Any news? Bernd
)
Any news?
Bernd Machenschalk:
And another month has gone by
)
And another month has gone by without any news...
4 months has gone
)
4 months has gone ...
einstein@home will start the S5R4 before having an S5R3 optimized app for PPC.
The most surprising is that it took 3 months to get an S5R3 optimized app for Windows x86 that represent 90% of einstein users while x86 MacOS X users have an S5R3 optimized app since more than 3 months that represent 3% of einstein users.
Which makes no sense !
RE: 4 months has gone
)
S5R3 was split into S5R3 and S5R3b (to start now) for technical reasons, S5R4 is still several month in the future.
From a perspective of minimizing risk to the project, it does make sense to roll out new features first for the lesser used platforms, tho. There were also technical reasons (mostly related to compilers used) why this particular optimization was first released on Mac OS-i686, and then on Linux-i686.
All the Intel apps use the same SSE code, which is handcrafted assembly code. Of course the PPC has a completely different vector unit.
CU
Bikeman
The SSE code is assembly code
)
The SSE code is assembly code ? or C/C++ code ?
I think beta app are here to minimize risks no ? i'm surprized By your argument because it seems to me That you do not proceed in this way before.
I fully understand that there are technical problems for PPC app
But the difference between x86 app for Macos X /Linux or Windows is very
Tenuous so i d'ont understand why it takes more than 3 months to recompile for windows !