the eah staff hand optimized an ~2x speedup over the compiled ppc code, akos has reached ~7x on x86. Maybe you should thing about buying him a mac. :) Seriously though, he wasn't able to orient himself on the x86 linux app, I strongly suspect PPC asm will be far worse.
the eah staff hand optimized an ~2x speedup over the compiled ppc code, akos has reached ~7x on x86. Maybe you should thing about buying him a mac. :) Seriously though, he wasn't able to orient himself on the x86 linux app, I strongly suspect PPC asm will be far worse.
Seriously, I'll be thinking more about having him porting his optimization to the x86-based macs instead....
the eah staff hand optimized an ~2x speedup over the compiled ppc code, akos has reached ~7x on x86. Maybe you should thing about buying him a mac. :) Seriously though, he wasn't able to orient himself on the x86 linux app, I strongly suspect PPC asm will be far worse.
:) These processor based systems aren't too different, so i'm sure that the optimization is possible on any systems. I modified the win app because i use win-x86 systems and it was the easiest way to me to have fun in crunching. Of course, i have some other ideas (e.g. a cheap PCI-bus accelerator), but they would need lots of time and i don't have it. Pure software developing gives "fun" much faster, especially on win-x86. Sorry...
the eah staff hand optimized an ~2x speedup over the compiled ppc code, akos has reached ~7x on x86. Maybe you should thing about buying him a mac. :) Seriously though, he wasn't able to orient himself on the x86 linux app, I strongly suspect PPC asm will be far worse.
:) These processor based systems aren't too different, so i'm sure that the optimization is possible on any systems. I modified the win app because i use win-x86 systems and it was the easiest way to me to have fun in crunching. Of course, i have some other ideas (e.g. a cheap PCI-bus accelerator), but they would need lots of time and i don't have it. Pure software developing gives "fun" much faster, especially on win-x86. Sorry...
Let's crunch! ;-)
How I wish there were dedicated Mac crunchers here in Einstein who possess your prodigal wisdom as there were with SETI@home at MacNN....
with the exception of Bernd Maschenschalk of course...., I'm sure he's doing everything he can for the mac community....
(snip)We've been working with experienced programmers from Apple to make the AltiVec version as fast as we can, and of all official Apps it makes the most of every clock cycle of a CPU.(snip)
BM
Thank you, Bernd, for keeping up the communication as an official member.
May I ask, for a user like me it is difficult to understand why someone like Akos will not be granted a look over the source code for improvement given the fact what he had managed to pull out of the hat? Is it possible to elaborate a little bit why the project board chooses to release speed improvements that are a country mile off Akos' work?
Is the project team lacking the technical expertise to follow Akos methods? Is the project funding based on the fact that code MUST NOT leak?
Please understand, that I just find it VERY DIFFICULT to make sense of a decision (?) not to make the project more efficient by the factor of 7 and instead to stick with optimisation of 2 at best.
I honestly do not want to shift blame here, but if someone comes forward and offers me the chance of getting my work seven times more quickly done, than I don't turn around and say no. That's why my confusion. That's why my question.
Regards
Soenke
:
your thoughts - the ways :: the knowledge - your space
:
(snip)We've been working with experienced programmers from Apple to make the AltiVec version as fast as we can, and of all official Apps it makes the most of every clock cycle of a CPU.(snip)
BM
Thank you, Bernd, for keeping up the communication as an official member.
May I ask, for a user like me it is difficult to understand why someone like Akos will not be granted a look over the source code for improvement given the fact what he had managed to pull out of the hat? Is it possible to elaborate a little bit why the project board chooses to release speed improvements that are a country mile off Akos' work?
Is the project team lacking the technical expertise to follow Akos methods? Is the project funding based on the fact that code MUST NOT leak?
Please understand, that I just find it VERY DIFFICULT to make sense of a decision (?) not to make the project more efficient by the factor of 7 and instead to stick with optimisation of 2 at best.
I honestly do not want to shift blame here, but if someone comes forward and offers me the chance of getting my work seven times more quickly done, than I don't turn around and say no. That's why my confusion. That's why my question.
Regards
Soenke
Hi Soenke,
unfortunately I cannot find the post where someone offered help such as Akos did for x86. Could you point me to that post, please?
My current vector code has either a minor bug or a numerical instability, that shows up as occasional invalid results of the 4.55 Linux App. I want to have this fixed before I do anything else with that code, like using it for other platforms.
On Intel machines we currently use assembly code that avoids this problem. I'm afraid I'll not be able to do something about these invalid results from PPC code for the last remaining Workunits of the S4 run (estimated for one month), but this shouldn't occur in the next run anymore. A code that includes measurements to avoid these invalid results with the current workunits wouldn't run faster than the AltiVec code that's in the current official PPC Mac App.
the eah staff hand optimized
)
the eah staff hand optimized an ~2x speedup over the compiled ppc code, akos has reached ~7x on x86. Maybe you should thing about buying him a mac. :) Seriously though, he wasn't able to orient himself on the x86 linux app, I strongly suspect PPC asm will be far worse.
RE: the eah staff hand
)
Seriously, I'll be thinking more about having him porting his optimization to the x86-based macs instead....
RE: the eah staff hand
)
:) These processor based systems aren't too different, so i'm sure that the optimization is possible on any systems. I modified the win app because i use win-x86 systems and it was the easiest way to me to have fun in crunching. Of course, i have some other ideas (e.g. a cheap PCI-bus accelerator), but they would need lots of time and i don't have it. Pure software developing gives "fun" much faster, especially on win-x86. Sorry...
Let's crunch! ;-)
RE: RE: the eah staff
)
How I wish there were dedicated Mac crunchers here in Einstein who possess your prodigal wisdom as there were with SETI@home at MacNN....
with the exception of Bernd Maschenschalk of course...., I'm sure he's doing everything he can for the mac community....
Despite the problems I posted
)
Despite the problems I posted the new MacOS App for public beta testing, especially to get some feedback. See this thread.
BM
BM
RE: (snip)We've been
)
Thank you, Bernd, for keeping up the communication as an official member.
May I ask, for a user like me it is difficult to understand why someone like Akos will not be granted a look over the source code for improvement given the fact what he had managed to pull out of the hat? Is it possible to elaborate a little bit why the project board chooses to release speed improvements that are a country mile off Akos' work?
Is the project team lacking the technical expertise to follow Akos methods? Is the project funding based on the fact that code MUST NOT leak?
Please understand, that I just find it VERY DIFFICULT to make sense of a decision (?) not to make the project more efficient by the factor of 7 and instead to stick with optimisation of 2 at best.
I honestly do not want to shift blame here, but if someone comes forward and offers me the chance of getting my work seven times more quickly done, than I don't turn around and say no. That's why my confusion. That's why my question.
Regards
Soenke
:
your thoughts - the ways :: the knowledge - your space
:
RE: RE: (snip)We've been
)
Hi Soenke,
unfortunately I cannot find the post where someone offered help such as Akos did for x86. Could you point me to that post, please?
RE: My current vector code
)
Hi Bernd,
any progress with the new science app?
On Intel machines we
)
On Intel machines we currently use assembly code that avoids this problem. I'm afraid I'll not be able to do something about these invalid results from PPC code for the last remaining Workunits of the S4 run (estimated for one month), but this shouldn't occur in the next run anymore. A code that includes measurements to avoid these invalid results with the current workunits wouldn't run faster than the AltiVec code that's in the current official PPC Mac App.
BM
BM
Will there be room for
)
Will there be room for improvement on the PowerPC-based Mac App for the next run...?