poor measured performance on Linux vs Windows (bad Compiler choices/options)

josep
josep
Joined: 9 Mar 05
Posts: 63
Credit: 1,156,542
RAC: 0

Well, I am new here, and I

Well, I am new here, and I have no relation with the developers (I am a simple home-PC user). But I remember a post from Bernd explaining that they have not still made the code public because this is not a simple task for them.

He says there is not an isolated code for the Einstein application. They use parts of the routines already developed for LSC. And they have to link, during compilation, code from two CVS trees at LSC and some extra code. They say that is not simple to gather all this source code and distribute it to users.

But all this code is not secret or hidden. Everybody is allowed to see it in LSC group CVS tree. You can read it by anonymous web access at http://www.lsc-group.phys.uwm.edu/

And there are several interesting things to see there. First af all, notice that all the LAL library is developed for Linux, and people in LSC group uses this OS in their own machines. So the scientific code of Einstein application is (I suppose) already optimized for Linux. Is the Windows code the one thas has been developed now for E@H (Bernd has build some windows libraries that allow building all the LAL code in Windows). So I suppose they are going to fix the problems soon.

The second thing you can see in this CVS is the ToDo file, for Einstein@home developers. There you can see that they have already planned studying differences in compilation beetween GCC and MSVC, and trying with Intel's compiler for Linux, too.

And they are also planning to public release E@H code, because in the ToDo file there are instructions for adapting the validator "before announcing the anonymous platform". It seems that the validator has to be rewritten to control more acurately the "unofficial" code results (applications built by users), to protect the database from malicious results, I suppose.

So, my personal conclusion is that they are already working on it. Perhaps a little more information of all this questions posted by developers regularly in this forum would be a very good thing...

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5,385,205
RAC: 0

> The subject was the BOINC

Message 8753 in response to message 8749

> The subject was the BOINC client. There is still a substantial disparity
> between OSs on the benchmarks - therefore the optimised clients are very
> worthwhile.

Yes, I did not dispute that. I looked at my PCs running Windows, then looked in the top performing computers and saw a fair number with runtimes that were longer, but not substantially more than mine.

> > Heck, my best performing system time
> > wise for almost all projects is my G5 which is running OS-X which is
> Linux
>
> OS X is not Linux. It is based on BSD Unix which is a very different animal.
> ;)

Just rechecked my sources, and you are right... oops ... it also is using part of OpenStep and Mach ...

> > As far as not releasing the client, that is their option.
>
> Of course it's their option - just as it is mine not to crunch Einstein. ;)

True, voting with your feet is an option ... I just hate to see you go ... :)

> If it's any consolation to anyone, Predictor is even worse to Linux users and
> that gets my thumbs-down too. :(

If nothing else, it should improve with time. Like I said in the other post, I think the optimization effort in SETI@Home is going to pay off for all of the projects, but that is not going to happen real soon I don't think ...

Seti-Cruncher
Seti-Cruncher
Joined: 9 Feb 05
Posts: 70
Credit: 7,114
RAC: 0

> > > As far as not releasing

Message 8754 in response to message 8753

> > > As far as not releasing the client, that is their option.
> >
> > Of course it's their option - just as it is mine not to crunch Einstein.
> ;)
>
> True, voting with your feet is an option ... I just hate to see you go ... :)

Well, I'm not completely gone. I'm still keeping an eye on things here - even if somewhat sporadically at the moment. ;)
>
> > If it's any consolation to anyone, Predictor is even worse to Linux users
> and
> > that gets my thumbs-down too. :(
>
> If nothing else, it should improve with time. Like I said in the other post,
> I think the optimization effort in SETI@Home is going to pay off for all of
> the projects, but that is not going to happen real soon I don't think ...

I am now very satisfied with my machines' performance in SETI - although that will not stop me looking for further improvement. ;) It can only happen in the other projects if they release the source code. You may have noticed that I have been a small part of the optimisation efforts in SETI. Unfortunately, my attempts to help here in Einstein have been ignored by devs and admins and, without the source code, there is nothing else I can do.

For the time being, the only option for the efficiency (and credit) aware Linux user is to run SETI with CPDN as a poor but acceptable backup.

Be lucky,

Neil

Oliver
Oliver
Joined: 24 Feb 05
Posts: 56
Credit: 129,576
RAC: 0

> You may have noticed that

Message 8755 in response to message 8754

> You may have noticed that I
> have been a small part of the optimisation efforts in SETI. Unfortunately, my
> attempts to help here in Einstein have been ignored by devs and admins and,
> without the source code, there is nothing else I can do.
Sure there is something you can do, dyslexia. Behave in your new dress or get filtered as fast as you can post.

Es gr

history
history
Joined: 22 Jan 05
Posts: 127
Credit: 7,573,923
RAC: 0

Just a comment here; I have a

Just a comment here; I have a 2 ghz athlon box doing nothing. Based on forum information (not just here), pushing ac power into this thing to run Boinc under SUSE is a waste of CPU cycles. Thank you Boikley. Long on software? I think not. Where's the kick ass Linux GUI?

Jordan Wilberding
Jordan Wilberding
Joined: 19 Feb 05
Posts: 162
Credit: 715,454
RAC: 0

If you use wine to run the

Message 8757 in response to message 8756

If you use wine to run the Windows binary in Linux, it's not really a waste at all.

> Just a comment here; I have a 2 ghz athlon box doing nothing. Based on forum
> information (not just here), pushing ac power into this thing to run Boinc
> under SUSE is a waste of CPU cycles. Thank you Boikley. Long on software? I
> think not. Where's the kick ass Linux GUI?
>

such things just should not be writ so please destroy this if you wish to live 'tis better in ignorance to dwell than to go screaming into the abyss worse than hell

Seti-Cruncher
Seti-Cruncher
Joined: 9 Feb 05
Posts: 70
Credit: 7,114
RAC: 0

> Sure there is something you

Message 8758 in response to message 8755

> Sure there is something you can do, dyslexia. Behave in your new dress or get
> filtered as fast as you can post.

Are you a complete idiot? If I were Lysdexia, why would I have been having a go at him/her? ;)

Be lucky,

Neil

Jure Repinc (JLP)
Jure Repinc (JLP)
Joined: 22 Jan 05
Posts: 13
Credit: 7,393,020
RAC: 0

I also hope the E@H

I also hope the E@H developers find out what is wrong with the Linux binary or even code for Linux. I know GCC usualy makes slower code then ICC or MSVC but the difference in E@H is just way to high and I guess there must be something wrong.

Maybe E@H developers could talk with some people from GCC project to find out what is going on and why Linux binaries are so slow. They could also try the new GCC 4.0.0, which produces quite a lot faster code in my experience with it. But be carefull as it can also miscompile a lot of code. But it should be worth trying it out.

It would also be nice if the developers could provide 64-bit binaries optimized for AMD64. This is also one area where speedups are usualy quite high for scientific apps.

It would be nice if some developer could comment on these issues here on the forum and inform us what re they doing about them. I guess they would like to get as much data back as possible and they should get out of our hardware as much as possible.

Wurgl (speak^Wcrunching for Special: Off-Topic)
Wurgl (speak^Wc...
Joined: 11 Feb 05
Posts: 321
Credit: 140,550,008
RAC: 0

> I also hope the E@H

Message 8760 in response to message 8759

> I also hope the E@H developers find out what is wrong with the Linux binary or
> even code for Linux. I know GCC usualy makes slower code then ICC or MSVC but
> the difference in E@H is just way to high and I guess there must be something
> wrong.
>
> Maybe E@H developers could talk with some people from GCC project to find out
> what is going on and why Linux binaries are so slow. They could also try the
> new GCC 4.0.0, which produces quite a lot faster code in my experience with
> it. But be carefull as it can also miscompile a lot of code. But it should be
> worth trying it out.

I analysed the code and compared the relevant code with the one from µ$.

Well the reason is the scheduling especially in combination with the stack architecture of x86 processors. Gcc creates nice code and schedules the instructions very sophisticated. It tries to schedule every instruction so, that the dependent operators are ready (different instructions use different number of cycles) at the time they are needed and also tries to take multiple units of the CPU in account and some more things too.

Now, this is very fine unless you have two independent dataflows of floating point data to compute in one block. When this happens, every second instruction does something with one dataflow and every other instruction does something with the other dataflow. This is very fine with integers and fine on every fscking architecture but x86. On x86 every floating point instruction must use the lowest stack register and some other for computation and the result is always in that lowest stack register. When then the next instruction is working on the second dataflow, the data must be swabbed which causes an additional statement (fxch).

The reason for this is just the excellent cripplet architecture of this CPU, which can be dated back to the first 8087, there is basically still the same set of instructions. And the reason is, that the folding from a flat architecture to a stack oriented is done by gcc after the scheduler did its work.

That's going on.

There are two ways to create better code. First change the code in the folding part, which actually means reschedule the instructions. -- This would sure cause a lot of new bugs in the compiler ...

Or, change the scheduler in a way that it prefers instructions which are in the same dataflow, to lower the probability of fxch statements. This is what I am currently trying to do.

Another way is to change the CPU description, but I doubt if all those internal instruction descriptions would get so complicated, that any optimizing and other path would simple be unable to handle them. So I think this is impossible too.

According to my sources, the Intel compiler does not create faster binaries on Linux, and gcc 4.0 does not too.

G Thomas Wilson
G Thomas Wilson
Joined: 5 Mar 05
Posts: 21
Credit: 2,311,944
RAC: 0

> According to my sources,

Message 8761 in response to message 8760

> According to my sources, the Intel compiler does not create faster binaries on
> Linux, and gcc 4.0 does not too.

In many cases, the Intel compiler indeed does create faster running code. I've used it on a few occasions and most things tend to run noticably faster. The only reason I don't use it exclusively is that I prefer to support the GNU guys.

As much as some (myself included) tend to knock gcc, they've really made some improvements over the years - Particularly where C++ is concerned. I just thought that needed said.

At any rate, I hope a solution can be found that doesn't cause too much inconvienience. Tis a great project and I really hated yanking my boxen off of it....But not as much as I hated the thoughts of fiddle-farting around with wine on 26 (oops..31) boxes. :)


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.