Einstein@Home for 64-bit Linux on AMD Athlon 64 X2

ebahapo

Joined: 22 Jan 05

Posts: 47

Credit: 755276

RAC: 0

Correction: the x86-64 Linux

20 Feb 2007 15:43:03 UTC

Message 43305 in response to message 43304

(moderation:

)

Correction: the x86-64 Linux client, version 5.8.11, can be downloaded from boinc_5.8.11_x86_64-pc-linux-gnu.tgz (make sure to copy both files to the BOINC working directory). The new x64 Windows client, version 5.8.11, by Crunch3r, can be found at boinc_5.8.11_windows_amd64.zip.

Update on project applications:

* Native 64-bit Application Sent to AMD64 Clients
- * SIMAP (Linux)
  * Chess960 (Linux)
  *

ABC (Linux)
* ABC ÃŸ (Linux & Windows)
* Predictor (Linux)
* RieselSieve (Linux)
* 32-bit Application Sent to AMD64 Clients

* SETI & SETI ÃŸ (Linux)
* HashClash (Linux & Windows)
* Leiden (Linux)
* Malaria (Linux)
* Docking (Linux)
* RieselSieve (Windows)
* WCG (Linux)
*

Pirates (Linux)
For more information, see BoincStats Forum.

HTH

clownius

Joined: 16 Jun 06

Posts: 42

Credit: 2164665

RAC: 0

I would really really like to

12 Mar 2007 10:29:34 UTC

Message 43306

(moderation:

)

I would really really like to see x86_64 supported by Einstein in some form. Native app would be best but short term a 32 bit app issued to 64 bit would be good still.
I tried just about everything to get the app working with an app info on my C2D with no luck and its stopping my fastest 2 cores from crunching Einstein during AA6.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4273

Credit: 245208851

RAC: 13214

RE: I'm not sure if they

12 Mar 2007 22:12:45 UTC

Message 43307 in response to message 43303

(moderation:

)

Quote:

I'm not sure if they made it into the official version, but Akos was experimenting with hotloops that used both SSE and 387 instructions to process data in parallel. If they were put into the deployed version, disabling the 387 would be a significant performance hit for an x86-64 native app.

The current Einstein App uses this method, i.e. doing "more contributing" parts of the calculation in high precision (80bit on FPU) while doing the rest in single precision (SSE). For the current setup doing everything in single precision isn't precise enough.

This complicated way of calculation, btw, is the reason why I couldn't simply compile a (native) 64bit App of the current code.

We are working on the code for S5R2, and it looks like it will become a lot cleaner, and probably everything in the "inner loop" can be done in single precision, so it will be a little faster and it should also be easier to build native 64bit Apps (yes, we do care).

ebahapo

Joined: 22 Jan 05

Posts: 47

Credit: 755276

RAC: 0

RE: The current Einstein

12 Mar 2007 22:38:39 UTC

Message 43309 in response to message 43307

(moderation:

)

Quote:

The current Einstein App uses this method, i.e. doing "more contributing" parts of the calculation in high precision (80bit on FPU) while doing the rest in single precision (SSE). For the current setup doing everything in single precision isn't precise enough...

We are working on the code for S5R2, and it looks like it will become a lot cleaner, and probably everything in the "inner loop" can be done in single precision, so it will be a little faster and it should also be easier to build native 64bit Apps (yes, we do care).

Good to know!

But let me correct you in that although SSE supports only single-precision, SSE2 supports double-precision too. Of course, if Einstein really needs to use x87's extended-precision 80-bit, that's the only way to go.

And in case someone is wondering whether using SSE/SSE2 code side-by-side with x87 code is faster, it isn't, as both SSE/SSE2 and x87 share the same FPU, only through different interfaces.

HTH

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4273

Credit: 245208851

RAC: 13214

RE: But let me correct you

12 Mar 2007 22:46:32 UTC

Message 43310 in response to message 43309

(moderation:

)

Quote:

But let me correct you in that although SSE supports only single-precision, SSE2 supports double-precision too. Of course, if Einstein really needs to use x87's extended-precision 80-bit, that's the only way to go.

I know that there is SIMD support for double precision, but 1) there are (or at least were at time of coding) much more machines that could do SSE but coudn't run SSE2 than that could run both, and 2) (re-)aligning the data for double precision SIMD calculation ate up all speed we would gain from doing the just four FPU calculations in two double precision SSE2 calculations. It simply wasn't worth the effort. [Edit] Modern CPUs with their "virtually two FPUs" (another interface to the same physical unit) will combine the FPU calculations for us anyway.

Webmaster Yoda

Joined: 15 Mar 05

Posts: 17

Credit: 608427

RAC: 0

This is all a bit technical

13 Mar 2007 4:21:56 UTC

Message 43311

(moderation:

)

This is all a bit technical for me, but is it safe to assume that any CPU currently capable of 64 bit supports at least SSE2?

In other words, would the problem with SSE vs SSE2 support be irrelevant for a 64 bit app?

I too have a 64bit (Core 2 Duo) machine that normally runs 64 bit Ubuntu. It's temporarily running Windows so it can participate at Einstein but I would much rather run 64 bit Linux (so it can do a lot of work at projects that have a fast 64 bit app).

Join the #1 Aussie Alliance on Einstein

Metod, S56RKO

Joined: 11 Feb 05

Posts: 135

Credit: 809775752

RAC: 63259

It's not too complicated to

13 Mar 2007 12:35:20 UTC

Message 43312

(moderation:

)

It's not too complicated to get current Einstein running under AMD64 linux. Details are highlited in this thread.

Metod ...

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3562358667

RAC: 197

RE: This is all a bit

13 Mar 2007 21:58:06 UTC

Message 43313 in response to message 43311

(moderation:

)

Quote:

This is all a bit technical for me, but is it safe to assume that any CPU currently capable of 64 bit supports at least SSE2?

In other words, would the problem with SSE vs SSE2 support be irrelevant for a 64 bit app?

Hardware support would be 100% for SSE2, but that wouldn't change the complexity of the software and of having to maintain more concurrent versions of it. The SSE to SSE2 port would still require just as much effort to carry out, and if sufficiently different, more work to maintain as well. AFAIK the only major difference across the codebase for different platforms is the x86 versions having assembler hotloops instead of c++. Different alignment requirements would require more widespread changes, and from the Akos client days of s4 there was extremely little performance gained from the change.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4273

Credit: 245208851

RAC: 13214

I actually made a SSE2

14 Mar 2007 11:04:00 UTC

Message 43314

(moderation:

)

I actually made a SSE2 version once, not modifying the "hot loop", but other parts of the program (sin/cos LUT). It didn't gain much on some CPUs and was much slower on others (Akos said there _might_ be some advantage on Woodcrests). And yes, it required to rearrange the data for a larger part of the program. At that time, the hazzle of maintaining (and deploying) yet another different version of the code wasn't worth the minimal speedup on only a few CPUs.

For the techs: For the current Apps we maintain four ("production"-) versions of the source code (for the central function, BOINC and graphics is C++, the rest is plain vanilla C):
- Hand-coded Assembler used for all x86 CPUs capable of SSE
- Hand-coded Assembler for x87 calculations (for x86 CPUs that can't do SSE)
- An AltiVec version using Motorola's C/C++-API to AltiVec instructions
- A generic C version that runs on all other CPUs such as G3, MIPS and SPARC

ebahapo

Joined: 22 Jan 05

Posts: 47

Credit: 755276

RAC: 0

Thanks for the detailed

14 Mar 2007 18:49:07 UTC

Message 43315

(moderation:

)

Thanks for the detailed explanation.

Then again, it shouldn't be too hard to send the 32-bit application to the 64-bit clients, as the number of projects already doing this confirm it.

Einstein@Home for 64-bit Linux on AMD Athlon 64 X2

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner