Einstein support OpenCL ?

Elphidieus

Joined: 20 Feb 05

Posts: 245

Credit: 20603702

RAC: 0

RE: BUt, as experience on

6 Apr 2009 6:50:57 UTC

Message 90702 in response to message 90701

(moderation:

)

Quote:

BUt, as experience on GPU Grid shows, higher end cards are sometimes the only way to get the work done. That said, the run time of the CURRENT tasks on Einstein takes about 6-20 hours on typical systems ... with a 9800GT card or above I would expect this to drop to about about 30 minutes to an hour on the aforementioned 9800GT...

Now that's a headstart, since i'm planning to invest small on a 9800GTX+ hoping to kext it on a Mac Pro, then again I can Boot Camp with it in case the Mac GPU client fails to surface....

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3562358667

RAC: 0

Paul: The danger with that

6 Apr 2009 10:57:48 UTC

Message 90703

(moderation:

)

Paul: The danger with that calculation is that GPUgrid is a single precision app. Einstien is double precision. GF cards are 8x slower in double precision. (I've no idea about ATI) This means that if the einstien team can't rework thier app to run in single precision without losing needed levels of output quality it'll be running much slower.

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 756103584

RAC: 1152002

RE: Paul: The danger with

6 Apr 2009 11:09:30 UTC

Message 90704 in response to message 90703

(moderation:

)

Quote:

Paul: The danger with that calculation is that GPUgrid is a single precision app. Einstien is double precision.

I would not qualify E@H as "double precision". It is true that E@H uses double precision arithmetic in some places, but it also does a whole lot of computation in single precision.

If you compare the performance of the variant of the E@H app that is optimized for SSE (SIMD instructions are single precsion only) to the one optimzed for SSE2 amd higher (capable of double precision SIMD), you'll see that the difference in performance is not that great. You should be able to get quite a lot of boost even from single precision GPU code.

That's for the S5R5 gravitational wave pulsar search. Note that there's now also the Arecibo EM binary pulsar search here at E@H. This app could benefit from GPUs as well, I guess.

CU
Bikeman

John Clark

Joined: 4 May 07

Posts: 1087

Credit: 3143193

RAC: 0

That's a pity, as the ATI HD

6 Apr 2009 11:21:49 UTC

Message 90705

(moderation:

)

That's a pity, as the ATI HD 38xx and 48xx series seem to be good at double precision, which is why they are so heavily used at MW. That project is heavily reliant on double precision, as I understand.

Certainly the processing speed up is well worth the graphics care upgrade (assuming WU feed is not an issue).

Using the old AGP HD38xx GPU will bring a six year old P4 PC up to the output of a 25% overclocked unlocked Penryn extreme quad.

Shih-Tzu are clever, cuddly, playful and rule!! Jack Russell are feisty!

Paul D. Buck

Joined: 17 Jan 05

Posts: 754

Credit: 5385205

RAC: 0

If the code at EaH is "heavy"

6 Apr 2009 15:49:44 UTC

Message 90706

(moderation:

)

If the code at EaH is "heavy" with double precision code then, well, the number of CUDA cards that will be usable goes down like a rock...

But we are all speculating in a vacuum, with no actual information or application to pick apart. Which is why I am patiently waiting until there is an actual application before I get that interested in making a change to the GPUs I have on hand ... :)

hoarfrost

Joined: 9 Feb 05

Posts: 207

Credit: 103979857

RAC: 65517

In "NVIDIA CUDA Programming

6 Apr 2009 16:51:41 UTC

Message 90707

(moderation:

)

In "NVIDIA CUDA Programming Guide 2.0" in "Appendix A. Technical Specifications", in section A.1 "General Specifications" I found, that "Compute capability" equal to 1.3 have only:
GeForce GTX 280
GeForce GTX 260
Tesla S1070
Tesla C1060

And in section A.1.4 "Specifications for Compute Capability 1.3" I read:

Quote:

Support for double-precision floating-point numbers.

If I understand right, devices "smaller" than GTX260/280 and Tesla S1060/C1060 cannot operate with double precision numbers and "truncate" operands to float precision type.

P.S. But for ATI Radeons we have a different sitiation. Or not?

kimmerin

Joined: 29 Sep 08

Posts: 16

Credit: 11090767

RAC: 0

RE: Paul: The danger with

6 Apr 2009 21:36:08 UTC

Message 90708 in response to message 90703

(moderation:

)

Quote:

Paul: The danger with that calculation is that GPUgrid is a single precision app. Einstien is double precision. GF cards are 8x slower in double precision. (I've no idea about ATI)

Have a look at Milkyway@Home where an ATI-optimized application is available. A Working unit that takes 20 minutes with a Core2 with 2.4 GHz and a SSSE3-optimited application is crunched within 26 seconds using an ATI Radeon HD4850. See my statistic-page for this. If you wonder about the drop in WUs being processed. The server at M@H is simply not able to create enough WUs for the high demand due to all the ATI users flooding that project.

M@H is using double precision as well, so assuming the same factor of optimization with Einstein at Home, crunching a WU with above GPU should be finished within 5 to 10 minutes.

Regards, Lothar

Paul D. Buck

Joined: 17 Jan 05

Posts: 754

Credit: 5385205

RAC: 0

RE: P.S. But for ATI

6 Apr 2009 21:40:22 UTC

Message 90709 in response to message 90707

(moderation:

)

Quote:

P.S. But for ATI Radeons we have a different sitiation. Or not?

Well, NOT for 38x0 and 48x0 cards .. but yes likely for all others ...

In other words, if you need lots of double precision you need a 38 or 48 class card and all other need not apply ...

This is probably one of the more common questions on GPU Grid regards to Nvidia cards and MW for ATI ...

The truth is that to do GPU computing you need a card, in general, that costs more than $100 to even get into the game to start at the low end. If you want to do serious work, well, right off you need to start thinking of a card in the $200+ range ... Domination required true commitments of cash ... :)

kimmerin

Joined: 29 Sep 08

Posts: 16

Credit: 11090767

RAC: 0

RE: The truth is that to do

6 Apr 2009 21:46:00 UTC

Message 90710 in response to message 90709

(moderation:

)

Quote:

The truth is that to do GPU computing you need a card, in general, that costs more than $100 to even get into the game to start at the low end. If you want to do serious work, well, right off you need to start thinking of a card in the $200+ range ... Domination required true commitments of cash ... :)

Sure, but you replace 60 regular computers with one GPU, so even $200+ is not too much.

Regards, Lothar

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4330

Credit: 251470102

RAC: 36353

RE: If I understand right,

7 Apr 2009 1:44:11 UTC

Message 90711 in response to message 90707

(moderation:

)

Quote:

If I understand right, devices "smaller" than GTX260/280 and Tesla S1060/C1060 cannot operate with double precision numbers and "truncate" operands to float precision type.

Nope. For these the double precision operations will be emulated by the software using multiple single precision operations, which is way slower than on GPUs supporting double precision on hardware, but will still give correct double precision results.

The core functions of the "HierarchicalSearch" are all single precision, there are a few variables that collect many small numeric values which will add up to a large error if simply switched to single precision. But in the code we use since S5R2 the use of such double precision variables has been purposefully reduced to a minimum, I don't think they will be the limiting factor.

Einstein support OpenCL ?

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner