Einstein support OpenCL ?

Elphidieus
Elphidieus
Joined: 20 Feb 05
Posts: 233
Credit: 15,849,722
RAC: 12,286

RE: BUt, as experience on

Message 90702 in response to message 90701

Quote:

BUt, as experience on GPU Grid shows, higher end cards are sometimes the only way to get the work done. That said, the run time of the CURRENT tasks on Einstein takes about 6-20 hours on typical systems ... with a 9800GT card or above I would expect this to drop to about about 30 minutes to an hour on the aforementioned 9800GT...

Now that's a headstart, since i'm planning to invest small on a 9800GTX+ hoping to kext it on a Mac Pro, then again I can Boot Camp with it in case the Mac GPU client fails to surface....

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1,357
Credit: 2,325,526,999
RAC: 3,214,980

Paul: The danger with that

Paul: The danger with that calculation is that GPUgrid is a single precision app. Einstien is double precision. GF cards are 8x slower in double precision. (I've no idea about ATI) This means that if the einstien team can't rework thier app to run in single precision without losing needed levels of output quality it'll be running much slower.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,516
Credit: 470,678,790
RAC: 54,162

RE: Paul: The danger with

Message 90704 in response to message 90703

Quote:
Paul: The danger with that calculation is that GPUgrid is a single precision app. Einstien is double precision.

I would not qualify E@H as "double precision". It is true that E@H uses double precision arithmetic in some places, but it also does a whole lot of computation in single precision.

If you compare the performance of the variant of the E@H app that is optimized for SSE (SIMD instructions are single precsion only) to the one optimzed for SSE2 amd higher (capable of double precision SIMD), you'll see that the difference in performance is not that great. You should be able to get quite a lot of boost even from single precision GPU code.

That's for the S5R5 gravitational wave pulsar search. Note that there's now also the Arecibo EM binary pulsar search here at E@H. This app could benefit from GPUs as well, I guess.

CU
Bikeman

John Clark
John Clark
Joined: 4 May 07
Posts: 1,087
Credit: 3,143,193
RAC: 0

That's a pity, as the ATI HD

That's a pity, as the ATI HD 38xx and 48xx series seem to be good at double precision, which is why they are so heavily used at MW. That project is heavily reliant on double precision, as I understand.

Certainly the processing speed up is well worth the graphics care upgrade (assuming WU feed is not an issue).

Using the old AGP HD38xx GPU will bring a six year old P4 PC up to the output of a 25% overclocked unlocked Penryn extreme quad.

Shih-Tzu are clever, cuddly, playful and rule!! Jack Russell are feisty!

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5,385,205
RAC: 0

If the code at EaH is "heavy"

If the code at EaH is "heavy" with double precision code then, well, the number of CUDA cards that will be usable goes down like a rock...

But we are all speculating in a vacuum, with no actual information or application to pick apart. Which is why I am patiently waiting until there is an actual application before I get that interested in making a change to the GPUs I have on hand ... :)

hoarfrost
hoarfrost
Joined: 9 Feb 05
Posts: 207
Credit: 53,668,476
RAC: 5,158

In "NVIDIA CUDA Programming

In "NVIDIA CUDA Programming Guide 2.0" in "Appendix A. Technical Specifications", in section A.1 "General Specifications" I found, that "Compute capability" equal to 1.3 have only:
GeForce GTX 280
GeForce GTX 260
Tesla S1070
Tesla C1060

And in section A.1.4 "Specifications for Compute Capability 1.3" I read:

Quote:
Support for double-precision floating-point numbers.

If I understand right, devices "smaller" than GTX260/280 and Tesla S1060/C1060 cannot operate with double precision numbers and "truncate" operands to float precision type.

P.S. But for ATI Radeons we have a different sitiation. Or not?

kimmerin
kimmerin
Joined: 29 Sep 08
Posts: 16
Credit: 11,090,767
RAC: 0

RE: Paul: The danger with

Message 90708 in response to message 90703

Quote:
Paul: The danger with that calculation is that GPUgrid is a single precision app. Einstien is double precision. GF cards are 8x slower in double precision. (I've no idea about ATI)

Have a look at Milkyway@Home where an ATI-optimized application is available. A Working unit that takes 20 minutes with a Core2 with 2.4 GHz and a SSSE3-optimited application is crunched within 26 seconds using an ATI Radeon HD4850. See my statistic-page for this. If you wonder about the drop in WUs being processed. The server at M@H is simply not able to create enough WUs for the high demand due to all the ATI users flooding that project.

M@H is using double precision as well, so assuming the same factor of optimization with Einstein at Home, crunching a WU with above GPU should be finished within 5 to 10 minutes.

Regards, Lothar

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5,385,205
RAC: 0

RE: P.S. But for ATI

Message 90709 in response to message 90707

Quote:
P.S. But for ATI Radeons we have a different sitiation. Or not?


Well, NOT for 38x0 and 48x0 cards .. but yes likely for all others ...

In other words, if you need lots of double precision you need a 38 or 48 class card and all other need not apply ...

This is probably one of the more common questions on GPU Grid regards to Nvidia cards and MW for ATI ...

The truth is that to do GPU computing you need a card, in general, that costs more than $100 to even get into the game to start at the low end. If you want to do serious work, well, right off you need to start thinking of a card in the $200+ range ... Domination required true commitments of cash ... :)

kimmerin
kimmerin
Joined: 29 Sep 08
Posts: 16
Credit: 11,090,767
RAC: 0

RE: The truth is that to do

Message 90710 in response to message 90709

Quote:
The truth is that to do GPU computing you need a card, in general, that costs more than $100 to even get into the game to start at the low end. If you want to do serious work, well, right off you need to start thinking of a card in the $200+ range ... Domination required true commitments of cash ... :)

Sure, but you replace 60 regular computers with one GPU, so even $200+ is not too much.

Regards, Lothar

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,031
Credit: 217,942,239
RAC: 50,948

RE: If I understand right,

Message 90711 in response to message 90707

Quote:
If I understand right, devices "smaller" than GTX260/280 and Tesla S1060/C1060 cannot operate with double precision numbers and "truncate" operands to float precision type.


Nope. For these the double precision operations will be emulated by the software using multiple single precision operations, which is way slower than on GPUs supporting double precision on hardware, but will still give correct double precision results.

The core functions of the "HierarchicalSearch" are all single precision, there are a few variables that collect many small numeric values which will add up to a large error if simply switched to single precision. But in the code we use since S5R2 the use of such double precision variables has been purposefully reduced to a minimum, I don't think they will be the limiting factor.

BM

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.