Gamma-ray pulsar binary search #1 on GPUs

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5842
Credit: 109411014500
RAC: 34957413

TimeLord04 wrote:Since the

TimeLord04 wrote:
Since the implementation of 1.15 onward, I've noticed a HUGE drop in RAC.  I've gone from a high of 103K RAC from the 1.12 and 1.14 Units to 83K RAC with 1.15 onward...  Anyone else notice this significant drop???

Comments about this have been made in several places already, including earlier in this thread.

In a nutshell, 693 has been the standard award for many previous iterations of CPU tasks in FGRPx searches.  The first GPU tasks have the same 'science payload' and were awarded the same 693.  Modern GPUs were crunching these very quickly, except for Windows hosts because of a problem building the app.  That was sorted.

Then along came tasks with 5x the 'science payload' - needed because of how quickly the standard tasks could be crunched on GPUs.  There was also some sort of efficiency gain for modern GPUs because people were reporting quite a bit less than 5x the crunch time - perhaps between 3x to 4x, although this is just a rough guess based on a limited amount of largely anecdotal evidence.  There have been several different awards for these but the latest is 1365 - not even 2x the standard CPU task award.  Most will be suffering a big RAC drop.

There may be problems with older NVIDIA GPUs like GTX 5xx and 6xx it would seem.  Some are blaming the Windows app still but I don't believe that's correct.  I've just started using the Linux app on both GTX 550Ti and GTX 650 GPUs and the performance seems rather poor.  It's probably something to do with NVIDIA's implementation of OpenCL.  I've yet to start playing with different driver versions to look for any improvement.  Modern NVIDIA GPUs seem to be OK so maybe it's a hardware deficiency or a combination of both hardware and driver.

Cheers,
Gary.

TimeLord04
TimeLord04
Joined: 8 Sep 06
Posts: 1442
Credit: 72378840
RAC: 0

Gary Roberts wrote:TimeLord04

Gary Roberts wrote:
TimeLord04 wrote:
Since the implementation of 1.15 onward, I've noticed a HUGE drop in RAC.  I've gone from a high of 103K RAC from the 1.12 and 1.14 Units to 83K RAC with 1.15 onward...  Anyone else notice this significant drop???

Comments about this have been made in several places already, including earlier in this thread.

In a nutshell, 693 has been the standard award for many previous iterations of CPU tasks in FGRPx searches.  The first GPU tasks have the same 'science payload' and were awarded the same 693.  Modern GPUs were crunching these very quickly, except for Windows hosts because of a problem building the app.  That was sorted.

Then along came tasks with 5x the 'science payload' - needed because of how quickly the standard tasks could be crunched on GPUs.  There was also some sort of efficiency gain for modern GPUs because people were reporting quite a bit less than 5x the crunch time - perhaps between 3x to 4x, although this is just a rough guess based on a limited amount of largely anecdotal evidence.  There have been several different awards for these but the latest is 1365 - not even 2x the standard CPU task award.  Most will be suffering a big RAC drop.

There may be problems with older NVIDIA GPUs like GTX 5xx and 6xx it would seem.  Some are blaming the Windows app still but I don't believe that's correct.  I've just started using the Linux app on both GTX 550Ti and GTX 650 GPUs and the performance seems rather poor.  It's probably something to do with NVIDIA's implementation of OpenCL.  I've yet to start playing with different driver versions to look for any improvement.  Modern NVIDIA GPUs seem to be OK so maybe it's a hardware deficiency or a combination of both hardware and driver.

Well my Win system is an AMD A6-6400K APU, Dual Core with 8GB DDR3 RAM.  I do NOT crunch on the CPU; ONLY on GTX-760 card.  System is just over 3 yrs old.

 

The MAC/Hackintosh is an Intel Quad Core Extreme QX9650 with 16GB DDR2 RAM.  Again, I do NOT crunch on the CPU; ONLY on the Dual GTX-750TI SC cards.  System came used from a couple of friends.  Motherboard is circa 2008/2009, RAM is new, CPU is used.

 

Both systems are air cooled using the CoolerMaster Hyper212 EVO.  The three GPUs crunch 2 Units at a time, each for a total of 6 Units at a time.  The MAC seems to be doing much better with a "lower end" GPU setup than the Win system...  Yet, the Win system is the one taking the heaviest beating.

 

Both systems crunch Einstein@Home 15 Hrs a day.  Currently running from 6 AM to 9 PM - Pacific.

 

As to Drivers; the Win system GTX-760 card is on 353.30 WHQL Driver.  The MAC is MORE INTERESTING... When using GPUs NOT normally recognized by MAC OS X; one MUST use the Alternate NVIDIA Driver to get the card working properly.  My Alternate NVIDIA Web Driver for MAC is some LONG WINDED thing starting with 346.xx.xxxx.xx; AND in addition to this Special Driver, one MUST install the SEPARATE NVIDIA CUDA Driver.  I'm currently on an 8.xx.xx Driver for CUDA.

 

The MAC CUDA Driver is used by my MAC for SETI@Home running TBar's CUDA75 App to crunch SETI MB Units.  The 346.xx.xxxx.xx Driver is using OpenCL everywhere else.

 

TL

TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join SETI Refugees

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513211304
RAC: 0

Gary Roberts wrote:...  It's

Gary Roberts wrote:
...  It's probably something to do with NVIDIA's implementation of OpenCL.  I've yet to start playing with different driver versions to look for any improvement.  Modern NVIDIA GPUs seem to be OK so maybe it's a hardware deficiency or a combination of both hardware and driver.

I would expect nVidia will push better performance on CUDA than OpenCL.

It is difficult to find suitable benchmarking between OpenCL and CUDA, (and then between Windows and Linux).  It seems to very much depend on "what you are doing"

My understanding is the majority of the FGRPB1G task workload is FFT in single precision, followed by some DP work which runs if possible on the GPU.

There is a (yes Linux) FFT test done here for GTX-1080 https://www.phoronix.com/scan.php?page=news_item&px=GTX-1080-OpenCL-vs-CUDA which shows even for the top end GPUs a significant difference. 

 

Betreger
Betreger
Joined: 25 Feb 05
Posts: 987
Credit: 1421930721
RAC: 768995

Gary Roberts wrote:TimeLord04

Gary Roberts wrote:
TimeLord04 wrote:
Since the implementation of 1.15 onward, I've noticed a HUGE drop in RAC.  I've gone from a high of 103K RAC from the 1.12 and 1.14 Units to 83K RAC with 1.15 onward...  Anyone else notice this significant drop???

Comments about this have been made in several places already, including earlier in this thread.

In a nutshell, 693 has been the standard award for many previous iterations of CPU tasks in FGRPx searches.  The first GPU tasks have the same 'science payload' and were awarded the same 693.  Modern GPUs were crunching these very quickly, except for Windows hosts because of a problem building the app.  That was sorted.

Then along came tasks with 5x the 'science payload' - needed because of how quickly the standard tasks could be crunched on GPUs.  There was also some sort of efficiency gain for modern GPUs because people were reporting quite a bit less than 5x the crunch time - perhaps between 3x to 4x, although this is just a rough guess based on a limited amount of largely anecdotal evidence.  There have been several different awards for these but the latest is 1365 - not even 2x the standard CPU task award.  Most will be suffering a big RAC drop.

There may be problems with older NVIDIA GPUs like GTX 5xx and 6xx it would seem.  Some are blaming the Windows app still but I don't believe that's correct.  I've just started using the Linux app on both GTX 550Ti and GTX 650 GPUs and the performance seems rather poor.  It's probably something to do with NVIDIA's implementation of OpenCL.  I've yet to start playing with different driver versions to look for any improvement.  Modern NVIDIA GPUs seem to be OK so maybe it's a hardware deficiency or a combination of both hardware and driver.

Gary I will install the latest driver, tomorrow because I have been to happy hour and I learned a long time ago alcohol and computer changes don't work well, and  see what it does. 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4704
Credit: 17550118744
RAC: 6433773

The Phoronix tests illustrate

The Phoronix tests illustrate pretty well the equivalence between OpenCL and CUDA EXCEPT for the FFT Single-Precision benchmarks.  The tester puts the difference down to the development of the CUDA code-paths being more mature than the OpenCL code-paths with hopes that the OpenCL development for OpenCL 2.0~2.1 code-paths in the future driver bring it closer to the CUDA driver.

 

Kailee71
Kailee71
Joined: 22 Nov 16
Posts: 35
Credit: 42623563
RAC: 0

Hi all, two quick

Hi all,

two quick questions;

1) is anybody else getting numerous "error while computing" results? See here. All my machines are getting this for the last couple of days now. This is GPU tasks only, both GTX580 and R9 280x.

2) I am now also beginning to use a hackintosh with a quadro k4000 (also gpu only); but this is running *much* slower than anticipated; getting runtimes around th 15000(s) where my R9 280x is getting less than 1/10th of that...

Any ideas on either of these? Tia,

 

Kailee.

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513211304
RAC: 0

nVidia have not announced any

nVidia have not announced any plans for OpenCL 2.0 so it might be some time.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4704
Credit: 17550118744
RAC: 6433773

Which is to be expected I

Which is to be expected I guess with their focus on their CUDA development.

 

Harri Liljeroos
Harri Liljeroos
Joined: 10 Dec 05
Posts: 3612
Credit: 2902632217
RAC: 1043864

Kai Leibrandt wrote:Hi

Kai Leibrandt wrote:

Hi all,

two quick questions;

1) is anybody else getting numerous "error while computing" results? See here. All my machines are getting this for the last couple of days now. This is GPU tasks only, both GTX580 and R9 280x.

2) I am now also beginning to use a hackintosh with a quadro k4000 (also gpu only); but this is running *much* slower than anticipated; getting runtimes around th 15000(s) where my R9 280x is getting less than 1/10th of that...

Any ideas on either of these? Tia,

 

Kailee.

Here is a link to one of my host with K4000 running win7. https://einsteinathome.org/host/11928053/tasks

It is taking about 13900 s for these longer tasks so the performance is very poor on this card in general.

Kailee71
Kailee71
Joined: 22 Nov 16
Posts: 35
Credit: 42623563
RAC: 0

Harri Liljeroos wrote:Kai

Harri Liljeroos wrote:
Kai Leibrandt wrote:

Hi all,

two quick questions;

1) is anybody else getting numerous "error while computing" results? See here. All my machines are getting this for the last couple of days now. This is GPU tasks only, both GTX580 and R9 280x.

2) I am now also beginning to use a hackintosh with a quadro k4000 (also gpu only); but this is running *much* slower than anticipated; getting runtimes around th 15000(s) where my R9 280x is getting less than 1/10th of that...

Any ideas on either of these? Tia,

 

Kailee.

Here is a link to one of my host with K4000 running win7. https://einsteinathome.org/host/11928053/tasks

It is taking about 13900 s for these longer tasks so the performance is very poor on this card in general.

Anything we can do about this? If not this machine will be taken off einstein again :-( But thanks for letting me know I'm not the only one.

Is anyone else having the error issues? I have not had any vaid results for 2-3 days now: https://einsteinathome.org/host/12464084/tasks

 

Thanks,

 

Kailee

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.