Gamma-ray pulsar binary search #1 on GPUs

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1593389004
RAC: 777042

I updated the driver on my

I updated the driver on my GTX660 from 335.23 to 376.33. The preliminary results do not look promising. Running 2 at a time I am projecting a run time of 5 hrs ea vs 4 hrs on the older driver. That wpuld be a real buzz kill.

TimeLord04
TimeLord04
Joined: 8 Sep 06
Posts: 1442
Credit: 72378840
RAC: 0

[RAC Update:] MAC has

[RAC Update:]

MAC has dropped 10K in about a week or so...  Now down to 53.9K RAC.

 

Win XP Pro x64 down 20+K since the 1.15 Units were introduced.  Now down to 22.59K RAC.

 

Just for the record(s)...

 

[Invalid(s) Update:]

MAC still has two current 1.14 Invalids showing.  No change there; no 1.17 Units have gone Invalid, yet.

 

TL

TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join SETI Refugees

mountkidd
mountkidd
Joined: 14 Jun 12
Posts: 176
Credit: 12614082555
RAC: 8013748

I too have been watching

I too have been watching productivity drop while some degree of stability returns to the FGR GPU world.  A summary of recent results from my hosts follows:

mountkidd-eah-stats-28dec16.jpg

 

I have a mix of Linux/Win7, AMD/NV, mostly GPU with a single stream of CPU on 3 hosts and E@H is the only project.  All 3x70k hosts are clocked at the same rate.  My 2 AMD hosts have been offline for the past 6 months.  I missed processing of BRP4G and don't have numbers for direct comparison so BRP6 data was used.

I did question the debate of Linux vs Windows and swapped cards between two of my hosts to get a reading.  Unlike two years ago where Linux/AMD had a 15% advantage for over Windows/AMD and was the motivation behind the conversion of my AMD hosts, Linux does show a hair stronger than Windows, but not enough to get excited.  Not an issue here for FGR GPU.

There seems to be a performance difference between the 570 & 670 generations of NV that show in the FGR GPU app.  And that might also mean that current NV generations respond differently as well.

My conclusion from all this is that the NV app is the issue.  CPU usage is abnormally high under both Linux & Windows and this is a GPU app.  NV/OpenCL just doesn't perform well.  So this raises a number of questions.  Why was this done as an OpenCl app rather than CUDA which is more NV friendly?  Is there outside NV/OpenCL expertise that the developers can tap into?  Do the developers have a plan B/C/D?

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4968
Credit: 18764200168
RAC: 7161504

Quote: My conclusion from

Quote:

My conclusion from all this is that the NV app is the issue.  CPU usage is abnormally high under both Linux & Windows and this is a GPU app.  NV/OpenCL just doesn't perform well.  So this raises a number of questions.  Why was this done as an OpenCl app rather than CUDA which is more NV friendly?  Is there outside NV/OpenCL expertise that the developers can tap into?  Do the developers have a plan B/C/D?

I believe it is a matter of developer resource availability.  It is a lot easier to just write an app in OpenCL once, which is cross-platform and OS agnostic and be done with it.  The performance differences can be attributed solely to the graphics card manufacturer API for OpenCL.  The performance of OpenCL is much better on AMD platforms than Nvidia because AMD is tied to the Khronos Group (developer of OpenCL) much closer than Nvidia which is still pushing their proprietary CUDA platform.

Frankly, I just don't see the developers spending anymore time on CUDA apps.  Isn't efficient.  Unless Nvidia makes more of a concerted effort in their OpenCL development work to match the parallel computing efficiency of the CUDA platform, we probably will always be at a compute disadvantage on Nvidia hardware. My $0.02 of prognostication.

 

cc
cc
Joined: 12 Dec 16
Posts: 4
Credit: 386939352
RAC: 0

Kai Leibrandt wrote: Is

Kai Leibrandt wrote:

Is anyone else having the error issues? I have not had any vaid results for 2-3 days now: https://einsteinathome.org/host/12464084/tasks

 Thanks,

 Kailee

Those tasks have this error:
"Got no suitable OpenCL device information from BOINC"

Might need to update your graphics driver.  I had a similar issue, but with an Nvidia card.

Kailee71
Kailee71
Joined: 22 Nov 16
Posts: 35
Credit: 42623563
RAC: 0

ac_3 wrote:Kai Leibrandt

ac_3 wrote:
Kai Leibrandt wrote:

Is anyone else having the error issues? I have not had any vaid results for 2-3 days now: https://einsteinathome.org/host/12464084/tasks

 Thanks,

 Kailee

Those tasks have this error:
"Got no suitable OpenCL device information from BOINC"

Might need to update your graphics driver.  I had a similar issue, but with an Nvidia card.

Hm was hoping not to change the driver as it's osx and not as trivial as windows for driver changes, in fact for the 280x machine there are no driver updates, only for nvidia. But when I'm at the machine again in a week or so I'll try that. As I can't do anything remotely it'll just have to wait until then :-(

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1593389004
RAC: 777042

Betreger wrote:I updated the

Betreger wrote:
I updated the driver on my GTX660 from 335.23 to 376.33. The preliminary results do not look promising. Running 2 at a time I am projecting a run time of 5 hrs ea vs 4 hrs on the older driver. That would be a real buzz kill.

The new driver proved to be about 20 min slower so I reinstalled the older driver, my throughput with this card is sad.

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1593389004
RAC: 777042

Well the old driver got the

Well the old driver got the GTX 660 back down to 4 hrs running 2 at a time so I didn't break anything. 

TimeLord04
TimeLord04
Joined: 8 Sep 06
Posts: 1442
Credit: 72378840
RAC: 0

12-31-2016 at 4:50 PM -

12-31-2016 at 4:50 PM - PST

 

WELL, in one day, I've DROPPED another 3K Total RAC LOSS from both systems combined!!!  Now at 73.6K RAC!!!!!

 

This is getting TIRESOME crunching 15 Hrs A DAY, and seemingly getting NOWHERE!!!!!  Used to have a High of 131K in the NOT so distant past on BRP6 and BRP4G combined.  When BRP6 died, I fell to roughly 103K RAC, (as mentioned prior), and now CAN'T even hold in the 90K's!!!

 

At this rate, I MAY NO LONGER be able to support 15 Hours of crunching a day!  Electric Bills are hitting $500 and $600 a month here, AND I'M NOT PAYING THE BILL; DAD IS!!!  I have to justify my activities with the Bill Payer!

 

I stopped crunching SETI due to the CreditNew issues there, and moved here for the Higher RAC to substantiate the work that is done on my three cards.  I LOVE contributing; but, MUST be able to justify the work.

 

Something NEEDS MUST CHANGE!  Or, am I alone in my feelings?????

 

TL

TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join SETI Refugees

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1593389004
RAC: 777042

TL RAC does not reflect the

TL RAC does not reflect the value of the work done, it is just a crude measure of throughput. As an aside I had a RAC over 90 K with BRP6 now it is plummeting to the gutter but the work is still important. 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.