The O2-All Sky Gravitational Wave Search on GPUs - discussion thread.

cecht
cecht
Joined: 7 Mar 18
Posts: 1533
Credit: 2900168892
RAC: 2174034

n12365 wrote:Bernd

n12365 wrote:
Bernd Machenschalk wrote:
First of all, thank you all for participating in this Beta test! It is pretty important for us to verify that this GPU application version produces scientifically valid results. In that sense every bit of information helps, particularly the 'invalid' results. The invalid results tell us that there is work left to do, and (hopefully) give an indication where. The 'valid' results are way less interesting, although they provide a reference for the rate of the 'invalid's.

Is running the GPU application at 1x helpful?  I can almost double my output by processing two tasks at time, but I assume it is easier to debug if only one task at a time is being processed.

See 22 Aug comment by Gary about this. His thought was to run at 1x only if your system is spitting out lots of invalids. I'm now running all my GPUs at 3x because my invalid rate for the v1.07 tasks is low.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18713037624
RAC: 6373596

I just grabbed the app and

I just grabbed the app and some tasks.  Haven't got to them yet as the FIFO put the Gamma Ray Pulsar Search tasks ahead of them in the queue.  I only run my gpus at 1X so I think that would give the devs some decent information on both the valids and expected invalids.  Curious where this new work falls out.

 

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1588392402
RAC: 762527

I switched to 1X to see if

I switched to 1X to see if that will reduce my invalid rate, 2X and 3X produced as many in invalids as valids.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18713037624
RAC: 6373596

Better for the project as 1X

Better for the project as 1X is what Bernd asked for when testing the beta app.

 

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7220564931
RAC: 970958

I have two RX 570 Windows 10

I have two RX 570 Windows 10 systems which have been producing very high validation rates (around 98%) running at 2X (though zero validation rates on a substantial number of 3X and 4X jobs).

Very recently I spotted a crop of "inconclusive" listings, and by now I have several invalid results.  My formerly near-perfect system (at one point it had a current listing of about 140 valid and 1 invalid for GW work) now has seven invalid results on work return August 29-31 and 20 (!!) inconclusives mostly returned yesterday and today.  My system which got healed once I dropped from 4X to 2X now has three invalid results on work returned on August 30 plus two inconclusives in that date range I confidently expect to resolve as invalid..

Maybe this is luck of the draw of the quorum partner.  Perhaps the validator got a bit stricter recently.  Perhaps both of my systems simultaneously got somewhat less healthy (maybe because of adverse effects of a Windows update).

I have dropped the most troubled system to 1X, and will watch the other.

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18713037624
RAC: 6373596

For anyone running Windows

For anyone running Windows and Nvidia cards, be especially leery of the 436 series of drivers.  These are the ones that have the new experimental integer scaling capability.

SETI users are getting nothing but invalids and errors on them for Arcecibo task species of angle range 2.7.  Reverting to 431 series drivers lets the cards run these same tasks with no issues.

 

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

Keith Myers wrote:For anyone

Keith Myers wrote:
For anyone running Windows and Nvidia cards, be especially leery of the 436 series of drivers.

Have those problems arisen only with newer generation cards or also older ? I've been running 436.15 and 436.20 with a GTX 960 and they've been working alright for this GW app. Ps. For fun... there's also 440.23 Insider driver unofficially available for those who run a regular Windows 10 version. This driver also has been working smoothly here with GTX 960.
https://forums.guru3d.com/threads/insider-nvidia-driver-440-23-wddm-2-7.428337/

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18713037624
RAC: 6373596

Believe the problem is

Believe the problem is occurring with all cards.  At least I know of hosts with both Pascal and Turing cards.  1080 and 2070 for example.

 

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7220564931
RAC: 970958

archae86 wrote:I have dropped

archae86 wrote:
I have dropped the most troubled system to 1X, and will watch the other.

My RX 570 system which was running about a 99% validation rate on GW GPU tasks until a couple of days ago and suddenly dropped to zero has now returned four tasks run at 1X.  The three of them which have gotten a quorum all show as inconclusive, meaning they will all fail.

I've done a cold iron (power off, wait a little while) reboot, and am continuing at 1X.  If it continues at "all fail", I think I"ll put it back to GRP.  I doubt repeating the 100% failure mode endlessly produces much new information for the project after the first few hours.

I do wonder what changed.  Time will tell whether the reboot changed it back.

Meanwhile, my other RX 570 Windows 10 system has plenty of validations and no inconclusives on quorums formed for tasks returned in the last 24 hours.  Maybe it is not fully healthy, but nothing like so awful as the other one

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18713037624
RAC: 6373596

I finally just finished my

I finally just finished my first CGW task after getting through all my GRP work.  See that the task requires more than a single cpu thread to support the gpu task. Mine topped out at 110% of a cpu thread. Will need to adjust the ncpus value in app_config.

https://einsteinathome.org/task/879061290

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.