The O2-All Sky Gravitational Wave Search on GPUs - discussion thread.

Matt White
Matt White
Joined: 9 Jul 19
Posts: 120
Credit: 280798376
RAC: 0

Gary Roberts wrote:I think

Gary Roberts wrote:
I think you made an earlier post about inconclusives (and perhaps invalids) 'disappearing' or something along those lines - I haven't gone back to find exactly how you described it.  From various snippets of information that come out from time to time, there can be 'manual intervention' of various sorts going on behind the scenes.  With the relaxing of validation 'rules', it seems likely that Work Units might be sent back through the validator and have their 'state' changed as a result.  An inconclusive or an invalid might suddenly become valid :-)

Good points. And that might very well be the case. What I do know, is they didn't show up in either the error column nor the invalid column.

I'm sure "Scotty" (Bernd, a great, and quite appropriate choice for an avatar) is doing quite a bit of work in the engine room that we aren't yet privy to. :) Sorry, couldn't help myself. I'm a huge Star Trek fan and Mr. Scott was my favorite character.

Clear skies,
Matt
Matt White
Matt White
Joined: 9 Jul 19
Posts: 120
Credit: 280798376
RAC: 0

archae86 wrote:Matt White

archae86 wrote:
Matt White wrote:
I should have checked first. The 5 tasks marked inconclusive have either validated, or been removed from the queue. No new error or invalid tasks in the count.

 

Using this method, for your host 12785296 I spot 3 inconclusives

For your host 12785591 I spot 11 inconclusives (I did not count one that probably ran CPU-only based on total CPU time).

One might expect them to be visible if one imposes the "pending" filter, but they are not.

Filtering by application, Found 7 inconclusive results this morning, with 2 additional invalids. I have a total of 7 invalid GW GPU 1.07 tasks, all of them of the AMD/LINUX variety. That number is up from 5, as noted last evening. All of the invalid tasks were run on Aug 13th, the day I had an incompatible task running concurrently, so I'm hopeful that everything since will be okay. Time will tell.

Clear skies,
Matt
Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

Matt White wrote:What I do

Matt White wrote:
What I do know, is they didn't show up in either the error column nor the invalid column.

I can see the same with my host very easily at the moment. I run 20 tasks 4x:

https://einsteinathome.org/host/12331989/tasks/0/0

I'm sure they all are going to end up "validation inconclusive". 6 have so far... so 14 are still pending. It looks like in the end there will be 20 tasks in "All" but there won't be a single task shown in any of the definitive categories.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4274
Credit: 245408944
RAC: 11650

"inconclusive" results are

"inconclusive" results are always looked at again by the validator automatically, when the additional task that was generated for reference is returned  & reported. If the validator was changed since the first validation attempt, it might be that by the next time all results that were previously found 'inconclusive' now 'agree' and are marked 'valid'.

Whe we release a new application (not application version) I usually leave the 'assimilator' turned off. This is visible by a 'disabled' assimilator on the server status page, and usually 'workunits waiting for assimilation'. In this state I could rather easily manually 're-validate' workunits, such that tasks status could change to 'valid' even from 'invalid'. However once a workunit has been 'assimilated', this would require too much manual intervention, which could also have unwanted and unnoticed side effects, so I rarely do a manual re-validation once an application has been established and the assimilator is running.

BM

cecht
cecht
Joined: 7 Mar 18
Posts: 1444
Credit: 2488622387
RAC: 1208306

A comparison of concurrent

A comparison of concurrent task times on different Polaris cards running v1.07 app on my Linux hosts:

RX460  97 min/task @ 1x
RX570  61 min/task @ 1x

compared to:

RX460  51 min/task @ 3x
RX570  35 min/task @ 3x

I'm looking forward to getting back to running 3x concurrent tasks; sure, for the higher efficiency, but also because one of my RX570 cards (device 1) has a pulsating squeak only when running tasks at 1x.  It sounds like a squeaky wheel bearing, faint, pulsating about 3 cycles/sec. It's not fan related. Anyone else hearing such oddness?

 

 

 

 

 

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5851
Credit: 110439384723
RAC: 30873084

cecht wrote:... one of my

cecht wrote:
... one of my RX570 cards (device 1) has a pulsating squeak only when running tasks at 1x.  It sounds like a squeaky wheel bearing, faint, pulsating about 3 cycles/sec. It's not fan related.

It's probably 'coil whine'.  Inductors (chokes) are used to filter out high frequencies in DC circuits and under certain conditions the filtering action can create audible noise - essentially vibrations in the wire loops.

If the card is producing valid results at 3x, you may as well go back to 3x.  My guess is that the real interest may be in situations where inconclusive/invalid results are produced at 3x to see if that also happens at 1x.

Cheers,
Gary.

cecht
cecht
Joined: 7 Mar 18
Posts: 1444
Credit: 2488622387
RAC: 1208306

Gary Roberts wrote:cecht

Gary Roberts wrote:
cecht wrote:
... one of my RX570 cards (device 1) has a pulsating squeak only when running tasks at 1x.  It sounds like a squeaky wheel bearing, faint, pulsating about 3 cycles/sec. It's not fan related.

It's probably 'coil whine'.  Inductors (chokes) are used to filter out high frequencies in DC circuits and under certain conditions the filtering action can create audible noise - essentially vibrations in the wire loops.

If the card is producing valid results at 3x, you may as well go back to 3x.  My guess is that the real interest may be in situations where inconclusive/invalid results are produced at 3x to see if that also happens at 1x.

Thanks, good to know.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

archae86
archae86
Joined: 6 Dec 05
Posts: 3146
Credit: 7076624931
RAC: 1303693

Gary Roberts wrote:If the

Gary Roberts wrote:
If the card is producing valid results at 3x, you may as well go back to 3x.  My guess is that the real interest may be in situations where inconclusive/invalid results are produced at 3x to see if that also happens at 1x.

Gary, I'm going to take your comment as "cover" to go back to 2X.  I have many dozens of task validations affirming that both of my Windows 10 RX 570 systems (single-card) had a very high successful validation rate when running at 2X.  I only downgraded out of obedience to Bernd--and in fairness his comment was phrased more in the tone of advice than demand.

If Bernd or someone else I trust comes on here and says "no, you really should be at 1X for the sake of the project", I'll revert promptly.  But the productivity difference is pretty painful if not needed.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5851
Credit: 110439384723
RAC: 30873084

archae86 wrote:... take your

archae86 wrote:
... take your comment as "cover" to go back to 2X.

For the benefit of all readers, I am not speaking for or on behalf of the project.  I'm just a volunteer and have no official capacity other than the basic duties of moderation and to attempt to answer queries and provide information, if I can.

My interpretation of Bernd's "struggling with validation" comment was that the overriding concern was to identify the precise causes of invalid results.  For that purpose, it would be quite distracting to have to filter through results that were invalid because of extraneous reasons such as attempts to run excessive numbers of concurrent tasks on equipment that couldn't really handle it.  So, I don't think they will be spending any time going through valid results, irrespective of the level of multiplicity used to produce them.

With the proviso that a volunteer pays proper attention to what is happening, I don't think that running on decent equipment at 2x where the volunteer has already established that results are almost always valid, would cause any issues for the Devs.  I think the most important thing a volunteer could do at the moment, that would assist, would be to try to provide examples of tasks that do fail validation at 1x.  I've already identified an older CPU machine with a modern and decent GPU that does that.  I have several more 'candidates' with a range of CPU architectures that will further test this.  Hopefully this will assist in providing useful examples to help Bernd.

 

Cheers,
Gary.

Matt White
Matt White
Joined: 9 Jul 19
Posts: 120
Credit: 280798376
RAC: 0

As of 01:45 UTC today, I have

As of 01:45 UTC today, I have a total of 9 invalid results on the GW 1.07 task. 8 of the failures are of the AMD/LINUX variety and 1 is a WIN7/NVIDIA task. This is out of about 80 tasks which validated successfully. Every one of the invalid results were a case where two CPU tasks out voted the GPU, just as Archae86 had predicted. I found a couple more inconclusive results in the pipeline this evening.

The filter lumps the GW task all together, so getting an accurate count of GPU work alone, is a bit of a chore.

Clear skies,
Matt

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.