The O2-All Sky Gravitational Wave Search on GPUs - discussion thread.

Matt White

Joined: 9 Jul 19

Posts: 120

Credit: 280798376

RAC: 0

Gary Roberts wrote:I think

22 Aug 2019 11:43:50 UTC

Message 172898 in response to message 172890

(moderation:

)

Gary Roberts wrote:

I think you made an earlier post about inconclusives (and perhaps invalids) 'disappearing' or something along those lines - I haven't gone back to find exactly how you described it. From various snippets of information that come out from time to time, there can be 'manual intervention' of various sorts going on behind the scenes. With the relaxing of validation 'rules', it seems likely that Work Units might be sent back through the validator and have their 'state' changed as a result. An inconclusive or an invalid might suddenly become valid :-)

Good points. And that might very well be the case. What I do know, is they didn't show up in either the error column nor the invalid column.

I'm sure "Scotty" (Bernd, a great, and quite appropriate choice for an avatar) is doing quite a bit of work in the engine room that we aren't yet privy to. :) Sorry, couldn't help myself. I'm a huge Star Trek fan and Mr. Scott was my favorite character.

Clear skies,

Matt

Matt White

Joined: 9 Jul 19

Posts: 120

Credit: 280798376

RAC: 0

archae86 wrote:Matt White

22 Aug 2019 12:01:59 UTC

Message 172899 in response to message 172886

(moderation:

)

archae86 wrote:

Matt White wrote:
I should have checked first. The 5 tasks marked inconclusive have either validated, or been removed from the queue. No new error or invalid tasks in the count.

Using this method, for your host 12785296 I spot 3 inconclusives

For your host 12785591 I spot 11 inconclusives (I did not count one that probably ran CPU-only based on total CPU time).

One might expect them to be visible if one imposes the "pending" filter, but they are not.

Filtering by application, Found 7 inconclusive results this morning, with 2 additional invalids. I have a total of 7 invalid GW GPU 1.07 tasks, all of them of the AMD/LINUX variety. That number is up from 5, as noted last evening. All of the invalid tasks were run on Aug 13th, the day I had an incompatible task running concurrently, so I'm hopeful that everything since will be okay. Time will tell.

Clear skies,

Matt

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

Matt White wrote:What I do

22 Aug 2019 12:01:19 UTC

Message 172900 in response to message 172898

(moderation:

)

Matt White wrote:

What I do know, is they didn't show up in either the error column nor the invalid column.

I can see the same with my host very easily at the moment. I run 20 tasks 4x:

https://einsteinathome.org/host/12331989/tasks/0/0

I'm sure they all are going to end up "validation inconclusive". 6 have so far... so 14 are still pending. It looks like in the end there will be 20 tasks in "All" but there won't be a single task shown in any of the definitive categories.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250534706

RAC: 34158

"inconclusive" results are

22 Aug 2019 12:46:00 UTC

Message 172904 in response to message 172898

(moderation:

)

"inconclusive" results are always looked at again by the validator automatically, when the additional task that was generated for reference is returned & reported. If the validator was changed since the first validation attempt, it might be that by the next time all results that were previously found 'inconclusive' now 'agree' and are marked 'valid'.

Whe we release a new application (not application version) I usually leave the 'assimilator' turned off. This is visible by a 'disabled' assimilator on the server status page, and usually 'workunits waiting for assimilation'. In this state I could rather easily manually 're-validate' workunits, such that tasks status could change to 'valid' even from 'invalid'. However once a workunit has been 'assimilated', this would require too much manual intervention, which could also have unwanted and unnoticed side effects, so I rarely do a manual re-validation once an application has been established and the assimilator is running.

cecht

Joined: 7 Mar 18

Posts: 1534

Credit: 2907088778

RAC: 2157124

A comparison of concurrent

22 Aug 2019 17:19:30 UTC

Message 172909

(moderation:

)

A comparison of concurrent task times on different Polaris cards running v1.07 app on my Linux hosts:

RX460 97 min/task @ 1x
RX570 61 min/task @ 1x

compared to:

RX460 51 min/task @ 3x
RX570 35 min/task @ 3x

I'm looking forward to getting back to running 3x concurrent tasks; sure, for the higher efficiency, but also because one of my RX570 cards (device 1) has a pulsating squeak only when running tasks at 1x. It sounds like a squeaky wheel bearing, faint, pulsating about 3 cycles/sec. It's not fan related. Anyone else hearing such oddness?

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117648416124

RAC: 35193928

cecht wrote:... one of my

22 Aug 2019 20:37:00 UTC

Message 172915 in response to message 172909

(moderation:

)

cecht wrote:

... one of my RX570 cards (device 1) has a pulsating squeak only when running tasks at 1x. It sounds like a squeaky wheel bearing, faint, pulsating about 3 cycles/sec. It's not fan related.

It's probably 'coil whine'. Inductors (chokes) are used to filter out high frequencies in DC circuits and under certain conditions the filtering action can create audible noise - essentially vibrations in the wire loops.

If the card is producing valid results at 3x, you may as well go back to 3x. My guess is that the real interest may be in situations where inconclusive/invalid results are produced at 3x to see if that also happens at 1x.

Cheers,
Gary.

cecht

Joined: 7 Mar 18

Posts: 1534

Credit: 2907088778

RAC: 2157124

Gary Roberts wrote:cecht

22 Aug 2019 21:48:24 UTC

Message 172916 in response to message 172915

(moderation:

)

Gary Roberts wrote:

cecht wrote:
... one of my RX570 cards (device 1) has a pulsating squeak only when running tasks at 1x. It sounds like a squeaky wheel bearing, faint, pulsating about 3 cycles/sec. It's not fan related.
It's probably 'coil whine'. Inductors (chokes) are used to filter out high frequencies in DC circuits and under certain conditions the filtering action can create audible noise - essentially vibrations in the wire loops.

If the card is producing valid results at 3x, you may as well go back to 3x. My guess is that the real interest may be in situations where inconclusive/invalid results are produced at 3x to see if that also happens at 1x.

Thanks, good to know.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7224224931

RAC: 1014150

Gary Roberts wrote:If the

22 Aug 2019 21:50:03 UTC

Message 172917 in response to message 172915

(moderation:

)

Gary Roberts wrote:

If the card is producing valid results at 3x, you may as well go back to 3x. My guess is that the real interest may be in situations where inconclusive/invalid results are produced at 3x to see if that also happens at 1x.

Gary, I'm going to take your comment as "cover" to go back to 2X. I have many dozens of task validations affirming that both of my Windows 10 RX 570 systems (single-card) had a very high successful validation rate when running at 2X. I only downgraded out of obedience to Bernd--and in fairness his comment was phrased more in the tone of advice than demand.

If Bernd or someone else I trust comes on here and says "no, you really should be at 1X for the sake of the project", I'll revert promptly. But the productivity difference is pretty painful if not needed.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117648416124

RAC: 35193928

archae86 wrote:... take your

22 Aug 2019 23:05:07 UTC

Message 172918 in response to message 172917

(moderation:

)

archae86 wrote:

... take your comment as "cover" to go back to 2X.

For the benefit of all readers, I am not speaking for or on behalf of the project. I'm just a volunteer and have no official capacity other than the basic duties of moderation and to attempt to answer queries and provide information, if I can.

My interpretation of Bernd's "struggling with validation" comment was that the overriding concern was to identify the precise causes of invalid results. For that purpose, it would be quite distracting to have to filter through results that were invalid because of extraneous reasons such as attempts to run excessive numbers of concurrent tasks on equipment that couldn't really handle it. So, I don't think they will be spending any time going through valid results, irrespective of the level of multiplicity used to produce them.

With the proviso that a volunteer pays proper attention to what is happening, I don't think that running on decent equipment at 2x where the volunteer has already established that results are almost always valid, would cause any issues for the Devs. I think the most important thing a volunteer could do at the moment, that would assist, would be to try to provide examples of tasks that do fail validation at 1x. I've already identified an older CPU machine with a modern and decent GPU that does that. I have several more 'candidates' with a range of CPU architectures that will further test this. Hopefully this will assist in providing useful examples to help Bernd.

Cheers,
Gary.

Matt White

Joined: 9 Jul 19

Posts: 120

Credit: 280798376

RAC: 0

As of 01:45 UTC today, I have

23 Aug 2019 1:59:55 UTC

Message 172921

(moderation:

)

As of 01:45 UTC today, I have a total of 9 invalid results on the GW 1.07 task. 8 of the failures are of the AMD/LINUX variety and 1 is a WIN7/NVIDIA task. This is out of about 80 tasks which validated successfully. Every one of the invalid results were a case where two CPU tasks out voted the GPU, just as Archae86 had predicted. I found a couple more inconclusive results in the pipeline this evening.

The filter lumps the GW task all together, so getting an accurate count of GPU work alone, is a bit of a chore.

Clear skies,

Matt

The O2-All Sky Gravitational Wave Search on GPUs - discussion thread.

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner