Outcomes on MeerKAT 0.05

bluestang

Joined: 13 Apr 15

Posts: 34

Credit: 2492970228

RAC: 0

Gary Roberts

3 Sep 2022 0:13:15 UTC

Message 200523 in response to message 200520

(moderation:

)

Gary Roberts wrote:

bluestang wrote:
... my lowly AMD 7870XT is doing 2x happily.
Is this the machine you're talking about? It shows as having a 2GB AMD 7800 series GPU.

I looked at the tasks list and the only MeerKAT tasks still showing were run with the v0.01 app over a month ago. Have you actually run concurrent tasks more recently with something closer to a production version??

Yes that's the one. Bad choice of words on my part as it is no longer running tasks, but it was and didn't seem to have many issues then.

I'm not too worried I have an issue, just want the devs to know what I'm experiencing for future development of these apps.

For now I'll just let my cache run at 2x on my NVIDIA GPUs and turn off Beta for now.

Thanks to all who responded and helped.

mikey

Joined: 22 Jan 05

Posts: 12687

Credit: 1839092911

RAC: 3802

bluestang wrote: Gary

3 Sep 2022 9:35:26 UTC

Message 200546 in response to message 200523

(moderation:

)

bluestang wrote:

Gary Roberts wrote:

bluestang wrote:
... my lowly AMD 7870XT is doing 2x happily.
Is this the machine you're talking about? It shows as having a 2GB AMD 7800 series GPU.

I looked at the tasks list and the only MeerKAT tasks still showing were run with the v0.01 app over a month ago. Have you actually run concurrent tasks more recently with something closer to a production version??

Yes that's the one. Bad choice of words on my part as it is no longer running tasks, but it was and didn't seem to have many issues then.

I'm not too worried I have an issue, just want the devs to know what I'm experiencing for future development of these apps.

For now I'll just let my cache run at 2x on my NVIDIA GPUs and turn off Beta for now.

Thanks to all who responded and helped.

There's a whole thread about the Meerkat tasks with Bernd the Admin talking about how they are working to fix the Validations etc by tweaking the apps and what's working now and what isn't here:

https://einsteinathome.org/content/em-searches-brp-raidiopulsar-and-fgrp-gamma-ray-pulsar

In short everyday the validations are increasing against other types of OS's and other things as well.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7223524931

RAC: 1001329

With another 24 hours running

3 Sep 2022 14:12:56 UTC

Message 200549 in response to message 200496

(moderation:

)

With another 24 hours running 0.12 MeerKAT for Windows/AMD on three hosts things continue to look very good.

I looked at my invalid tasks sorted by sent date. I did see a couple of failures to quorum partners running v0.12 Linux applications. I've not yet spotted either success of failure for my tasks matched against v0.13 Liinux work. I have high hopes based on reports from others.

I also saw something new to me: half a dozen tasks with the status showing as Completed, can't validate.

These turned out to be tasks created from Work Units which reached the 20 task limit.

While I expected these would turn out to be WUs which had piled up computation errors on too many machines, the dominant problem in all cases was "Error while downloading". with the failures logged largely on September 2 and 3.

As with the previous problem with computation errors, it appears that some recent downloading failures are not random, but preferentially associated with tasks created from a small subset of WUs.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7223524931

RAC: 1001329

I just noticed that there was

3 Sep 2022 16:25:53 UTC

Message 200550 in response to message 200549

(moderation:

)

I just noticed that there was a window a couple of hours ago in which my system received MeerKAT work on an 0.14 version.

The full listed application reference is

Binary Radio Pulsar Search (MeerKAT) v0.14 (BRP7-opencl-ati-gcc4) windows_x86_64

There was a cuda55 variant also carrying the 0.14 and gcc4 markings.

As my subsequent work fetches have again been 0.12 work, and the application page does not currently list anything above 0.13, I surmise something was not satisfactory about these. I'm currently processing the seven of these I have in hand ahead of their turn, so I'll have seen some behavior on one of my systems within an hour.

[edit to add: I've run some Windows AMD 0.14 gcc4 tasks to completion on one of my systems. It seemed entirely unremarkable, taking very similar elapsed time to 0.12. However, among the seven tasks, all were initially assigned 0.14 cuda55 quorum partners. Of the three or four of those cuda55 trials that had reported when I looked, all had generated fast fails ("error while computing" with elapsed time under 10 seconds). I suppose that may be why these versions were pulled out of service so quickly.]

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3956

Credit: 46907842642

RAC: 64527864

"gcc4" sounds like they are

3 Sep 2022 17:15:11 UTC

Message 200551

(moderation:

)

"gcc4" sounds like they are playing with an older compiler for their application. though I wonder why using such an old version would be beneficial. gcc4 is VERY old. depending on which exact subversion they are using, GCC 4 releases spanned from ~2005-2016.

Ubunbtu 22.04 has v11.x for example.

_________________________________________________________________________

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7223524931

RAC: 1001329

So my seven 0.14 gcc4 tasks

3 Sep 2022 18:05:23 UTC

Message 200552 in response to message 200550

(moderation:

)

So my seven 0.14 gcc4 tasks have had 0.14 cuda quorum partners return results for four different tasks, all of which have generated fast fails.

To add insult to injury, one of those WUs where I had an initial fast failing 0.14 cuda partner, was then issued as an 0.12 cuda task. For this I got an inconclusive--in other words my result did not match well enough to count. This is a bit troublesome, as 0.12 cuda tasks have successfully validated against my 0.12 ati tasks quite often.

https://einsteinathome.org/workunit/669056670

If you have some of these 0.14 gcc4 tasks in your queue, you may wish to consider aborting them. I doubt that running them at this stage will add much value.

Outcomes on MeerKAT 0.05

Forums › Cruncher's Corner

Gary Roberts

bluestang wrote: Gary

With another 24 hours running

I just noticed that there was

"gcc4" sounds like they are

So my seven 0.14 gcc4 tasks

Comment viewing options

Forums › Cruncher's Corner