Help a noob out - acceptable rate of invalid returns

Burned

Joined: 25 Jun 21

Posts: 32

Credit: 388221900

RAC: 0

30 Jun 2021 14:02:57 UTC

Topic 225627

(moderation:

)

Crunching GPU work for GRPBS1. I'm getting some Invalid returns. Is this just the nature of the science and computations? Different systems produces different results and you just try to converge on the most likely correct answer?

Keith Myers

Joined: 11 Feb 11

Posts: 4753

Credit: 17681147065

RAC: 5738335

Yes, unavoidable. You get a

30 Jun 2021 14:15:36 UTC

Message 186878

(moderation:

)

Yes, unavoidable. You get a better chance if your wingmen have similar hardware and OS.

~~Some discussion about relaxing the validator limits or pairing wingmen with similar hardware taking place.~~

[Edit] Comment for wrong project.

archae86

Joined: 6 Dec 05

Posts: 3145

Credit: 7057804931

RAC: 1601700

For that specific application

1 Jul 2021 1:31:12 UTC

Message 186895

(moderation:

)

For that specific application a pretty typical ratio of invalid to valid results for an individual system is roughly ~~100:1~~ one out of every hundred. If a system is persistently well above that, say one out of twenty or worse, and a quick check of top systems of quorum partners suggests the background situation has not changed for everybody, then there is real reason for concern about the health of the system in question.

My quick look at your two hosts suggested to me that there was nothing unusual at hand. so far.

In floating point something as simple as conversions from internal to external representation causes infinitestimal differences in results from runs in which things get paused at different places. IEEE floating point is so very good that such differences are quite usually inconsequential, but nevertheless detectable. Setting the acceptance limits on "close enough to count" is not a trivial matter.

[edited in response to Gary Roberts pointing out that I had it very wrong indeed]

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5845

Credit: 109973693670

RAC: 29753622

Just a small clarification of

1 Jul 2021 0:21:55 UTC

Message 186898

(moderation:

)

Just a small clarification of what Archae86 posted. He said, "ratio of invalid to valid results for an individual system is roughly 100:1." but I'm sure he meant the other way around :-).

I would agree that's it's quite normal to see around 1% of returned tasks marked as 'invalid' due to very minor precision differences in different math libraries being used. This can vary over time - maybe 0.5% at one point and perhaps as high as 2% at some other time. If you see much greater than that consistently, it might be wise to investigate - things like clocks, voltages and temperature spring to mind :-).

Cheers,
Gary.

tullio

Joined: 22 Jan 05

Posts: 2118

Credit: 61407735

RAC: 0

In QuChemPedIA@home I get a

1 Jul 2021 6:54:50 UTC

Message 186908

(moderation:

)

In QuChemPedIA@home I get a 50%rate of invalid results using Windows 10. It is a Linux project and I have to use VirtualBox. Yet I am number 28 in the RAC ranking list.

Tullio

Burned

Joined: 25 Jun 21

Posts: 32

Credit: 388221900

RAC: 0

Tullio, I'm not certain how

1 Jul 2021 13:30:52 UTC

Message 186919

(moderation:

)

Tullio, I'm not certain how virtual box works, but you may want to check your linux c runtime libraries. The project should probably have a recommendation as to what package(s) they want used.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3713

Credit: 34650213193

RAC: 39696339

tullio wrote: In

1 Jul 2021 16:29:21 UTC

Message 186923 in response to message 186908

(moderation:

)

tullio wrote:

In QuChemPedIA@home I get a 50%rate of invalid results using Windows 10. It is a Linux project and I have to use VirtualBox. Yet I am number 28 in the RAC ranking list.

Tullio

what does this have to do with Einstein?

_________________________________________________________________________

tullio

Joined: 22 Jan 05

Posts: 2118

Credit: 61407735

RAC: 0

I have a number of Einstein

2 Jul 2021 15:31:25 UTC

Message 186938

(moderation:

)

I have a number of Einstein results, both GPU and CPU. Pending rate is almost zero. I have six BOINC running projects. Sometimes it is interesting to compare how different projects handle the valid/invalid ratio.

Tullio

Help a noob out - acceptable rate of invalid returns

Forums › Cruncher's Corner

Yes, unavoidable. You get a

For that specific application

Just a small clarification of

In QuChemPedIA@home I get a

Tullio, I'm not certain how

tullio wrote: In

I have a number of Einstein

Comment viewing options

Forums › Cruncher's Corner