Validate error - What this really means!

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117751025455
RAC: 34814138

Here is the extra info for

Here is the extra info for these two task IDs.

Quote:
272579064


Validate error [6] (00000010)
- result file has entries that aren't numbers

Quote:
273251777


Validate error [6] (00001010)
- result file has entries that aren't numbers
- a number is out of valid range for this result

Cheers,
Gary.

Dennis
Dennis
Joined: 8 Nov 09
Posts: 5
Credit: 9611599
RAC: 0

271583014

Nigel Garvey
Nigel Garvey
Joined: 4 Oct 10
Posts: 51
Credit: 33021269
RAC: 90898

Thanks, Gary. Here are two

Thanks, Gary. Here are two more which appeared this morning.

273794486
274216816

With the previous three I've reported, that's 1 day 14 hours 42 minutes and 34 seconds of CPU time down the toilet in the past two weeks. I've now turned off FGRP1 tasks for a while in my preferences. I hope the "file has entries that aren't numbers" error means something to someone.

NG

NG

Darren Peets
Darren Peets
Joined: 19 Nov 09
Posts: 37
Credit: 107366051
RAC: 48439

270352809 272967613 273026007

270352809
272967613
273026007
274259792 (not the only validate error for this workunit, and I note that the Linux and Mac results so far have been invalid, while the Windows result is pending)

One of my two computers, to my knowledge, has never had an invalid task on anything other than gamma ray work.

Dennis
Dennis
Joined: 8 Nov 09
Posts: 5
Credit: 9611599
RAC: 0

27158301

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117751025455
RAC: 34814138

Extra info for this

Extra info for this task

Quote:
271583014

Validate error [6] (00000010)
- result file has entries that aren't numbers

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117751025455
RAC: 34814138

RE: 273794486 Validate

Quote:
273794486


Validate error [6] (00001010)
- result file has entries that aren't numbers
- a number is out of valid range for this result

Quote:
274216816


Validate error [6] (00000010)
- result file has entries that aren't numbers

Quote:
With the previous three I've reported, that's 1 day 14 hours 42 minutes and 34 seconds of CPU time down the toilet in the past two weeks.


As pointed out in the opening post of this thread, there is a disproportionately high rate of validate errors for FGRP1 tasks on Linux and Mac OS X. Everybody using these systems is suffering and I'm very sorry it's taking so long to find the cause. The Devs were informed at the time and are trying to work out what is doing this. They do have many things competing for their time and this particular problem is obviously not simple to diagnose.

Quote:
I've now turned off FGRP1 tasks for a while in my preferences.


That's the only thing you can do if the 'loss rate' is unacceptable. In your case, you seem to be really being hammered so I can fully understand your concerns.

Quote:
I hope the "file has entries that aren't numbers" error means something to someone.


The actual message may bear little relationship to what is actually doing the damage. Since it's not showing up in Windows, I would guess that it's probably some obscure problem somewhere specific to the unix world that is being triggered occasionally by whatever ... Unfortunately, nothing has yet been found and the Devs have other priorities they must also attend to.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117751025455
RAC: 34814138

RE: 270352809 Validate

Quote:
270352809


Validate error [6] (00000010)
- result file has entries that aren't numbers

Quote:
272967613


Validate error [6] (00000010)
- result file has entries that aren't numbers

Quote:
273026007


Validate error [6] (00000010)
- result file has entries that aren't numbers

Quote:
274259792


Validate error [6] (00000010)
- result file has entries that aren't numbers

Quote:
(not the only validate error for this workunit, and I note that the Linux and Mac results so far have been invalid, while the Windows result is pending)


There are now three validate errors (all Mac OS X/Linux) for the WU quorum you mention. It's possible that this could be bad data. The 'in progress' task is on a Windows machine so the answer will be revealed shortly. If it fails (validate error) it's very likely bad data and I'll report it to the Devs. If it succeeds, the data is OK and it's just a random triple coincidence of the validate error problem. In that case, I'll report it as well, just in case a triple occurrence like this might help with the diagnosis. The extra info associated with the other two failed tasks is exactly the same as yours.

Quote:
One of my two computers, to my knowledge, has never had an invalid task on anything other than gamma ray work.


That's not really surprising since validate errors are quite rare for CPU tasks other than FGRP1 tasks.

Cheers,
Gary.

Darren Peets
Darren Peets
Joined: 19 Nov 09
Posts: 37
Credit: 107366051
RAC: 48439

That workunit has now

That workunit has now validated (Windows-Windows).

I hope it's not something silly like line ends or "5.0E-3" being detected as non-numerical characters...

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117751025455
RAC: 34814138

It won't be something simple

It won't be something simple or silly. The problem is being looked at and it is elusive.

I have brought this triple validate error quorum to the attention of the Devs in the (probably forlorn) hope that it might provide additional insights. Anyway, fingers crossed ...

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.