Validate Errors

ritterm
ritterm
Joined: 18 Jun 08
Posts: 23
Credit: 46657826
RAC: 0
Topic 197335

I just noticed that my FX8150/GTX550Ti/Win7-64 host has recently generated several validate errors (and "Completed, marked as invalid"). I'm probably missing something in the stderr output, but I don't see much difference between these invalid tasks and the valid tasks for this host.

Any idea what happening and what could be causing the problem?

Thanks,

MarkR

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2976741074
RAC: 784797

Validate Errors

Oops. I'm seeing the same thing: Invalid tasks for computer 5744895. I did have a few voltage problems with that host before the holidays, but they were diagnosed and fixed, and I've had no problems with it since.

You are having problems with the NVidia GPU version of the Binary Radio Pulsar Search, mine are with the Intel GPU version. I suspect that we might both be suffering from related problems at the server end, but I'll keep an eye on things.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508870244
RAC: 75806

Reading this I checked my

Reading this I checked my account.
3 errors; one compute error saying
couldn't start app: Input file rand_PAS.bank.v3 missing or invalid: md5 checksum failed for file
two download errors saying that this file was corrupted.
All on the same machine: win7-64, i7, 2 AMD gpu's . Old main win7

ritterm
ritterm
Joined: 18 Jun 08
Posts: 23
Credit: 46657826
RAC: 0

Richard Haselgrove wrote:I

Richard Haselgrove wrote:
I suspect that we might both be suffering from related problems at the server end, but I'll keep an eye on things.


Sorry to hear that I'm not the only one...ha-ha. Hopefully we'll get some feedback from the admins soon.

Alex wrote:
I checked my account...3 errors...one compute error...two download errors


Yeah, but Richard and I are seeing validate errors. :-(

ritterm
ritterm
Joined: 18 Jun 08
Posts: 23
Credit: 46657826
RAC: 0

Oh, dear... I might have

Oh, dear... I might have bigger problems with this host. Every Gravitational Wave S6 Directed Search (CasA) v1.05 (SSE2) is crashing... :-(

ritterm
ritterm
Joined: 18 Jun 08
Posts: 23
Credit: 46657826
RAC: 0

Any chance we could get some

Any chance we could get some feedback from the developers or admins on this issue? I've had recurrences since my original post (...and I believe Richard has, too).

Thanks,

MarkR

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2976741074
RAC: 784797

RE: Any chance we could get

Quote:

Any chance we could get some feedback from the developers or admins on this issue? I've had recurrences since my original post (...and I believe Richard has, too).

Thanks,

MarkR


Yes, that list looks longer and newer, doesn't it?

Not worth disturbing anyone at this time of night (it must be a small proportion of tasks failing - that host spits out one every 12 minutes, five per hour, 120 per day) - but maybe a PM in the morning.

Christian Menges
Christian Menges
Joined: 26 Nov 05
Posts: 3
Credit: 102787541
RAC: 0

I've also validate

I've also validate errors:
------> Number of samples: 2097152
------> Trial dispersion measure: 266 cm^-3 pc
------> Scale factor: 1.875
[21:49:04][2000][INFO ] Seed for random number generator is 1084926635.
[21:49:08][2000][INFO ] Derived global search parameters:
------> f_A probability = 0.04
------> single bin prob(P_noise > P_thr) = 1.2977e-008
------> thr1 = 18.1601
------> thr2 = 21.263
------> thr4 = 26.2923
------> thr8 = 34.674
------> thr16 = 48.9881
[21:50:06][2000][INFO ] Checkpoint committed!
[21:51:12][2000][INFO ] Checkpoint committed!
[21:52:17][2000][INFO ] Checkpoint committed!
[21:53:23][2000][INFO ] Checkpoint committed!
[21:54:28][2000][INFO ] Checkpoint committed!
[21:55:34][2000][INFO ] Checkpoint committed!
[21:56:39][2000][INFO ] Checkpoint committed!
[21:57:45][2000][INFO ] Checkpoint committed!
[21:58:50][2000][INFO ] Checkpoint committed!
[21:59:56][2000][INFO ] Checkpoint committed!
[22:01:01][2000][INFO ] Checkpoint committed!
[22:02:07][2000][INFO ] Checkpoint committed!
[22:03:12][2000][INFO ] Checkpoint committed!
[22:04:18][2000][INFO ] Checkpoint committed!
[22:05:23][2000][INFO ] Checkpoint committed!
[22:06:29][2000][INFO ] Checkpoint committed!
[22:07:34][2000][INFO ] Checkpoint committed!
[22:08:40][2000][INFO ] Checkpoint committed!
[22:09:45][2000][INFO ] Checkpoint committed!
[22:10:51][2000][INFO ] Checkpoint committed!
[22:11:57][2000][INFO ] Checkpoint committed!
[22:13:02][2000][INFO ] Checkpoint committed!
[22:14:08][2000][INFO ] Checkpoint committed!
[22:15:13][2000][INFO ] Checkpoint committed!
[22:16:19][2000][INFO ] Checkpoint committed!
[22:17:24][2000][INFO ] Checkpoint committed!
[22:18:30][2000][INFO ] Checkpoint committed!
[22:19:35][2000][INFO ] Checkpoint committed!
[22:20:41][2000][INFO ] Checkpoint committed!
[22:21:46][2000][INFO ] Checkpoint committed!
[22:22:52][2000][INFO ] Checkpoint committed!
[22:23:57][2000][INFO ] Checkpoint committed!
[22:25:03][2000][INFO ] Checkpoint committed!
[22:26:08][2000][INFO ] Checkpoint committed!
[22:27:14][2000][INFO ] Checkpoint committed!
[22:28:19][2000][INFO ] Checkpoint committed!
[22:29:25][2000][INFO ] Checkpoint committed!
[22:30:30][2000][INFO ] Checkpoint committed!
[22:31:36][2000][INFO ] Checkpoint committed!
[22:32:41][2000][INFO ] Checkpoint committed!
[22:33:47][2000][INFO ] Checkpoint committed!
[22:34:52][2000][INFO ] Checkpoint committed!
[22:35:58][2000][INFO ] Checkpoint committed!
[22:37:03][2000][INFO ] Checkpoint committed!
[22:38:08][2000][INFO ] Checkpoint committed!
[22:39:14][2000][INFO ] Checkpoint committed!
[22:40:19][2000][INFO ] Checkpoint committed!
[22:41:25][2000][INFO ] Checkpoint committed!
[22:42:30][2000][INFO ] Checkpoint committed!
[22:43:34][2000][INFO ] OpenCL shutdown complete!
[22:43:34][2000][INFO ] Statistics: count dirty SumSpec pages 0 (not checkpointed), Page Size 1024, fundamental_idx_hi-window_2: 1100505
[22:43:34][2000][INFO ] Data processing finished successfully!
22:43:34 (2000): called boinc_finish

]]>
like this
Greetings

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 756102891
RAC: 1155145

Hi! For the host in the

Hi!

For the host in the thread starting message I see validate errors and invalid results for both GPU and CPU apps, and at a much higher rate than usual. At the same time, other hosts that were computing the same work were able to return valid results (at least in several cases I checked). All this does point to hardware problems, I'm afraid. Using a memory checking tool and checking that the cooling works are standard things you'll want to try first.

Cheers
HB

ritterm
ritterm
Joined: 18 Jun 08
Posts: 23
Credit: 46657826
RAC: 0

RE: All this does point to

Quote:
All this does point to hardware problems, I'm afraid...


And I'm afraid you're probably right, HB. I popped the case open and it didn't take long to find a couple of blown components on the motherboard. I'm very disappointed as this mobo is only a few months old. I'll now get to experience my first RMA with a hardware vendor (in this case, ASUS).

Thanks very much, HB, for your feedback. :-)

MarkR

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.