D41.xx Observation Thread

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4,527,270
RAC: 0

D41.11/12 results: valid:

D41.11/12 results:

valid: 21
invalid: 0

Could somebody give me a similar report?
It would be useful to decide the direction of further development.
I can increase the accuracy in some way, so I will do that if it's needed.

MarkF
MarkF
Joined: 12 Apr 05
Posts: 393
Credit: 1,516,715
RAC: 0

32 completed without errors 9

32 completed without errors
9 valid
0 invalid

Beyond
Beyond
Joined: 28 Feb 05
Posts: 121
Credit: 2,001,416,212
RAC: 6,179,375

Second result with D41.12 on

Second result with D41.12 on same machine is also bad. That's 2 for 2.

http://einsteinathome.org/task/27335956

All resuls done with S41.12 on same machine were good (hundreds of them).
Back to S41.12 on that machine (an Athlon64 3400+) until this gets resolved.

I have 2 other machines that have each produced 1 good result so far with D41.12.

[Sleeper]
[Sleeper]
Joined: 28 Feb 05
Posts: 10
Credit: 112,524
RAC: 0

17 completed without

17 completed without errors

4 Valid
1 invalid
12 pending

EDIT:
Is Intel more exactly or my AMD Athlon XP?

Kerwin
Kerwin
Joined: 22 Jan 05
Posts: 7
Credit: 10,120,613
RAC: 0

As of now, 15 complete

As of now, 15 complete without errors.

2 Valid
13 Pending

Brian
Brian
Joined: 25 Mar 06
Posts: 22
Credit: 80,237
RAC: 0

It seems like (almost?) every

Message 29757 in response to message 29756

It seems like (almost?) every invalid WU with D41.12 also had a Power PC or Linux client in the cluster. Has anyone validated with one of these clients in their cluster or had an invalid result with only other optimized or standard Win clients?

Vorik
Vorik
Joined: 10 Nov 04
Posts: 8
Credit: 21,237,672
RAC: 72,728

Lots of errors on my K6-2

Lots of errors on my K6-2 400Mhz:

http://einsteinathome.org/host/599639/tasks

5.3.12.tx36
- exit code -1073741819 (0xc0000005)

2006-05-01 01:13:08.4269 [normal]: Optimised by akosf D41.12 --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-05-01 01:13:08.4369 [normal]: Started search at lalDebugLevel = 0
2006-05-01 01:13:11.9920 [normal]: Checkpoint-file 'Fstat.out.ckp' not found.
2006-05-01 01:13:11.9920 [normal]: No usable checkpoint found, starting from beginning.

***UNHANDLED EXCEPTION****
Reason: Access Violation (0xc0000005) at address 0x0040AD0A read attempt to address 0x02C30000

1: 05/01/06 01:13:12

The only unit that had no error was one that had been startet with D40.

http://einsteinathome.org/task/27206675

I switched back to D40.

MarkF
MarkF
Joined: 12 Apr 05
Posts: 393
Credit: 1,516,715
RAC: 0

I have at least 1 invalid

I have at least 1 invalid result that is not directly attributeable to my messing around (possibly another). Both look like sync errors to me. Here is the latest.

5.2.13 BoincStudio 0.4b

2006-04-30 20:11:25.4101 [normal]: Optimised by akosf D41.12 --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-04-30 20:11:25.4101 [normal]: Started search at lalDebugLevel = 0
2006-04-30 20:11:26.4413 [normal]: Checkpoint-file 'Fstat.out.ckp' not found.
2006-04-30 20:11:26.4413 [normal]: No usable checkpoint found, starting from beginning.
2006-04-30 20:12:29.2694 [normal]: Fstat file reached MaxFileSizeKB ==> compactifying ... done.
2006-04-30 20:54:59.1757 [normal]: Search finished successfully.

ps
changing your 'Write to disk at most every' may help to prevent this error. Should not be a near integer multiple of the completion time.

Brian
Brian
Joined: 25 Mar 06
Posts: 22
Credit: 80,237
RAC: 0

RE: I have at least 1

Message 29760 in response to message 29759

Quote:

I have at least 1 invalid result that is not directly attributeable to my messing around (possibly another). Both look like sync errors to me. Here is the latest.

5.2.13 BoincStudio 0.4b

2006-04-30 20:11:25.4101 [normal]: Optimised by akosf D41.12 --> 'projects/einstein.phys.uwm.edu/albert_4.37_windows_intelx86.exe'
2006-04-30 20:11:25.4101 [normal]: Started search at lalDebugLevel = 0
2006-04-30 20:11:26.4413 [normal]: Checkpoint-file 'Fstat.out.ckp' not found.
2006-04-30 20:11:26.4413 [normal]: No usable checkpoint found, starting from beginning.
2006-04-30 20:12:29.2694 [normal]: Fstat file reached MaxFileSizeKB ==> compactifying ... done.
2006-04-30 20:54:59.1757 [normal]: Search finished successfully.

ps
changing your 'Write to disk at most every' may help to prevent this error. Should not be a near integer multiple of the completion time.

If you are referring to WU 7602326, again, one of the comps in that cluster was running a Linux client. Don't know yet if this is a trend, but it might be.

MarkF
MarkF
Joined: 12 Apr 05
Posts: 393
Credit: 1,516,715
RAC: 0

There is no linux here. ps I

There is no linux here.
ps I have increased 'Write to disk at most every' beyond the completion time. Will advise.
7602326
oooops
Sorry, I misunderstood your post initially Brian, sorry. I think it is a timing issue between the science app and the manager.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.