Neeed help with compute errors

mikey
mikey
Joined: 22 Jan 05
Posts: 11889
Credit: 1828219831
RAC: 201209

RE: As the original poster

Message 89869 in response to message 89865

Quote:

As the original poster I would like to get back to my issue with compute errors.

I ran memtest86 without any reported memory errors. I ran two threads of prime95 for 24 hours testing first the cpu/fpu and again testing memory, without any reported errors.

And last I have run chkdsk 3 times without evidence of any physical disk problems, I have checked the "smartdrive" data against an archive data base of 2300 other hard disks like mine and
I will try again to run tasks with S5R5 and hope for better results.

Okay I have checked your results and found these errors:
(1)- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0041D64F read attempt to address 0x00000035
(2)- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0041D64F read attempt to address 0x000000A0
(3)- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0041D340 read attempt to address 0x00000036
(4)- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0041D36F read attempt to address 0x00000004
(5)Error occured on Monday, January 5, 2009 at 04:49:50.
C:\Program Files\BOINC\projects\einstein.phys.uwm.edu\einstein_S5R4_6.05_windows_intelx86.exe caused an Access Violation at location 0041d36f in module C:\Program Files\BOINC\projects\einstein.phys.uwm.edu\einstein_S5R4_6.05_windows_intelx86.exe Reading from location 0000000d.

The above are either memory or hard drive errors. Maybe permission errors in that you are not an Admin and don't have permission to write to all parts of the drive?

Another common error you are having is for example:
2009-01-04 16:34:18.3437 [normal]: INFO: Couldn't open checkpoint h1_0850.95_S5R4__425_S5R4a_1_0.cpt

Do you keep suspended units in memory when Boinc swaps to another project? This is a setting on the webpage under Your Account, Computing Preferences then under Processor Usage you see this "Leave applications in memory while suspended?". If this is set to NO please change it to YES.

One other thing, the last 4 units have all completed just fine and you currently have a cache of 2 units. It seems that your problem may have solved itself or that you found the problem, fixed it and may not even know it. Or you may know and just didn't report back that you had found a problem and fixed it. Either way I hope your problems are over and your future is bright!!

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5842
Credit: 109411831144
RAC: 34874676

RE: The above are either

Message 89870 in response to message 89869

Quote:
The above are either memory or hard drive errors.


How can you be so certain? Why not a CPU error, or a floating point error, or swollen capacitors on the motherboard or a flakey PSU, or a whole range of other hardware or heat related issues? The OP did state that he tested his RAM with memtest and he tested his HDD as well, so you have picked the two things that are probably not involved.

Quote:
Maybe permission errors in that you are not an Admin and don't have permission to write to all parts of the drive?


Surely you are not serious?? If the app ran for a while and wrote quite a few checkpoints before crashing, how can it possibly be a write permission problem?

Quote:
Another common error you are having is for example:
2009-01-04 16:34:18.3437 [normal]: INFO: Couldn't open checkpoint h1_0850.95_S5R4__425_S5R4a_1_0.cpt


Did you miss the "[normal]: INFO:" comment that is prepended to what you are calling an "error message"? Of course when crunching first starts on a task, there cannot be a preexisting checkpoint file. The fact that a checkpoint doesn't yet exist is simply being reported as a normal informational message to announce the fact that crunching is starting from the very beginning rather than picking up somewhere in the middle.

Quote:
Do you keep suspended units in memory when Boinc swaps to another project?


Why would this matter?? It certainly has nothing to do with the OP's problem.

Quote:
If this is set to NO please change it to YES.


Why? If you are going to give blanket advice like this that is not related to the problem being experienced, at least set out the detailed pros and cons for each way that the option might be set. Why would the Devs make this an optional choice if it always should be set one way and not the other?

Quote:
... It seems that your problem may have solved itself or that you found the problem... Either way I hope your problems are over and your future is bright!!


At least we can agree on this :-).

Cheers,
Gary.

mikey
mikey
Joined: 22 Jan 05
Posts: 11889
Credit: 1828219831
RAC: 201209

Edited for content...I did

Message 89871 in response to message 89870

Edited for content...I did not like my message!

Nothing But Idle Time
Nothing But Idl...
Joined: 24 Aug 05
Posts: 158
Credit: 289204
RAC: 0

Geeez, not again! I ran out

Geeez, not again! I ran out all my cache last night, Jan 28, and installed a newer version of MS .NET framework. I rebooted. Then just for sport I rebooted again with an intervening CHKDSK w/repair if necessary. No disk errors found; disk reported as healthy.

I started Boinc and downloaded 1 Rosetta task and 1 Einstein task. Rosetta seems to be running OK.

But Einstein started at 8:45 pm and by 9:11 had crashed with

 - exit code -1073741819 (0xc0000005)

1, INFO: Major Windows version: 5
c
2, WARNING: Fixing yLower (-389 -> 0) [HoughMap.c 771]

C:\Program Files\BOINC\projects\einstein.phys.uwm.edu\einstein_S5R5_3.01_windows_intelx86_2.exe caused an Access Violation at location 0041d4c0 in module C:\Program Files\BOINC\projects\einstein.phys.uwm.edu\einstein_S5R5_3.01_windows_intelx86_2.exe Reading from location 0000005a.

And the next task crashed after a while longer

exit code -1073741819 (0xc0000005)

C:\Program Files\BOINC\projects\einstein.phys.uwm.edu\einstein_S5R5_3.01_windows_intelx86_2.exe caused an Access Violation at location 0041d4c0 in module C:\Program Files\BOINC\projects\einstein.phys.uwm.edu\einstein_S5R5_3.01_windows_intelx86_2.exe Reading from location 0000005a.

1, INFO: Major Windows version: 5
c
2, 3, c
4, 5, c
6, 7, c
8, 9, c
10, 11, c
12, 13, c
14, 15, c
16, 17, c
18, 19, c
20, 21, c
22, 23, c
24, 25, c
26, 27, c
28, 29, c
30, 31, c
32, 33, c
34, 35, c
36, 37, c
38, 39, c
40, 41, c
42, 43, c
44, 45, c
46, 47, c
48, 49, c
50, 51, c
52, 53, c
54, 55, c
56, 57, c
58, 59, c
60, 61, c
62, 63, c
64, WARNING: Fixing yLower (-5997 -> 0) [HoughMap.c 771]

My third task has been running for 2 hours, but I speculate that if it encounters a " Warning: Fixing yLower.. blah...blah..." message that it will die also. Does this not look like a bug? I'm anxiously waiting to see what the third task does.

I looked at some of the previous tasks that ran OK and none of them have the "Warning: Fixing yLower" message in them.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

Just for fun, try to update

Message 89873 in response to message 89872

Just for fun, try to update your DirectX version to one of the latest updates. It may not help, but then again, it may do the trick.

mikey
mikey
Joined: 22 Jan 05
Posts: 11889
Credit: 1828219831
RAC: 201209

RE: Just for fun, try to

Message 89874 in response to message 89873

Quote:
Just for fun, try to update your DirectX version to one of the latest updates. It may not help, but then again, it may do the trick.

You can update to the latest DirectX, 9.0c, here:
http://www.microsoft.com/downloads/details.aspx?FamilyID=2da43d38-db71-4c1b-bc6a-9b6652cd92a3&DisplayLang=en

Nothing But Idle Time
Nothing But Idl...
Joined: 24 Aug 05
Posts: 158
Credit: 289204
RAC: 0

RE: Just for fun, try to

Message 89875 in response to message 89873

Quote:
Just for fun, try to update your DirectX version to one of the latest updates. It may not help, but then again, it may do the trick.


Already use DirectX 9.0c if that's the current version, and don't use graphics or window savers.

My third task crashed also with same access violation and Warning
1, INFO: Major Windows version: 5
c
2, 3, c
4, 5, c
6, 7, c
8, 9, c
10, 11, c
12, 13, c
14, 15, c
16, 17, c
18, 19, c
20, 21, c
22, 23, WARNING: Fixing yLower (-29538 -> 0) [HoughMap.c 771]

The message says it is fixing yLower; does this mean it is trying to update a file or just something resident in memory?

Also FWIW, the first task downloaded after reboot also downloaded a new skygrid file "skygrid_0510Hz_S5R5.dat". The subsequent downloaded tasks did NOT download a new skygrid file; is the skygrid file the problem? This scenario seems reminiscent of what happened to me around Jan 17.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

RE: Already use DirectX

Message 89876 in response to message 89875

Quote:
Already use DirectX 9.0c if that's the current version, and don't use graphics or window savers.


DirectX isn't just used by the graphics, but also by sound, network, monitor, input devices and other parts of Windows. Having just DirectX 9.0c doesn't mean you have the latest update of that version, it's updated once every 2 months. The link Mike gave you will update your DirectX 9.0c to the latest available.

Saying "I already have that, but I don't use it for such and so, so I don't need to try it" will definitely not fix things. You want to get rid of the errors, so you'll have to update, clean out, upgrade, delete, change whatever you can think of, or we can come up with that may have a chance to fix it.

And else just stop running Einstein on that computer. It hates this project, it's jinxed and needs an ugly voodoo lady with a flock of naked chickens.

Nothing But Idle Time
Nothing But Idl...
Joined: 24 Aug 05
Posts: 158
Credit: 289204
RAC: 0

RE: RE: Already use

Message 89877 in response to message 89876

Quote:
Quote:
Already use DirectX 9.0c if that's the current version, and don't use graphics or window savers.

DirectX isn't just used by the graphics, but also by sound, network, monitor, input devices and other parts of Windows. Having just DirectX 9.0c doesn't mean you have the latest update of that version, it's updated once every 2 months. The link Mike gave you will update your DirectX 9.0c to the latest available.

Saying "I already have that, but I don't use it for such and so, so I don't need to try it" will definitely not fix things. You want to get rid of the errors, so you'll have to update, clean out, upgrade, delete, change whatever you can think of, or we can come up with that may have a chance to fix it.

And else just stop running Einstein on that computer. It hates this project, it's jinxed and needs an ugly voodoo lady with a flock of naked chickens.


OK, I'll humor you. But problem as I see it is that I can't run a controlled experiment. I can upgrade DirectX but when I run new tasks there is no guarantee that the only change of variable in running a new task is the DirectX change. A new task can bring new variables into the mix. If it runs OK everyone immediately jumps to the obvious conclusion that upgrading DirectX fixed it, but it could be something different about the new task instead which is not so obvious. In fact my machine often responds to repeated reboots as tried-n-proven way to stabilize my machine...(mutter...mutter).

Nothing But Idle Time
Nothing But Idl...
Joined: 24 Aug 05
Posts: 158
Credit: 289204
RAC: 0

RE: Just for fun, try to

Message 89878 in response to message 89873

Quote:
Just for fun, try to update your DirectX version to one of the latest updates. It may not help, but then again, it may do the trick.

I did as you asked; the DirectX download replaced a few components but I'm still getting access violation and still getting this warning message which is obviously related to whatever my problem is.

60, 61, WARNING: Fixing yLower (-1902 -> 0) [HoughMap.c 771]
9, WARNING: Fixing yLower (-19084 -> 0) [HoughMap.c 771]

As soon as this message appears the task aborts. I've asked repeatedly in this thread for someone to explain to me what this message is for and what it portends but no has even tried to answer. So I guess nobody but Bernd knows and he's keeping it secret.

Well, I've tried and ain't tryin' no moh! Einstein doesn't want to play nice-nice with me so our long term engagement is over. A shame, too, 'cause I really liked supporting this project and it's goal. Exasperating; I thought everyone's contribution was welcome but I guess not.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.