!!! Unrecoverable error for result - exit code 99 (0x63)

Thomas Madigan
Thomas Madigan
Joined: 26 May 05
Posts: 6
Credit: 4985448
RAC: 0
Topic 193310

For the past year, my PC has logged no more than 2 successful work units. The vast majority of them have returned an error or failed for 2 or 3 different reasons.

I've detached from the project, reset the project, reinstalled BOINC, reinstalled the latest version of BOINC, reinstalled my Operating System (for unrelated reasons - Windoze XP-Pro), initiated a thread almost identical in name to this one about a year ago, and yet, my PC is still crunching data but *RECEIVING NO CREDIT* for trillions of CPU cycles, kilowatts of wasted power, wasted bandwidth, Internet traffic. I'm a C programmer and a Physics and Astronomy Professor by profession. I consider myself fairly smart and adept at debugging code but since I didn't write E@H, I DON'T KNOW WHY I'M STILL GETTING EXIT CODE 99 (0x63). No one on this board seems to know SO WHO DO I ASK??!!! The "solutions" posted by all the part-time, self-appointed experts and sysadmins don't work. I've tried all their solutions and THEY DON'T WORK. You really need to post a COMPREHENSIVE list of error codes with SOLUTIONS THAT WORK. The programmers who write the bloody code need to be INTIMATELY involved with this message board and do due diligence for the tens of thousands of good people who are spending their time and treasure participating in this project - NOW!

I’m not going to bother posting the related messages produced in BOINC, BECAUSE, WITH THE EXCEPTION OF THE DATE, THEY’RE THE SAME AS THE LAST TIME.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 691126875
RAC: 256242

!!! Unrecoverable error for result - exit code 99 (0x63)

Quote:

For the past year, my PC has logged no more than 2 successful work units. The vast majority of them have returned an error or failed for 2 or 3 different reasons.

I've detached from the project, reset the project, reinstalled BOINC, reinstalled the latest version of BOINC, reinstalled my Operating System (for unrelated reasons - Windoze XP-Pro), initiated a thread almost identical in name to this one about a year ago, and yet, my PC is still crunching data but *RECEIVING NO CREDIT* for trillions of CPU cycles, kilowatts of wasted power, wasted bandwidth, Internet traffic. I'm a C programmer and a Physics and Astronomy Professor by profession. I consider myself fairly smart and adept at debugging code but since I didn't write E@H, I DON'T KNOW WHY I'M STILL GETTING EXIT CODE 99 (0x63). No one on this board seems to know SO WHO DO I ASK??!!! The "solutions" posted by all the part-time, self-appointed experts and sysadmins don't work. I've tried all their solutions and THEY DON'T WORK. You really need to post a COMPREHENSIVE list of error codes with SOLUTIONS THAT WORK. The programmers who write the bloody code need to be INTIMATELY involved with this message board and do due diligence for the tens of thousands of good people who are spending their time and treasure participating in this project - NOW!

I’m not going to bother posting the related messages produced in BOINC, BECAUSE, WITH THE EXCEPTION OF THE DATE, THEY’RE THE SAME AS THE LAST TIME.

This is for host 965546, right? It shows a variety of errors: checksum, permission, errors related to seemingly inconsistent input file data which worked before ...

By resetting the project, reinstalling the involved software etc you have now excluded all related software problems that I can think of, so from my perspective I strongly suspect that there's a hardware related problem for this particular host, and not a general problem with what you (somewhat unfair, IMHO) call "bloody code", which, by the way, seems to work pretty well on all the other PCs under your account (but admittedly not on host 965546). That host must be ca 6 years old (production of PIIIs ended in early 2002) so the possibility of a hardware failure should not be dismissed lightly.

Bikeman

Thomas Madigan
Thomas Madigan
Joined: 26 May 05
Posts: 6
Credit: 4985448
RAC: 0

RE: RE: For the past

Message 75324 in response to message 75323

Quote:
Quote:

For the past year, my PC has logged no more than 2 successful work units. The vast majority of them have returned an error or failed for 2 or 3 different reasons.

I've detached from the project, reset the project, reinstalled BOINC, reinstalled the latest version of BOINC, reinstalled my Operating System (for unrelated reasons - Windoze XP-Pro), initiated a thread almost identical in name to this one about a year ago, and yet, my PC is still crunching data but *RECEIVING NO CREDIT* for trillions of CPU cycles, kilowatts of wasted power, wasted bandwidth, Internet traffic. I'm a C programmer and a Physics and Astronomy Professor by profession. I consider myself fairly smart and adept at debugging code but since I didn't write E@H, I DON'T KNOW WHY I'M STILL GETTING EXIT CODE 99 (0x63). No one on this board seems to know SO WHO DO I ASK??!!! The "solutions" posted by all the part-time, self-appointed experts and sysadmins don't work. I've tried all their solutions and THEY DON'T WORK. You really need to post a COMPREHENSIVE list of error codes with SOLUTIONS THAT WORK. The programmers who write the bloody code need to be INTIMATELY involved with this message board and do due diligence for the tens of thousands of good people who are spending their time and treasure participating in this project - NOW!

I’m not going to bother posting the related messages produced in BOINC, BECAUSE, WITH THE EXCEPTION OF THE DATE, THEY’RE THE SAME AS THE LAST TIME.

This is for host 965546, right? It shows a variety of errors: checksum, permission, errors related to seemingly inconsistent input file data which worked before ...

By resetting the project, reinstalling the involved software etc you have now excluded all related software problems that I can think of, so from my perspective I strongly suspect that there's a hardware related problem for this particular host, and not a general problem with what you (somewhat unfair, IMHO) call "bloody code", which, by the way, seems to work pretty well on all the other PCs under your account (but admittedly not on host 965546). That host must be ca 6 years old (production of PIIIs ended in early 2002) so the possibility of a hardware failure should not be dismissed lightly.

Bikeman

Yes, this is for host 965546. Checksum (MD5) errors in what context? Compared to what? Processing (upload) size compared to server-side size? The "seemingly inconsistent input file data" [size, perhaps] I get and this I think may be a clue to the problem. Permission? Again, in what context?

I've been running Seti@home since 1999 and then migrated to BOINC when it became mandatory and have been successfully running BOINC ever since, all the while using the same processor and PC (PIII, 1.4 Ghz) to participate in Seti@home, climateprediction.net and E@H. About a year ago, for no apparent reason and after working fine for as long as I've participated in E@H with BOINC, E@H started throwing errors and returning bad work units. Additionally, I have slower processors (one 800 Mhz PIII and a 450 Mhz PII) on this account that are working fine but take a while longer to complete a work unit. So, it is my concerted opinion, not only as a professor of Astronomy and Physics at 3 separate institutions of higher learning and a former IT professional (C, C++ programmer), that it is the code or the *data that is downloaded* that is at fault, not my PC or its processor.

Based on your reply, it is safe to assume that the "exit code 99 (0x63)" error is a catch-all for errors that can't be compartmentalized or identified with any more specificity. A little more clarity in identifying the error is what is necessary.

Regards,
TM

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5850
Credit: 110046044589
RAC: 22534841

RE: ....I consider myself

Quote:
....I consider myself fairly smart and adept at debugging code but since I didn't write E@H, I DON'T KNOW WHY I'M STILL GETTING EXIT CODE 99 (0x63). No one on this board seems to know SO WHO DO I ASK??!!! The "solutions" posted by all the part-time, self-appointed experts and sysadmins don't work. I've tried all their solutions and THEY DON'T WORK.

Do you think it's a productive tactic to deliberately and quite unfairly denigrate the one person who actually tried to help you? You actually first posted on June 13 (and not a year ago as you claim) and only one person tried to make suggestions as to what might be causing the problem. So who are all these "part-time, self-appointed experts and sysadmins" who have offered you these unworkable solutions?

I realise you can be forgiven for feeling angry and frustrated with the problems you are encountering but blasting off at those who might otherwise be inclined to attempt to help is hardly a rational course of action to choose.

I assume you do realise there is a sticky thread at the top of this forum listing some of the client errors you might encounter? I also assume that you might have seen the author is listed as a Project Developer and if you've looked at his posting history you would have seen that he is a very regular contributor to the problem solving that is going on through all the message boards here. Wouldn't you consider that it might be much more productive to post all the details of your problem in that thread and attempt to engage in direct conversation with him? If you scan through that thread you will notice that he regularly tries to answer problems that others are reporting.

Quote:
You really need to post a COMPREHENSIVE list of error codes with SOLUTIONS THAT WORK.

If you take the trouble to read the sticky thread you will see that some of the errors are documented. You will also notice that many of the errors have multiple possible causes which are not fully understood. These are actively being worked on with the help of cooperative participants who draw particular errors to the attention of the developers. If what you are demanding was possible to produce, do you think the Devs would deliberately withhold it?

Unfortunately, we participants need to take some responsibility for what is happening on our own computers. Sure there are bugs in the software which we ourselves have no direct way of fixing. In your case however, it would seem more likely that your problems have a hardware component somewhere. I have quite a few machines with PIII tualatin 1400 processors which are running the same software as you are but are not experiencing any of your problems. The things you need to specifically check include:-

  • * Is your machine overclocked?
    * What is the CPU fan speed?
    * When did you last clean the CPU heatsink fins?
    * Have you checked the CPU full load temperature?
    * Have you run a program like Memtest86 to check your RAM under load?
    * Have you tried removing or replacing RAM a stick at a time?
    * Have you done a full bad sector scan on your hard disk?

As a final comment, a couple of your most recent failures showed the problem when trying to read a saved checkpoint. I wonder what would cause information written to disk to fail to read back in correctly? Another failure was an Unhandled Exception error which had the debugging information attached. Bernd may be able to make some sense from this when he has a chance to look at it.

I hope you will be able to find a hardware issue that solves most (if not all) of your current problems.

Cheers,
Gary.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 691126875
RAC: 256242

RE: Based on your reply,

Message 75326 in response to message 75324

Quote:


Based on your reply, it is safe to assume that the "exit code 99 (0x63)" error is a catch-all for errors that can't be compartmentalized or identified with any more specificity. A little more clarity in identifying the error is what is necessary.

Regards,
TM


If you inspect the results produced by your P-III, you'll see that it generates an abundance of different errors, most of them are related to problems reading files. Please follow Gary's advise and perform some hardware checks on that machine. I'd focus on memory and disk problems.

Quote:

So, it is my concerted opinion, not only as a professor of Astronomy and Physics at 3 separate institutions of higher learning and a former IT professional (C, C++ programmer), that it is the code or the *data that is downloaded* that is at fault, not my PC or its processor.

Some of your failed results show the following symptoms: The app starts up ok, crunches along, then (e.g. when the computer is powered down) it stops. It does so several times but then, on the n-th restart, it fails when reading the downloaded data. As the app had run before, this data was read in successfully before after it was downloaded. In all my experience as an IT professional, I'd say this might perhaps point to a memory or disk failure. Don't you think so?

CU
Bikeman

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.