exit code 21

Canada_Fred
Canada_Fred
Joined: 25 Dec 05
Posts: 10
Credit: 14600
RAC: 0
Topic 190448

So far all my work for this project has dumped with exit code 21 and it's driving me nuts. The first result died at the 19 minute mark but the rest seem to complete over an hour before being trashed. I've read past posts, the BOINC Wiki and searched Google with no success. I have one more result 59 minutes into it which I suspect will also suffer the same demise on it's next turn. Any ideas would be appreciated.

P4 1.8Ghz
Windows 2000 Pro
BOINC Manager 5.2.13
Einstein version 4.79

I had Windows Installer Service disabled when I attached to the project so is it possible something wasn't installed?

Here are my log entries which cover the the last trashed result.

2005-12-25 08:39:30 [Einstein@Home] Starting result l1_0977.5__0977.8_0.1_T05_S4lD_0 using einstein version 479
2005-12-25 09:39:30 [boincsimap] Resuming result 20051213.007235_2 using simap version 505
2005-12-25 09:39:30 [Einstein@Home] Pausing result l1_0977.5__0977.8_0.1_T05_S4lD_0 (left in memory)
2005-12-25 10:39:31 [boincsimap] Pausing result 20051213.007235_2 (left in memory)
2005-12-25 10:39:31 [Predictor @ Home] Resuming result bprion_5_116157_3 using mfoldB125 version 428
2005-12-25 10:42:12 [---] request_reschedule_cpus: process exited
2005-12-25 10:42:12 [Predictor @ Home] Computation for result bprion_5_116157_3 finished
2005-12-25 10:42:12 [Predictor @ Home] Starting result bprion_5_128374_4 using mfoldB125 version 428
2005-12-25 10:42:14 [Predictor @ Home] Started upload of bprion_5_116157_3_0
2005-12-25 10:42:14 [Predictor @ Home] Started upload of bprion_5_116157_3_1
2005-12-25 10:42:18 [Predictor @ Home] Finished upload of bprion_5_116157_3_0
2005-12-25 10:42:18 [Predictor @ Home] Throughput 7818 bytes/sec
2005-12-25 10:42:18 [Predictor @ Home] Started upload of bprion_5_116157_3_2
2005-12-25 10:42:19 [Predictor @ Home] Finished upload of bprion_5_116157_3_1
2005-12-25 10:42:19 [Predictor @ Home] Throughput 51924 bytes/sec
2005-12-25 10:42:22 [Predictor @ Home] Finished upload of bprion_5_116157_3_2
2005-12-25 10:42:22 [Predictor @ Home] Throughput 9678 bytes/sec
2005-12-25 11:42:12 [Einstein@Home] Resuming result l1_0977.5__0977.8_0.1_T05_S4lD_0 using einstein version 479
2005-12-25 11:42:12 [Predictor @ Home] Pausing result bprion_5_128374_4 (left in memory)
2005-12-25 11:59:51 [Einstein@Home] Unrecoverable error for result l1_0977.5__0977.8_0.1_T05_S4lD_0 (The device is not ready. (0x15) - exit code 21 (0x15))
2005-12-25 11:59:51 [---] request_reschedule_cpus: process exited
2005-12-25 11:59:51 [Einstein@Home] Computation for result l1_0977.5__0977.8_0.1_T05_S4lD_0 finished

Sharky T
Sharky T
Joined: 19 Feb 05
Posts: 159
Credit: 1187722
RAC: 0

exit code 21

Your results have this message in stderr out:
5.2.13
The device is not ready. (0x15) - exit code 21 (0x15)

The assumption xTemp >= 0 failed ... that should not be possible!!
DEBUG: loop=28254, xTemp=1760193.164606, f=977.796104, alpha=6, tempInt1[alpha]=25
DEBUG: skyConst[ tempInt1[ alpha ] ] = 1800.163814, xSum[ alpha ]=0.000000

*** PLEASE report this bug to ***

Can't recall that I ever seen this before.
Maybe you should report this bug. ;)


Canada_Fred
Canada_Fred
Joined: 25 Dec 05
Posts: 10
Credit: 14600
RAC: 0

gravity waves or

gravity waves or bugs...progress is progress :)

bug reported

Tern
Tern
Joined: 27 Jul 05
Posts: 309
Credit: 99440614
RAC: 0

This is just a guess... as a

This is just a guess... as a new member, you got the new application that was just released. You never had the _old_ application on your system. It's possible that something was "missed" in the list of things that have to be downloaded for the new application to work, and nobody else has seen it because whatever is missing from this one was already downloaded for the old one.

So you may have just saved the project from seeing hundreds of reports just like yours! That may not earn you any credits, but it certainly deserves a "thank you".

Welcome to Einstein, by the way. :-)

Stick
Stick
Joined: 24 Feb 05
Posts: 790
Credit: 33137945
RAC: 1033

RE: This is just a guess...

Message 22726 in response to message 22725

Quote:

This is just a guess... as a new member, you got the new application that was just released. You never had the _old_ application on your system. It's possible that something was "missed" in the list of things that have to be downloaded for the new application to work, and nobody else has seen it because whatever is missing from this one was already downloaded for the old one.

So you may have just saved the project from seeing hundreds of reports just like yours! That may not earn you any credits, but it certainly deserves a "thank you".

Welcome to Einstein, by the way. :-)

I have to disagree with Bill here. All your failed results look like the standard v4.79 type. Also all your units look like they processed for a while before crashing (one of them processed a little over an hour). Furthermore, all the error messages indicate some kind of "device problem" when they failed. Now I am guessing, but I wonder if you are running into an issue involving your computer going to sleep and/or hibernating and therefore your disc is suddenly unavailable. Or, alternatively the device "not ready" might also be your display adapter if you are using the BOINC screensaver. In any case, the error messages are not the usual messages that we see for these kind of problems. Therefore, it makes me wonder if you have been keeping your Windows 2000 current through Windows Update - sometimes older device drivers give strange error messages. But, I am quickly getting beyond my level of competence. Hopefully, somebody with a little more expertise will be along soon.

But I do agree with Bill on this issue - welcome to Einstein!

Canada_Fred
Canada_Fred
Joined: 25 Dec 05
Posts: 10
Credit: 14600
RAC: 0

I've been using this computer

I've been using this computer for distributed computing projects for about four years so I know to turn all power management features off including the monitor sleep mode and even the network card power saving feature because it can cause issues. I don't use a screen saver not even the BOINC one. All automated tasks that need lots of cycles like anti-virus and defragging I do manually and the timing of all other events don't line up to the error.

This may be something as simple as a file(s) didn't get installed because as was previously stated they were already installed for earlier members and this was just an oversight, or, there may be a driver that needs to exist that isn't for whatever reason, or, as I previously stated Windows Installer Service was turned off when I attached to this project preventing a critical file from being installed. If this is a driver conflict then it is something unique to Einstein's code because it continues to crunch happily with Predictor and SIMAP.

For now I've told BOINC not to download any new work for this project but I'll be hanging out here when I can. :)

Tern
Tern
Joined: 27 Jul 05
Posts: 309
Credit: 99440614
RAC: 0

Hopefully you'll get a quick

Hopefully you'll get a quick response from the "bug" email, and it'll get you up and going. I haven't gotten any of the new WUs yet, still working off an old set - but when I do, I'll try to remember to check this thread, and I'll get a list of the files I have, to paste here. Then we can see what is missing from yours, if anything. It _shouldn't_ be related to Windows Installer Service; all necessary files other than OS stuff are supposed to be handled by BOINC itself.

Stick is right that it seems odd that it has such a "delay" before the error, if it's a missing file, but I've seen things happen on other projects in "phase two" of a result that didn't in "phase one". All the "normal" problems, you already know how to avoid (screensaver, virus scan, etc.), and unfortunately the new Einstein "Albert" app just hasn't had the "history" where we can spot previously-seen problems yet. Glad to know you're able to keep that CPU warm on other projects. I recently switched all my resource share from Predictor over to Rosetta as I prefer the way they're running their project, much more "active". Don't know anything about SIMAP. There's plenty to choose from!

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4316
Credit: 250799954
RAC: 34160

Well, this message was put

Well, this message was put there on purpose, as we made certain assumptions when we optimized the code. However in the searches we are running on Einstein@Home this issue should rather not occur.

Ususally this more likely points to a problem in the FPU on the computer, probably due to overheating. If, on the other hand, this would be a real problem in the code or the search parametes, it should happen on all results for a given workunit. Unfortunately there have been no other results returned from other machines of the WUs in question yet, so we'll have to wait a bit more to see where this comes from.

BM

BM

Canada_Fred
Canada_Fred
Joined: 25 Dec 05
Posts: 10
Credit: 14600
RAC: 0

I did some more digging into

I did some more digging into my system and found my disk caching was turned off so I enabled it. I downloaded one work unit as a test unit and adjusted my project priorities so if it's going to crash it will happen overnight tonight. I'll let everyone know if it worked.

Canada_Fred
Canada_Fred
Joined: 25 Dec 05
Posts: 10
Credit: 14600
RAC: 0

Nope, that wasn't it.

Nope, that wasn't it. Predictor and SIMAP apps are much happier now so it wasn't a total waste of effort.

Dumped after 39 seconds. :-(

5.2.13
The device is not ready. (0x15) - exit code 21 (0x15)

The assumption xTemp >= 0 failed ... that should not be possible!!
DEBUG: loop=19727, xTemp=1759691.473442, f=977.566497, alpha=18, tempInt1[alpha]=73
DEBUG: skyConst[ tempInt1[ alpha ] ] = 1800.073427, xSum[ alpha ]=0.000000

*** PLEASE report this bug to ***

Sharky T
Sharky T
Joined: 19 Feb 05
Posts: 159
Credit: 1187722
RAC: 0

Don't know if the big WU-file

Don't know if the big WU-file can cause this problem(looks like all your wu are cut from the same datablock).
I would try to replace it with "reset project" to get a new one and see if that helps.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.