Hi,
These 2 WUs, 676685 and 673808, both crashed when they were 99.5% - 99.7% complete. I had taken a backup 3 hours before the first one crashed, so I reverted to it thinking that it was a one-off problem, but it crashed again when it was almost finished. Unfortunately I didn't have a backup handy when the second one crashed, so I didn't have a chance to rerun it.
The error message for the WUs is as follows:
02/04/2005 15:27:59 - Einstein@Home - Unrecoverable error for result H1_0260.9__0261.4_0.1_T25_Test02_3 ( - exit code -1073741819 (0xc0000005))
03/04/2005 15:23:26 - Einstein@Home - Unrecoverable error for result H1_0260.9__0261.3_0.1_T25_Test02_6 ( - exit code -1073741819 (0xc0000005))
If you look at these two WUs, you will see that there are others experiencing client errors, too (3 client errors in the first one, and 4 in the second.) Is there anyone having similar problems out there? Any ideas as to what might be happening?
A WU takes ~11 hours to crunch on my laptop, so ~22 + 3 hours (for the rerun of the first WU from backup) = ~25 hours of crunching has gone to trash! I have never experienced such crashes with Einstein before, it was quite consistent and stable until yesterday. I have another H1_0260.9 WU, if this one also crashes, I'm afraid I will have to say goodbye until a new kind of H1_XXXX WU is submitted.
Any kind of help, advice, clue, tip, etc. will be greatly appreciated, especially from project owners and admins.
Best regards,
Ertugrul.
Copyright © 2024 Einstein@Home. All rights reserved.
Problem with H1_0260.9 WUs?
)
I have something rather strange going on here!
> I have another H1_0260.9 WU, if this one also crashes, I'm afraid
> I will have to say goodbye until a new kind of H1_XXXX WU is submitted.
Well, I took a backup just before this one was 99% through, and when I restarted it completed fine without any errors. But still I had the impression that those H1_0260.9 WUs were problematic, so I detached and reattached, and got a new set of WU. And guess what, that one crashed before my eyes at 99% after 11 hours of crunching, too!!! I HATE to waste WUs, so I went back to my backup which I had taken when the WU was 30% through, and reran that. But this time I stopped and took a backup at 99%, restarted, and voila, it completed without any problem!
So I have strange pattern like,
1) Let the WU finish on its own, and get a crash,
2) Babysit the WU just before it's over, and you are OK.
I don't have any problems with babysitting, I regulary backup my BOINC folder anyway, but this pattern seeems totally illogical and annoying to me.
I'm running einstein 4.79 with CC 4.25, BTW...
Any suggestions?
Ertugrul.