I've had another hang up as well. Curious, is it happening just after one WU is reported to the server. I think that is what is going on. I have 2 Einstein WUs running at the same time. When one finishes and reports, the other freezes and a new one starts. At least that is what I have noticed. Anyone else?
Ok, ok - the 0.11 still suffers from the hang problem. We already made some suggestions to David Anderson to fix this completely, but he is the one in charge to fix this. We're waiting for that and will make a new App asap.
For the time being I suggest you quit and restart BOINC if you observe a hang. Enable the screensaver only if you can remotely log into the machine and kill a hanging einstein (kill -QUIT) process if necessary.
Does that mean you don't want to hear that I had another work unit hang last night? ;-) Same as last one, Activity Monitor shows Einstein running at full speed, but zero progress on the work unit. Seems to occur about every three or four units.
Will wait for the fix when it is available. At 2x processing speed improvement, not much is lost as long as you pay attention. I used the scheduling feature in Energy saver to do a daily restart automatically. I have it shut down at a specified time and then start back up 2-3 minutes later.
The auto restart thing seems like a reasonable solution ( with the caveat that C@H might lose up to 15 Min of work if it happens at the wrong time). Let everyone know how this works for you. I might try it as kind of a last resort. On my system once things get into a rhythm, it runs for a while without issues. I would hate to stop it when it is working.
Regards
Phil
PS - Bernd, No criticism from here. All my posts are meant just to let you guys know what is happening and what I am trying. I agree that the speed increase is worth the extra effort. Keep up the good work. Looking forward to the next release.
We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
Anyone out there know why E@H requests more work before it completes a WU. Minervini is correct that this occurs 30 Min. before E@H completes the running WU. On my system the current WU stops for a moment while the new WU arrives. Sometimes work picks right up again, and sometimes it does not. I can't say that this is where all the hangs occur, but I do have more hangs near the end of a WU than in the middle.
I have been watching and E@H is the only app that does this. All the rest of the APPs request new work at the end of a WU. Now it is true that sometimes delivery of WUs for these other projects delays until later, but the request is always at the end of processing.
It could be that E@H could be fixed by assuring the the request only occurs after completion of the processing on the current WU. Sort of "Ok I'm done, give me another one."
Just a thought
Phil
We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
Two problems...
1. At some point BOINC stops showing any progress. Quitting and restarting updates the progress.
2. After peaking at about twice the average rate before the new software, the average rate has been dropping for over a week back half way to its prior value.
Two problems...
1. At some point BOINC stops showing any progress. Quitting and restarting updates the progress.
2. After peaking at about twice the average rate before the new software, the average rate has been dropping for over a week back half way to its prior value.
I would agree that the hangs slow progress (item 1) and therefore adversely effect the average rate of production (item 2). But we are really talking about a workaround and a fix is coming soon (I hope). So the important part of this is that it requires extra human monitoring time (or Shaktai's automated approach). In any case a solution will be found. We should all just keep reporting findings here so the programers have something to go on. While I feel certain that part of the problem is with the app I can also see where BOINC Client has some issues that should be addressed. The Client issues may take a lot longer to have addressed.
Regards
Phil
We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
When BOINC progress appears to hang, the nice processes continue to run.
Quite right. Tonight I had a P@H WU and a S@H WU running on one cpu while a E@H WU ran on the other. While I was uncertain about it I let it run to see what would happen. They all ran to the next swap time and switched. all of them reflected the proper progress although the two that shared a cpu only progressed about 2/3 as far as they would have by themselves.
At this point my main worry is a WU (w1_0979.5__0979.6_0.1_T05_S4hA) that shows up in my stats on the E@H site but is not in my BOINC queue. I think it may have been lost in uploading.
Any ideas from the E@H guys about this would be helpful.
Regards
Phil
We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
I've had another hang up as
)
I've had another hang up as well. Curious, is it happening just after one WU is reported to the server. I think that is what is going on. I have 2 Einstein WUs running at the same time. When one finishes and reports, the other freezes and a new one starts. At least that is what I have noticed. Anyone else?
Ok, ok - the 0.11 still
)
Ok, ok - the 0.11 still suffers from the hang problem. We already made some suggestions to David Anderson to fix this completely, but he is the one in charge to fix this. We're waiting for that and will make a new App asap.
For the time being I suggest you quit and restart BOINC if you observe a hang. Enable the screensaver only if you can remotely log into the machine and kill a hanging einstein (kill -QUIT) process if necessary.
BM
BM
Does that mean you don't want
)
Does that mean you don't want to hear that I had another work unit hang last night? ;-) Same as last one, Activity Monitor shows Einstein running at full speed, but zero progress on the work unit. Seems to occur about every three or four units.
Will wait for the fix when it is available. At 2x processing speed improvement, not much is lost as long as you pay attention. I used the scheduling feature in Energy saver to do a daily restart automatically. I have it shut down at a specified time and then start back up 2-3 minutes later.
Team MacNN - The best Macintosh team ever.
LoL, Shaktai. Same thing
)
LoL, Shaktai. Same thing here! Is there away to reinstall Einstein .08. At least it didn't waste cpu cycles.
The auto restart thing seems
)
The auto restart thing seems like a reasonable solution ( with the caveat that C@H might lose up to 15 Min of work if it happens at the wrong time). Let everyone know how this works for you. I might try it as kind of a last resort. On my system once things get into a rhythm, it runs for a while without issues. I would hate to stop it when it is working.
Regards
Phil
PS - Bernd, No criticism from here. All my posts are meant just to let you guys know what is happening and what I am trying. I agree that the speed increase is worth the extra effort. Keep up the good work. Looking forward to the next release.
We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
Anyone out there know why E@H
)
Anyone out there know why E@H requests more work before it completes a WU. Minervini is correct that this occurs 30 Min. before E@H completes the running WU. On my system the current WU stops for a moment while the new WU arrives. Sometimes work picks right up again, and sometimes it does not. I can't say that this is where all the hangs occur, but I do have more hangs near the end of a WU than in the middle.
I have been watching and E@H is the only app that does this. All the rest of the APPs request new work at the end of a WU. Now it is true that sometimes delivery of WUs for these other projects delays until later, but the request is always at the end of processing.
It could be that E@H could be fixed by assuring the the request only occurs after completion of the processing on the current WU. Sort of "Ok I'm done, give me another one."
Just a thought
Phil
We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
Two problems... 1. At some
)
Two problems...
1. At some point BOINC stops showing any progress. Quitting and restarting updates the progress.
2. After peaking at about twice the average rate before the new software, the average rate has been dropping for over a week back half way to its prior value.
RE: Two problems... 1. At
)
I would agree that the hangs slow progress (item 1) and therefore adversely effect the average rate of production (item 2). But we are really talking about a workaround and a fix is coming soon (I hope). So the important part of this is that it requires extra human monitoring time (or Shaktai's automated approach). In any case a solution will be found. We should all just keep reporting findings here so the programers have something to go on. While I feel certain that part of the problem is with the app I can also see where BOINC Client has some issues that should be addressed. The Client issues may take a lot longer to have addressed.
Regards
Phil
We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.
When BOINC progress appears
)
When BOINC progress appears to hang, the nice processes continue to run.
RE: When BOINC progress
)
Quite right. Tonight I had a P@H WU and a S@H WU running on one cpu while a E@H WU ran on the other. While I was uncertain about it I let it run to see what would happen. They all ran to the next swap time and switched. all of them reflected the proper progress although the two that shared a cpu only progressed about 2/3 as far as they would have by themselves.
At this point my main worry is a WU (w1_0979.5__0979.6_0.1_T05_S4hA) that shows up in my stats on the E@H site but is not in my BOINC queue. I think it may have been lost in uploading.
Any ideas from the E@H guys about this would be helpful.
Regards
Phil
We Must look for intelligent life on other planets as,
it is becoming increasingly apparent we will not find any on our own.