Estimated run-time too high

mikey
mikey
Joined: 22 Jan 05
Posts: 11933
Credit: 1831905535
RAC: 212911

RE: Now the DCF for the

Message 90278 in response to message 90272

Quote:

Now the DCF for the Q6600 increased to 5.87 !!!
The DCF for the E4300 keeps at 3,95

So the estimated time rised again from 16 hours to 23 hours !

I just run einstein.

You can manually edit the client_state.xml file changing the DCF to a factor of 1.000000. This is a fix but will only change the time to completion numbers and allow you to get more work if needed. This only needs to be done in extreme cases and as long as everything is running fine now, does not need to be done. Eventually Boinc should settle back down and "fix" itself. IF yo uchoose to edit the file make sure you exit Boinc, also if you are running multiple projects ONLY change this one, not the DCF in all the others too.

Bluesilvergreen
Bluesilvergreen
Joined: 20 May 06
Posts: 23
Credit: 1206151
RAC: 0

Ok, now I let my cache get

Ok, now I let my cache get empty and uninstalled the 6.4.5 and installed the 6.6.3 even if it is beta.

Seem's fine so far.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5845
Credit: 109869202037
RAC: 30435430

RE: Seem's fine so

Message 90280 in response to message 90279

Quote:
Seem's fine so far.


While you were doing the version change, did you take the opportunity to correct the DCF back to a more reasonable value?

If not, are you getting a steady (10% of the difference) reduction (in both DCF and estimated crunch time) every time a task completes?

Cheers,
Gary.

Bluesilvergreen
Bluesilvergreen
Joined: 20 May 06
Posts: 23
Credit: 1206151
RAC: 0

Because I uninstalled the

Because I uninstalled the whole BOINC there is no client_state.xml left from the 6.4.5.

The DCF in the new client_state file is 1.00000000

And the first 4 wu's were estimated right (around 4 hours).

Something about the decrease and increase of the estimated time in the 6.4.5 :

I really don't know or can't remember, when this started.

But overall I would say, that the time decreases by 10% after each completed task.
Until it gets to around 15-16 hours and then it increases rapidly to almost 30 hours. Don't know if it increases after an S5R4 6.10 or S5R5 3.01.

Because it was present on both of my machines I think it is caused by BOINC 6.4.5.
That's the most suspicient thing for me, because BOINC 6.4.5 hasn't been installed for a long time on both rigs.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5845
Credit: 109869202037
RAC: 30435430

RE: Because I uninstalled

Message 90282 in response to message 90281

Quote:
Because I uninstalled the whole BOINC there is no client_state.xml left from the 6.4.5.

As I mentioned in an earlier message, uninstalling BOINC does not remove ANY of your configuration or state files. No files specific to a project are removed - unless you do so manually, afterwards. If you are uninstalling just to try out a different version of BOINC, you should not manually delete anything, particularly your state file.

If your state file had been removed, the hostIDs stored in it would have also gone and BOINC would have needed to assign new ones when starting up after the reinstall. I've just had a look and you still have the original hostIDs for your active hosts with no sign of any new ones, so your state file can't have been deleted. I'm puzzled as to how the DCF got back to 1.0 (the default for a new installation) when your state file seems to be otherwise unchanged.

It doesn't really matter any more now that things seem to have returned to normal. I guess it's possible that some of the other actions (like resetting the project or detaching) that you can choose from within BOINC Manager might reset the DCF to default value without interfering with the hostID. I was just curious and you know what they say about curiosity :-).

Please let us know if the DCF starts increasing dramatically again. If it really was a problem with 6.4.5, that problem may still be present in a later version. If it had been identified and fixed, I'm surprised that 6.4.5 would still be left as the recommended version.

Cheers,
Gary.

Bluesilvergreen
Bluesilvergreen
Joined: 20 May 06
Posts: 23
Credit: 1206151
RAC: 0

I uninstalled the 6.4.5 AND

I uninstalled the 6.4.5 AND removed the whole BOINC-folder after the uninstallation so all files were gone.

Then I installed the 6.6.3 and DCF was at 1.00000000 for default.

Just before the first four wu's were crunched the next four had been downloaded (because I set the cache to 0.1 days). And all four had the right estimated times, because the DCF hasn't been corrected yet due to the fact that no new wu has been completed when the new one's get in (Am I right?).

But after the first four had been uploaded the DCF should get corrected !? I didn't look into the client_state.xml, because I thought everything went fine.

Then when the second four wu's were almost done the next 4 got in and also with the right estimated time. And because of that I was shure, that everything is fine now and set the cache to 2 days. So I got around 30 wu's I think and all with the right estimated time. Then I got to bed and I shut down my computer (for hibernation). This morning I booted it up from hibernate-mode and I was shocked, because the four wu's that are beeing crunched right now have now a way too high estimated time (12 hours left with only 50% to go).

It's the same effect on my E4300 after resuming from hibernate.
I think it has something to do with a wrong calculation of wall-time vs. cpu-time and so a wrong cpu-usage and so a way too high DCF.

But, what surprises me this time, that the DCF went only to 1.3.
So i will wait until the current wu's are completed and what happens then to the new one's.

So I think it is the combination of BOINC 6.4.5 and later and the hibernation.

I will get back to a version older than 6.4.5.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2140
Credit: 2767944137
RAC: 988832

RE: Please let us know if

Message 90284 in response to message 90282

Quote:
Please let us know if the DCF starts increasing dramatically again. If it really was a problem with 6.4.5, that problem may still be present in a later version. If it had been identified and fixed, I'm surprised that 6.4.5 would still be left as the recommended version.


My belief is that there is a serious DCF problem in v6.4.5 - there were many reports of excessive runtime estimates, unexpected 'high priority' running, and similar troubles soon after release.

v6.4.5 was made official when it was because BOINC decided they absolutely had to have a CUDA support platform, and a high-profile CUDA project application, by a pre-determined date. Eric Korpela's Blog at SETI gives some of the story, but doesn't explain why the date chosen for a press release should govern the entire software release timetable: that seems to be putting the cart before the horse.

Anyway, BOINC v6.4.5 sort-of works, and I don't think the DCF bug is serious enough to withdraw the whole CUDA project. [There might have been other reasons for that - the SETI app needed three debug revisions before it even ran reliably without crashing the host computer - but DCF isn't one of them]

There have been at least four BOINC versions since v6.4.5, but none of them have been fit to make it beyond alpha. So for the time being, we're stuck with v6.4.5 as the "recommended", "stable" version.

I'll repeat the advice I gave to CPDN: BOINC v6.4.5 is only a required upgrade if you have a CUDA-capable graphics card, and want to use it to crunch for a project that has a CUDA application for your platform. That pretty much narrows it down to GPUGRID for Windows or Linux, and SETI for Windows. Anyone who doesn't fit that very narrow specification would be better off sticking with their current BOINC, or fetching an older version from the 'all versions' link on the BOINC download page.

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

RE: Then I installed the

Message 90285 in response to message 90283

Quote:
Then I installed the 6.6.3 and DCF was at 1.00000000 for default.

Be aware that 6.6.3 has a rather serious LTD calculation bug that has bitten everyone that has installed that version over at GPU Grid.

So far the "best" versions are 5.10.45 for those that do not need CUDA and 6.5.0 which is marginally better than 6.4.5 if you do need CUDA support.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5845
Credit: 109869202037
RAC: 30435430

RE: I uninstalled the 6.4.5

Message 90286 in response to message 90283

Quote:
I uninstalled the 6.4.5 AND removed the whole BOINC-folder after the uninstallation so all files were gone.


So how did you manage to retain your original host ID??? I must be missing something because your host ID (along with a lot of other vital stuff - such as details of all tasks in progress) will be lost when you delete the remnants of the BOINC tree. In this situation you get issued with a brand new ID when you next install together with brand new data files and tasks. Your new tasks do have a different frequency so you obviously got new data files but you don't have a new ID.

For the benefit of anybody reading who is not sure, if you want to try out a different (either newer or older) version of BOINC to see if BOINC itself might be the cause of some problem, the procedure is very simple. There is no need to manually delete anything - all you need to do is:-

  • * Stop BOINC
    * Uninstall the old version (Windows Add/Remove programs)
    * Install the version you wish to try out
    * Restart BOINC

Whatever tasks (for whatever projects) you have on board will continue to be crunched from where crunching left off when you stopped the previous BOINC.

Quote:
Just before the first four wu's were crunched the next four had been downloaded (because I set the cache to 0.1 days). And all four had the right estimated times, because the DCF hasn't been corrected yet due to the fact that no new wu has been completed when the new one's get in (Am I right?).


Yes. Whatever correction is needed for other tasks can only happen when a task finishes and the actual elapsed time is available. BOINC isn't even smart enough to change estimates for other tasks in any way even if the crunch time already exceeds the estimate when crunching is only partly complete. As an example, imagine the current estimate for a new task is 5 hours. Imagine the current task is at 80% completed with an elapsed time of 6 hours. There would be a BOINC estimate that this current task is still going to take another 1.5 hours to complete giving an all-up time of around 7.5 hours. If a new task were to be downloaded right now, BOINC would still assign the old estimate of 5 hours and would continue to do so right up to the point that the current task reached 100%. The instant that happened, and because it would be a big error in the estimate (5 hrs -> 7.5 hrs) BOINC would likely change all tasks on board by the full amount rather than just 10% of the difference.

Quote:
But after the first four had been uploaded the DCF should get corrected !? I didn't look into the client_state.xml, because I thought everything went fine.


Actually, it's nothing to do with uploading and its nothing to do with the fact you have 4 cores and 4 tasks 'in flight' at all times. The check (and any subsequent correction) is made precisely at the completion of each and every task. Usually real times and estimated times are close enough for BOINC to be happy to nudge the DCF and hence the estimate, by 10% of the difference. Unless you are looking closely, the changes (of the order of seconds to perhaps a few minutes) often go unnoticed.

Quote:
... I shut down my computer (for hibernation). This morning I booted it up from hibernate-mode and I was shocked ...


I wish you had mentioned hibernation much earlier .... :-).

In theory hibernation is useful but .... I'll bet the story goes something like this. I'm surmising, but I'd be surprised if I'm way off the mark. In hibernate mode, no further processing is done but the clock keeps ticking over. BOINC is not smart enough to understand that it should be ignoring time that elapses during hibernation. The science app is able to properly account for actual CPU time so there is nothing strange about the reported CPU time that the task actually takes. BOINC, on the other hand, thinks that the task has actually been crunching for the whole time of hibernation so that when you come out of hibernation, BOINC gets a nasty surprise at all this extra time that has passed with no progress being reported by the science app. So, of course BOINC has to blow out the estimated remaining time and when the task finishes it will blow out the DCF and all the future estimates as well. During the daytime, BOINC will be able to correct downwards 10% each time a new task finishes, but the next night, have a guess what is going to happen ... :-).

This will all fit in very nicely with what you reported.

I'll bet you would be able to go back to 6.4.5 with no further problems if you shut down rather than hibernate any time you want to turn your machine off.

Quote:
But, what surprises me this time, that the DCF went only to 1.3.


... pretty much what you would expect for a few hours hibernation. Imagine how this might accumulate over a week or more :-).

Quote:
So I think it is the combination of BOINC 6.4.5 and later and the hibernation.


You may find it will happen with all versions of BOINC and that the key to success is just to give hibernation a miss.

Cheers,
Gary.

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

RE: RE: I uninstalled the

Message 90287 in response to message 90286

Quote:
Quote:
I uninstalled the 6.4.5 AND removed the whole BOINC-folder after the uninstallation so all files were gone.

So how did you manage to retain your original host ID???

Could it be that he deleted the main Boinc-folder and kept the Data-folder. Remember that version 6 splits to 2 folders. All .xml-files and project-files are stored in the data-folder and that usually resides in a hidden folder.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.