Estimated run-time too high

Bluesilvergreen
Bluesilvergreen
Joined: 20 May 06
Posts: 23
Credit: 1,206,151
RAC: 0
Topic 194164

Hi!

I experienced a rising run-time estimation, but don't know exactly what caused this.

So, instead of 3-5 hours (that's the real run-time) BOINC estimated a run-time of 20-30 hours???
It doesn't matter if S5R4 or S5R5. And the present of app_info.xml has also no effect.

This effect is visible on both of my machines (Q6600 and E4300 both with BOINC 6.4.5).

And it doesn't decrease it hovers around at this time.

Does this has something to with BOINC itself (6.4.5) or is this caused by einstein?

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,405
Credit: 53,432,389,379
RAC: 73,389,436

Estimated run-time too high

Quote:
I experienced a rising run-time estimation, but don't know exactly what caused this.


When you say 'rising' was it a sudden, once-off jump or a more gradual and progressive change?

Quote:
So, instead of 3-5 hours (that's the real run-time) BOINC estimated a run-time of 20-30 hours???
It doesn't matter if S5R4 or S5R5. And the present of app_info.xml has also no effect.


That's very unusual and is most likely to be caused by a single rogue result (or rather rogue crunch time) that causes BOINC to think that all future tasks will take longer by such a large factor. BOINC will therefore make the increase in estimate in one big hit and then gradually decrease again as future tasks take the more normal values for crunch time. I've looked in the tasks lists for both machines and don't see any rogue results and also it would be very unlikely for this to be seen on two different machines simultaneously. So it would not appear to be due to a rogue crunch time unless such a result has already expired from the online database.

What I did notice was that both machines have each done R4 resend tasks recently and, as expected, these tasks did take more time than the surrounding R5 tasks, but not the 20-30 hours you mentioned. There would probably be some adverse and temporary effect on the DCF (duration correction factor) stored in your state file (client_state.xml) each time one of these transitions occurred but I would think the changes would be small enough so as not to cause any real concern. It would be much more of a problem (as it was for the R3/R4 transition) if the project supplied estimates had a large disparity for one run and not the other. This time however, I think the estimates are much better and shouldn't be causing any such problems.

You should browse the DCF value in your state file and see if there are large changes after tasks from each different run complete.

Quote:
This effect is visible on both of my machines (Q6600 and E4300 both with BOINC 6.4.5).


Which is why you would tend to rule out a rogue result or a rogue benchmark run, both of which could screw up the estimate for crunch time. You wouldn't expect both machines to suffer exactly the same fate at exactly the same time. Also, you should not lump both machines together since the quad core is rather faster than the dual core. Its 20 most recent results have taken between just under 4 hours to around 7.5 hours whereas the range for the dual core is between about 6.5 hours to nearly 14 hours. Are you saying that both of these machines have time estimates of around 25 hours which are not decreasing at all? Also, the time estimates for future R4 tasks (if any) would be quite different to the time estimates for R5 tasks. However a time estimate for an R4 task shouldn't be anywhere near 20 hours, even for the slower machine.

Quote:
And it doesn't decrease it hovers around at this time.


Are you really sure about that? BOINC should correct by 10% of the difference each time a new task is completed.

Quote:
Does this has something to with BOINC itself (6.4.5) or is this caused by einstein?


Unlikely in both cases since there are no other reports that I know of where people are experiencing such dramatic changes.

Cheers,
Gary.

Holmis
Joined: 4 Jan 05
Posts: 1,118
Credit: 828,623,527
RAC: 215,555

RE: Does this has

Quote:

Does this has something to with BOINC itself (6.4.5) or is this caused by einstein?

I do think this is a bug in Boinc 6.4.5. I have a vague memory of reading it somewhere.
I also got hit with the same thing but not on Einstein but on Seti, the estimates climbed rather quickly to ridiculous values, manually edited the DCF in client_state.xml and changed Boinc-version to 6.6.0, have been fine so far.

Bluesilvergreen
Bluesilvergreen
Joined: 20 May 06
Posts: 23
Credit: 1,206,151
RAC: 0

Ok, thanks for your

Ok, thanks for your answers!

I don't know anymore if the estimated time rised by a jump or steadily. But I think it was a jump after I installed BOINC 6.4.5

I will let my cache empty and than uninstall BOINC and reinstall it and hope, that it will correct this.

So far the estimated crunch-time decreased to 16 hours on my Q6600, but I have noticed this before and than it rises again to almost 30 hours.

One of my current tasks that is crunched right now has an estimate of 35 hours with 75% to go (S5R5 - 25% in 1h 13min so far). Very strange. The other 3 tasks have 33% to go with an estimated completion time if 10h!! (66% done in 3h 45min)

DCF is at 3,82 right now on the Q6600.
DCF is at 3,96 right now on the E4300.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,516
Credit: 463,791,934
RAC: 21,425

RE: DCF is at 3,82 right

Message 90271 in response to message 90270

Quote:

DCF is at 3,82 right now on the Q6600.
DCF is at 3,96 right now on the E4300.

This is way too high, anything between 1.0 and 2.0 is OK.

I wonder: wouldn't it be enough to just just stop BOINC, edit the client_state.xml file (search for "correction") and restart BOINC?

Not sure what caused this in the first place, tho.

CU
Bikeman

Bluesilvergreen
Bluesilvergreen
Joined: 20 May 06
Posts: 23
Credit: 1,206,151
RAC: 0

Now the DCF for the Q6600

Now the DCF for the Q6600 increased to 5.87 !!!
The DCF for the E4300 keeps at 3,95

So the estimated time rised again from 16 hours to 23 hours !

I just run einstein.

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9,352,143
RAC: 0

Well, you are running 6.4.5,

Well, you are running 6.4.5, which is known to have problems with work fetch behaviour and other issues. Basically, it's an Alpha release (even though the minor version number implies otherwise) and should be considered 'junk' unless you are willing to deal with it's idiosyncrasies with frequent manual interventions.

Since there is no CUDA app for EAH currently, and neither of your hosts could run it even if there was, you may want to consider rolling back to 6.2.x for now.

Alinator

Bluesilvergreen
Bluesilvergreen
Joined: 20 May 06
Posts: 23
Credit: 1,206,151
RAC: 0

OK! I will try an older

OK!

I will try an older version.

But my Q6600 is capable of CUDA (8800GT ;-)

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,405
Credit: 53,432,389,379
RAC: 73,389,436

RE: I will let my cache

Message 90275 in response to message 90270

Quote:
I will let my cache empty and than uninstall BOINC and reinstall it and hope, that it will correct this.


You don't need to empty your cache. A reinstallation will pick up again exactly from where you were when you decided to uninstall. You don't even need to uninstall first (you can install 'over the top') but I always feel better doing so anyway :-). If there is any problem with 6.4.5, you may be better off trying a different version. I have no experience with 6.4.5. I decided to stay on 5.10.45 quite a while ago because I've never had any problems with it and new BOINC versions often seem to be pushed out before they are really ready for prime time - just a personal opinion.

Quote:
One of my current tasks that is crunched right now has an estimate of 35 hours with 75% to go (S5R5 - 25% in 1h 13min so far). Very strange. The other 3 tasks have 33% to go with an estimated completion time if 10h!! (66% done in 3h 45min)


A rapidly falling estimate of the remaining crunch time as crunching proceeds is exactly what you expect when the initial estimate before crunching started was way too large. The thing that is puzzling is why the estimate got so large in the first place. It's also very puzzling as to why it doesn't self correct. At one point I wondered if speedstep was kicking in and throttling back your CPU frequency but that should be visible in the form of extended crunch times at the lower frequency.

To make the estimate increase so much (ie to make the DCF increase from 3.82 to 5.87 as you mention in a later post) you usually need a completed task to have actually taken this huge amount of time. However there is no evidence of this in the task times reported on the website. The most recent values listed there all appear to be pretty much as expected.

The other interesting parameter in your state file would be cpu_efficiency. This is really a measure of crunch time as a fraction of wall clock time. As an example, if a task had a recorded CPU time of 9 hours but the total elapsed time was 10 hours, the cpu_efficiency would be 0.9. If the host is not doing much else, you would expect values of 0.99+. It would be useful to know if there is anything unusual with your cpu_efficiency.

It's probably worthwhile trying to eliminate the BOINC version from being the problem by simply going back to an earlier recommended version. It's hard to believe that 6.4.5 would still be listed as recommended if it were this bad however, so I don't hold much hope.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,405
Credit: 53,432,389,379
RAC: 73,389,436

RE: Now the DCF for the

Message 90276 in response to message 90272

Quote:
Now the DCF for the Q6600 increased to 5.87 !!!

DCF doesn't change during the crunching of a task - it's not a continuously updating function. A change like this only happens just after task completion when BOINC can see the real time taken as compared to the original estimate. BOINC usually makes a change in DCF based on 10% of the difference between actual crunch time and the original estimate. Anytime the DCF is adjusted, BOINC also adjusts the estimates for all future tasks in your cache, including ones that are already partly crunched.

In cases where the difference between actual and estimated crunch times is large in the upwards direction, BOINC will not simply adjust by 10%. It will adjust by the full amount since such an increase might have implications for future work fetch (possible deadline issues).

Without any evidence for vastly increased crunch times, it's hard to understand why BOINC is making this sort of an extremely large increase, unless there really is something to do with DCF that is badly broken in this particular version.

Cheers,
Gary.

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9,352,143
RAC: 0

RE: OK! I will try an

Message 90277 in response to message 90274

Quote:

OK!

I will try an older version.

But my Q6600 is capable of CUDA (8800GT ;-)

Ahhh, yes.

You are showing as CUDA capable for the 6600 at SAH. However, here at EAH they aren't even detecting that yet.

Alinator

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.