New "Remaining (estimated)" time computation

George Johnson
George Johnson
Joined: 20 Nov 06
Posts: 2
Credit: 57,530,616
RAC: 0
Topic 197716

I was wondering if you could improve/correct your Remaining (estimated) time computation. It is very inaccurate! I also run Cosmology@Home and Milkyway@Home.

Their time remaining field computation is pretty accurate (within a few percent), compared to Einstein@Home. Current example for E@H: Progress 16.204% Elapsed 16:35:20 Remaining 580:23:20.

mikey
mikey
Joined: 22 Jan 05
Posts: 9,084
Credit: 1,092,008,569
RAC: 5,956,067

New "Remaining (estimated)" time computation

Quote:

I was wondering if you could improve/correct your Remaining (estimated) time computation. It is very inaccurate! I also run Cosmology@Home and Milkyway@Home.

Their time remaining field computation is pretty accurate (within a few percent), compared to Einstein@Home. Current example for E@H: Progress 16.204% Elapsed 16:35:20 Remaining 580:23:20.

The problem has to do with the units themselves, some units are more complicated to crunch then others, and Boinc does not look at the overall time to crunch, just an on the fly calculation. An example might be if you are buzzing along at 60 mph in your car and your destination is 60 miles away your gps will tell you you have 60 minutes to go. BUT whoa you hit a track jam, your gps recalculates and your time to destination goes from 60 minutes to 90 minutes, 120 minutes etc, etc the longer you sit in that traffic. Then all of a sudden the traffic is gone and your gps goes back to the original time estimate of the 60 minute range, minus the time you spent stuck in traffic. All this means that when Boinc sees a slowdown in crunching time it automatically extends the time to finish estimate, thinking the rest of the unit will take as long as the part you are on is taking. This could be correct, but at Einstein usually is not, but it is not something Einstein has any control over, it is an internal Boinc thing.

It is also something the Boinc Developers, the main programmers, have been working on ever since Boinc was released many years ago. It IS better then it first was, but as you can see it can still give wildly inaccurate numbers sometimes.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,593
Credit: 85,633,479,288
RAC: 67,024,201

RE: ... Current example for

Quote:
... Current example for E@H: Progress 16.204% Elapsed 16:35:20 Remaining 580:23:20.


Unfortunately, it is the behaviour of BOINC that is now causing an ongoing problem for you. Yes, the badly estimated run times of the beta test tasks caused the problem in the first place but the problem will continue for a long time (perhaps months) without manual intervention.

You have two tasks showing on the website for your computer, one a completed beta test task and the other, an 'in progress' task from the recently started FGRP4 run. I presume the above progress figures relate to the 'in progress' task.

Some beta test tasks took a lot longer than the estimates and yours would have been one of these. When the task completed, BOINC would have noted the discrepancy between estimate and actual and would have adjusted the duration correction factor (DCF) to account for it. If the discrepancy had been a factor of 10 (it could have been even more), the DCF would be 10 times larger than it should be. This means that subsequent tasks (eg the one you are currently processing) will be estimated at far higher than they will actually take. On your above progress details, 16% in 16 hours would indicate that perhaps the task will take around 100 hours in total - probably less. BOINC will be pretty lousy at refining the estimate of remaining time until right at the end.

Your biggest problem will be the fact that it will take an incredibly long time for BOINC to correct the damage to the DCF that the single test task has caused. Each subsequent task completing normally will cause a downward refinement to the DCF (in small steps) but it will take perhaps 20 or more to achieve a full correction. At several days per task, it's going to take a very long time. If it were my machine, I would correct the problem by manually editing the state file (client_state.xml) in the BOINC data folder to reduce the current DCF for the EAH project by a factor of 10. It doesn't really matter if you go too far with the reduction. The first completed task afterwards will immediately fix this.

It's not normal to have to edit the state file. If you make a mistake you can do a lot of damage. You should not try to edit this file if you don't fully understand exactly what you are trying to achieve. If you are not comfortable with editing, you should set No New Tasks (NNT) for EAH and complete and return the current task. If you then leave and rejoin the project, you should end up with corrected values for all such parameters.

If you decide to edit, you have to stop BOINC and open the file with a plain text editor like Windows Notepad. You have to find the linex.xxxxxx, being careful to check that this line is within the block of lines that belong to the Einstein project. As an example, if you found the value to be 24.123456, you would edit it to be 2.412345 - a factor of 10 reduction. This is the only change you need to make. When you save the file, make sure it's still called client_state.xml precisely (no extra .txt extension).

After doing this, you can restart BOINC and your estimates will look a lot better. It would be very wise to stay with the FGRP4 run only if you don't want any further disruption to estimates on EAH. Because the DCF is 'per project' and not 'per science app', running different apps may cause problems when some apps take shorter or longer than what is built into a task.

Also, you should probably stay away from beta test apps. The deadlines are usually short and results need to be returned quickly. The beta test task you returned had an elapsed time of over 4 days. Because it exceeded the deadline, an extra copy was sent out and yours would NOT have received credit but for the fact that you managed to sneak in before the third one was returned. Your machine is a bit too slow to be suited to beta tests.

The other thing you should look into is the big time difference between CPU time and run time. This seems unusually large and is an indication that something else outside of BOINC is consuming *lots* of CPU cycles on your machine.

Cheers,
Gary.

George Johnson
George Johnson
Joined: 20 Nov 06
Posts: 2
Credit: 57,530,616
RAC: 0

Thanks for the info. I can

Thanks for the info. I can wait :)

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 905
Credit: 25,166,626
RAC: 1

FYI, the estimate runtime has

FYI, the estimate runtime has been improved...

Oliver

 

Einstein@Home Project

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.