I have noticed that the new SR4 wu's are only scheduled according to my client to take 3-4 hr's. Unfortunately they actually take 12-18hr's depending on which CPU they are running on (my two machines currently attached AMD athalon 2500, 3200). This resulted in my clients receiving way to much work for me to finish by the deadline.
Copyright © 2024 Einstein@Home. All rights reserved.
Task duration
)
That's due to BOINC's result duration correction factor being set to a number that corrects the run time of the S5R3 tasks. It will have to learn again how long the new tasks take.
Since you knew the new search was coming, you could've told BOINC to get a lesser queue of work. You can still do so now.
The new fpops_est seems to
)
The new fpops_est seems to require a _very_ different correction factor, especially compared to the "old" optimized apps.
Even with a cache setting of only 1.5 days, I received 16 workunits on a single Pentium III / 800
It is possible though, that removing the app_info.xml has reset the DCF to default 1.00
RE: The new fpops_est seems
)
/me watching with interest and somewhat dismay...
Have to correct my above
)
Have to correct my above post, I had looked at the wrong venue, setting for that P3 is even at 0.35 days - that's good for 16 tasks?
RE: That's due to BOINC's
)
Well I may have known it was coming ,but I did not know it would need to re-learn the task duration factor. being this is the first time I have run into a situation like this. Your post seems a bit condescending, excuse me for my noobness.
Of course the logical thing to do is reduce work cache settings which I did right away, however this does not really help me after the fact. <_<
RE: It is possible though,
)
Nah. I removed mine and my RDCF is still at 0.427171 on this computer.
I only got 1 task, due to my very low additional work requests (0.05).
Unless there is some sort of
)
Unless there is some sort of dramatic speed increase during a task's runtime, my P4 is really struggling. It's been chewing on a task for about 2 hours and 40 minutes and is only at around 9.25% estimated completion.
I found the problem now. It
)
I found the problem now. It does deliver SR4 files to computers having an app_info.xml for SR3.
Those results are rejected by the BOINC client (5.10.28 in this case) :
but the BOINC client does not tell the scheduler, that it has rejected it.
So the web site still lists the result for that box, the box doesn't have anything though.
So that P3 box mentioned above must have received a bunch of SR4 WUs, before I removed the app_info.xml
I guess this is either a configuration problem on the Einstein project side or a bug in this BOINC server side version.
Seems to be quite critical to me because it will create 1000s of ghost WUs.
RE: I found the problem
)
It's already been reported as a 'critical' bug in the BOINC server-side code: trac [trac]#713[/trac]. It's been seen at SETI, SETI Beta and CPDN Beta as well, so it's nothing (special) to do with Einstein - just another BOINC trip-wire for Bruce to fall over during his server upgrade weekend.
RE: RE: That's due to
)
I've picked this post to reply to, not because I want to be condescending towards anyone or pick on anyone for perhaps "noobness" or otherwise. I just want to try to explain the situation and give advice on how to best rectify things. Right up front I want to assure people that it's quite easy to get back to some equilibrium.
Firstly a bit of history. The estimate of crunch time is set by the work unit generator (WUG). Back at the start of S5R3, the apps were rather slower than they were at the end. The WUG estimate was OK at the start of S5R3 but towards the end, the tasks were being done much faster than the WUG estimate. BOINC handles this by lowering the duration correction factor (DCF) stored in your host's state file (client_state.xml). BOINC knows nothing about the efficacy of the WUG estimate. It simply reacts to reality as it perceives it over time. BOINC learns to correct the now incorrect estimate built into the WUG by lowering the DCF. At the end of S5R3, it would not be uncommon for the DCF to be 0.25 or less just to cope with the old WUG estimate which is now hopelessly too long for reality.
Enter S5R4, with a new WUG providing new and updated estimates. Your BOINC client cannot possibly know in advance about the changes in the new WUG so until it knows better (by relearning) it will assume that a DCF of 0.25 is still OK. On the other hand, the new WUG now knows a much better estimate of the real time so a new task now contains a realistic estimate of the true time but BOINC then applies the existing correction of say 0.25 and comes up with its own estimate that is 4 times shorter than it should be. When the first new task is completed, BOINC will get a very sudden wakeup call and will (in one big hit) change the DCF from 0.25 to perhaps a lot closer to the "ideal" value of 1.0.
Until BOINC has a chance to do this, the only real problem is that you will likely get a whole bunch of new work - more than your machine can handle - until the new DCF takes effect. This will only be a problem if you have left your cache size at something like 10 days. In that case you might end up with about 40 days of actual work.
Is this a big deal? Well actually, no it's not! All you have to do is firstly return your cache settings to something a bit more realistic, say 1 - 2 days at the most and then simply abort whatever tasks you feel are in excess of what your machine can comfortably handle. Immediately after you abort the excess, hit the "update" function so that the server can immediately be notified and then it will simply resend these excess tasks to someone else. End of problem. Very little bandwidth has been wasted because a task is just a very small set of parameters that tell the science app how to crunch the data that is already on your machine. Aborting tasks is NOT aborting large data files.
Is there a way to avoid all this? No, not really. Whenever the WUG changes, BOINC can't know in advance if the new crunch time estimates are good bad or indifferent. It will always have to relearn the new reality. We can assist by not leaving machines with overly large caches.
Cheers,
Gary.