FGRP4 Observations and Problems

Anonymous

Recently received 2 FGRP4

Recently received 2 FGRP4 WUs. BOINC is estimating 54 Hours. Will wait and see. My "partner" on one WU is Nemo at the University of Wisconsin... :>)

Jasper
Jasper
Joined: 14 Feb 12
Posts: 63
Credit: 4032891
RAC: 0

I got a couple a few minutes

I got a couple a few minutes ago. There is still some crunching going on for FGRP3 resends, so those will be started somewhen tonight.

They´re LATeah0001D_80 and estimated ... 69 hours, 57 minutes, 40 seconds! Deadline set to two weeks, so that´s OK. I´m holding my breath, hope it´s not going to be anywhere near that figure! FRGP3 took around 12:30 on my system - increased to some 22 hours estimates after my DCF was destroyed with GW and BRP4 - if I left it alone without impacting with some huge stuff, like games or pretty heavy other applications. We´ll see... 😜 😳

Anonymous

RE: Recently received 2

Quote:
Recently received 2 FGRP4 WUs. BOINC is estimating 54 Hours. Will wait and see. My "partner" on one WU is Nemo at the University of Wisconsin... :>)

3 jobs complete in about 8 hours not 54.

archae86
archae86
Joined: 6 Dec 05
Posts: 3165
Credit: 7409641687
RAC: 1907717

Regarding the very large

Regarding the very large (roughly 5x to 10x) overestimate of required completion time for FGRP4 work, on my flotilla of hosts, there appears to have been a sudden and very recent change to much greater realism.

As of this writing, on three out of three hosts with downloaded nonstarted work, ALL of the unstarted units display reasonably plausible completion time estimates on the order of a quarter of those shown up until today.

Completion time realism on work already started is a different matter, with on one single unit which started running just over an hour ago showing somewhat realistic remaining time, while ones started between three and seven hours ago display inflated remaining time to go consistent with the previous state of affairs.

However, since all my unstarted units were downloaded within the last eight hours, it may be that download time is the changeover determinant, in which case it appears that the critical time is just about 12 hours ago at this writing.

It seems likely to me that someone turned a knob.

Of course a side effect of this change is more units are being downloaded to satisfy a stated queue size preference. Happily I am set to about two days, so expect not to be swamped. If you have taken extreme measures to get "enough" work under the previous regime, you may wish to alter settings to avert a deluge.

Harri Liljeroos
Harri Liljeroos
Joined: 10 Dec 05
Posts: 4658
Credit: 3387081263
RAC: 1931898

I see also the reduced time

I see also the reduced time estimates for recently downloaded FGRP4 tasks. But for me it looks like they are now way low, about one quarter of the time they will actually be using. That's based on times I see on the host I am typing this on. Another host seems to have new estimates much closer to what I would expect.

archae86
archae86
Joined: 6 Dec 05
Posts: 3165
Credit: 7409641687
RAC: 1907717

Of my four hosts, three are

Of my four hosts, three are desktop PCs which include an nvidia graphics card running only Einstein Perseus work, and have recently been running only FGRP4 work on their CPUs. On these hosts the DCF was moving up and down "chasing" the estimate error. FGRP4 work completion would give notice of overestimate and the DCF would be adjusted down, then Perseus work completion would have the opposite effect. These three recently had DCF's in the 0.9 to 1.5 range and with this DCF were gravely overestimating FGRP4 completion times and moderately underestimating Perseus times.

My fourth is a laptop, with no GPU, a not especially fast CPU which I only allow to run one task, and far less than 100% turnon time. It was steadily grinding down the DCF slope, having currently reached .47188, which was nowhere near low enough to get an accurate estimate. Progress on this slope was slow because it was only completing a work unit about once every two days at best.

If someone here has a fast host running pure FGRP4 work at a high rate with no conflicting DCF propulsion from other work types, it is possible they may already have reached a DCF under which the "old style" downloads provided reasonably accurate estimates. I think in that case they will see a very large (perhaps 4x) upshift in DCF upon completion of the very first of the new style work units.

Phil
Phil
Joined: 8 Jun 14
Posts: 861
Credit: 814021521
RAC: 9685494

One of my machines is a

One of my machines is a mini-Mac i7 running all 8 threads on FGRP4-SSE2.

When FGRP first came back out times estimates started around 60 hours with an actual completion time of around 12 hours.

The estimated time eventually ground down to around 13 hours with actuals staying about the same.

The current batch of wus have estimated completion times of 2 hours 13 min.

Something has definitely been adjusted.

Phil

Phil

I thought I was wrong once, but I was mistaken.

archae86
archae86
Joined: 6 Dec 05
Posts: 3165
Credit: 7409641687
RAC: 1907717

RE: The estimated time

Quote:
The estimated time eventually ground down to around 13 hours with actuals staying about the same.


Could you take a look at the details page for that machine on your account and report what it is currently showing for Task Duration Correction Factor? Then again what it changes to after the first of the adjusted units is reported?

My guess is that it is currently showing something below 0.2, and will jump up by at least a factor of four on first report.

Phil
Phil
Joined: 8 Jun 14
Posts: 861
Credit: 814021521
RAC: 9685494

Current DCF is 0.263017. I

Current DCF is 0.263017.

I will post an updated one after some new units have completed.

Phil

Phil

I thought I was wrong once, but I was mistaken.

archae86
archae86
Joined: 6 Dec 05
Posts: 3165
Credit: 7409641687
RAC: 1907717

This post is with regard to

This post is with regard to variations in work unit length in FGRP4. As I did not pay close attention to this matter during the previous GRP work, I don't know if this is materially different, but I've noticed that in addition to the primary population of work, there is a considerable zoo of short units about.

If you look at Stderr output listing on the task page after an FGRP4 job is finished, you can see two things which so far nearly agree with each other and which come pretty close to classifying the relative execution time needed:

One is stated as showing how many skypoints there will be, in a line like this which can be found a dozen or so lines past the very beginning shortly before the big gaps start:
% Sky point 1/3

The other is near the very end, and is the last of a sequential set indicating checkpoints, with the last one looking like this:

% checkpoint 3
% Time spent on semicoherent stage: 4148.8073s
% Writing semicoherent output file.

% Following up candidate number: 1


and so on for five candidates...

Once I noticed the skypoints number, which is recently, it so far has matched the maximum checkpoint number, or been one more.

On the "production" FGRP4 work, I have seen work units with at least the following number of skypoints.:

3
5
7
12
13
16
19
20
21
23
26
27
30

Of these, the 30 checkpoint/30 skypoint type is by far the most common, so far.

Gary Roberts or another guru has indicated that the smaller jobs are termed "short ends" and perhaps a search on that term might give some insight into why these exist. I confess I don't know.

While broadly the number of skypoints correlates well to execution time, it is not the end of the story. There are definitely work units with the same number of skypoints but different predicted and actual execution time, and awarded credit.

During the testing phase of FGRP4 there were some units with many more skypoints than 30, but I have seen none in the "LAT...." production phase.

Also during the testing phase I saw strong correlation of execution time to a substring of the task name after the initial portion. In production this has largely disappeared, save that "16.0" tasks seem to me still to be highly likely to be very short.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.