S6CasA and FGRP2 credit and estimated run time

low2
low2
Joined: 27 Jan 08
Posts: 3
Credit: 32206
RAC: 0
Topic 197363

The ratio between the S6CasA CPU time (about 38 hours on my computer) and the FGRP2 one (52 hours) is 0.73:1, which is quite similar to the estimated computation size ratio (100800 GFLOPs / 150000 GFLOPs) which is 0.672:1.

However, these ratios are not the same as the granted credit ratios and the estimated run time ratio. The ratio of the credit granted for S6CASa (390.83) and FGRP (880) is 0.444:1 (which made it better to run FGRP2 tasks), and the ratio is similar for the estimated run time. After running two S6CasA tasks which took 40 hours, the estimated run time of the FGRP2 tasks went up to 98 hours giving a ratio of 0.408:1.

I'm not sure about the new FGRP3 tasks as my computer just finished two S6CasA tasks and two FGRP2 tasks. Did anyone else notice this before or has this been posted before?

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7234907715
RAC: 1202361

S6CasA and FGRP2 credit and estimated run time

Estimated run times don't stay the same. As your host runs more of a given type, a couple of twiddle factors are meant eventually to match up to your host's processing capability overall, and the relative rates at which different tasks run.

So basing a credit complaint on early estimates is more likely an observation on inaccurate estimates than on unfair credit. Better to base any concern to get maximum productivity on your own machines on multiple actual run times, and to base any global fairness concerns on actual observations of run times on a diverse set of machines--preferably representative of the overall Einstein population.

I had not been running FGRP2 on any of my hosts, and when they initially ran FGRP3 the whole estimation scheme went askew for a little while. I posted on this in the Problems forum.

On my hosts, last time I checked, I got slightly more credit per hour for CasA than for FGRP2 (so I ran CasA). This time around they get slightly more credit for FGRP3 than for CasA (so I've switched to FGRP3). I little doubt those with configuration and processor differences will see somewhat different ratios.

low2
low2
Joined: 27 Jan 08
Posts: 3
Credit: 32206
RAC: 0

This was not really a

This was not really a complaint, I just wanted to see if anyone knew a more specific reason for this difference. I had run a few other S6CasA and FGRP2 tasks before with similar results (actually I remember the FGRP2 tasks took ~60 hours which brings the ratio (0.667:1) even closer to the estimated computation size ratio). Partly I was also curious as to how the estimated run time is calculated, I assumed it would be proportional to the estimated computation size.

Since now there are not much other FGRP2 tasks left, I couldn't really make a comparison between the FGRP2 tasks and the S6CasA tasks for other people. I'll just see how the new FGRP3 tasks go.

Have you ran FGRP2 tasks before or not? You said that the S6CasA tasks gave more credit per hour than FGRP2 tasks, so did you mean you ran a test one? I don't really understand what you mean..

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7234907715
RAC: 1202361

RE: Have you ran FGRP2

Quote:
Have you ran FGRP2 tasks before or not? You said that the S6CasA tasks gave more credit per hour than FGRP2 tasks, so did you mean you ran a test one? I don't really understand what you mean..

I think the BOINC adjustment processes don't work especially well when one has a mix of WU types running on the CPU from the same project. So generally I try to find which Einstein CPU WU type is giving my particular systems the most credit/hour and standardize on that one. From time to time (usually when a new type of work shows up), I'll run a test for a few days, averaging the wall clock time per WU on each host for each type of work and doing the simple arithmetic to compare credit productivity.

My records suggest I last did this in early November 2013, and at that time found a GW6 advantage in credit per hour over what I labelled as "GRP2" ranging from 0.99 (very slight disadvantage) to 1.19 (moderate advantage) over my five hosts--so I standardized from then until now on GW6. My most recent comparisons spanned a few days, and suggested that for my particular hosts GW6 fell behind FGRP3 by ratios in the .76 to .87 range.

See what I mean now?

low2
low2
Joined: 27 Jan 08
Posts: 3
Credit: 32206
RAC: 0

I was just a bit confused

I was just a bit confused because you said you had not been running FGRP2 on any of your hosts in your first post. So you mean that you only ran FGRP2 before on your hosts to compare the credit productivity, and never specifically ran FGRP2 tasks outside of those tests?

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7234907715
RAC: 1202361

RE: I was just a bit

Quote:
I was just a bit confused because you said you had not been running FGRP2 on any of your hosts in your first post. So you mean that you only ran FGRP2 before on your hosts to compare the credit productivity, and never specifically ran FGRP2 tasks outside of those tests?


While your second sentence is true, what I actually meant was that the prior state of my machines when FGRP3 showed up (uninvited) was that they had not been running FGRP2 recently, and thus did not have up-to-date relative application performance fiddle factors for that application.

The host display at SETI displays something it labels "average processing rate" specific to each application. I assume this is the fiddle factor I refer to, and that for Einstein work BOINC has such a number also, and that some of the oddities you and I have both observed relate to initial mistaken value and subsequent adjustment of that number.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2963795702
RAC: 711826

RE: The host display at

Quote:
The host display at SETI displays something it labels "average processing rate" specific to each application. I assume this is the fiddle factor I refer to, and that for Einstein work BOINC has such a number also, and that some of the oddities you and I have both observed relate to initial mistaken value and subsequent adjustment of that number.


The APR is an integral part of the runtime estimation process brought into BOINC as past of the 'CreditNew' reforms - for good or ill. It is calculated and updated dynamically and automatically for each separate application version, so should keep task runtime estimates realistic for each task type.

Einstein have chosen not to implement CreditNew here until its validity has been proven. As a consequence, they don't have the tool for dynamic management of estimates either: it would be truer to say that Einstein applies a (manual) fiddle-factor to try and keep the estimates reasonable.

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7234907715
RAC: 1202361

RE: As a consequence, they

Quote:
As a consequence, they don't have the tool for dynamic management of estimates either: it would be truer to say that Einstein applies a (manual) fiddle-factor to try and keep the estimates reasonable.


OK, but something offsets the host-specific time-prediction of execution time besides the overall host Task duration correction factor. I could easily see at the transition from running GW6 as my exclusive CPU task--with quite accurate estimates across my flotilla of five hosts--that the initial estimates for FGRP3 were much wider of the mark, and that through some oddity of the adaptation computation, they actually got much worse after execution of the first FGRP3 task than they were at first download. The "much worse" part may have just been the huge leap that somehow got computed for the DCF on more than one of the hosts at first FGRP3 task completion, but even when the heavily GPU-dominated machines ground their DCF back down to where the Perseus GPU task estimates were realistic, the FGRP3 estimates were about 50% high on average and slowly trending down with accumulated experience.

So I'll readily take your insight, Richard, that APR is not it on Einstein, but there must be something host/application specific, somewhere, in addition to the raw benchmark output and the host's overall DCF, isn't there? Or is it not application-specific but just adjusting the GPU vs. CPU? or ??

I'm not trying to argumentative here--just genuinely curious, and offering some tidbits of observation.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2963795702
RAC: 711826

RE: RE: As a consequence,

Quote:
Quote:
As a consequence, they don't have the tool for dynamic management of estimates either: it would be truer to say that Einstein applies a (manual) fiddle-factor to try and keep the estimates reasonable.

OK, but something offsets the host-specific time-prediction of execution time besides the overall host Task duration correction factor. I could easily see at the transition from running GW6 as my exclusive CPU task--with quite accurate estimates across my flotilla of five hosts--that the initial estimates for FGRP3 were much wider of the mark, and that through some oddity of the adaptation computation, they actually got much worse after execution of the first FGRP3 task than they were at first download. The "much worse" part may have just been the huge leap that somehow got computed for the DCF on more than one of the hosts at first FGRP3 task completion, but even when the heavily GPU-dominated machines ground their DCF back down to where the Perseus GPU task estimates were realistic, the FGRP3 estimates were about 50% high on average and slowly trending down with accumulated experience.

So I'll readily take your insight, Richard, that APR is not it on Einstein, but there must be something host/application specific, somewhere, in addition to the raw benchmark output and the host's overall DCF, isn't there? Or is it not application-specific but just adjusting the GPU vs. CPU? or ??

I'm not trying to argumentative here--just genuinely curious, and offering some tidbits of observation.


Sure - I'm likewise.

This project is one of the best for exercising this sort of curiosity, because the server logs are visible and accessible. Here's some food for thought from your two CUDA hosts. From:

http://einstein.phys.uwm.edu/host_sched_logs/4234/4234243
2014-02-01 14:04:16.4473 [PID=25747] [send] active_frac 0.999923 on_frac 0.969362 DCF 1.119750
2014-02-01 14:04:17.2456 [PID=25747] [send] [HOST#4234243] Sending app_version 521 hsgamma_FGRP3 2 109 ; 3.33 GFLOPS
2014-02-01 14:04:17.2481 [PID=25747] [send] est. duration for WU 183364804: unscaled 45032.53 scaled 52022.93
2014-02-01 14:04:17.2518 [PID=25747] [send] [HOST#4234243] Sending app_version 479 einsteinbinary_BRP5 2 139 BRP5-cuda32-nv301; 33.31 GFLOPS
2014-02-01 14:04:17.2539 [PID=25747] [send] est. duration for WU 183369314: unscaled 13509.76 scaled 15606.88

http://einstein.phys.uwm.edu/host_sched_logs/3409/3409259
2014-02-01 17:35:01.5184 [PID=28394] [send] active_frac 0.999940 on_frac 0.999714 DCF 0.892226
2014-02-01 17:35:01.9448 [PID=28394] [send] [HOST#3409259] Sending app_version 479 einsteinbinary_BRP5 2 139 BRP5-cuda32-nv301; 24.91 GFLOPS
2014-02-01 17:35:01.9463 [PID=28394] [send] est. duration for WU 183378964: unscaled 18063.77 scaled 16122.55

So there *is* some sort of fiddle-factor going on on the server. It thinks that app_version 479 is some 25% faster on the first of the two (apparently identical) hosts - 33.31 GFLOPS to 24.91 GFLOPS. But by the time the estimates have been scaled (primarily by DCF), they turn out almost identical. Spooky.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.