I tested if setting '0.1 days' of work cache would limit receiving tasks for two hosts that already had some results. At that point they also had work in queue for about 1 day. No. Those hosts were still continuing to download more work (1 task per contact) even after there was already work in queue for about 100 hours. I wonder if they would've hit somekind of ridiculously high daily quota limit in the end. Looks like that setting for work cache can potentially behave almost like a non-limitting ON/OFF switch with these tasks. I'd suggest a little bit of manual monitoring on that.
No. Those hosts were still continuing to download more work (1 task per contact) even after there was already work in queue for about 100 hours. I wonder if they would've hit somekind of ridiculously high daily quota limit in the end. Looks like that setting for work cache can potentially behave almost like a non-limitting ON/OFF switch with these tasks. I'd suggest a little bit of manual monitoring on that.
I see you are running BOINC 7.16.3 too. I had that same problem (even worse) on WCG after upgrading, and posted on their forum. Someone suggested it was 7.16.3, but I doubted it. Now I am beginning to wonder what is going on.
EDIT: I think (but am not sure) that it straightens itself out eventually. Maybe just some initial value is set wrong?
I see you are running BOINC 7.16.3 too. I had that same problem (even worse) on WCG after upgrading, and posted on their forum. Someone suggested it was 7.16.3, but I doubted it. Now I am beginning to wonder what is going on.
That's interesting, thanks for posting. It might well be that this has something to do with the Boinc version.
I have currently all hosts set to 'no new tasks'. I'll open the gates on those two hosts with '0.1 days' tomorrow and see if the scheduler still thinks more work should be downloaded.
Jim1348 wrote:
EDIT: I think (but am not sure) that it straightens itself out eventually. Maybe just some initial value is set wrong?
Actually just a moment ago with one other host I saw behaviour that could possibly support that straightening. This other host had a few dozen tasks in queue at peak, but I had set it to 'no new tasks' and about 6 tasks were left. I tested what would happen now with '0.1 days'. The host started downloading tasks one at a time, but it stopped after there was 16 tasks total (of which 4 are running). Scheduler says now "Not requesting tasks: don't need". So it's happy with 0.1 days being only 16 tasks (will take about 40 hours to drain out). That is positive already. I hope the other hosts will straighten likewise.
REAL-TIME-EDIT: Haha... the scheduler was just bluffing me on that "positive" case. It clearly noticed I was not watching that host and had started to download more tasks behind my back. There's now 30 'in progress' and that number was going up. 'No new tasks' for that system too ! Absolutely can't leave the gate open for them. I'll see if something will eventually change, but I'm starting to believe this is happening for me because of the Boinc v7.16.3 . No problems with the O2MD1 itself whatsoever !
REAL-TIME-EDIT: Haha... the scheduler was just bluffing me on that "positive" case. It clearly noticed I was not watching that host and had started to download more tasks behind my back. There's now 30 'in progress' and that number was going up.
Something that might help is that I set all my machines to converge more rapidly to the correct value.
For anyone running the default BOINC 7.14.2 version, and changes to the experimental 7.16 branch, you will definitely see changes in work requests. Lots of changes to how BOINC balances work fetches, deadlines and devices busy among projects in the latest client.
The change in REC in cc_config is very helpful in balancing work requests among projects.
Thank you both !
I added that extra line now in cc_config.xml. Half-life was a freakin' great game... and at least a couple of those great moments when playing that masterpiece in my man cave back then... should had recorded some of that atmosphere. I knew. That will be fixed now.
To get an idea of where each project stands in regard to other projects, turn on work_fetch_debug in Event Log logging options for a brief period. Then look at the output and pay attention to the REC value for each project. That shows you the ratio of resource share between projects.. For example earlier today I changed my resource share for Milkyway from 100 to 200. That caused Milkyway to have to "catch up" to Seti. All my gpus had to shift to exclusive running of MW from their normal Seti work. My Seti resource share is 1000 and when I changed MW to 200, the ratio changed from 1/10 to 1/5. MW is now caught up and and you can see that the ratio of the REC of Seti (59589923) divided by 5 is now the REC of Milkyway (11947152).
Mon 07 Oct 2019 03:18:27 PM PDT | | [work_fetch] ------- start work fetch state -------
Mon 07 Oct 2019 03:18:27 PM PDT | | [work_fetch] target work buffer: 43200.00 + 0.00 sec
Mon 07 Oct 2019 03:18:27 PM PDT | | [work_fetch] --- project states ---
Mon 07 Oct 2019 03:18:27 PM PDT | Einstein@Home | [work_fetch] REC 1524337.105 prio -9.940 can't request work: "no new tasks" requested via Manager
Mon 07 Oct 2019 03:18:27 PM PDT | GPUGRID | [work_fetch] REC 1846456.036 prio -1.000 can request work
Mon 07 Oct 2019 03:18:27 PM PDT | Milkyway@Home | [work_fetch] REC 11947152.895 prio -3.386 can't request work: scheduler RPC backoff (49.01 sec)
Mon 07 Oct 2019 03:18:27 PM PDT | SETI@home | [work_fetch] REC 59589923.555 prio -2.293 can't request work: scheduler RPC backoff (277.14 sec)
So after I made the changes to my preferences there is a change. Now seeing the below. Have 8 new O2MD1 in my cache
2019-10-08 01:25:21.2616 [PID=9439 ] [version] Checking plan class 'LIBC215'
2019-10-08 01:25:21.2645 [PID=9439 ] [version] reading plan classes from file '/2019-10-08 01:25:21.4638 [PID=9441 ] SCHEDULER_REQUEST::parse(): unrecognized: <allow_multiple_clients>0</allow_multiple_clients>
2019-10-08 01:25:21.2646 [PID=9439 ] [version] plan class ok
2019-10-08 01:25:21.2647 [PID=9439 ] [version] Best version of app einstein_O2MD1 is 1.01 ID 1188 LIBC215 (8.90 GFLOPS
robl wrote:I am seeing
)
The amount of granted credits seem to be tied with the frequency of a task.
21.70 Hz ... 210 credits
21.80 Hz ... 220
22.10 Hz ... 430
22.40 Hz ... 440
42.00 Hz ... 810
43.40 Hz ... 840
55.55 Hz ... 1000
etc.
That feature makes sense with the freq-runtime-curve. Nice !
I tested if setting '0.1
)
I tested if setting '0.1 days' of work cache would limit receiving tasks for two hosts that already had some results. At that point they also had work in queue for about 1 day. No. Those hosts were still continuing to download more work (1 task per contact) even after there was already work in queue for about 100 hours. I wonder if they would've hit somekind of ridiculously high daily quota limit in the end. Looks like that setting for work cache can potentially behave almost like a non-limitting ON/OFF switch with these tasks. I'd suggest a little bit of manual monitoring on that.
Richie wrote: No. Those hosts
)
I see you are running BOINC 7.16.3 too. I had that same problem (even worse) on WCG after upgrading, and posted on their forum. Someone suggested it was 7.16.3, but I doubted it. Now I am beginning to wonder what is going on.
EDIT: I think (but am not sure) that it straightens itself out eventually. Maybe just some initial value is set wrong?
So here is a snip bit from
)
So here is a snip bit from the last time my computer contacted the server.
I only had Continuous Gravitational Wave search O2 All-Sky check
Jim1348 wrote:I see you are
)
That's interesting, thanks for posting. It might well be that this has something to do with the Boinc version.
I have currently all hosts set to 'no new tasks'. I'll open the gates on those two hosts with '0.1 days' tomorrow and see if the scheduler still thinks more work should be downloaded.
Actually just a moment ago with one other host I saw behaviour that could possibly support that straightening. This other host had a few dozen tasks in queue at peak, but I had set it to 'no new tasks' and about 6 tasks were left. I tested what would happen now with '0.1 days'. The host started downloading tasks one at a time, but it stopped after there was 16 tasks total (of which 4 are running). Scheduler says now "Not requesting tasks: don't need". So it's happy with 0.1 days being only 16 tasks (will take about 40 hours to drain out). That is positive already. I hope the other hosts will straighten likewise.
REAL-TIME-EDIT: Haha... the scheduler was just bluffing me on that "positive" case. It clearly noticed I was not watching that host and had started to download more tasks behind my back. There's now 30 'in progress' and that number was going up. 'No new tasks' for that system too ! Absolutely can't leave the gate open for them. I'll see if something will eventually change, but I'm starting to believe this is happening for me because of the Boinc v7.16.3 . No problems with the O2MD1 itself whatsoever !
Richie wrote:REAL-TIME-EDIT:
)
Something that might help is that I set all my machines to converge more rapidly to the correct value.
In the cc_config.xml file, insert this:
<cc_config>
<options>
<rec_half_life_days>1.000000</rec_half_life_days>
</options>
</cc_config>
I am not having major problems at the moment, but am not sure everything is back to normal yet.
For anyone running the
)
For anyone running the default BOINC 7.14.2 version, and changes to the experimental 7.16 branch, you will definitely see changes in work requests. Lots of changes to how BOINC balances work fetches, deadlines and devices busy among projects in the latest client.
The change in REC in cc_config is very helpful in balancing work requests among projects.
Thank you both ! I added that
)
Thank you both !
I added that extra line now in cc_config.xml. Half-life was a freakin' great game... and at least a couple of those great moments when playing that masterpiece in my man cave back then... should had recorded some of that atmosphere. I knew. That will be fixed now.
To get an idea of where each
)
To get an idea of where each project stands in regard to other projects, turn on work_fetch_debug in Event Log logging options for a brief period. Then look at the output and pay attention to the REC value for each project. That shows you the ratio of resource share between projects.. For example earlier today I changed my resource share for Milkyway from 100 to 200. That caused Milkyway to have to "catch up" to Seti. All my gpus had to shift to exclusive running of MW from their normal Seti work. My Seti resource share is 1000 and when I changed MW to 200, the ratio changed from 1/10 to 1/5. MW is now caught up and and you can see that the ratio of the REC of Seti (59589923) divided by 5 is now the REC of Milkyway (11947152).
Mon 07 Oct 2019 03:18:27 PM PDT | | [work_fetch] ------- start work fetch state -------
Mon 07 Oct 2019 03:18:27 PM PDT | | [work_fetch] target work buffer: 43200.00 + 0.00 sec
Mon 07 Oct 2019 03:18:27 PM PDT | | [work_fetch] --- project states ---
Mon 07 Oct 2019 03:18:27 PM PDT | Einstein@Home | [work_fetch] REC 1524337.105 prio -9.940 can't request work: "no new tasks" requested via Manager
Mon 07 Oct 2019 03:18:27 PM PDT | GPUGRID | [work_fetch] REC 1846456.036 prio -1.000 can request work
Mon 07 Oct 2019 03:18:27 PM PDT | Milkyway@Home | [work_fetch] REC 11947152.895 prio -3.386 can't request work: scheduler RPC backoff (49.01 sec)
Mon 07 Oct 2019 03:18:27 PM PDT | SETI@home | [work_fetch] REC 59589923.555 prio -2.293 can't request work: scheduler RPC backoff (277.14 sec)
So after I made the changes
)
So after I made the changes to my preferences there is a change. Now seeing the below. Have 8 new O2MD1 in my cache