Work is being created so slowly that the RTS buffer never has chance to fill because any produced task is downloaded immediately by all the starving hosts.
Would it help if I aborted the FGRP#1 tasks I have on my pc's? I've been slowly stopping running them and running other projects instead to give you guys a chance to get some.
There is a project induced FGRPB1G work shortage currently. I am unable to retrieve any work for all my hosts when I contact the scheduler.
No work is available message constantly on every request. The SSP shows plenty of work in the RTS category for the FGRPB1GG and BRP4/7 tasks but none is received when requested.
I am positive this is because of the beta GW O3MD* tasks which has the work generators running wild. Over a million tasks in the RTS buffer for that sub-project.
The effect of that is the download servers are getting swamped by the GW work going out and there is no other work making it to the RTS buffer for those not running the beta work.
I am down over 2000 tasks in my caches across my five hosts. Will be out of work in less than 8 hours now.
The admins need to turn off the beta task generators and deplete the massive overabundance of the work in the RTS buffers.
The admins need to turn off the beta task generators and deplete the massive overabundance of the work in the RTS buffers.
I suspect there is an important additional element that may be more significant than the sheer amount of O3 GPU task generation. Based on my observations on my hosts, which currently have thousands of pending tasks for which the necessary quorum partner task remains unsent, in some cases after over a week, I suspect there are restrictive rules on what is a suitable host to which a second task in a WU can be sent which are so restrictive that hardly any host requesting work qualifies. Those tasks then remain in the buffers.
This has happened every time that the GW work is available trying to find the suitable wingmen with the previously downloaded seed files for the work onboard every host.
This demands attention from the admins. I don't understand why they continue to let this occur.
All that needs to be done is put a regulator on the work generators to restrict the quantity of task in the work buffers like they have implemented for the other sub-projects.
The download servers and schedulers work best when they restrict the RTS buffers to around 10,000 - 15,000 tasks like the FGRRPB1G and BRP4/7 RTS buffers.
That is sufficient to keep all hosts properly fed.
But when they let the GW work generators run unchecked and generate over 2M tasks, things grind to a halt quickly.
excess O3MD1 tasks purged, and FGRPB1G and BRP7 tasks flowing normally again.
definitely the extraneous O3MD tasks clogging up the works since these two events coincide with one another. maybe something like a full disk scenario. or the system getting severely bogged down trying to manage so many tasks.
Keith Myers wrote: magic_sam
)
Would it help if I aborted the FGRP#1 tasks I have on my pc's? I've been slowly stopping running them and running other projects instead to give you guys a chance to get some.
For the benefit of anyone
)
For the benefit of anyone checking here, the shortage of FGRPB1G work seems to be over - at least for the moment.
Cheers,
Gary.
There is a project induced
)
There is a project induced FGRPB1G work shortage currently. I am unable to retrieve any work for all my hosts when I contact the scheduler.
No work is available message constantly on every request. The SSP shows plenty of work in the RTS category for the FGRPB1GG and BRP4/7 tasks but none is received when requested.
I am positive this is because of the beta GW O3MD* tasks which has the work generators running wild. Over a million tasks in the RTS buffer for that sub-project.
The effect of that is the download servers are getting swamped by the GW work going out and there is no other work making it to the RTS buffer for those not running the beta work.
I am down over 2000 tasks in my caches across my five hosts. Will be out of work in less than 8 hours now.
The admins need to turn off the beta task generators and deplete the massive overabundance of the work in the RTS buffers.
Keith Myers wrote:The admins
)
I suspect there is an important additional element that may be more significant than the sheer amount of O3 GPU task generation. Based on my observations on my hosts, which currently have thousands of pending tasks for which the necessary quorum partner task remains unsent, in some cases after over a week, I suspect there are restrictive rules on what is a suitable host to which a second task in a WU can be sent which are so restrictive that hardly any host requesting work qualifies. Those tasks then remain in the buffers.
This has happened every time
)
This has happened every time that the GW work is available trying to find the suitable wingmen with the previously downloaded seed files for the work onboard every host.
This demands attention from the admins. I don't understand why they continue to let this occur.
All that needs to be done is put a regulator on the work generators to restrict the quantity of task in the work buffers like they have implemented for the other sub-projects.
The download servers and schedulers work best when they restrict the RTS buffers to around 10,000 - 15,000 tasks like the FGRRPB1G and BRP4/7 RTS buffers.
That is sufficient to keep all hosts properly fed.
But when they let the GW work generators run unchecked and generate over 2M tasks, things grind to a halt quickly.
Now only 3 hours of work left on my fastest host.
excess O3MD1 tasks purged,
)
excess O3MD1 tasks purged, and FGRPB1G and BRP7 tasks flowing normally again.
definitely the extraneous O3MD tasks clogging up the works since these two events coincide with one another. maybe something like a full disk scenario. or the system getting severely bogged down trying to manage so many tasks.
_________________________________________________________________________
Yes, happy to see Bernd post
)
Yes, happy to see Bernd post that they goofed in letting the GW work generators run wild and fouling up the works for the other sub-projects.
Overnight I refilled my caches of GR#1 work so I am happy now.