FGRPB1G work shortage

mikey
mikey
Joined: 22 Jan 05
Posts: 12689
Credit: 1839094724
RAC: 3728

Keith Myers wrote: magic_sam

Keith Myers wrote:

magic_sam wrote:

Hi all,

What's the status regarding that FGRPB1G work shortage ?

I just picked up 4 new tasks, even though the "tasks to send" counter remains at 0:

https://einsteinathome.org/server_status.php

Cheers, Sam

Work is being created so slowly that the RTS buffer never has chance to fill because any produced task is downloaded immediately by all the starving hosts. 

Would it help if I aborted the FGRP#1 tasks I have on my pc's? I've been slowly stopping running them and running other projects instead to give you guys a chance to get some.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117653062763
RAC: 35168459

For the benefit of anyone

For the benefit of anyone checking here, the shortage of FGRPB1G work seems to be over - at least for the moment.

Cheers,
Gary.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18742118197
RAC: 7013800

There is a project induced

There is a project induced FGRPB1G work shortage currently.  I am unable to retrieve any work for all my hosts when I contact the scheduler.

No work is available message constantly on every request.  The SSP shows plenty of work in the RTS category for the FGRPB1GG and BRP4/7 tasks but none is received when requested.

I am positive this is because of the beta GW O3MD* tasks which has the work generators running wild. Over a million tasks in the RTS buffer for that sub-project.

The effect of that is the download servers are getting swamped by the GW work going out and there is no other work making it to the RTS buffer for those not running the beta work.

I am down over 2000 tasks in my caches across my five hosts.  Will be out of work in less than 8 hours now.

The admins need to turn off the beta task generators and deplete the massive overabundance of the work in the RTS buffers.

 

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7224424931
RAC: 1019852

Keith Myers wrote:The admins

Keith Myers wrote:

The admins need to turn off the beta task generators and deplete the massive overabundance of the work in the RTS buffers.

I suspect there is an important additional element that may be more significant than the sheer amount of O3 GPU task generation.  Based on my observations on my hosts, which currently have thousands of pending tasks for which the necessary quorum partner task remains unsent, in some cases after over a week, I suspect there are restrictive rules on what is a suitable host to which a second task in a WU can be sent which are so restrictive that hardly any host requesting work qualifies.  Those tasks then remain in the buffers.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18742118197
RAC: 7013800

This has happened every time

This has happened every time that the GW work is available trying to find the suitable wingmen with the previously downloaded seed files for the work onboard every host.

This demands attention from the admins. I don't understand why they continue to let this occur.

All that needs to be done is put a regulator on the work generators to restrict the quantity of task in the work buffers like they have implemented for the other sub-projects.

The download servers and schedulers work best when they restrict the RTS buffers to around 10,000 - 15,000 tasks like the FGRRPB1G and BRP4/7 RTS buffers.

That is sufficient to keep all hosts properly fed.

But when they let the GW work generators run unchecked and generate over 2M tasks, things grind to a halt quickly.

Now only 3 hours of work left on my fastest host.

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3956
Credit: 46953772642
RAC: 64599287

excess O3MD1 tasks purged,

excess O3MD1 tasks purged, and FGRPB1G and BRP7 tasks flowing normally again.

definitely the extraneous O3MD tasks clogging up the works since these two events coincide with one another. maybe something like a full disk scenario. or the system getting severely bogged down trying to manage so many tasks. 

_________________________________________________________________________

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18742118197
RAC: 7013800

Yes, happy to see Bernd post

Yes, happy to see Bernd post that they goofed in letting the GW work generators run wild and fouling up the works for the other sub-projects.

Overnight I refilled my caches of GR#1 work so I am happy now.

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.