I ran out of GPU work yesterday. This is not the first time. I have noticed on this particular machine that work finishes, is uploaded, and then hangs around in "Ready to Report" status (my other machine uploads results and the work unit is immediately deleted). Most of the time, this eventually clears and new units are downloaded and the "Ready to Report" work units are deleted. Sometimes however, all work units are consumed, and none are downloaded until I manually Update the project. Here is a relevant log snippet.
7/8/2021 1:27:33 PM | Einstein@Home | Computation for task LATeah4011L00_1132.0_0_0.0_17141295_1 finished
7/8/2021 1:27:33 PM | Einstein@Home | Starting task LATeah4011L00_1140.0_0_0.0_1394425_0
7/8/2021 1:27:35 PM | Einstein@Home | Started upload of LATeah4011L00_1132.0_0_0.0_17141295_1_0
7/8/2021 1:27:35 PM | Einstein@Home | Started upload of LATeah4011L00_1132.0_0_0.0_17141295_1_1
7/8/2021 1:27:36 PM | Einstein@Home | Finished upload of LATeah4011L00_1132.0_0_0.0_17141295_1_0
7/8/2021 1:27:36 PM | Einstein@Home | Finished upload of LATeah4011L00_1132.0_0_0.0_17141295_1_1
7/8/2021 1:27:40 PM | Einstein@Home | Sending scheduler request: To fetch work.
7/8/2021 1:27:40 PM | Einstein@Home | Reporting 2 completed tasks
7/8/2021 1:27:40 PM | Einstein@Home | Requesting new tasks for CPU and AMD/ATI GPU
7/8/2021 1:27:42 PM | Einstein@Home | Scheduler request completed: got 0 new tasks
7/8/2021 1:27:42 PM | Einstein@Home | No work sent
7/8/2021 1:27:42 PM | Einstein@Home | No work is available for Gamma-ray pulsar binary search #1 on GPUs
7/8/2021 1:27:42 PM | Einstein@Home | No work is available for Gravitational Wave search O2 Multi-Directional
7/8/2021 1:27:42 PM | Einstein@Home | (reached daily quota of 512 tasks)
7/8/2021 1:27:42 PM | Einstein@Home | Project has no jobs available
7/8/2021 1:27:42 PM | Einstein@Home | Project requested delay of 25628 seconds
7/8/2021 1:29:44 PM | Einstein@Home | Computation for task LATeah4011L00_1116.0_0_0.0_4047210_2 finished
Before the 25628 seconds passed, I pressed update and new work was downloaded.
I've checked relevant settings. I store at least 0.2 days of work, and an additional 0.2 days of work. Networking and other resource consumption settings are wide open. The machine constantly runs with no other workload. I do two GPU work units in parallel in about 6 minutes. I have a very very low invalid and error rate. I may game an hour a day on the machine. When I game, I suspend GPU tasks for that hour and start them back up when I'm finished.
Copyright © 2024 Einstein@Home. All rights reserved.
Your daily quota of tasks to
)
Your daily quota of tasks to download is derived from your count of usable CPUs and GPUs. The GRP GPU task runs fast enough on your 6900XT to give the symptom you describe (daily task limit of 512 exceeded). Another common way to exceed daily quota is quota reduction caused by errors, but that is not your problem.
This quota limit has for several years been a bit of a problem here at Einstein for people with the fastest GPUs. If you look at top producing host lists you may notice that many of the machines list an oddly high number of processors. Some of them have genuinely high core count CPU chips, but others have used a simple trick to falsify reporting on this point, solely in order to allow a larger daily maximum task download quota.
A standard way to do this is to edit your existing cc_config.xml file, which on a current Windows system is probably located at:
c:\ProgramData\Boinc
within the <options> section,
look to see if you already have a line for ncpus, and if so modify it to a higher count. While you can find posts here advising a value as high as 256, that is currently pointless, as the project does not assign a higher quota for values above some threshold, which I think may be 64.
Here is the exact entry I have in cc_config for the machine on which I am typing:
<ncpus>24</ncpus>
As your machine is currently reporting 16 processors, you'll want to set a value higher than 16. Possibly you will find yours is currently set to 0, which is a code meaning "report the real number".
Be sure to save the file as text, having by preference used a proper text editor. Then you'll need to advise boincmgr to reread config files:
Options|read config files
Good luck.
Archae86, thanks. I didn't
)
Archae86, thanks. I didn't realize the project throttled work. Interesting. I suspect at one point there was a rationale for this based on my reading of Ars' forum discussion of the project ten to fifteen years ago. That rationale has probably long since disappeared, at least based on my very simplistic reading of the server status page. Anyway, I didn't have a cc_config.xml file so I created one in the ProgramData\Data directory with your suggestions and it is now reporting 24 CPUs.
As someone with several of
)
As someone with several of the fastest systems on the project, I can't say I've ever seen this issue (outside of a temporary work fetch ban from too many errors, as arch described). Though I do run relatively high core count CPUs for the most part. So maybe they're high enough already to ensure enough work is flowing.
do we know the allotment per CPU core and per GPU? how much more is added to the quota with each core/device?
_________________________________________________________________________
Ian&Steve C. wrote:do we know
)
Yes we do.
32 tasks/day for each CPU, plus 256 for each GPU.
But if you limit what fraction of your CPUs are allowed for BOINC to consider as available for use using the Preferences|Advanced Settings|Processor Usage|
Use at most nn% of the processors
mechanism, that cuts the count for daily task download quota purposes.
I think Intel GPU functions on the CPU chip may not count at all, or at least don't count if you have them disabled. I don't know the particulars of that, nor of other special cases. I recall there is a limit of claimable CPUs and GPUs for this purpose. For CPUs it is somewhere between 64 and 100, and for GPUs I think it is eight.
Radeon VII card owners ran into this limit a long time ago. Big Navi cards can get there also, as shown by the Original Poster on this thread (and seen on my own systems).
Burned wrote:... work
)
There's a good reason for that. It's called 'protecting' the servers and online database :-). The biggest saving is in reporting a bunch of tasks in one hit rather than each one individually.
The two operations of 'reporting' completed work and issuing new work both require significant interaction with the database. By combining the two, the total load on the system is minimised. Your BOINC client is designed to not immediately report completed work (the files are already safely uploaded so there's no particular hurry) but rather to wait (with an upper limit on time, just in case) until there is a work request required. At that point, new work fetch and completed work reporting can occur simultaneously, minimising the overall impact.
Your choice of work cache settings makes this "holding of unreported work" a more noticeable side effect. With values of 0.2 + 0.2 for the two settings, the client will 'fill up' to 0.4 days but then wont need to make a further work request until the work on hand drops below the 'low water mark' of 0.2 days. If you were to choose a compromise setting of 0.3 + 0, the effect would be that the client would make more frequent work requests in maintaining the fixed 0.3 days of work. You would then see completed work being cleared much more regularly.
Cheers,
Gary.
archae86 wrote: Ian&Steve C.
)
thanks. I think my high core count CPUs are keeping me afloat then. My 3080Ti is doing about 750 task a day, so with 32 cores I’m below the limit on 1 GPU.
probably getting close on the 7x2080ti system tho.
_________________________________________________________________________
Gary, thanks. I didn't
)
Gary, thanks. I didn't understand that subtlety between uploading results and reporting them complete. I'm all for taking load off the back end. My goal is to maximize throughput. I don't want any idle cycles on my computers. One of my frustrations with other projects is that sometimes the generation of work units requires a bit more babysitting than the researchers are able to do. Nothing worse than looking at your logs and seeing you haven't done any work for 48 hours.
Ian&Steve C. wrote: My
)
That's amazing output. In Folding@Home, a 3080ti is only about 24% faster than a 6900XT. I guess I should do some benchmarking.
FWIW, my 2070 Super (X2) are
)
FWIW, my 2070 Super (X2) are doing 322 per day, or 161 each.
That works out to ~8:56 per task, but my BOINC Manager is showing ~8:19 to ~8:22 per task. I need to work on this, I'm sure they should do better than that.
...[EDIT]...
If I recall correctly, they used to do a GPU task in ~6:00 to ~6:15, but that was a different task description. But then again, I just may be dreaming.
...[EDIT]...
I'm talking about the current batch of LATeah4011L01 tasks which I'm doing well over 8 min per task.
Proud member of the Old Farts Association
My 2060 Super is doing about
)
My 2060 Super is doing about 150 tasks a day (about 580 seconds per). This is the LATeah4 stuff. The LATeah3 tasks ran in about 450 seconds. You can really see the project "slowing down" since the 3's gave way to the 4's.