Platform: Win 8.1, 64-bit Prof; Xeon E3-1270 CPU 3.7 GHz; 32 GB Memory; GTX 1070 and RTX 2070.
It takes 11.5 to 12.5 hours for computer to complete a CPU WU, so at most about 6 per day are completed. Preferences are set to store at least 2 days of work and to process two WUs per GPU. App_config.xml is set to simulate 7 CPUs, 4 GPU and 3 CPU.
BOINC controls the inventory size of GPU WUs perfectly, but caches 65-75 CPU WUs, which is equivalent to 10-12 days of work. So as the deadline for CPU WUs approaches, BOINC pushes more and more GPU WUs off the schedule and out of core in an attempt to complete the CPU WUs on time, until eventually only 7 CPU WUs are being processed and the GPUs are idle. Needlesstosay, BOINC creates an endless stream of error WUs as they are cancelled because not started before the deadline.
What should have worked was to set the simulate cpus option to 3, then put that each GPU WU uses little or no CPU time and 1/2 a GPU in app_config.xml, thusly:
<app_config>
<app>
<name>hsgamma_FGRPB1G</name>
<max_concurrent>4</max_concurrent>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>0.15</cpu_usage>
</gpu_versions>
</app>
</app_config>
BONIC says it reads app_config.xml and does not indicate any error, but the UI still indicates each GPU WU requires one CPU and 0.5 GPUs. I have changed app_config.xml hundreds of times before, but it does not work now. BOINC does process 3 CPU WUs, but only 1 CPU WU. The <cpu_usage>0.15</cpu_usage> is correctly configured in client_state.xml, but BOINC will not use it nor indicate it in the UI.
So, the only way to process 4 GPU WUs and 3 CPU WUs as the CPU deadline approaches is to suspend all but 3 CPU WUS, then check BOINC several times a day to unsuspend CPU WUs to replace the completed ones. That is very error prone. Moreover, BOINC will not download new workunits if any are suspended in the UI, so once a day all the CPU WUs have to be resumed for a half hour or so to allow the GPU cache to fill, during which time no GPU WUs are finished and most times the partially completed work times on the GPU WUs are lost as the CPU WUs are suspended again and the GPU WUs are resumed.
So, does anyone of know of any settings or preferences that can be changed to allow simultaneous processing of GPU and CPU WUs without human intervention, without giving up fully utilizing the GPUs all the time, and without creating any error WUs as BOINC cancels them?
Thanks in advance for your kind attention in this matter.
Copyright © 2024 Einstein@Home. All rights reserved.
CElliott wrote:So, does
)
Your own post points very clearly to a solution--lower your preference settings for "Store at least" and "store up to an additional" days of work. On the evidence you give, a simple factor of four reduction in the sum of these two would very likely fulfill all of your stated conditions. You could then tweak it upward to find the limit.
Here at Einstein, it would likely work quite nicely, as work supply outages tend usually to be a small part of a day. Possibly you participate in other projects for which you would find the result unsatisfactory.
Perhaps someone else will come along and propose a solution more to your liking--but this one is simple, and meets your stated goals, I think.
Your Nvidia gpu's require the
)
Your Nvidia gpu's require the attnetion of a full cpu core for each task run on your gpu's, so 2 tasks on each card will need a total of 4 cpu's in support. Changing the default cpu count and usage parameters is likely going to do more harm than good. Your <cpu_usage> setting should not be less than 1 for gpu tasks to run at full steam on Nvidia cards ;-)
Depending on your processors 'logical' cpu count you may not be able to run cpu tasks at all due to the support requirement your gpu tasks need.
Per PC I run 1 BOINC client
)
Per PC I run 1 BOINC client for CPU tasks and another client for GPU tasks. One of the reasons is exactly this. Managing different work queues for each processor type. It's even more of a pain at E@H if you wish to run the CPU here as the average task time is somewhere in the middle of either processor.
With two changes this is
)
With two changes this is working: "What should have worked was to set the simulate cpus option to 3, then put that each GPU WU uses little or no CPU time and 1/2 a GPU in app_config.xml." First, on the E@H website I set the WU cache size to 0.5 days of inventory, which is 2/4 as per @ARCHAE86. Then I rebooted. For reasons I don't understand, BOINC and the UI picked up the <cpu_usage>0.15</cpu_usage> line in app_config.xml in the E@H project directory. So, 3 CPU WUs are being scheduled on the CPU and 4 on the GPUs, and everything is running automatically. Only time will tell if it stays that way.
I am eager to try @MMONNIN's solution of running two BOINC clients, one to schedule the CPU and one to schedule the GPUs. It is innovative, principled, gives the user many more options to play with, and sounds like it should work. But I had to try one proposed soln at a time. @ARCHAE86's was first and mine was already in the machine. So, when I rebooted, it all just took off.
Many thanks to all for your help.
A guide I made at my home
)
A guide I made at my home forum:
https://www.overclock.net/forum/18056-boinc-guides-tutorials/1628924-guide-setting-up-multiple-boinc-instances.html
I run 2x tasks on my RX 580 in one client and a full 8x CPU threads on another client. (Partially for WUProp hours.) In WIn7, I use Process Lasso to push the CPU executables to threads 0-6 and the 2x E@H exe get CPU thread 7. Since I have an ATI card there is low CPU usage by the GPU exe. If you allow Windows to manage it then they CPU hop and the GPU task ends up choked waiting on CPU. So you may need to give at least 1 CPU thread per task, maybe an additional. Even if you cut the number of CPU tasks down I would still use Process Lasso in Windows.
E@H uses an old version of
)
E@H uses an old version of the BOINC server code which uses the DCF (Duration Correction Factor) mechanism. That mechanism doesn’t handle multiple app versions with differing run times such as a CPU and a GPU app. Newer BOINC code works around that but has various other issues.
BOINC blog
MarkJ wrote:E@H uses an old
)
Agreed and why I almost never run cpu and gpu units from the same project on the same machine!! I will have one machine run gpu units from project a and cpu units from project b and a 2nd machine will do the reverse. It's just another reason to have multiple pc's.