Well the work from GPUGrid is spotty all the time. So you can't depend on getting constant work to keep your gpu busy awhile waiting for work from Einstein.
Plus you are likely to snag a Python task from GPUGrid recently and that uses a lot of system memory and cpu cores. Which may be a good thing I guess with the lack of available Universe work.
You should set aside 3-5 cpu cores for the Python tasks to not have them impact the crunching times of the Universe tasks.
The regular acemd3/4 GPUGrid tasks only need a single cpu core to run.
Both types of gpu work at GPUGrid will keep a card occupied for 4-16 hours depending on the model and generation.
Well the work from GPUGrid is spotty all the time. So you can't depend on getting constant work to keep your gpu busy awhile waiting for work from Einstein.
Plus you are likely to snag a Python task from GPUGrid recently and that uses a lot of system memory and cpu cores. Which may be a good thing I guess with the lack of available Universe work.
You should set aside 3-5 cpu cores for the Python tasks to not have them impact the crunching times of the Universe tasks.
The regular acemd3/4 GPUGrid tasks only need a single cpu core to run.
Both types of gpu work at GPUGrid will keep a card occupied for 4-16 hours depending on the model and generation.
I am assuming you are referring to GPU tasks. How do you set aside 3-5 CPU cores on the gpu Python tasks? Is that something on the website or app_config.xml magic?
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Well since you spoke of GPUGrid and I replied pertaining to that project, GPUGrid only provides gpu tasks as implied in its name.
Yes, reserve 3-5 cpu cores for a Python task in an app_config.xml file. You are going to have to use one anyway to achieve your single GPUGrid task usage desire anyway.
If I am not mistaken, the beta python tasks on GPUgrid will use way more than a few cores. It will run 32 parallel agents on the processor. You pretty much have to pause or reduce other work. You CAN run other work at the same time depending on your setup but it will just slow the python task down. I didn't know this at first and it took on of those work units like 3 days to complete on our most powerful system. I paused other work and it sped up a great deal, but definitely not my favorite tasks.
The Python on GPU tasks are no longer beta. They are a stock application now along with acemd3/4.
I don't know what you are doing with your system but the longest Python tasks on my slowest gpus have only run for 7-8 hours max. Not multiple days. My fastest gpus do them in 5 hours typically.
Yes the Python tasks do start multiple parallel processes, but you don't need to stop all other cpu processing to run them. The tasks are a hybrid mix of cpu-gpu processing depending on what the task is doing in its inferencing/ML profiling runs.
I have run them on between 3-5 cpu_usage settings in an app_config and see minimal extension time on Universe or TN-Grid tasks of only about 5-15 minutes depending on the processor.
At a cpu_usage setting of 3.0 in my config I see the residual cpu usage in BoincTasks of between 200-300% so they are only using an extra 2-3 cpu threads than what I allow BOINC to see for scheduling.
I could set usage to 5 cpu threads to see single thread 100% usage on the Python task in BoincTasks but that would unnecessarily reduce the count of other project cpu tasks. I accept the slightly longer cpu runtimes for my cpu projects and still keep up my cpu production RAC on my projects.
And I still am running 20-28 Universe tasks along with four TN-Grid and another 2 yoyo tasks on the daily driver.
Strange. I will need to mess around with the config I think (still learning that part). The host it was running on is: https://einsteinathome.org/host/12901776 . It barely challenged the GPUs so the bottleneck was the CPUs, I believe. I have BOINC set to use 95% of cores and E@H was using all available when I received the python tasks from GPUgrid.
Is there a good place to read about proper configuration for using "x" amount of cores for one project and "y" amount for another project? I think I am confused on the topic. Still learning as I go.
As I stated the tasks use a mixed cpu-gpu application. You will see very little gpu usage then periodic spurts of activity and then back to low utilization.
This task design and behavior was explained by abouh, the scientist-developer here.
Having alternating phases of lower and higher GPU utilisation is normal in Reinforcement Learning, as the agent alternates between data collection(medium cpu usage) (generally low GPU usage) and training (higher GPU memory and utilisation)
The document for client configuration/application configuration has always been here.
I read over on Universe at Home that the "contest" would end on May 19th (Thank you Peter). I am hoping that applies to E@H too. I am getting update backoffs of 3.5 Hours.
Meanwhile, after watching an unending series of Milkyway@Home tasks being processed on my Gpu server I switched to "any E@H GPU tasks but Gamma Ray #1" mode.
I am happy to report I am getting tasks from E@H on my gpus again.
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
That is the "plan". Otherwise, the E@H RAC is completely unsupported.
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Well the work from GPUGrid is
)
Well the work from GPUGrid is spotty all the time. So you can't depend on getting constant work to keep your gpu busy awhile waiting for work from Einstein.
Plus you are likely to snag a Python task from GPUGrid recently and that uses a lot of system memory and cpu cores. Which may be a good thing I guess with the lack of available Universe work.
You should set aside 3-5 cpu cores for the Python tasks to not have them impact the crunching times of the Universe tasks.
The regular acemd3/4 GPUGrid tasks only need a single cpu core to run.
Both types of gpu work at GPUGrid will keep a card occupied for 4-16 hours depending on the model and generation.
Keith Myers wrote: Well the
)
I am assuming you are referring to GPU tasks. How do you set aside 3-5 CPU cores on the gpu Python tasks? Is that something on the website or app_config.xml magic?
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Well since you spoke of
)
Well since you spoke of GPUGrid and I replied pertaining to that project, GPUGrid only provides gpu tasks as implied in its name.
Yes, reserve 3-5 cpu cores for a Python task in an app_config.xml file. You are going to have to use one anyway to achieve your single GPUGrid task usage desire anyway.
If I am not mistaken, the
)
If I am not mistaken, the beta python tasks on GPUgrid will use way more than a few cores. It will run 32 parallel agents on the processor. You pretty much have to pause or reduce other work. You CAN run other work at the same time depending on your setup but it will just slow the python task down. I didn't know this at first and it took on of those work units like 3 days to complete on our most powerful system. I paused other work and it sped up a great deal, but definitely not my favorite tasks.
Yes the Python tasks do start
)
The Python on GPU tasks are no longer beta. They are a stock application now along with acemd3/4.
I don't know what you are doing with your system but the longest Python tasks on my slowest gpus have only run for 7-8 hours max. Not multiple days. My fastest gpus do them in 5 hours typically.
Yes the Python tasks do start multiple parallel processes, but you don't need to stop all other cpu processing to run them. The tasks are a hybrid mix of cpu-gpu processing depending on what the task is doing in its inferencing/ML profiling runs.
I have run them on between 3-5 cpu_usage settings in an app_config and see minimal extension time on Universe or TN-Grid tasks of only about 5-15 minutes depending on the processor.
At a cpu_usage setting of 3.0 in my config I see the residual cpu usage in BoincTasks of between 200-300% so they are only using an extra 2-3 cpu threads than what I allow BOINC to see for scheduling.
I could set usage to 5 cpu threads to see single thread 100% usage on the Python task in BoincTasks but that would unnecessarily reduce the count of other project cpu tasks. I accept the slightly longer cpu runtimes for my cpu projects and still keep up my cpu production RAC on my projects.
And I still am running 20-28 Universe tasks along with four TN-Grid and another 2 yoyo tasks on the daily driver.
Strange. I will need to mess
)
Strange. I will need to mess around with the config I think (still learning that part). The host it was running on is: https://einsteinathome.org/host/12901776 . It barely challenged the GPUs so the bottleneck was the CPUs, I believe. I have BOINC set to use 95% of cores and E@H was using all available when I received the python tasks from GPUgrid.
Is there a good place to read about proper configuration for using "x" amount of cores for one project and "y" amount for another project? I think I am confused on the topic. Still learning as I go.
Thanks for any advice!
As I state the tasks use a
)
As I stated the tasks use a mixed cpu-gpu application. You will see very little gpu usage then periodic spurts of activity and then back to low utilization.
This task design and behavior was explained by abouh, the scientist-developer here.
https://www.gpugrid.net/forum_thread.php?id=5233&nowrap=true#56977
Having alternating phases of lower and higher GPU utilisation is normal in Reinforcement Learning, as the agent alternates between data collection(medium cpu usage) (generally low GPU usage) and training (higher GPU memory and utilisation)
The document for client configuration/application configuration has always been here.
https://boinc.berkeley.edu/wiki/Client_configuration#Application_configuration
This is my app_config.xml for the project for your perusal and understanding.
<app_config>
<app>
<name>acemd3</name>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>1.0</cpu_usage>
</gpu_versions>
</app>
<app>
<name>acemd4</name>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>1.0</cpu_usage>
</gpu_versions>
</app>
<app>
<name>PythonGPU</name>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>3.0</cpu_usage>
</gpu_versions>
</app>
</app_config>
I read over on Universe at
)
I read over on Universe at Home that the "contest" would end on May 19th (Thank you Peter). I am hoping that applies to E@H too. I am getting update backoffs of 3.5 Hours.
Meanwhile, after watching an unending series of Milkyway@Home tasks being processed on my Gpu server I switched to "any E@H GPU tasks but Gamma Ray #1" mode.
I am happy to report I am getting tasks from E@H on my gpus again.
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Looks like you only are
)
Looks like you only are getting GW work though.
gamma ray work is still very scarce
_________________________________________________________________________
Ian&Steve C. wrote: Looks
)
That is the "plan". Otherwise, the E@H RAC is completely unsupported.
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!