Conversations about your/my setup

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5014
Credit: 18905774869
RAC: 6576420

Well the work from GPUGrid is

Well the work from GPUGrid is spotty all the time.  So you can't depend on getting constant work to keep your gpu busy awhile waiting for work from Einstein.

Plus you are likely to snag a Python task from GPUGrid recently and that uses a lot of system memory and cpu cores.  Which may be a good thing I guess with the lack of available Universe work.

You should set aside 3-5 cpu cores for the Python tasks to not have them impact the crunching times of the Universe tasks.

The regular acemd3/4 GPUGrid tasks only need a single cpu core to run.

Both types of gpu work at GPUGrid will keep a card occupied for 4-16 hours depending on the model and generation.

 

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6556
Credit: 9644378071
RAC: 2846739

Keith Myers wrote: Well the

Keith Myers wrote:

Well the work from GPUGrid is spotty all the time.  So you can't depend on getting constant work to keep your gpu busy awhile waiting for work from Einstein.

Plus you are likely to snag a Python task from GPUGrid recently and that uses a lot of system memory and cpu cores.  Which may be a good thing I guess with the lack of available Universe work.

You should set aside 3-5 cpu cores for the Python tasks to not have them impact the crunching times of the Universe tasks.

The regular acemd3/4 GPUGrid tasks only need a single cpu core to run.

Both types of gpu work at GPUGrid will keep a card occupied for 4-16 hours depending on the model and generation.

I am assuming you are referring to GPU tasks.  How do you set aside 3-5 CPU cores on the gpu Python tasks?  Is that something on the website or app_config.xml magic?

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5014
Credit: 18905774869
RAC: 6576420

Well since you spoke of

Well since you spoke of GPUGrid and I replied pertaining to that project, GPUGrid only provides gpu tasks as implied in its name.

Yes, reserve 3-5 cpu cores for a Python task in an app_config.xml file. You are going to have to use one anyway to achieve your single GPUGrid task usage desire anyway.

 

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 264
Credit: 10861928185
RAC: 11537286

If I am not mistaken, the

If I am not mistaken, the beta python tasks on GPUgrid will use way more than a few cores. It will run 32 parallel agents on the processor. You pretty much have to pause or reduce other work. You CAN run other work at the same time depending on your setup but it will just slow the python task down. I didn't know this at first and it took on of those work units like 3 days to complete on our most powerful system. I paused other work and it sped up a great deal, but definitely not my favorite tasks. 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5014
Credit: 18905774869
RAC: 6576420

Yes the Python tasks do start

The Python on GPU tasks are no longer beta.  They are a stock application now along with acemd3/4.

I don't know what you are doing with your system but the longest Python tasks on my slowest gpus have only run for 7-8 hours max.  Not multiple days. My fastest gpus do them in 5 hours typically.

Yes the Python tasks do start multiple parallel processes, but you don't need to stop all other cpu processing to run them. The tasks are a hybrid mix of cpu-gpu processing depending on what the task is doing in its inferencing/ML profiling runs.

I have run them on between 3-5 cpu_usage settings in an app_config and see minimal extension time on Universe or TN-Grid tasks of only about 5-15 minutes depending on the processor.

At a cpu_usage setting of 3.0 in my config I see the residual cpu usage in BoincTasks of between 200-300% so they are only using an extra 2-3 cpu threads than what I allow BOINC to see for scheduling.

I could set usage to 5 cpu threads to see single thread 100% usage on the Python task in BoincTasks but that would unnecessarily reduce the count of other project cpu tasks. I accept the slightly longer cpu runtimes for my cpu projects and still keep up my cpu production RAC on my projects.

And I still am running 20-28 Universe tasks along with four TN-Grid and another 2 yoyo tasks on the daily driver.

 

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 264
Credit: 10861928185
RAC: 11537286

Strange. I will need to mess

Strange. I will need to mess around with the config I think (still learning that part). The host it was running on is: https://einsteinathome.org/host/12901776 . It barely challenged the GPUs so the bottleneck was the CPUs, I believe. I have BOINC set to use 95% of cores and E@H was using all available when I received the python tasks from GPUgrid. 

Is there a good place to read about proper configuration for using "x" amount of cores for one project and "y" amount for another project? I think I am confused on the topic. Still learning as I go. 

Thanks for any advice!

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5014
Credit: 18905774869
RAC: 6576420

As I state the tasks use a

As I stated the tasks use a mixed cpu-gpu application.  You will see very little gpu usage then periodic spurts of activity and then back to low utilization.

This task design and behavior was explained by abouh, the scientist-developer here.

https://www.gpugrid.net/forum_thread.php?id=5233&nowrap=true#56977

Having alternating phases of lower and higher GPU utilisation is normal in Reinforcement Learning, as the agent alternates between data collection(medium cpu usage) (generally low GPU usage) and training (higher GPU memory and utilisation)

The document for client configuration/application configuration has always been here.

https://boinc.berkeley.edu/wiki/Client_configuration#Application_configuration

This is my app_config.xml for the project for your perusal and understanding.

<app_config>
   <app>
      <name>acemd3</name>
      <gpu_versions>
          <gpu_usage>1.0</gpu_usage>
          <cpu_usage>1.0</cpu_usage>
      </gpu_versions>
    </app>
    <app>
      <name>acemd4</name>
      <gpu_versions>
          <gpu_usage>1.0</gpu_usage>
          <cpu_usage>1.0</cpu_usage>
      </gpu_versions>
    </app>    
    <app>
      <name>PythonGPU</name>
      <gpu_versions>
          <gpu_usage>1.0</gpu_usage>
          <cpu_usage>3.0</cpu_usage>
      </gpu_versions>
    </app>
   
</app_config>

 

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6556
Credit: 9644378071
RAC: 2846739

I read over on Universe at

I read over on Universe at Home that the "contest" would end on May 19th (Thank you Peter).  I am hoping that applies to E@H too.  I am getting update backoffs of 3.5 Hours.

Meanwhile, after watching an unending series of Milkyway@Home tasks being processed on my Gpu server I switched to "any E@H GPU tasks but Gamma Ray #1" mode.

I am happy to report I am getting tasks from E@H on my gpus again.

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 4042
Credit: 47952456561
RAC: 35649914

Looks like you only are

Looks like you only are getting GW work though. 
 

gamma ray work is still very scarce 

_________________________________________________________________________

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6556
Credit: 9644378071
RAC: 2846739

Ian&Steve C. wrote: Looks

Ian&Steve C. wrote:

Looks like you only are getting GW work though. 
 

gamma ray work is still very scarce 

That is the "plan".  Otherwise, the E@H RAC is completely unsupported.

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.