Maximizing Nvidia production

Raistmer*
Raistmer*
Joined: 20 Feb 05
Posts: 208
Credit: 177576220
RAC: 222733

petri33 написал:To decrease

petri33 wrote:

To decrease CPU-usage I have tried this: (with lowered CPU usage, higher GPU utilization and lower throughput :( 

Quite paradoxal outcome. Any thoughts why so?

petri33
petri33
Joined: 4 Mar 20
Posts: 117
Credit: 3341045819
RAC: 0

1) CPU-load is still from 24

1) CPU-load is still from 24 to 70 percent. My guess is that the enqueue-calls durations vary a lot. Some would require a short sleep and some a way longer sleep. I'd have to make an analysis of call behavior and find a pattern. Maybe a previous call that sets function parameters or kernel size would reveal how long to wait.

2) And the calls to FFT if implemented like in Seti through live compilation may use another call (not clEnqueueNDRangeKernel)

3) And I have not had much time to test with different sleeps times and number of tasks per gpu.

Feel free to try and find out.

 

Petri

 

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3681
Credit: 33817650983
RAC: 37830717

from what I had read here in

from what I had read here in the past, I was under the impression that the GW GPU app was actually using the CPU for some functions of the WU processing, not just feeding work to the GPU. presumably because they weren't able to port whatever function to the GPU.

 

petri would know better if he digs more into it. that was just always my impression though.

 

 

_________________________________________________________________________

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5585
Credit: 7672972902
RAC: 1745116

Zalster wrote:@Tom I don't

Zalster wrote:

@Tom

I don't see where you are excluding 3 of 5 GPUs from your machine in the cc_config file.  From what I see, you would be using all GPUs for GW. Where did you put the <exclude>?
Z

Rats.

Let me look.  I think it was on the "other box" I was trying that on. 

And its "cc_config.xml" file in almost completely cleaned out on that box.

The following is from my Intel i9 box which has two gtx 1060 3GB's on it that I have excluded from running Gravity Wave gpu tasks.

------------------------------------------------------

  <cc_config>
<log_flags>
   <sched_op_debug>1</sched_op_debug>
</log_flags>
<options>
   <use_all_gpus>1</use_all_gpus>
   <save_stats_days>180</save_stats_days>
   <max_tasks_reported>50</max_tasks_reported>
   <max_file_xfers>4</max_file_xfers>
   <max_file_xfers_per_project>2</max_file_xfers_per_project>
   <use_all_gpus>1</use_all_gpus>
   <save_stats_days>90</save_stats_days>
<rec_half_life_days>1.000000</rec_half_life_days>
<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>1</device_num>
   <type>NVIDIA</type>
   <app>einstein_O2MD1</app>
   <app>einstein_O2MDF</app>
</exclude_gpu>
<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>2</device_num>
   <type>NVIDIA</type>
   <app>einstein_O2MD1</app>
   <app>einstein_O2MDF</app>
</exclude_gpu>--
 </options>
</cc_config>
----------------------------------------------------------------------------------------------------------------------------------------------------------

The following is a "recreation" I haven't actually tested it.  I am also now reasonably confident that if you have two <app></app> in a row the last one is the one that is "remembered".

The Amd 3950x box which is where I tried to split running GW gpu and Pulsar#1 gpu tasks would have looked like this:

<cc_config>
<log_flags>
   <sched_op_debug>1</sched_op_debug>
</log_flags>
<options>
   <use_all_gpus>1</use_all_gpus>
   <save_stats_days>180</save_stats_days>
   <max_tasks_reported>50</max_tasks_reported>
   <max_file_xfers>4</max_file_xfers>
   <max_file_xfers_per_project>2</max_file_xfers_per_project>
   <use_all_gpus>1</use_all_gpus>
   <save_stats_days>90</save_stats_days>
<rec_half_life_days>1.000000</rec_half_life_days>
<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>2</device_num>
   <type>NVIDIA</type>
   <app>einstein_O2MD1</app>
   <app>einstein_O2MDF</app>
</exclude_gpu>
<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>3</device_num>
   <type>NVIDIA</type>
   <app>einstein_O2MD1</app>
   <app>einstein_O2MDF</app>
</exclude_gpu>

<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>3</device_num>
   <type>NVIDIA</type>
   <app>einstein_O2MD1</app>
   <app>einstein_O2MDF</app>
</exclude_gpu>
<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>0</device_num>
   <type>NVIDIA</type>
   <app>hsgamma_FGRP5</app>
   <app>hsgamma_FGRPB1G</app>
</exclude_gpu>
<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>1</device_num>
   <type>NVIDIA</type>
   <app>hsgamma_FGRP5</app><app>hsgamma_FGRPB1G</app>
  <app>einstein_O2MDF</app>
  </exclude_gpu>
 </options>
</cc_config>

----------------------------------------------------------------------------------------------------------------

The goal was to run some Pulsar tasks and some GW tasks simultaneously on different gpus.  As far as I could tell, I had the data and the apps in place but they were not running at the same time.  Even though, the day before they had been.

And I never did get it to the point where when I thought I had both tasks, both were running.

Tom M

 

 

 

 

 

 

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5585
Credit: 7672972902
RAC: 1745116

Gary Roberts wrote:I don't

Gary Roberts wrote:
I don't know if I'm properly understanding what you are trying to do.  Please correct me if I'm not.

I think what I was trying to do was get some Pulsar tasks and some GW gpu tasks to run at the same time.  So I was trying to exclude GW from some gpus and Pulsar tasks from other gpus.

When I displayed all tasks in the Boinc Manager it seemed to be showing both Pulsar and GW tasks.  And I couldn't get both to run at the same time even though the previous (prior to when the question was posted both HAD been running).

So I was trying to see if I could get both to run at the same time.

I have read advice that you can switch from one type of task (say to GW) and to the other task (say Pulsar#1) but can't get both to download "together."

Since I have been running E@H as a backup task previously I wasn't paying much attention to what was going on just as long as the gpus stayed busy when S@H wasn't shipping out any tasks.

It is entirely possible I was laboring under a misunderstanding of what should be going on and how it should be going on.

I would really like to be able to "set and forget" and have "it" share running GW gpu and Pulsar#1 gpu tasks without having it run all one for a while and then run all other for a while, and never shall the two share one system.

My impression is that is simply not yet possible.

Tom M.

 

 

 

 

 

 

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

Hi Tom,    I reviewed the

Hi Tom, 

 

I reviewed the cc_config that you posted. First thing that jumps out is you have 2 app listed in each exclude. You can only have 1 there as BOINC is go with the second. The other thing I see is you listed both the CPU and GPU together in those excludes.  O2MD1 and FGRP5 I believe are both CPU only applications. So you should just remove those all together and leave the other line in there. For the very last gpu #1, you have 3 apps listed. There should only be 1.  In order to get work, you would need to move the CC config out of boinc folder, launch bonic, have a large cache set and download a ton of work. Pause BOINC, exit. Put the cc config back in and relaunch. It should then follow the exclude.  Once it runs thru all the work of 1 type (it usually does as 1 process faster than the other) you would need to pause boinc, exit. Remove the CC config. Relaunch boinc, connect and update and download more work. I used to do that every couple of days for Einstein.

 

Z

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17542335049
RAC: 6372174

I just let the scheduler send

I just let the scheduler send me whatever it wants within my 120 gpu task limit.  I crunch both FGRPB1G and GW 2.07 gpu tasks on my three or four cards. They crunch according to FIFO and I seem to always have one of each type crunching at the same time.  Don't need to play any games with cc_config and app_config other than some max_concurrent statements and my pandora.config file.

 

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5585
Credit: 7672972902
RAC: 1745116

Keith Myers wrote: I just

Keith Myers wrote:

I just let the scheduler send me whatever it wants within my 120 gpu task limit.  I crunch both FGRPB1G and GW 2.07 gpu tasks on my three or four cards. They crunch according to FIFO and I seem to always have one of each type crunching at the same time.  Don't need to play any games with cc_config and app_config other than some max_concurrent statements and my pandora.config file.

 

TY.  I am going to switch to my "both" setting for my boxes.

 

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.