Maximizing Nvidia production

Raistmer*
Raistmer*
Joined: 20 Feb 05
Posts: 198
Credit: 70,418,276
RAC: 167,341

petri33 написал:To decrease

petri33 wrote:

To decrease CPU-usage I have tried this: (with lowered CPU usage, higher GPU utilization and lower throughput :( 

Quite paradoxal outcome. Any thoughts why so?

petri33
petri33
Joined: 4 Mar 20
Posts: 77
Credit: 1,643,249,093
RAC: 6,203,559

1) CPU-load is still from 24

1) CPU-load is still from 24 to 70 percent. My guess is that the enqueue-calls durations vary a lot. Some would require a short sleep and some a way longer sleep. I'd have to make an analysis of call behavior and find a pattern. Maybe a previous call that sets function parameters or kernel size would reveal how long to wait.

2) And the calls to FFT if implemented like in Seti through live compilation may use another call (not clEnqueueNDRangeKernel)

3) And I have not had much time to test with different sleeps times and number of tasks per gpu.

Feel free to try and find out.

 

Petri

 

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 1,009
Credit: 8,291,088,138
RAC: 33,487,079

from what I had read here in

from what I had read here in the past, I was under the impression that the GW GPU app was actually using the CPU for some functions of the WU processing, not just feeding work to the GPU. presumably because they weren't able to port whatever function to the GPU.

 

petri would know better if he digs more into it. that was just always my impression though.

 

 

_____________________________________________

Tom M
Tom M
Joined: 2 Feb 06
Posts: 1,237
Credit: 2,428,846,582
RAC: 5,676,408

Zalster wrote:@Tom I don't

Zalster wrote:

@Tom

I don't see where you are excluding 3 of 5 GPUs from your machine in the cc_config file.  From what I see, you would be using all GPUs for GW. Where did you put the <exclude>?
Z

Rats.

Let me look.  I think it was on the "other box" I was trying that on. 

And its "cc_config.xml" file in almost completely cleaned out on that box.

The following is from my Intel i9 box which has two gtx 1060 3GB's on it that I have excluded from running Gravity Wave gpu tasks.

------------------------------------------------------

  <cc_config>
<log_flags>
   <sched_op_debug>1</sched_op_debug>
</log_flags>
<options>
   <use_all_gpus>1</use_all_gpus>
   <save_stats_days>180</save_stats_days>
   <max_tasks_reported>50</max_tasks_reported>
   <max_file_xfers>4</max_file_xfers>
   <max_file_xfers_per_project>2</max_file_xfers_per_project>
   <use_all_gpus>1</use_all_gpus>
   <save_stats_days>90</save_stats_days>
<rec_half_life_days>1.000000</rec_half_life_days>
<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>1</device_num>
   <type>NVIDIA</type>
   <app>einstein_O2MD1</app>
   <app>einstein_O2MDF</app>
</exclude_gpu>
<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>2</device_num>
   <type>NVIDIA</type>
   <app>einstein_O2MD1</app>
   <app>einstein_O2MDF</app>
</exclude_gpu>--
 </options>
</cc_config>
----------------------------------------------------------------------------------------------------------------------------------------------------------

The following is a "recreation" I haven't actually tested it.  I am also now reasonably confident that if you have two <app></app> in a row the last one is the one that is "remembered".

The Amd 3950x box which is where I tried to split running GW gpu and Pulsar#1 gpu tasks would have looked like this:

<cc_config>
<log_flags>
   <sched_op_debug>1</sched_op_debug>
</log_flags>
<options>
   <use_all_gpus>1</use_all_gpus>
   <save_stats_days>180</save_stats_days>
   <max_tasks_reported>50</max_tasks_reported>
   <max_file_xfers>4</max_file_xfers>
   <max_file_xfers_per_project>2</max_file_xfers_per_project>
   <use_all_gpus>1</use_all_gpus>
   <save_stats_days>90</save_stats_days>
<rec_half_life_days>1.000000</rec_half_life_days>
<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>2</device_num>
   <type>NVIDIA</type>
   <app>einstein_O2MD1</app>
   <app>einstein_O2MDF</app>
</exclude_gpu>
<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>3</device_num>
   <type>NVIDIA</type>
   <app>einstein_O2MD1</app>
   <app>einstein_O2MDF</app>
</exclude_gpu>

<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>3</device_num>
   <type>NVIDIA</type>
   <app>einstein_O2MD1</app>
   <app>einstein_O2MDF</app>
</exclude_gpu>
<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>0</device_num>
   <type>NVIDIA</type>
   <app>hsgamma_FGRP5</app>
   <app>hsgamma_FGRPB1G</app>
</exclude_gpu>
<exclude_gpu>
   <url>http://einstein.phys.uwm.edu/</url>
   <device_num>1</device_num>
   <type>NVIDIA</type>
   <app>hsgamma_FGRP5</app><app>hsgamma_FGRPB1G</app>
  <app>einstein_O2MDF</app>
  </exclude_gpu>
 </options>
</cc_config>

----------------------------------------------------------------------------------------------------------------

The goal was to run some Pulsar tasks and some GW tasks simultaneously on different gpus.  As far as I could tell, I had the data and the apps in place but they were not running at the same time.  Even though, the day before they had been.

And I never did get it to the point where when I thought I had both tasks, both were running.

Tom M

 

 

 

 

 

 

People before Prophets (or Profits as the case may be...)

 

Tom M
Tom M
Joined: 2 Feb 06
Posts: 1,237
Credit: 2,428,846,582
RAC: 5,676,408

Gary Roberts wrote:I don't

Gary Roberts wrote:
I don't know if I'm properly understanding what you are trying to do.  Please correct me if I'm not.

I think what I was trying to do was get some Pulsar tasks and some GW gpu tasks to run at the same time.  So I was trying to exclude GW from some gpus and Pulsar tasks from other gpus.

When I displayed all tasks in the Boinc Manager it seemed to be showing both Pulsar and GW tasks.  And I couldn't get both to run at the same time even though the previous (prior to when the question was posted both HAD been running).

So I was trying to see if I could get both to run at the same time.

I have read advice that you can switch from one type of task (say to GW) and to the other task (say Pulsar#1) but can't get both to download "together."

Since I have been running E@H as a backup task previously I wasn't paying much attention to what was going on just as long as the gpus stayed busy when S@H wasn't shipping out any tasks.

It is entirely possible I was laboring under a misunderstanding of what should be going on and how it should be going on.

I would really like to be able to "set and forget" and have "it" share running GW gpu and Pulsar#1 gpu tasks without having it run all one for a while and then run all other for a while, and never shall the two share one system.

My impression is that is simply not yet possible.

Tom M.

 

 

 

 

 

 

People before Prophets (or Profits as the case may be...)

 

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3,063
Credit: 3,362,824,762
RAC: 0

Hi Tom,    I reviewed the

Hi Tom, 

 

I reviewed the cc_config that you posted. First thing that jumps out is you have 2 app listed in each exclude. You can only have 1 there as BOINC is go with the second. The other thing I see is you listed both the CPU and GPU together in those excludes.  O2MD1 and FGRP5 I believe are both CPU only applications. So you should just remove those all together and leave the other line in there. For the very last gpu #1, you have 3 apps listed. There should only be 1.  In order to get work, you would need to move the CC config out of boinc folder, launch bonic, have a large cache set and download a ton of work. Pause BOINC, exit. Put the cc config back in and relaunch. It should then follow the exclude.  Once it runs thru all the work of 1 type (it usually does as 1 process faster than the other) you would need to pause boinc, exit. Remove the CC config. Relaunch boinc, connect and update and download more work. I used to do that every couple of days for Einstein.

 

Z

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 1,480
Credit: 3,094,362,296
RAC: 8,614,789

I just let the scheduler send

I just let the scheduler send me whatever it wants within my 120 gpu task limit.  I crunch both FGRPB1G and GW 2.07 gpu tasks on my three or four cards. They crunch according to FIFO and I seem to always have one of each type crunching at the same time.  Don't need to play any games with cc_config and app_config other than some max_concurrent statements and my pandora.config file.

 

Tom M
Tom M
Joined: 2 Feb 06
Posts: 1,237
Credit: 2,428,846,582
RAC: 5,676,408

Keith Myers wrote: I just

Keith Myers wrote:

I just let the scheduler send me whatever it wants within my 120 gpu task limit.  I crunch both FGRPB1G and GW 2.07 gpu tasks on my three or four cards. They crunch according to FIFO and I seem to always have one of each type crunching at the same time.  Don't need to play any games with cc_config and app_config other than some max_concurrent statements and my pandora.config file.

 

TY.  I am going to switch to my "both" setting for my boxes.

 

Tom M

People before Prophets (or Profits as the case may be...)

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.