Discussion Thread for the Continuous GW Search known as O2MD1 (now O2MDF - GPUs only)

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117699335793
RAC: 35101307

Bill Lumley wrote:This forum

Bill Lumley wrote:
This forum was where i found out about my 2 GPU's ( GCN 1.0 architecture. Radeon HD-7970M with 2GB operating at 850 MHZ) was not compatible anymore ... with 2 months of nothing but failures it was frustrating trying to find out if it was my computer ...drivers etc ...stopping and starting again .... re setting the project ...

It's only one particular search (the GW search on GPUs) that doesn't seem to work with GCN 1st gen hardware.  The gamma-ray pulsar search works fine.

Bill Lumley wrote:
This should be announced from the pulpit about this ....  front and centre that this architecture is not longer compatible with the new runs and not even try it  .... not be found in the back of a forum after 26 pages....!!!!!

Maybe you need to stop and think for a moment about what your expectations actually imply.  Projects like this one have a group of scientists/research students working on particular projects requiring analysis of large amounts of data.  Somebody (perhaps a scientist) writes the code to do the analysis.  A small (perhaps very small) group of technical staff are responsible for building and deploying apps, preparing data, maintaining servers and databases, etc, etc, so that the analysis can be done.

They would have access to a couple of test machines on which to test everything.  How could they possibly know if every hardware/software variant that a volunteer might bring to the table is going to work without issue?  They don't have the vast spread of possible hardware and they most certainly don't have the time to test and document everything.

This is where the forums come in.  When a new search is started, lots of people will share experiences, report problems and help with finding solutions.  If you run into an issue, rather than banging your head against a brick wall for 2 months, start a thread in the problems section giving as much information as possible about your problem.  Chances are very good that someone will be able to help you solve it.  It takes maybe 10 mins per day to keep up with problems/solutions being discussed.

Don't expect an edict from the pulpit.  It's the plebs down in the pews that will be your best source of comfort :-).

 

Cheers,
Gary.

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

I'm not receiving new tasks.

I'm not receiving new tasks. I guess staff is examining and maybe tuning something. Nice to see what will be coming after pressure is let into pipelines again.

 

edit: It was just a short pause. Receiving tasks again.

cecht
cecht
Joined: 7 Mar 18
Posts: 1535
Credit: 2909725401
RAC: 2106160

Here are data showing

Here are data showing potential limitations of CPU resources on GW GPU task crunching. For comparison, data are also provided for FGRBPG1 pulsar GPU tasks. Measurements were from a Linux host with a Pentium G5600 (2c/4t; 3.9 GHz) running two RX 570 GPUs. Table footnotes and methods are given at the bottom.

Gravitational Wave GPU app v2.08 for O2MDFG3 data series

Run configTask tYieldWCPU %Wh/T
1 GPU@1x3739692542
1 GPU@2x1977895027
2 GPU@1x33881224533
2 GPU@2x54541049046

Gamma-ray binary pulsar app v1.18

Run configTask tYieldWCPU %Wh/T
2 GPU@3x9.72972142517

What this shows is that the fastest GW task times occur with a single GPU running 2x tasks, although the best daily task yield is realized with two GPUs running 1x tasks. This higher yield, however, comes at a somewhat lower power efficiency (higher Wh per task). As long as CPU usage is 25% - 50%, then GPU task utilization scales nearly proportionally. 

Running two GPUs @ 2x, however, gives the slowest task time by far. This configuration also had a disproportionately lower wattage (from lower GPU utilization, not shown) and CPU usage that was near its limit. These numbers demonstrate what others have observed, that multiple GW GPU tasks can be throttled by limited CPU resources.

Therefore, if I'm interpreting these numbers correctly, to get the most out of GPUs for GW crunching, two CPU cores/threads are needed for each GPU task. For example, if I upgraded to an 8-core CPU (e.g. i7-9700K), then I should realize task times of ~19 min when running two GPUs @ 2x (4 concurrent tasks). Right? Maybe?

Table footnotes and methods:

BBcode table formatting was generated at https://theenemy.dk/table/

Run configuration: 1x is a GPU utilization factor (GUF) of 1, 2x is 0.5 GUF. GUF is set in Project Preferences or the app_config.xml file.

Task t, Realized single task completion time in minutes = BOINC run time in minutes X GUF.

Yield, Tasks per day = 24 hr / Task t in hours X #GPUs.

W, Crunch Watts = Wattage as measured with a power meter at the wall, minus the host resting state of 56W. 

CPU % is CPU usage as averaged by eyeballing across four CPU threads from a 1 minute graph of CPU activity. ($ gnome-system-monitor #I need a better way to measure this.)

Wh/T, Watt hours per task = W X 24h / Yield

One GPU was excluded from crunching by using the exclude_gpu option in the cc_config.xml file:

<option>
  <exclude_gpu>
     <url>einstein.physics.uwm.edu</url>
     <device_num>1</device_num>
     <app>einstein_O2MDF</app>
  </exclude_gpu>
</option>

Notes & Observations:
System processes sdma0 and comp_1.0.0 generate MUCH higher rates of CPU utilization when boinc is running 2 GPUs @ 2x vs @ 1x. ($ top) SDMA has something to do with memory allocation between the GPU and CPU, I think; no idea what comp does.
GW GPU tasks at 2GPUs@1x seem to come in two flavors: short run times (~26 min) and longer run times (~37 m, and up to 45 m). The short times are accompanied by higher GPU utilization and higher wattage. Times were averaged from both flavors using 90 samples.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

Did you measure to see what

Did you measure to see what the data stream looks like crossing the bus? Way back with Seti during Opencl,we noticed a saturation of the buses when running more than 1 work units per card. The higher the number of work units the slower the results were due to the saturation.

cecht
cecht
Joined: 7 Mar 18
Posts: 1535
Credit: 2909725401
RAC: 2106160

Zalster wrote:Did you measure

Zalster wrote:
Did you measure to see what the data stream looks like crossing the bus? Way back with Seti during Opencl,we noticed a saturation of the buses when running more than 1 work units per card. The higher the number of work units the slower the results were due to the saturation.

Good thought, thanks.  I need to look up how to do that.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

cecht wrote:Good thought,

cecht wrote:
Good thought, thanks.  I need to look up how to do that.

 

I wish I could help with that. It was in the windows era before I moved to the dark side with Linux.  Nvidia's Precision X allowed you to look at the bus and see what percentage was being utilized along with how much was crossing.  You could literally watch it hit a point where it didn't go up anymore and then time to complete would just get longer.  I'm sure there are other software out there that does the same thing but I don't know enough about them to help. Keith Myer's might. He seems to be up to date on all thing Linux. I'll have to ask him the next time I talk to him.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3958
Credit: 47045792642
RAC: 65130132

He’s talking about an AMD GPU

He’s talking about an AMD GPU though. 

no clue how to see that info for AMD.  

_________________________________________________________________________

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18752661628
RAC: 7124898

The best tool for AMD cards

The best tool for AMD cards is the Wattman utility.  Another mandatory tool is Rick's Lab, amdgpu-utils.  Rick you know from Seti forums as RuieKe and the great Bench-MT tool.

https://github.com/Ricks-Lab/amdgpu-utils

There are resources for AMD cards actually.

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3958
Credit: 47045792642
RAC: 65130132

Does Wattman run on Linux

Does Wattman run on Linux though? 

 

does Rick’s tool show PCIe utilization for AMD cards?  

_________________________________________________________________________

Mr P Hucker
Mr P Hucker
Joined: 12 Aug 06
Posts: 838
Credit: 519315251
RAC: 13889

In the last day, my O2MDF WUs

In the last day, my O2MDF WUs have been taking 4-5 times longer.  Is this normal or is there something up with my system?  Gamma on the same GPU takes the usual time.

If this page takes an hour to load, reduce posts per page to 20 in your settings, then the tinpot 486 Einstein uses can handle it.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.