Discussion Thread for the Continuous GW Search known as O2MD1 (now O2MDF - GPUs only)

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117695605814

RAC: 35069280

Bill Lumley wrote:This forum

1 Mar 2020 22:45:51 UTC

Message 175809 in response to message 175795

(moderation:

)

Bill Lumley wrote:

This forum was where i found out about my 2 GPU's ( GCN 1.0 architecture. Radeon HD-7970M with 2GB operating at 850 MHZ) was not compatible anymore ... with 2 months of nothing but failures it was frustrating trying to find out if it was my computer ...drivers etc ...stopping and starting again .... re setting the project ...

It's only one particular search (the GW search on GPUs) that doesn't seem to work with GCN 1st gen hardware. The gamma-ray pulsar search works fine.

Bill Lumley wrote:

This should be announced from the pulpit about this .... front and centre that this architecture is not longer compatible with the new runs and not even try it .... not be found in the back of a forum after 26 pages....!!!!!

Maybe you need to stop and think for a moment about what your expectations actually imply. Projects like this one have a group of scientists/research students working on particular projects requiring analysis of large amounts of data. Somebody (perhaps a scientist) writes the code to do the analysis. A small (perhaps very small) group of technical staff are responsible for building and deploying apps, preparing data, maintaining servers and databases, etc, etc, so that the analysis can be done.

They would have access to a couple of test machines on which to test everything. How could they possibly know if every hardware/software variant that a volunteer might bring to the table is going to work without issue? They don't have the vast spread of possible hardware and they most certainly don't have the time to test and document everything.

This is where the forums come in. When a new search is started, lots of people will share experiences, report problems and help with finding solutions. If you run into an issue, rather than banging your head against a brick wall for 2 months, start a thread in the problems section giving as much information as possible about your problem. Chances are very good that someone will be able to help you solve it. It takes maybe 10 mins per day to keep up with problems/solutions being discussed.

Don't expect an edict from the pulpit. It's the plebs down in the pews that will be your best source of comfort :-).

Cheers,
Gary.

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

I'm not receiving new tasks.

31 Mar 2020 23:48:03 UTC

Message 176291

(moderation:

)

I'm not receiving new tasks. I guess staff is examining and maybe tuning something. Nice to see what will be coming after pressure is let into pipelines again.

edit: It was just a short pause. Receiving tasks again.

cecht

Joined: 7 Mar 18

Posts: 1535

Credit: 2909492073

RAC: 2104268

Here are data showing

13 Apr 2020 22:42:33 UTC

Message 176583

(moderation:

)

Here are data showing potential limitations of CPU resources on GW GPU task crunching. For comparison, data are also provided for FGRBPG1 pulsar GPU tasks. Measurements were from a Linux host with a Pentium G5600 (2c/4t; 3.9 GHz) running two RX 570 GPUs. Table footnotes and methods are given at the bottom.

Gravitational Wave GPU app v2.08 for O2MDFG3 data series

Run config	Task t	Yield	W	CPU %	Wh/T
1 GPU@1x	37	39	69	25	42
1 GPU@2x	19	77	89	50	27
2 GPU@1x	33	88	122	45	33
2 GPU@2x	54	54	104	90	46

Gamma-ray binary pulsar app v1.18

Run config	Task t	Yield	W	CPU %	Wh/T
2 GPU@3x	9.7	297	214	25	17

What this shows is that the fastest GW task times occur with a single GPU running 2x tasks, although the best daily task yield is realized with two GPUs running 1x tasks. This higher yield, however, comes at a somewhat lower power efficiency (higher Wh per task). As long as CPU usage is 25% - 50%, then GPU task utilization scales nearly proportionally.

Running two GPUs @ 2x, however, gives the slowest task time by far. This configuration also had a disproportionately lower wattage (from lower GPU utilization, not shown) and CPU usage that was near its limit. These numbers demonstrate what others have observed, that multiple GW GPU tasks can be throttled by limited CPU resources.

Therefore, if I'm interpreting these numbers correctly, to get the most out of GPUs for GW crunching, two CPU cores/threads are needed for each GPU task. For example, if I upgraded to an 8-core CPU (e.g. i7-9700K), then I should realize task times of ~19 min when running two GPUs @ 2x (4 concurrent tasks). Right? Maybe?

Table footnotes and methods:

BBcode table formatting was generated at https://theenemy.dk/table/

Run configuration: 1x is a GPU utilization factor (GUF) of 1, 2x is 0.5 GUF. GUF is set in Project Preferences or the app_config.xml file.

Task t, Realized single task completion time in minutes = BOINC run time in minutes X GUF.

Yield, Tasks per day = 24 hr / Task t in hours X #GPUs.

W, Crunch Watts = Wattage as measured with a power meter at the wall, minus the host resting state of 56W.

CPU % is CPU usage as averaged by eyeballing across four CPU threads from a 1 minute graph of CPU activity. ($ gnome-system-monitor #I need a better way to measure this.)

Wh/T, Watt hours per task = W X 24h / Yield

One GPU was excluded from crunching by using the exclude_gpu option in the cc_config.xml file:

<option>
  <exclude_gpu>
     <url>einstein.physics.uwm.edu</url>
     <device_num>1</device_num>
     <app>einstein_O2MDF</app>
  </exclude_gpu>
</option>

Notes & Observations:
System processes sdma0 and comp_1.0.0 generate MUCH higher rates of CPU utilization when boinc is running 2 GPUs @ 2x vs @ 1x. ($ top) SDMA has something to do with memory allocation between the GPU and CPU, I think; no idea what comp does.
GW GPU tasks at 2GPUs@1x seem to come in two flavors: short run times (~26 min) and longer run times (~37 m, and up to 45 m). The short times are accompanied by higher GPU utilization and higher wattage. Times were averaged from both flavors using 90 samples.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Zalster

Joined: 26 Nov 13

Posts: 3117

Credit: 4050672230

RAC: 0

Did you measure to see what

14 Apr 2020 14:41:20 UTC

Message 176604 in response to message 176583

(moderation:

)

Did you measure to see what the data stream looks like crossing the bus? Way back with Seti during Opencl,we noticed a saturation of the buses when running more than 1 work units per card. The higher the number of work units the slower the results were due to the saturation.

cecht

Joined: 7 Mar 18

Posts: 1535

Credit: 2909492073

RAC: 2104268

Zalster wrote:Did you measure

14 Apr 2020 19:44:34 UTC

Message 176611 in response to message 176604

(moderation:

)

Zalster wrote:

Did you measure to see what the data stream looks like crossing the bus? Way back with Seti during Opencl,we noticed a saturation of the buses when running more than 1 work units per card. The higher the number of work units the slower the results were due to the saturation.

Good thought, thanks. I need to look up how to do that.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Zalster

Joined: 26 Nov 13

Posts: 3117

Credit: 4050672230

RAC: 0

cecht wrote:Good thought,

14 Apr 2020 20:08:32 UTC

Message 176613 in response to message 176611

(moderation:

)

cecht wrote:

Good thought, thanks. I need to look up how to do that.

I wish I could help with that. It was in the windows era before I moved to the dark side with Linux. Nvidia's Precision X allowed you to look at the bus and see what percentage was being utilized along with how much was crossing. You could literally watch it hit a point where it didn't go up anymore and then time to complete would just get longer. I'm sure there are other software out there that does the same thing but I don't know enough about them to help. Keith Myer's might. He seems to be up to date on all thing Linux. I'll have to ask him the next time I talk to him.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3958

Credit: 47039252642

RAC: 65108076

He’s talking about an AMD GPU

15 Apr 2020 1:39:09 UTC

Message 176621 in response to message 176613

(moderation:

)

He’s talking about an AMD GPU though.

no clue how to see that info for AMD.

_________________________________________________________________________

Keith Myers

Joined: 11 Feb 11

Posts: 4964

Credit: 18751785730

RAC: 7105539

The best tool for AMD cards

15 Apr 2020 2:29:31 UTC

Message 176625 in response to message 176621

(moderation:

)

The best tool for AMD cards is the Wattman utility. Another mandatory tool is Rick's Lab, amdgpu-utils. Rick you know from Seti forums as RuieKe and the great Bench-MT tool.

https://github.com/Ricks-Lab/amdgpu-utils

There are resources for AMD cards actually.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3958

Credit: 47039252642

RAC: 65108076

Does Wattman run on Linux

15 Apr 2020 5:35:28 UTC

Message 176629 in response to message 176625

(moderation:

)

Does Wattman run on Linux though?

does Rick’s tool show PCIe utilization for AMD cards?

_________________________________________________________________________

Mr P Hucker

Joined: 12 Aug 06

Posts: 838

Credit: 519312922

RAC: 13792

In the last day, my O2MDF WUs

15 Apr 2020 11:06:01 UTC

Message 176634

(moderation:

)

In the last day, my O2MDF WUs have been taking 4-5 times longer. Is this normal or is there something up with my system? Gamma on the same GPU takes the usual time.

If this page takes an hour to load, reduce posts per page to 20 in your settings, then the tinpot 486 Einstein uses can handle it.

Discussion Thread for the Continuous GW Search known as O2MD1 (now O2MDF - GPUs only)

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner