Discussion Thread for the Continuous GW Search known as O2MD1 (now O2MDF - GPUs only)

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5888

Credit: 119691773863

RAC: 25361076

Bernd Machenschalk wrote:...

3 Sep 2020 20:48:48 UTC

Message 179828 in response to message 179824

(moderation:

)

Bernd Machenschalk wrote:

... it was reported somewhere ...

I sent you a PM about the warning that gets issued (clFFT resources not freed) if a GPU task gets suspended - eg. if the multiplicity of concurrent tasks gets reduced and as a result a task gets suspended.

For more background about why I sent the PM, you can check this message, and the messages that led up to it. The warning was in the stderr.txt output and I wasn't sure if it was important or not.

Thanks for the new test app. I guess cecht will be interested to see if it performs any differently when his script is reducing task multiplicity :-). Is there expected to be any benefit (reduced memory consumption) for running multiple tasks under conditions where multiplicity is being changed like this?

Cheers,
Gary.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4350

Credit: 253685298

RAC: 34623

Richie wrote: I'm receiving

4 Sep 2020 9:43:06 UTC

Message 179833 in response to message 179827

(moderation:

)

Richie wrote:

I'm receiving that only for AMD cards (for about 10 hours now). My Nvidia cards still get only 2.07.

Thanks for reporting. Indeed there was something wrong on the server side. Should be fixed.

cecht

Joined: 7 Mar 18

Posts: 1618

Credit: 3031290235

RAC: 1445295

Gary Roberts wrote:...Thanks

5 Sep 2020 22:02:52 UTC

Message 179862

(moderation:

)

Gary Roberts wrote:

...Thanks for the new test app. I guess cecht will be interested to see if it performs any differently when his script is reducing task multiplicity :-). Is there expected to be any benefit (reduced memory consumption) for running multiple tasks under conditions where multiplicity is being changed like this?

So far as I can tell, 2.09 is running tasks with the same memory usage as 2.08. My fancy little script-in-development hasn't yet been put to the test to resolve excessive GPU memory use because the Spotlight tasks to date have had relatively light memory requirements. Wouldn't it be wonderful if Bernd's new app rendered obsolete my fancy little script? :-\

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4350

Credit: 253685298

RAC: 34623

Apparently there's a problem

10 Dec 2020 14:44:47 UTC

Message 181569

(moderation:

)

Apparently there's a problem with our GW app and the new OSX version "Big Sur" (described here). For now we'll stop delivering GW work to such systems.

Tom M

Joined: 2 Feb 06

Posts: 6884

Credit: 9800323299

RAC: 3565360

I have found it possible to

13 Dec 2020 1:30:24 UTC

Message 181615

(moderation:

)

I have found it possible to run a mix of Rx 5700 and Rx 580's under Windows 10.

However, the Rx 580's are not real happy running 3 threads per gpu with Gamma Ray. Might run 1 or 2 GW tasks though.

I thought I would try segmenting the Rx 5700s to not run GW and the Rx 580s to not run GR.

I think I have the profile set up to download both GR and GW.

Right now I am getting a GW app not found error because I don't have any downloaded.

According to the Boinc log 0, 1 GPUs are the Rx 580's.

The cc_config.xml looks like this:

<cc_config>
<log_flags>
<sched_op_debug>1</sched_op_debug>
</log_flags>
<options>
<max_tasks_reported>150</max_tasks_reported>
<max_file_xfers>4</max_file_xfers>
<max_file_xfers_per_project>4</max_file_xfers_per_project>
<save_stats_days>180</save_stats_days>
<ncpus>16</ncpus>
<exclude_gpu>
<url>http://einstein.phys.uwm.edu/</url>
<device_num>0</device_num>
<app>hsgamma_FGRPB1G</app>
</exclude_gpu>
<exclude_gpu>
<url>http://einstein.phys.uwm.edu/</url>
<device_num>1</device_num>
<app>hsgamma_FGRPB1G</app>
</exclude_gpu>
<exclude_gpu>
<url>http://einstein.phys.uwm.edu/</url>
<device_num>2</device_num>
<app>einstein_O2MDF</app>
</exclude_gpu>
<exclude_gpu>
<url>http://einstein.phys.uwm.edu/</url>
<device_num>3</device_num>
<app>einstein_O2MDF</app>
</exclude_gpu>
<exclude_gpu>
<url>http://einstein.phys.uwm.edu/</url>
<device_num>4</device_num>
<app>einstein_O2MDF</app>
</exclude_gpu>
<exclude_gpu>
<url>http://einstein.phys.uwm.edu/</url>
<device_num>5</device_num>
<app>einstein_O2MDF</app>
</exclude_gpu>
</options>
</cc_config>

Am I missing anything or screwing something up?

Tom M

A Proud member of the O.F.A. (Old Farts Association).

363fc9cda368b2e...

Joined: 4 Sep 17

Posts: 10

Credit: 172312824

RAC: 0

maybe try removing exclude

13 Dec 2020 5:06:04 UTC

Message 181621

(moderation:

)

try restarting boinc after you’ve saved the config file if you haven’t already

Tom M

Joined: 2 Feb 06

Posts: 6884

Credit: 9800323299

RAC: 3565360

Because I wasn't getting any

6 Jan 2021 14:43:30 UTC

Message 182220

(moderation:

)

Because I wasn't getting any GR tasks I toggled the "run non-preferred if preferred are not available" and it started downloading GW gpu tasks.

Which leads me to the question of how many gpu threads on my Rx 5700's should I be running?

So far, running either two tasks or even three tasks at a time have not pushed my gpu utilization above 80% on a regular basis.

And with 2 tasks running I am noticing a very high level of "copy" activity on my Windows task manager for the GPUs. I am used to 1-4%. I am getting 7-21%.

That leads me to wonder if I am not getting "thrashing" activity that is leading to shuttling two different tasks in/out of the gpu memory. Usually, that implies a slow down of processing.

So far the tasks seem to be executing ok.

Any guidance on if I should be running 3-4 tasks for GW or 1 only would be appreciated. As usual, the goal is maximizing production.

Tom M

A Proud member of the O.F.A. (Old Farts Association).

Ian&Steve C.

Joined: 19 Jan 20

Posts: 4155

Credit: 50100078219

RAC: 42358047

GW tasks are much more

6 Jan 2021 15:36:53 UTC

Message 182221

(moderation:

)

GW tasks are much more dependent on the CPU speed/capability. make sure you are leaving enough free CPU resources. I'm not sure the behavior of the Windows_AMD app, but the Linux_nvidia app routinely uses more than 1 CPU thread per GPU_GW task. In my mind, your 3950X should be more than capable to feed the GPU, unless there's some major difference in the behavior of the windows_amd app.

I wouldn't run 4x on your 5700, since they only have 8GB of VRAM. i'd stick to 2-3x. again, make sure that you have enough spare CPU available, reduce your CPU work if you have to.

I run 1x on my GPUs, and get good GPU utilization on my systems, so no reason to run 2x for me:

RTX 2080ti @225W (4.2GHz Ryzen 3900x) : ~90-95%

RTX 2080ti @215W (3.3GHz EPYC 7402P) : ~85-90%

GTX 1650 @75W (4.2GHz Ryzen 3900X) : ~95%

RTX 2070 @150W (3.3GHz EPYC 7402P) : ~90%

RTX 3070 @200W (2.8GHz EPYC 7642) : ~90%

GTX 1660 Super @100W (2.8GHz EPYC 7642) :~90-95%

_________________________________________________________________________

Ian&Steve C.

Joined: 19 Jan 20

Posts: 4155

Credit: 50100078219

RAC: 42358047

a quick look at some various

6 Jan 2021 17:13:52 UTC

Message 182228

(moderation:

)

a quick look at some various systems, looks like Nvidia has a significant edge over AMD for these GW tasks.

_________________________________________________________________________

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5888

Credit: 119691773863

RAC: 25361076

Tom M wrote:Any guidance on

6 Jan 2021 23:56:55 UTC

Message 182235 in response to message 182220

(moderation:

)

Tom M wrote:

Any guidance on if I should be running 3-4 tasks for GW or 1 only would be appreciated.

About a month ago, I setup 5 hosts with AMD RX 570 8GB GPUs specifically for GW tasks. Each machine runs tasks 4x and none have had any 'lack of memory' issues with the 8GB GPUs at 4x multiplicity. At the start, when I first tested, there were distinct improvements in going from 1x right through to 4x, although going from 1x to 2x was the biggest gain.

The CPUs are quite basic and do not crunch. They include AMD 3000G, Intel G4560, and Intel i3-9100F, all with 4 CPU threads which support the 4x GPU tasks. There is also a 6 core Ryzen 5 2600 for comparison. Somewhat surprisingly, the hosts with weaker CPUs keep up with and often beat the Ryzen. 4 threads are certainly enough to support 4 concurrent GPU tasks it would seem.

The machine performing the best is actually the 3000G which completes 4 tasks in about 32 to 36 mins most of the time (ie. 8-9mins on a per task basis). All the others take around 35 to 39 mins most of the time. There are a very small number of 'outliers' - tasks that take 40-50% longer than normal. Out of around 2500-3000 tasks in a particular frequency 'series', I've noticed so far perhaps around 30 tasks or so that show this behaviour. I don't have full coverage of all the issue numbers in a full series so there could be a few more that show up.

Something similar happened in the previous run and it seems due to this small number simply having that much more work to do than the standard task. I deliberately crunched some of these at lower multiplicities and they took that much longer than the standard tasks there as well. So not a behaviour caused by lack of memory, it would seem.

These hosts are currently crunching frequencies in the range 450Hz to 475Hz. Like last time, memory issues might creep in at much higher frequencies but everything seems OK at 4x for an 8GB AMD GPU at the moment.

Cheers,
Gary.

Discussion Thread for the Continuous GW Search known as O2MD1 (now O2MDF - GPUs only)

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner