I sent you a PM about the warning that gets issued (clFFT resources not freed) if a GPU task gets suspended - eg. if the multiplicity of concurrent tasks gets reduced and as a result a task gets suspended.
For more background about why I sent the PM, you can check this message, and the messages that led up to it. The warning was in the stderr.txt output and I wasn't sure if it was important or not.
Thanks for the new test app. I guess cecht will be interested to see if it performs any differently when his script is reducing task multiplicity :-). Is there expected to be any benefit (reduced memory consumption) for running multiple tasks under conditions where multiplicity is being changed like this?
...Thanks for the new test app. I guess cecht will be interested to see if it performs any differently when his script is reducing task multiplicity :-). Is there expected to be any benefit (reduced memory consumption) for running multiple tasks under conditions where multiplicity is being changed like this?
So far as I can tell, 2.09 is running tasks with the same memory usage as 2.08. My fancy little script-in-development hasn't yet been put to the test to resolve excessive GPU memory use because the Spotlight tasks to date have had relatively light memory requirements. Wouldn't it be wonderful if Bernd's new app rendered obsolete my fancy little script? :-\
Ideas are not fixed, nor should they be; we live in model-dependent reality.
Apparently there's a problem with our GW app and the new OSX version "Big Sur" (described here). For now we'll stop delivering GW work to such systems.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Because I wasn't getting any GR tasks I toggled the "run non-preferred if preferred are not available" and it started downloading GW gpu tasks.
Which leads me to the question of how many gpu threads on my Rx 5700's should I be running?
So far, running either two tasks or even three tasks at a time have not pushed my gpu utilization above 80% on a regular basis.
And with 2 tasks running I am noticing a very high level of "copy" activity on my Windows task manager for the GPUs. I am used to 1-4%. I am getting 7-21%.
That leads me to wonder if I am not getting "thrashing" activity that is leading to shuttling two different tasks in/out of the gpu memory. Usually, that implies a slow down of processing.
So far the tasks seem to be executing ok.
Any guidance on if I should be running 3-4 tasks for GW or 1 only would be appreciated. As usual, the goal is maximizing production.
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
GW tasks are much more dependent on the CPU speed/capability. make sure you are leaving enough free CPU resources. I'm not sure the behavior of the Windows_AMD app, but the Linux_nvidia app routinely uses more than 1 CPU thread per GPU_GW task. In my mind, your 3950X should be more than capable to feed the GPU, unless there's some major difference in the behavior of the windows_amd app.
I wouldn't run 4x on your 5700, since they only have 8GB of VRAM. i'd stick to 2-3x. again, make sure that you have enough spare CPU available, reduce your CPU work if you have to.
I run 1x on my GPUs, and get good GPU utilization on my systems, so no reason to run 2x for me:
Any guidance on if I should be running 3-4 tasks for GW or 1 only would be appreciated.
About a month ago, I setup 5 hosts with AMD RX 570 8GB GPUs specifically for GW tasks. Each machine runs tasks 4x and none have had any 'lack of memory' issues with the 8GB GPUs at 4x multiplicity. At the start, when I first tested, there were distinct improvements in going from 1x right through to 4x, although going from 1x to 2x was the biggest gain.
The CPUs are quite basic and do not crunch. They include AMD 3000G, Intel G4560, and Intel i3-9100F, all with 4 CPU threads which support the 4x GPU tasks. There is also a 6 core Ryzen 5 2600 for comparison. Somewhat surprisingly, the hosts with weaker CPUs keep up with and often beat the Ryzen. 4 threads are certainly enough to support 4 concurrent GPU tasks it would seem.
The machine performing the best is actually the 3000G which completes 4 tasks in about 32 to 36 mins most of the time (ie. 8-9mins on a per task basis). All the others take around 35 to 39 mins most of the time. There are a very small number of 'outliers' - tasks that take 40-50% longer than normal. Out of around 2500-3000 tasks in a particular frequency 'series', I've noticed so far perhaps around 30 tasks or so that show this behaviour. I don't have full coverage of all the issue numbers in a full series so there could be a few more that show up.
Something similar happened in the previous run and it seems due to this small number simply having that much more work to do than the standard task. I deliberately crunched some of these at lower multiplicities and they took that much longer than the standard tasks there as well. So not a behaviour caused by lack of memory, it would seem.
These hosts are currently crunching frequencies in the range 450Hz to 475Hz. Like last time, memory issues might creep in at much higher frequencies but everything seems OK at 4x for an 8GB AMD GPU at the moment.
Bernd Machenschalk wrote:...
)
I sent you a PM about the warning that gets issued (clFFT resources not freed) if a GPU task gets suspended - eg. if the multiplicity of concurrent tasks gets reduced and as a result a task gets suspended.
For more background about why I sent the PM, you can check this message, and the messages that led up to it. The warning was in the stderr.txt output and I wasn't sure if it was important or not.
Thanks for the new test app. I guess cecht will be interested to see if it performs any differently when his script is reducing task multiplicity :-). Is there expected to be any benefit (reduced memory consumption) for running multiple tasks under conditions where multiplicity is being changed like this?
Cheers,
Gary.
Richie wrote: I'm receiving
)
Thanks for reporting. Indeed there was something wrong on the server side. Should be fixed.
BM
Gary Roberts wrote:...Thanks
)
So far as I can tell, 2.09 is running tasks with the same memory usage as 2.08. My fancy little script-in-development hasn't yet been put to the test to resolve excessive GPU memory use because the Spotlight tasks to date have had relatively light memory requirements. Wouldn't it be wonderful if Bernd's new app rendered obsolete my fancy little script? :-\
Ideas are not fixed, nor should they be; we live in model-dependent reality.
Apparently there's a problem
)
Apparently there's a problem with our GW app and the new OSX version "Big Sur" (described here). For now we'll stop delivering GW work to such systems.
BM
I have found it possible to
)
I have found it possible to run a mix of Rx 5700 and Rx 580's under Windows 10.
However, the Rx 580's are not real happy running 3 threads per gpu with Gamma Ray. Might run 1 or 2 GW tasks though.
I thought I would try segmenting the Rx 5700s to not run GW and the Rx 580s to not run GR.
I think I have the profile set up to download both GR and GW.
Right now I am getting a GW app not found error because I don't have any downloaded.
According to the Boinc log 0, 1 GPUs are the Rx 580's.
The cc_config.xml looks like this:
Am I missing anything or screwing something up?
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
maybe try removing exclude
)
try restarting boinc after you’ve saved the config file if you haven’t already
Because I wasn't getting any
)
Because I wasn't getting any GR tasks I toggled the "run non-preferred if preferred are not available" and it started downloading GW gpu tasks.
Which leads me to the question of how many gpu threads on my Rx 5700's should I be running?
So far, running either two tasks or even three tasks at a time have not pushed my gpu utilization above 80% on a regular basis.
And with 2 tasks running I am noticing a very high level of "copy" activity on my Windows task manager for the GPUs. I am used to 1-4%. I am getting 7-21%.
That leads me to wonder if I am not getting "thrashing" activity that is leading to shuttling two different tasks in/out of the gpu memory. Usually, that implies a slow down of processing.
So far the tasks seem to be executing ok.
Any guidance on if I should be running 3-4 tasks for GW or 1 only would be appreciated. As usual, the goal is maximizing production.
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
GW tasks are much more
)
GW tasks are much more dependent on the CPU speed/capability. make sure you are leaving enough free CPU resources. I'm not sure the behavior of the Windows_AMD app, but the Linux_nvidia app routinely uses more than 1 CPU thread per GPU_GW task. In my mind, your 3950X should be more than capable to feed the GPU, unless there's some major difference in the behavior of the windows_amd app.
I wouldn't run 4x on your 5700, since they only have 8GB of VRAM. i'd stick to 2-3x. again, make sure that you have enough spare CPU available, reduce your CPU work if you have to.
I run 1x on my GPUs, and get good GPU utilization on my systems, so no reason to run 2x for me:
RTX 2080ti @225W (4.2GHz Ryzen 3900x) : ~90-95%
RTX 2080ti @215W (3.3GHz EPYC 7402P) : ~85-90%
GTX 1650 @75W (4.2GHz Ryzen 3900X) : ~95%
RTX 2070 @150W (3.3GHz EPYC 7402P) : ~90%
RTX 3070 @200W (2.8GHz EPYC 7642) : ~90%
GTX 1660 Super @100W (2.8GHz EPYC 7642) :~90-95%
_________________________________________________________________________
a quick look at some various
)
a quick look at some various systems, looks like Nvidia has a significant edge over AMD for these GW tasks.
_________________________________________________________________________
Tom M wrote:Any guidance on
)
About a month ago, I setup 5 hosts with AMD RX 570 8GB GPUs specifically for GW tasks. Each machine runs tasks 4x and none have had any 'lack of memory' issues with the 8GB GPUs at 4x multiplicity. At the start, when I first tested, there were distinct improvements in going from 1x right through to 4x, although going from 1x to 2x was the biggest gain.
The CPUs are quite basic and do not crunch. They include AMD 3000G, Intel G4560, and Intel i3-9100F, all with 4 CPU threads which support the 4x GPU tasks. There is also a 6 core Ryzen 5 2600 for comparison. Somewhat surprisingly, the hosts with weaker CPUs keep up with and often beat the Ryzen. 4 threads are certainly enough to support 4 concurrent GPU tasks it would seem.
The machine performing the best is actually the 3000G which completes 4 tasks in about 32 to 36 mins most of the time (ie. 8-9mins on a per task basis). All the others take around 35 to 39 mins most of the time. There are a very small number of 'outliers' - tasks that take 40-50% longer than normal. Out of around 2500-3000 tasks in a particular frequency 'series', I've noticed so far perhaps around 30 tasks or so that show this behaviour. I don't have full coverage of all the issue numbers in a full series so there could be a few more that show up.
Something similar happened in the previous run and it seems due to this small number simply having that much more work to do than the standard task. I deliberately crunched some of these at lower multiplicities and they took that much longer than the standard tasks there as well. So not a behaviour caused by lack of memory, it would seem.
These hosts are currently crunching frequencies in the range 450Hz to 475Hz. Like last time, memory issues might creep in at much higher frequencies but everything seems OK at 4x for an 8GB AMD GPU at the moment.
Cheers,
Gary.