Observations on FGRPB1 1.15 for Windows

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7394521687

RAC: 1978428

14 Dec 2016 15:51:49 UTC

Topic 203727

(moderation:

)

As mentioned in the Technical News forum thread on GAMMA-RAY PULSAR BINARY SEARCH #1 ON GPUS Work has appeared for Windows.

There are oddities, which make this experience very, very different from recent GPU crunching at Einstein.

The application itself makes very heavy use of CPU, with the GPU confined to one specific (important) component on the calculations.

Project settings means the work arrives at my hosts set for "1 CPU + 1 NVidia GPU" regardless of the value for the project preference setting of "GPU utilization factor of FGRP apps".

So if, as I do, you have constrained CPU usage to well under 100% in an effort to tune productivity for previous application mixes, you may wish to revise settings. In my case the hunger for CPU slots nudged aside in-process GW CPU tasks.

Also, unlike previous GPU tasks, I'm seeing CPU usage at very high levels, on three different hosts 81%, 99%, and 91%.

Also GPU temperature is not reaching anywhere near so high a level as usual, as the single task allowed by the forced multiplicity setting does not keep the GPU remotely fully occupied.

I do not yet have any successful completions, let alone validations, but do have work in process on three of four available machines.

ravenigma

Joined: 20 Aug 10

Posts: 69

Credit: 80645321

RAC: 64

I've not yet received a

14 Dec 2016 17:01:29 UTC

Message 152794

(moderation:

)

I've not yet received a second FGBRPB1 task, but I was able to set up the cc_config.xml to allow multiple instances. My GPU is currently working on both one of these tasks and a GPUGrid task. Currently set for two tasks at a time, may increase it once I see how it behaves.

<app>
<name>hsgamma_FGRPB1G</name>
<gpu_versions>
<gpu_usage>.5</gpu_usage>
<cpu_usage>1</cpu_usage>
</gpu_versions>
</app>

EDIT: GPUGrid tasks finished and I received a second FGRPB1 task. I now have two running on the GPU concurrently. GPU utilization has increased and is topping out around 15 - 20%.

Der Mann mit de...

Joined: 12 Dec 05

Posts: 151

Credit: 302594178

RAC: 0

Hi Archae86, if I understand

14 Dec 2016 16:41:43 UTC

Message 152796

(moderation:

)

Hi Archae86,

if I understand you in the right way, we shouldn't set i.e 0.33 GPU utilization factor and let free one core of cpu!?

BTW ATI and Intel GPU's are still getting none Work so far.

Der Mann mit der Ledertasche

Greetings from the North

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7394521687

RAC: 1978428

I have three machines running

14 Dec 2016 17:10:02 UTC

Message 152798 in response to message 152796

(moderation:

)

I have three machines running this work. For two of the three the work is running at 1 task per GPU (those two machines happen to have two GPUs each) despite my preference setting for more.

However on the third machine (which perhaps coincidentally has just one GPU) there are two 1.15 tasks running on the single GPU (and GPU-Z is showing an average GPU load of 9%, far higher than the others).

Just to complete the picture, my fourth machine is not getting this work at all. I've not figured out whether it has just been unlucky in request timing for oversubscribed supply, or whether some difference I've not guessed is blocking it.

After writing that I reviewed the most recent work request logs, and the key difference for the host which is not getting work appears to be this line:

"[version] NVidia device (or driver) doesn't support OpenCL"

That host has a slightly older driver, but not old at all (372.54 on the "non-supporting" host). The cards also differ, with the failing host having a 1080 6GB 1060, which the successful hosts have a 3GB 1060, a 1050, a 970 and a couple of 750Ti cards.

After posting this note, I intend to install the current Nvidia driver on the machine which is failing to get work.

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7394521687

RAC: 1978428

archae86 wrote: After posting

14 Dec 2016 17:33:59 UTC

Message 152799 in response to message 152798

(moderation:

)

archae86 wrote:

After posting this note, I intend to install the current Nvidia driver on the machine which is failing to get work.

While I can't prove cause and effect, I'll observe that immediately after I rebooted the machine which had been unable to get 1.15 work with the non-support of OpenCL comment, it got one 1.15 task, and on further automatic requests got more. So maybe a very recent Nvidia driver is actually required.

That machine is honoring the two GPU tasks at a time setting from my preferences.

On checking my preferences, I now believe that I made a false claim in accusing the 1.15 work of ignoring my GPU utilization factor setting. Apparently I changed that setting for one venue recently in a period when no GPU work was available, and did not notice it when I thought I checked.

I think Gary Roberts is well-known to advocate lowering your request work queue settings with new work types or other major configuration changes. I, personally, have lowered my requested amount from a bit over two days to about a half day. Initial indications on my machines are that the first estimate for completion time is well under the truth, so without this precaution I'd have considerably over-fetched.

ravenigma

Joined: 20 Aug 10

Posts: 69

Credit: 80645321

RAC: 64

What sort of run-times are

14 Dec 2016 18:58:13 UTC

Message 152803

(moderation:

)

What sort of run-times are people seeing on these tasks? Here's one I just finished.

I'm the one with the one with the much, much longer run time.

Jim1348

Joined: 19 Jan 06

Posts: 463

Credit: 257957147

RAC: 0

I am seeing three hours on a

14 Dec 2016 19:11:18 UTC

Message 152804

(moderation:

)

I am seeing three hours on a pair of GTX 750 Ti's.

https://einsteinathome.org/workunit/265708013

https://einsteinathome.org/workunit/265708012

I will stop until help arrives.

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7394521687

RAC: 1978428

I just got my first two

14 Dec 2016 19:57:40 UTC

Message 152806 in response to message 152803

(moderation:

)

I just got my first two finished, with elapsed time a little over 4 hours (so even slower than yours). The good news is that they validated. However they took a hundred times longer to run than their quorum partner, running an ATI Tahiti board under Linux using Gamma-ray pulsar binary search #1 on GPUs v1.13 (FGRPopencl-ati) x86_64-pc-linux-gnu.

WU 1
WU 2

As these are showing a very non-linear relationship between elapsed time and percent completion indication, I don't have reliable estimates for the other three hosts, but it appears that the next two will take over four hours, possibly a lot over. The fourth is the fastest CPU, but got work later than the others, but will certainly be well over two hours.

I did find another leftover configuration setting from how I liked to run things with FGRP4 which hurt some initial running--I had a CPU affinity set using Process Lasso which restricted the CPU "support" application (in this case the main worker, I think) to a subset of available cores. I've undone that, but don't expect a really dramatic approval.

These things are seriously slow.

Zalster

Joined: 26 Nov 13

Posts: 3117

Credit: 4050672230

RAC: 0

My 1070s ran 4 of them all

14 Dec 2016 22:10:51 UTC

Message 152810

(moderation:

)

My 1070s ran 4 of them all took the same amount of time 2 hr 20 minutes. I believe that is short than than the 6 hours on the CPU but still a lot longer than I would have thought but these are Beta apps so...

Jesse Viviano

Joined: 8 Jun 05

Posts: 33

Credit: 133045917

RAC: 0

I just used TechPowerUp's

14 Dec 2016 22:43:36 UTC

Message 152811

(moderation:

)

I just used TechPowerUp's GPU-Z (where the latest version will be linked to at https://www.techpowerup.com/downloads/. I do not link directly to it because such a link will point to a specific version instead of the latest version), and found out that my GPU load is absolutely zero. This project is running entirely on my CPU. Other projects are still able to max out my GPU.

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7394521687

RAC: 1978428

Jesse Viviano wrote:I just

14 Dec 2016 22:56:18 UTC

Message 152812 in response to message 152811

(moderation:

)

Jesse Viviano wrote:

I just used TechPowerUp's GPU-Z
<snip>
and found out that my GPU load is absolutely zero. This project is running entirely on my CPU. Other projects are still able to max out my GPU.

Since you are using GPU-Z I suggest you look again at the load entry and enable averaging. If your system is like mine, you'll find the long-term average to be somewhere in the 2 to 8% range averaged over the full WU. It is somewhat spikey, so lots of individual times read zero. But your GPU is not idle unless it is unlike any of my seven. Another way to see this is to look at the GPU reported temperature, which is nowhere near so high as when running BRP4, but is considerably higher than when truly unused.

Observations on FGRPB1 1.15 for Windows

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner