Gamma-ray pulsar binary search #1 on GPUs

Jim1348

Joined: 19 Jan 06

Posts: 463

Credit: 257957147

RAC: 0

Holmis wrote:To get rid of

7 Dec 2016 22:36:34 UTC

Message 152531 in response to message 152528

(moderation:

)

Holmis wrote:

To get rid of the error message edit your cc_config.xml and remove the tags.

The first thing I did was look in my cc_config.xml, and didn't see it. This is what I have:

<cc_config>
<options>
   <rec_half_life_days>1.000000</rec_half_life_days>
   <use_all_gpus>1</use_all_gpus>
    <ignore_nvidia_dev>1</ignore_nvidia_dev>
</options>
</cc_config>

You are undoubtedly right that it has nothing to do with why I am not getting the Betas; that must be something else. I will try again later.

EDIT: I added the "ignore one GTX 960" to just run on one card and avoid the problem with the two cards. That seemed to work for a while, but when I stopped getting work units, I thought maybe they had decided to restrict distribution to systems with just one card, and that mine was still considered two cards even if one was ignored.

floyd

Joined: 12 Sep 11

Posts: 133

Credit: 186610495

RAC: 0

Jim, the other message means

8 Dec 2016 8:46:28 UTC

Message 152548 in response to message 152531

(moderation:

)

Jim, the other message means that the WU that the scheduler considered for you was already processed by the beta application. They don't want a second beta result but a stable result to verify against so that WU ist not assigned to you. Whether there are no fresh WU or the scheduler doesn't find them and gives up I can't tell.

Jim1348

Joined: 19 Jan 06

Posts: 463

Credit: 257957147

RAC: 0

OK, thanks. I will try again

8 Dec 2016 8:51:34 UTC

Message 152549 in response to message 152548

(moderation:

)

OK, thanks. I will try again later. They seem to run well on the GTX 960, about 600 seconds, as compared to 1000 seconds that your GTX 750 Ti is getting, so the memory bandwidth does not seem to be as much of a handicap as for the BRP4Gs. That is what I was looking for.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4349

Credit: 253403373

RAC: 37587

Currently the scheduling

8 Dec 2016 12:38:00 UTC

Message 152557

(moderation:

)

Currently the scheduling array is filled with tasks that are bound to be run by the CPU version to validate results from the GPU version. However, the results of the GPU apps validate pretty well, I'll drop this restriction later today.

Jeroen

Joined: 25 Nov 05

Posts: 379

Credit: 740030628

RAC: 0

Thanks for the update and

9 Dec 2016 3:16:40 UTC

Message 152573

(moderation:

)

Thanks for the update and work done on the new FGRP GPU application. I see my NVIDIA GPU in Linux has been validating some of the tasks successfully and so far none of the tasks completed are being reported as invalid. The tasks are completing consistently at 221 seconds a piece.

choks

Joined: 24 Feb 05

Posts: 16

Credit: 150632536

RAC: 88443

About warnings in the

9 Dec 2016 12:40:59 UTC

Message 152598

(moderation:

)

About warnings in the logs.

Since BOINC does not report FP64 support, a dummy kernel compile check using FP64 is performed when OpenCL device is opened. If FP64 is OK, we use the GPU for almost everything (even sorting results). If the device does not support FP64, all kernels requiring "double" support are performed by the CPU (about 10x slower).

If you see "OpenCL device has FP64 support" in the logs, it means that the GPU has been recognized to support double floating point. Don't worry about performance, double precision is not the major part of processing.

On OSX, there are lots of warning compiling the FFT library, but this is harmless and should be ignored.

As Bernd said, we are still having issues with the Windows driver. I hope we will find soon what's causing the biggest OpenCL kernel to fail on Windows only.

Christophe

Trotador

Joined: 2 May 13

Posts: 58

Credit: 2122645985

RAC: 8

Crunching these wus in a

9 Dec 2016 15:04:13 UTC

Message 152602

(moderation:

)

Crunching these wus in a HD7950 in Linux, no issue so far, around 315 seconds when oly one, 460 with two and 610 with three simultaneous wus.

Jim1348

Joined: 19 Jan 06

Posts: 463

Credit: 257957147

RAC: 0

I normally run just one at a

9 Dec 2016 17:04:18 UTC

Message 152605 in response to message 152602

(moderation:

)

I normally run just one at a time, but thought I would try two on my GTX 960 (Ubuntu 16.10, 367.57 drivers). This is a minimally factory overclocked card that runs at 58 C normally. But as soon as it hit two work units, it errored out. And all the remaining work units errored out too, after 11 seconds. This is like the problem that I initially had with two cards, except that I was running just one card. Apparently my machine does not like two work units at once in any manner.

choks

Joined: 24 Feb 05

Posts: 16

Credit: 150632536

RAC: 88443

@Jim: Looks you have 2Gb of

9 Dec 2016 18:42:14 UTC

Message 152612

(moderation:

)

@Jim: Looks you have 2Gb of GPU RAM which should be enough to perform 2 tasks at the same time. Does it occur if you run two BRP4G WU? (just quit the Beta to test it).

What's clear is that the Nvidia driver got completely crazy after the first error, and computation have be done outside normal GPU memory, which leads to the FP exception you get.

Did you had to reboot your machine to perform GPU tasks successfully, or just restart BOINC did it?

Thx

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

I tried to run 3 tasks

9 Dec 2016 19:05:30 UTC

Message 152614

(moderation:

)

I tried to run 3 tasks parallel on GTX 960 2GB (driver 375.20) + Linux Mint 18 (4.9.0-040900rc8-generic). Allocated 0.33 GPU resources per task. All three 'Gamma-ray pulsar binary search #1 on GPUs v1.12 (FGRPopencl-nvidia) x86_64-pc-linux-gnu' tasks started at the same time.

Two of them continued running fine all the way to end, but third "line" in parallel kept erroring out always in about 18 secs. Here's error message for all those tasks:

[CRITICAL]: ERROR: MAIN() returned with error '-4'
FPU status flags:
Error in OpenCL context: CL_MEM_OBJECT_ALLOCATION_FAILURE error executing CL_COMMAND_NDRANGE_KERNEL on GeForce GTX 960 (Device 0).

https://einsteinathome.org/host/12468219/tasks/error

Gamma-ray pulsar binary search #1 on GPUs

Forums › Technical News

Comment viewing options

Forums › Technical News