New GPU errors running Gravitational Wave search O2

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 415
Credit: 10261623455
RAC: 20342397

... most (I'd say 90%) of the

... most (I'd say 90%) of the WUs need/use no more than max 950MB.

So, some of his WUs should work - unless, as already said, he has a factor more than x1.

Hope CHEROKEE will answer.

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 415
Credit: 10261623455
RAC: 20342397

@RICHIE: ... of course, but

@RICHIE:

... of course, but somewhere along this post, someone pointed at the somewhat "old" driver version.

So, the OP should maybe update the NVIDIA driver to exclude this eventual source of error/problem.

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

Yes....

Harri Liljeroos wrote:
Wasn't there a limitation for Nvidia opencl that it can only access about 25% of the GPU memory?

Yes....

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 415
Credit: 10261623455
RAC: 20342397

Harri Liljeroos wrote:Wasn't

Harri Liljeroos wrote:
Wasn't there a limitation for Nvidia opencl that it can only access about 25% of the GPU memory?

check/read:

 https://devtalk.nvidia.com/default/topic/992502   (opencl)

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 415
Credit: 10261623455
RAC: 20342397

UPDATE: I'm surprised of my

UPDATE:

I'm surprised of my findings today.

Max VRAM used on the GTX 960 is suddenly around 1600MB.

So, I guess, there is a wide spread of needed VRAM depending on what WUs are sent.

I am seeing even VRAM max over 2100MB on one of my high end GPU.

All running factor x1.

Dp
Dp
Joined: 27 Aug 05
Posts: 14
Credit: 195572303
RAC: 227697

I've had increasing problems

I've had increasing problems since june2019, tasks were invalid or just errored out /einsteinathome.org/task/921190883.

I have disconnected then reconnected after a couple days from Einstein, but still the frequent errors return.\

Other projects seem to run fine, but I'm at a loss for E@H.

 

Any suggestions?

RWP

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4972
Credit: 18774266366
RAC: 7209481

Purge your drivers and then

Purge your drivers and then reinstall them.

 

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

Dp wrote:I've had increasing

Dp wrote:
I've had increasing problems since june2019, tasks were invalid or just errored out /einsteinathome.org/task/921190883.

Stderr output of that task says "2020-02-09 03:53:59.0325 (3060) [normal]: OpenCL Device used for Search/Recalc and/or semi coherent step: 'Tahiti (Platform: AMD Accelerated Parallel Processing, global memory: 3072 MiB)' "

Boinc says your host has "[2] AMD AMD Radeon(TM) RX 560 Series (4096MB)".

The problem is that your host has two GPUs of which the RX 560 is compatible with these current GW GPU tasks but the older card with a Tahiti chip is NOT. Boinc doesn't know that and is feeding these GW tasks to both cards.

You would need to exclude that Tahiti card from running these GW tasks for now (or remove the card physically from your system).

Here's a message that has useful information about how to exclude a GPU:

https://einsteinathome.org/content/discussion-thread-continuous-gw-search-known-o2md1-now-o2mdf-gpus-only?page=16#comment-173836

Both of your GPUs are AMD so the <device_num> parameter would be important to differentiate them and contain the Tahiti card inside that <exclude_gpu> part. If I remember correct you can see which GPU is '0' and which one is '1' by looking at the Boinc event log messages at Boinc startup. Or by trial and error.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.