2nd GPU not running tasks.

BoincSpy
BoincSpy
Joined: 8 Nov 04
Posts: 32
Credit: 1103664671
RAC: 361299
Topic 226764

I have set <use_all_gpus> to 1 and had both gpus running until recently and once in a while the second gpu will kick in. I don't see anything in the log files that would prevent it from running. I turned on coproc_debug  in Event Log options and I get the following messages and BOINC does recognize to assign a workunit to the gpu instance but thats it... Is this a BOINC issue or Einstein@home one? Here are the messages. Instance 0 seems to have addition messages but Instance 0. Just repeats on Assigning a WU to the instance.

 

 

1/12/2022 10:49:15 AM | Einstein@Home | [coproc] NVIDIA instance 0; 1.000000 pending for LATeah3011L06_692.0_0_0.0_19627524_0
1/12/2022 10:49:15 AM | Einstein@Home | [coproc] NVIDIA instance 0: confirming 1.000000 instance for LATeah3011L06_692.0_0_0.0_19627524_0
1/12/2022 10:49:15 AM | Einstein@Home | [coproc] Assigning NVIDIA instance 1 to LATeah3011L06_692.0_0_0.0_19648611_1
 

I have tried resetting the project and rebooting the computer 

 Regards,

BoincSpy

BoincSpy
BoincSpy
Joined: 8 Nov 04
Posts: 32
Credit: 1103664671
RAC: 361299

Hmm There is a hint in the

Hmm There is a hint in the cpu_sched_debug, message indicates that that the CPU is committed.. Message is:

1/12/2022 11:08:11 AM | Einstein@Home | [cpu_sched_debug] skipping GPU job LATeah3011L06_692.0_0_0.0_15274656_0; CPU committed

So need to figure out what CPU committed means now. I only let the have BOINC have  only 75% of the CPUs.

 


 

GWGeorge007
GWGeorge007
Joined: 8 Jan 18
Posts: 3065
Credit: 4970777686
RAC: 1427761

First I would suggest that

First I would suggest that you make your computer(s) visible.  Without this, we do not know what system you are running, what GPUs you are using, etc.  You would get more help with the computer(s) visible.

George

Proud member of the Old Farts Association

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3958
Credit: 47009532642
RAC: 64913009

what CPU project are you

what CPU project are you running?

_________________________________________________________________________

BoincSpy
BoincSpy
Joined: 8 Nov 04
Posts: 32
Credit: 1103664671
RAC: 361299

Hi GWGEORGE007, I have

Hi GWGEORGE007,

I have made the computers visible, computer name wks-ubuntu-005.

Regards

BoincSpy
BoincSpy
Joined: 8 Nov 04
Posts: 32
Credit: 1103664671
RAC: 361299

Digging a bit deeper

Digging a bit deeper their  is a discussion about this on boinc forums.

 

GPU tasks skipped after scheduler overcommits CPU cores (berkeley.edu) 

 

But appears there is no real solution to the problem.

 

 

 

GWGeorge007
GWGeorge007
Joined: 8 Jan 18
Posts: 3065
Credit: 4970777686
RAC: 1427761

BoincSpy

BoincSpy wrote:

Hi GWGEORGE007,

I have made the computers visible, computer name wks-ubuntu-005.

Regards

We cannot see your named computer "wks-ubuntu-005".  Could you identify the computer in question by it's ID # ?

George

Proud member of the Old Farts Association

BoincSpy
BoincSpy
Joined: 8 Nov 04
Posts: 32
Credit: 1103664671
RAC: 361299

COMPUTER 12903076

COMPUTER 12903076

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3958
Credit: 47009532642
RAC: 64913009

Does the second GPU start

Does the second GPU start working if you suspend CPU work? What CPU project are you crunching? 

_________________________________________________________________________

BoincSpy
BoincSpy
Joined: 8 Nov 04
Posts: 32
Credit: 1103664671
RAC: 361299

Hmm - I was able to get the

Hmm - I was able to get the both GPUs going by deleting all of the non-GPU work units. Will see how long they continue to both work.

 

 

 

 

 

GWGeorge007
GWGeorge007
Joined: 8 Jan 18
Posts: 3065
Credit: 4970777686
RAC: 1427761

If you haven't already, you

If possible, could you copy/paste your  cc_config.xml  and your  app_config.xml  files here so we can see what you're up against?

George

Proud member of the Old Farts Association

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.