// DBOINCP-300: added node comment count condition in order to get Preview working ?>
J Borthwick
Joined: 18 Dec 11
Posts: 5
Credit: 10012797
RAC: 0
2 Nov 2013 18:05:24 UTC
Topic 197247
(moderation:
)
I have been running BRP 5 tasks since JAN 13 on a rMBP with no trouble but some 31 WUs have now errored out.Have the latest driver from NVIDIA.
Boinc 7.0.31
What's Darwin 12.5.0? You haven't upgraded to Maverick, have you? That - it has become apparent - has problematic CUDA drivers.
OK - false alarm. He says (in PM) he's staying on Mountain Lion 10.8.5
Any mac specialists in the house?
By no means a Mac specialist, but I had a look at the relevant part of the tasks list and you can see a sudden problem developing in the middle of a running task. It would appear that some software update might have happened which killed things for the GPU. CPU tasks were unaffected. The last good BRP5 task was returned at 31 Oct 2013 7:30:05 UTC and the first failed one at 1 Nov 2013 23:31:52 UTC. By examining the stderr output for that failed task we see (close to the end)
....
[18:58:07][350][INFO ] Checkpoint committed!
[18:59:08][350][INFO ] Checkpoint committed!
dyld: DYLD_ environment variables being ignored because main executable (/Library/Application Support/BOINC Data/slots/0/../../switcher/switcher) is setuid or setgid
[23:00:01][8567][INFO ] Application startup - thank you for supporting Einstein@Home!
[23:00:01][8567][INFO ] Starting data processing...
[23:00:01][8567][ERROR] Failed to enable CUDA thread yielding for device #0 (error: 2)! Sorry, will try to occupy one CPU core...
[23:00:01][8567][ERROR] Couldn't acquire CUDA context of device #0 (error: 2)!
[23:00:01][8567][ERROR] Demodulation failed (error: 1002)!
23:00:01 (8567): called boinc_finish
which indicates that the problem occurred sometime after 18:59:08 when the last checkpoint was saved.
Perhaps the OP may recall what happened (software update-wise) between that time and 23:00:01 when the app attempted to restart and immediately errored out.
Hi Gary
As far as I can recall no software updates were done over the time in question.
Have returned to CPU work meantime.
Will wait for CUDA updates.
Cheers
JB
Using CUDA 5.5.28 and Boinc 7.0.31
The rMBP has been running approximately 12hrs a day since February 2013 with no obvious heat problems.Cooling fan is of course running constantly.I have 8 projects on it but usually ony run 2 concurrently.
At the moment Einstein CPU and T4T.
It just keeps on running.
JB
Problem solved.User error
Checked Boinc preferences and found multiprocessor usage set at 100%.Changed to 75 and now GPU work is back to normal.
Thanks.
JB
GT 650M ERRORS
)
What's Darwin 12.5.0? You haven't upgraded to Maverick, have you? That - it has become apparent - has problematic CUDA drivers.
RE: What's Darwin 12.5.0?
)
OK - false alarm. He says (in PM) he's staying on Mountain Lion 10.8.5
Any mac specialists in the house?
RE: RE: What's Darwin
)
By no means a Mac specialist, but I had a look at the relevant part of the tasks list and you can see a sudden problem developing in the middle of a running task. It would appear that some software update might have happened which killed things for the GPU. CPU tasks were unaffected. The last good BRP5 task was returned at 31 Oct 2013 7:30:05 UTC and the first failed one at 1 Nov 2013 23:31:52 UTC. By examining the stderr output for that failed task we see (close to the end)
which indicates that the problem occurred sometime after 18:59:08 when the last checkpoint was saved.
Perhaps the OP may recall what happened (software update-wise) between that time and 23:00:01 when the app attempted to restart and immediately errored out.
Cheers,
Gary.
Hi Gary As far as I can
)
Hi Gary
As far as I can recall no software updates were done over the time in question.
Have returned to CPU work meantime.
Will wait for CUDA updates.
Cheers
JB
What's the CUDA driver
)
What's the CUDA driver version being used…?
I'm on 10.8.5 with BOINC 7.0.31 on CUDA 5.5.28… But mine's an old GT 330M…
Off-topic: Ain't that a little hot to be running on an rMBP…? Form factor's a lot thinner than mine...
Using CUDA 5.5.28 and Boinc
)
Using CUDA 5.5.28 and Boinc 7.0.31
The rMBP has been running approximately 12hrs a day since February 2013 with no obvious heat problems.Cooling fan is of course running constantly.I have 8 projects on it but usually ony run 2 concurrently.
At the moment Einstein CPU and T4T.
It just keeps on running.
JB
Edit Forgot to include
)
Edit
Forgot to include pogs.
JB
Problem solved.User
)
Problem solved.User error
Checked Boinc preferences and found multiprocessor usage set at 100%.Changed to 75 and now GPU work is back to normal.
Thanks.
JB