No new workunits - communication deferred also after reset of project

Philipp Marc Neuhaus
Philipp Marc Neuhaus
Joined: 24 May 05
Posts: 2
Credit: 636836991
RAC: 241096
Topic 225288

Hi, 

After intensive calculations for years, all of a sudden there are no work units coming. Communications is deferred. I have reset the project in the client - still no work units.

What to do?

Thanks upfront for guidance….

Phil

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 264
Credit: 7197828261
RAC: 9996790

... which applications ??

... which applications ??

... what does the "event log" under "tools" say ??

mikey
mikey
Joined: 22 Jan 05
Posts: 11971
Credit: 1834014834
RAC: 225583

Philipp Marc Neuhaus

Philipp Marc Neuhaus wrote:

Hi, 

After intensive calculations for years, all of a sudden there are no work units coming. Communications is deferred. I have reset the project in the client - still no work units.

What to do?

Thanks upfront for guidance….

Phil 

Umm I just looked at your pc and you have 20 of this type of task in progress right now:

Gamma-ray pulsar binary search #1 on GPUs v1.20 () x86_64-pc-linux-gnu

So you are not out of work units.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5850
Credit: 110036684023
RAC: 22384119

Philipp Marc Neuhaus

Philipp Marc Neuhaus wrote:

... all of a sudden there are no work units coming.

Your tasks list shows 1014 compute errors.  You have probably used up your daily allowance.  The scheduler tends to notice these things and you would been denied new work for a period because of that.  There would have been messages in the event log to tell you exactly what happened.

Resetting the project was a complete waste of time.  You need to look at what caused the errors and fix the real problem.

You do that by clicking on the Task ID link for any of your tasks showing as 'Error while computing' on the website (you have a lot to choose from) and reading what it says - even if you think you don't understand it.  There will always be some clues.

Here is an excerpt from one I looked at:-

boinc_get_opencl_ids returned [(nil) , (nil)]   - Failed to get OpenCL platform/device info from BOINC (error: -1)!

There's more that goes with the above.  I've just chosen enough to show the problem.

This would seem to indicate that your machine no longer has the OpenCL libraries properly installed and available to BOINC.  Perhaps you may have done something recently to do with graphics drivers?

 

Cheers,
Gary.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3716
Credit: 34687366409
RAC: 26514163

can say with 99% certainty

can say with 99% certainty that you simply need to reboot your system, then wait 12-24hrs to get out of Einstein jail for returning so many error'd tasks.

 

The OP is running Ubuntu 20.04. which has been known to forcefully push nvidia driver updates that are marked critical security updates. it swaps out the libraries on the fly, even while computation is happening. this breaks the driver and all tasks will fail with the error that it can't find a GPU. I've had this same scenario happen to me several times since changing to 20.04. if you run "nvidia-smi" in the terminal before rebooting, you will be greeted with a nice "driver/library mismatch" error. reboot, and the problem is resolved. but you still need to sit in timeout from Einstein for all the errors. nothing you can do to get Einstein to send you more work until your detention is over.

_________________________________________________________________________

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.