No GPU tasks any more from Einstein@home?

Philipp Marc Neuhaus
Philipp Marc Neuhaus
Joined: 24 May 05
Posts: 2
Credit: 636737001
RAC: 244767
Topic 219991

Hi,

I used to crunch around 400´000 units for Einstein@home per day. After a Re-install of Ubuntu 18.4.3 LTS running with NVIDIA GeForce GTX 980 Ti recommended driver 435.21 from NVIDIA and CUDA 10.1 I do not get any GPU units from Einstein@home, while from other projects I do get, like SETI@home and Moo wrapper. Some days have passed now since the re-install, so for me it is not clear why I do not get work units.

Can you enlighten me? - Anything I have to do? - No work units available?

Thx upfront ..

Phil 

archae86
archae86
Joined: 6 Dec 05
Posts: 3146
Credit: 7059894931
RAC: 1139958

The first step in this

The first step in this frequently asked question is to review the information posted from the most recent project update by that machine.  You can find this by going to your account page, then to the list of your computers, then clicking on the link giving the date of most recent contact.

In this case, your host 12544534 had contact at 17:39 UTC on November 16.  The two crucial work request lines read:

2019-11-16 17:39:45.3485 [PID=1633 ] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2019-11-16 17:39:45.3486 [PID=1633 ] [send] CUDA: req 0.00 sec, 0.00 instances; est delay 0.00

As your machine requested zero work, it was sent zero work.

You mention that you run other projects.  I suspect BOINC on your machine thinks that you need to get work from other projects to get your project-level work into balance with your expressed intention.

 

 

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5850
Credit: 110026272824
RAC: 22501435

Philipp Marc Neuhaus

Philipp Marc Neuhaus wrote:
... After a Re-install of Ubuntu 18.4.3 LTS running with NVIDIA GeForce GTX 980 Ti recommended driver 435.21 from NVIDIA and CUDA 10.1 I do not get any GPU units from Einstein@home ...

Have you checked on the website to see what GPU tasks the server thinks you have?  At the moment, the online database says you have a total of 1261 GPU tasks, of which 211 are "in progress", 1 is "pending", and 18 are already "valid".  The real concern however is that there are 1030 tasks listed as "error".

I looked at just the error tasks and these are quite a puzzle.  It shows that you have received those over a long period from 2nd Nov to 10th Nov.  They show a status of "timed out - no response" and a date of 15th Nov which means they were all given an error status before the official deadline.  Did you know you had such a large number of tasks over that period?  Did you do your reinstall with such a large number already on board?  It looks very much like those tasks had become 'lost' in some way - probably due to the reinstall of the OS.

In this very recent comment from Bernd, he announced that he had made a change in how the scheduler handles lost tasks.  It may be just a coincidence but the timing seems to be pretty much correct.  If you did have all those tasks as 'lost' perhaps the scheduler has now turned them into errors.  That shouldn't have happened to FGRPB1G tasks because the change was to do with test tasks for the GW search using GPUs (O2MDF) where there was no CPU app to fall back to when a GPU test task became lost.

It would be very helpful if you could give a lot more information about your reinstall and about what tasks you had at various stages.  Since you support multiple projects, why do you try to keep such a large work cache size such that your BOINC client would request such a large amount from just one project?  Do you set 'No new tasks' and work off the excess before doing major maintenance like a full OS re-install?

Cheers,
Gary.

Jake1402
Jake1402
Joined: 18 Jul 19
Posts: 4
Credit: 177462631
RAC: 176495

I am having the same problem

I am having the same problem with three of my machines running Linux Mint 19.2 and NVIDIA driver 430.50. Card is a NVIDIA GeForce GTX 650.

I get the following message when the computer contacts the server:

2019-11-17 08:47:41.3669 [PID=29556] [version] NVidia device (or driver) doesn't support OpenCL

I was under the impression NVIDIA included OpenCL support in all their drivers.

Thoughts?

Jake1402
Jake1402
Joined: 18 Jul 19
Posts: 4
Credit: 177462631
RAC: 176495

Here is the whole file if you

Here is the whole file if you need it:

2019-11-17 08:47:41.2111 [PID=29556]   Request: [USER#xxxxx] [HOST#12795055] [IP xxx.xxx.xxx.50] client 7.9.3
2019-11-17 08:47:41.2676 [PID=29556] [debug]   have_master:1 have_working: 1 have_db: 1
2019-11-17 08:47:41.2676 [PID=29556] [debug]   using working prefs
2019-11-17 08:47:41.2676 [PID=29556] [debug]   have db 1; dbmod 1573849640.000000; global mod 1573849640.000000
2019-11-17 08:47:41.2676 [PID=29556]    [send] effective_ncpus 8 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2019-11-17 08:47:41.2676 [PID=29556]    [send] effective_ngpus 1 max_jobs_on_host_gpu 999999
2019-11-17 08:47:41.2676 [PID=29556]    [send] Not using matchmaker scheduling; Not using EDF sim
2019-11-17 08:47:41.2676 [PID=29556]    [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2019-11-17 08:47:41.2676 [PID=29556]    [send] CUDA: req 43231.47 sec, 0.33 instances; est delay 0.00
2019-11-17 08:47:41.2676 [PID=29556]    [send] work_req_seconds: 0.00 secs
2019-11-17 08:47:41.2677 [PID=29556]    [send] available disk 96.56 GB, work_buf_min 86400
2019-11-17 08:47:41.2677 [PID=29556]    [send] active_frac 0.999973 on_frac 0.947561 DCF 1.000000
2019-11-17 08:47:41.2687 [PID=29556]    [mixed] sending locality work first (0.3418)
2019-11-17 08:47:41.2692 [PID=29556]    [locality] send_work_locality(): locality_appid_filter 'appid=54 AND'
2019-11-17 08:47:41.2991 [PID=29556]    [send] send_old_work() no feasible result older than 336.0 hours
2019-11-17 08:47:41.2991 [PID=29556]    [locality] send_work_locality(): sending work for new file(s)
2019-11-17 08:47:41.2991 [PID=29556]    [locality] send_new_file_work(): try to send old work
2019-11-17 08:47:41.3330 [PID=29556]    [version] Checking plan class 'GW-opencl-ati'
2019-11-17 08:47:41.3363 [PID=29556]    [version] reading plan classes from file '/BOINC/projects/EinsteinAtHome/plan_class_spec.xml'
2019-11-17 08:47:41.3363 [PID=29556]    [version] beta test app versions not allowed in project prefs.
2019-11-17 08:47:41.3363 [PID=29556]    [version] Checking plan class 'GW-opencl-nvidia'
2019-11-17 08:47:41.3363 [PID=29556]    [version] beta test app versions not allowed in project prefs.
2019-11-17 08:47:41.3364 [PID=29556]    [version] no app version available: APP#54 (einstein_O2MDF) PLATFORM#7 (x86_64-pc-linux-gnu) min_version 0
2019-11-17 08:47:41.3364 [PID=29556]    [version] no app version available: APP#54 (einstein_O2MDF) PLATFORM#1 (i686-pc-linux-gnu) min_version 0
2019-11-17 08:47:41.3366 [PID=29556]    [locality] send_work_locality(): returning with work still needed
2019-11-17 08:47:41.3366 [PID=29556]    [mixed] sending non-locality work second
2019-11-17 08:47:41.3669 [PID=29556]    [version] Checking plan class 'FGRPopencl-ati'
2019-11-17 08:47:41.3669 [PID=29556]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2019-11-17 08:47:41.3669 [PID=29556]    [version] No ATI devices found
2019-11-17 08:47:41.3669 [PID=29556]    [version] Checking plan class 'FGRPopencl-nvidia'
2019-11-17 08:47:41.3669 [PID=29556]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2019-11-17 08:47:41.3669 [PID=29556]    [version] NVidia device (or driver) doesn't support OpenCL
2019-11-17 08:47:41.3670 [PID=29556]    [version] Checking plan class 'FGRPopencl1K-ati'
2019-11-17 08:47:41.3670 [PID=29556]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2019-11-17 08:47:41.3670 [PID=29556]    [version] No ATI devices found
2019-11-17 08:47:41.3670 [PID=29556]    [version] Checking plan class 'FGRPopencl1K-nvidia'
2019-11-17 08:47:41.3670 [PID=29556]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2019-11-17 08:47:41.3670 [PID=29556]    [version] NVidia device (or driver) doesn't support OpenCL
2019-11-17 08:47:41.3670 [PID=29556]    [version] Checking plan class 'FGRPopenclTV-nvidia'
2019-11-17 08:47:41.3671 [PID=29556]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2019-11-17 08:47:41.3671 [PID=29556]    [version] NVidia device (or driver) doesn't support OpenCL
2019-11-17 08:47:41.3671 [PID=29556]    [version] no app version available: APP#40 (hsgamma_FGRPB1G) PLATFORM#7 (x86_64-pc-linux-gnu) min_version 0
2019-11-17 08:47:41.3671 [PID=29556]    [version] no app version available: APP#40 (hsgamma_FGRPB1G) PLATFORM#1 (i686-pc-linux-gnu) min_version 0
2019-11-17 08:47:41.3730 [PID=29556]    [send] [HOST#12795055] is looking for work from a non-preferred application
2019-11-17 08:47:41.3731 [PID=29556]    [version] no app version available: APP#19 (einsteinbinary_BRP4) PLATFORM#7 (x86_64-pc-linux-gnu) min_version 0
2019-11-17 08:47:41.3731 [PID=29556]    [version] no app version available: APP#19 (einsteinbinary_BRP4) PLATFORM#1 (i686-pc-linux-gnu) min_version 0
2019-11-17 08:47:41.3809 [PID=29556] [debug]   [HOST#12795055] MSG(high) No work sent
2019-11-17 08:47:41.3809 [PID=29556] [debug]   [HOST#12795055] MSG(high) see scheduler log messages on https://einsteinathome.org/host/12795055/log
2019-11-17 08:47:41.3809 [PID=29556] [debug]   [HOST#12795055] MSG(high) Binary Radio Pulsar Search (Arecibo) is not available for your type of computer.
2019-11-17 08:47:41.3810 [PID=29556]    Sending reply to [HOST#12795055]: 0 results, delay req 60.00
2019-11-17 08:47:41.3811 [PID=29556]    Scheduler ran 0.175 seconds
mikey
mikey
Joined: 22 Jan 05
Posts: 11969
Credit: 1833888015
RAC: 224498

Jake1402 wrote:Here is the

Jake1402 wrote:

Here is the whole file if you need it:

2019-11-17 08:47:41.3669 [PID=29556]    [version] No ATI devices found 

You have your pc's hidden but if you are using Win10 they are updating pc's and that often replaces the gpu drivers with Microsoft ones, try reloading the Nvidia drivers.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5850
Credit: 110026272824
RAC: 22501435

mikey wrote:... if you are

mikey wrote:
... if you are using Win10 ...

He stated that he's using LinuxMint so it's not a Windows problem :-).  However it is a problem with missing OpenCL libs, since the server log states that the OpenCL support is missing.

I know nothing about Ubuntu or its derivatives so he'll need to find out how to install the missing OpenCL capability.  Perhaps he can get advice from the LinuxMint forums.  It's probably something like a separate package that provides CUDA/OpenCL.

Cheers,
Gary.

mikey
mikey
Joined: 22 Jan 05
Posts: 11969
Credit: 1833888015
RAC: 224498

Gary Roberts wrote:mikey

Gary Roberts wrote:
mikey wrote:
... if you are using Win10 ...

He stated that he's using LinuxMint so it's not a Windows problem :-).  However it is a problem with missing OpenCL libs, since the server log states that the OpenCL support is missing.

I know nothing about Ubuntu or its derivatives so he'll need to find out how to install the missing OpenCL capability.  Perhaps he can get advice from the LinuxMint forums.  It's probably something like a separate package that provides CUDA/OpenCL.

I missed that...note to self read more!!!

I use Linux Mint and enabled 3rd party drivers and they just work but the newest ones don't show up as options so I use the ones from Nvidia they recommend. It's under System Settings, Drivers, I have several pc's running Linux Mint all with different Nvidia cards and they are using driver versions 384.13, 430.26, 390.77 and 390.11. On one pc I did have to uncheck the newer driver as it did not work so I just selected an older one and it works just fine.

kksplace
kksplace
Joined: 24 Feb 18
Posts: 7
Credit: 810947082
RAC: 682067

I had a similar problem when

I had a similar problem when setting up a new build last year. It appears the OpenCL components don't get installed with some drivers along the way.

See the following: setiahome.berkeley.edu/forum_thread.php?id=83234

It (and a couple of other places) recommends to excecute:

sudo apt-get install ocl-icd-libopencl1

It appears Ocl-icd-libopencl1 is also available on the Software Manager on Mint.

I only had to do this once and everything has been OK since.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.