There appears to be an incompatibility been the E@H gamma-ray pulsar search GPU application and linux Fedora 26, at least on my systems. The work units tried to date using F26 simply fail immediately after starting. Prior to upgrade to 26, these work units ran fine on all my systems. Listed below is a summary of the system hardware, software, drivers, environment and settings along with some troubleshooting observations:
AMD x86_64 FX8350 (8 core) 4.0 GHz 16 GB ram, nVidia GTX-1060 6 GB.
Boinc-client 7.6.22 distro release (fedora), nVidia drivers 375.66 distro release (rpmfusion).
Boinc runs as a daemon in /usr/lib/boinc with owner/group as boinc.
No overclocking of any system stock speeds either CPU or GPU.
Research Efforts/Observations:
1. Reviewed the distro release notes regarding the F25 to F26 upgrade but did not notice anything
documented that might relate to this issue.
2. Both F25 and F26 are at kernel version 4.12.9.
3. Down graded one system back to F25 and it now runs E@H GPU WU's fine.
4. Tried distro test release nVidia drivers, libs, cuda version 384.59 (rpmfusion) with F26 but the WU
still failed.
5. Zero of the 3 GPU WU capable systems will run E@H GPU work with F26 (two are as above, the other is a
Phenom II with the same graphics card).
6. GPUgrid work units run fine on all systems using Fedora 26 with the nVidia 375.66 or 384.59 drivers.
7. Reading stderr doesn't really tell me much as I haven't figured out what exit code 6 means yet
(same for all machines I have looked at). It's like the WU fails before it ever starts plus I'm not
a developer so back traces are above my pay grade :).
I doubt that the cuda libs were updated between the two versions but I wanted to try the video drivers and libs. It would be interesting to know if anyone else is using F26 for GPU WU's and having similar experiences or success. I have all x86 and x86_64 libs required to run Windows apps quite well under wine and crossover so I shouldn't really be missing much in the way of graphic libs. Will continue to investigate to see if I can come up with a missing lib or something.
Crunching since Feb 2003 (United Devices, Find-a-Drug back then)
Copyright © 2024 Einstein@Home. All rights reserved.
You are using the Mesa OpenCL
)
You are using the Mesa OpenCL drivers on FC26 which at the moment are not compatible with our openCL applications. There are also other users with this problem and a workaround posted here: https://einsteinathome.org/content/fgrpopencl1k-ati-polaris10-amdgpu-llvm391-mesa-17-crash The workaround is essentially to replace the Mesa provided libraries with the official libraries provided by Nvidia.
There is also a discussion here: http://boinc.berkeley.edu/dev/forum_thread.php?id=11806
Thank you Christian Beer for
)
Thank you Christian Beer for the very helpful information. I removed mesa-OpenCL and installed the latest driver from nVidia and all appears to be running fine. Will check the fedora release notes again and if the mesa opencl isn't mentioned, I'll inform the community. I think the distro version from rpmfusion will also probably work now with mesa removed (it's listed as a dependency but it can be excluded from installation). I like using the distro version more as with akmod, I don't have to rebuild the nVidia modules every time a new kernel upgrade is introduced. Thanks again.
Roger
Crunching since Feb 2003 (United Devices, Find-a-Drug back then)