which version of the AMD driver are you installing? AMD drivers seem to be particularly picky about which driver you install on each system. especially wrt what kernel you are running.
I would try the latest 22.20 AMD driver, making sure to get the one for Ubuntu 22.04, and NOT the one for 20.04 HWE)
which version of the AMD driver are you installing? AMD drivers seem to be particularly picky about which driver you install on each system. especially wrt what kernel you are running.
I would try the latest 22.20 AMD driver, making sure to get the one for Ubuntu 22.04, and NOT the one for 20.04 HWE)
Good thought. I tried, but --headless is not recognized with --opencl=rocm,legacy in 22.20. However, according to the documentation (https://amdgpu-install.readthedocs.io/en/latest/install-installing.html#), the install argument --usecase=opencl is what should be used for a headless compute environment. I tried that (allowing dkms this time), but am still getting computation errors. This is all on kernel 5.15.0-47-generic.
Ideas are not fixed, nor should they be; we live in model-dependent reality.
Looks like we have the same issue, Cecht. The GPU task errors seem to be an issue specific to ROCm and newer BOINCs somehow, but I haven't yet found a real solution, just a work-around to use a BOINC older than 7.18.x - looks like you've done the same. I've known about this for months, even before updating to Ubuntu 22.04.
The only way I could get everything working together is with Ubuntu 20.04.1, keeping the 5.4.0-42 kernel static, while running 21.10 AMD Radeon drivers and BOINC 7.16. I'm sure there are other ways (that host was running fine with the 5.14 kernel before I tried to change things up).
Ideas are not fixed, nor should they be; we live in model-dependent reality.
It's a pity I didn't see your posts sooner, as I've seen for a long time the issue seems to be with BOINC itself - or perhaps its expectations of the GPU driver - rather than the driver or kernel itself. I can demonstrate that on kernel 5.15 and amdgpu 22.20.3, with the current ROCm 5.2.3, BOINC 7.16.17 works fine with ROCr-based OpenCL whereas BOINC 7.18 onwards do not.
Edit: The above applies to both Ubuntu 20.04 and 22.04.
And I hope you realize that the distro maintainers providing BOINC 7.18.1 versions screwed up royally and compiled the Android release code into the x64 version. That is why it doesn't work. There never was an official 7.18 x64 .deb version release at BOINC's github.
If you want to use a later version, install Gianfranco's ppa and get your latest versions from him. He compiled them correctly.
I was already using that PPA for 7.16 and 7.20 (and quite a while before them). 7.20 does not work nicely with ROCm. Are you suggesting Gianfranco compiled Android code into 7.20? I have 7.20 working fine on everything except two hosts that rely on ROCm.
I was already using that PPA for 7.16 and 7.20 (and quite a while before them). 7.20 does not work nicely with ROCm. Are you suggesting Gianfranco compiled Android code into 7.20? I have 7.20 working fine on everything except two hosts that rely on ROCm.
No, empatically not. The distro maintainers made the mistake. Not Gianfranco.
[Edit]
As you can see from the BOINC All Versions release page there is NO 7.18.1 Linux x86/x64 release. the latest is a 7.16.6 x64 release.
My point is that the BOINC appears to be the issue here, not the packaging issue you describe. Seems as though Cecht is willing to stick with Ubuntu 20.04 for now (it's supported till next April anyway), so it doesn't look like he's willing to test my theory. I only know of one other who's confirmed the same situation as I've found.
Since I have never had to deal directly with supporting an AMD gpu, I am at a disadvantage in offering expert assistance. I have helped teammates who have tried to run AMD gpus though and have helped them work through the typical difficulties of getting AMD gpus to run BOINC project work correctly.
All I'm saying is that the Android 7.18.1 branch had only specific Android fixes merged into it and was multiple merges behind the amd64 branch. So it was meant to work only on Android devices. It is not unreasonable to expect that lots of code for the amd64 branch was not included.
You can actually see the point the 7.18 branch was frozen in development and compare it to the amd64 branches at the BOINC github site.
what about adding the
)
what about adding the --headless argument?
which version of the AMD driver are you installing? AMD drivers seem to be particularly picky about which driver you install on each system. especially wrt what kernel you are running.
I would try the latest 22.20 AMD driver, making sure to get the one for Ubuntu 22.04, and NOT the one for 20.04 HWE)
get the deb file here: https://repo.radeon.com/amdgpu-install/22.20/ubuntu/jammy/amdgpu-install_22.20.50200-1_all.deb
_________________________________________________________________________
Ian&Steve C. wrote:what
)
Good thought. I tried, but --headless is not recognized with --opencl=rocm,legacy in 22.20. However, according to the documentation (https://amdgpu-install.readthedocs.io/en/latest/install-installing.html#), the install argument --usecase=opencl is what should be used for a headless compute environment. I tried that (allowing dkms this time), but am still getting computation errors. This is all on kernel 5.15.0-47-generic.
Ideas are not fixed, nor should they be; we live in model-dependent reality.
Looks like we have the same
)
Looks like we have the same issue, Cecht. The GPU task errors seem to be an issue specific to ROCm and newer BOINCs somehow, but I haven't yet found a real solution, just a work-around to use a BOINC older than 7.18.x - looks like you've done the same. I've known about this for months, even before updating to Ubuntu 22.04.
https://boinc.berkeley.edu/forum_thread.php?id=14786
On the note of installation, I use this for the most recent amdgpus:
amdgpu-install --opencl=rocr --usecase=opencl
That seems to be the equvalent to the old 'headless' argument which is no longer supported.
Soli Deo Gloria
Wedge009, I ended up dialing
)
Wedge009, I ended up dialing back the kernel and drivers, as outlined at https://einsteinathome.org/content/troubleshooting-ubuntu-20-and-fresh-install-amd-drivers in my 10 Sept 2022 post.
The only way I could get everything working together is with Ubuntu 20.04.1, keeping the 5.4.0-42 kernel static, while running 21.10 AMD Radeon drivers and BOINC 7.16. I'm sure there are other ways (that host was running fine with the 5.14 kernel before I tried to change things up).
Ideas are not fixed, nor should they be; we live in model-dependent reality.
It's a pity I didn't see your
)
It's a pity I didn't see your posts sooner, as I've seen for a long time the issue seems to be with BOINC itself - or perhaps its expectations of the GPU driver - rather than the driver or kernel itself. I can demonstrate that on kernel 5.15 and amdgpu 22.20.3, with the current ROCm 5.2.3, BOINC 7.16.17 works fine with ROCr-based OpenCL whereas BOINC 7.18 onwards do not.
Edit: The above applies to both Ubuntu 20.04 and 22.04.
Soli Deo Gloria
And I hope you realize that
)
And I hope you realize that the distro maintainers providing BOINC 7.18.1 versions screwed up royally and compiled the Android release code into the x64 version. That is why it doesn't work. There never was an official 7.18 x64 .deb version release at BOINC's github.
If you want to use a later version, install Gianfranco's ppa and get your latest versions from him. He compiled them correctly.
https://launchpad.net/~costamagnagianfranco/+archive/ubuntu/boinc
I was already using that PPA
)
I was already using that PPA for 7.16 and 7.20 (and quite a while before them). 7.20 does not work nicely with ROCm. Are you suggesting Gianfranco compiled Android code into 7.20? I have 7.20 working fine on everything except two hosts that rely on ROCm.
Soli Deo Gloria
Wedge009 wrote:I was
)
No, empatically not. The distro maintainers made the mistake. Not Gianfranco.
[Edit]
As you can see from the BOINC All Versions release page there is NO 7.18.1 Linux x86/x64 release. the latest is a 7.16.6 x64 release.
Official BOINC client releases
You will only find the 7.18.1 release in the Android section.
The distro maintainers pulled the source code from the incorrect branch tag when they compiled the client for amd64 release.
My point is that the BOINC
)
My point is that the BOINC appears to be the issue here, not the packaging issue you describe. Seems as though Cecht is willing to stick with Ubuntu 20.04 for now (it's supported till next April anyway), so it doesn't look like he's willing to test my theory. I only know of one other who's confirmed the same situation as I've found.
Soli Deo Gloria
Since I have never had to
)
Since I have never had to deal directly with supporting an AMD gpu, I am at a disadvantage in offering expert assistance. I have helped teammates who have tried to run AMD gpus though and have helped them work through the typical difficulties of getting AMD gpus to run BOINC project work correctly.
All I'm saying is that the Android 7.18.1 branch had only specific Android fixes merged into it and was multiple merges behind the amd64 branch. So it was meant to work only on Android devices. It is not unreasonable to expect that lots of code for the amd64 branch was not included.
You can actually see the point the 7.18 branch was frozen in development and compare it to the amd64 branches at the BOINC github site.
Compare code changes between 7.18 and 7.20