Unfortunately, early this morning I aborted several hanging tasks reporting 100% but not finising at my Linux hosts.
I thought then that it was a defective batch.
You'll find these tasks as Error - Aborted tasks for each host.
Some of them were running for more than 30000 seconds...
One of my Mint machines is still at Mint 19.1 (Ubuntu 18.04), but I've just started a pair of 8-day CPDN tasks on it. I could try a driver roll-back if it's really important, but I'd prefer not to.
I can confirm that the faulting tasks run OK on a Windows GeForce GTX 1050 Ti with driver: 442.74, and on the HD 530 intel_gpu on the same machine (host 12496320).
One of my Mint machines is still at Mint 19.1 (Ubuntu 18.04), but I've just started a pair of 8-day CPDN tasks on it. I could try a driver roll-back if it's really important, but I'd prefer not to.
I can confirm that the faulting tasks run OK on a Windows GeForce GTX 1050 Ti with driver: 442.74, and on the HD 530 intel_gpu on the same machine (host 12496320).
yeah, no worries if you don't want to. I was just reaching for some extra data points. this situation feels similar to the issue seen on SETI last year with the SoG tasks on certain nvidia drivers. so far, this particular issue seems to be driver-agnostic, but you never know. If someone is able to give a conclusive answer it would just be another variable to rule out.
can anyone with a Linux host and older kernel (5.4 or earlier) try the nvidia driver from the 440 generation or older?
I have kernel 5.8+ and it seems I can't install the older driver on this kernel.
I've got a GTX 1660 Ti on a Ryzen 3700X running Ubuntu 18.04 (kernel 5.3) with version 440.10 drivers, and it was failing these - hope that's old enough for a useful data point!
That machine is, of course, on NNT and having an E@H holiday at present (running GW doesn't seem to go well with CPDN and some WCG CPU stuff, so I don't switch over...)
Thanks for the new datapoint. That is a Turing card which is one of the problem architectures.
But you have an older kernel and an older driver which was one of the parameters that we were hoping was not affected.
So, it is beginning to look that OS, kernel and drivers are not the problem. It looks like the hardware architecture is the root cause.
I wonder if one of the task parameters is not able to handle the CC level or the commands available to the latest architectures of Volta/Turing/Ampere.
ServicEnginIC
)
Same here. Aborting all the bad tasks.
GR tasks: Win7 and
)
GR tasks:
Win7 and Win10
Newest drivers.
All GR get computation error.
Nvidia driver recovery several times, then PCs get BSOD with driver error 116.
Have aborted ALL tasks.
Hope tech guys are up and running!
One of my Mint machines is
)
One of my Mint machines is still at Mint 19.1 (Ubuntu 18.04), but I've just started a pair of 8-day CPDN tasks on it. I could try a driver roll-back if it's really important, but I'd prefer not to.
I can confirm that the faulting tasks run OK on a Windows GeForce GTX 1050 Ti with driver: 442.74, and on the HD 530 intel_gpu on the same machine (host 12496320).
FWIW, the following setting
)
FWIW, the following setting is doing fine:
GPU:NVIDIA GeForce GTX 1060 3GB (3016MB) driver: 450.10
OS:Linux LinuxMint Linux Mint 19.3 Tricia [5.4.0-62-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1.4)]
BOINC client version:7.9.3
Richard Haselgrove
)
yeah, no worries if you don't want to. I was just reaching for some extra data points. this situation feels similar to the issue seen on SETI last year with the SoG tasks on certain nvidia drivers. so far, this particular issue seems to be driver-agnostic, but you never know. If someone is able to give a conclusive answer it would just be another variable to rule out.
_________________________________________________________________________
solling2 wrote: FWIW, the
)
the issue seems to only affect Volta/Turing/Ampere cards. anything Pascal and earlier seems to be unaffected. your GTX 1060 is Pascal.
_________________________________________________________________________
Ian&Steve C. wrote:can
)
could this be similar to the driver issue on ubuntu 20 and amd drivers here:
No. Totally separate issue.
)
No. Totally separate issue.
_________________________________________________________________________
Ian&Steve C. wrote: can
)
I've got a GTX 1660 Ti on a Ryzen 3700X running Ubuntu 18.04 (kernel 5.3) with version 440.10 drivers, and it was failing these - hope that's old enough for a useful data point!
That machine is, of course, on NNT and having an E@H holiday at present (running GW doesn't seem to go well with CPDN and some WCG CPU stuff, so I don't switch over...)
Cheers - Al.
Thanks for the new
)
Thanks for the new datapoint. That is a Turing card which is one of the problem architectures.
But you have an older kernel and an older driver which was one of the parameters that we were hoping was not affected.
So, it is beginning to look that OS, kernel and drivers are not the problem. It looks like the hardware architecture is the root cause.
I wonder if one of the task parameters is not able to handle the CC level or the commands available to the latest architectures of Volta/Turing/Ampere.