Here is my computer. Intel with Win 7, BOINC 6.12.34 and NVIDIA GeForce GTX 560 Ti (1023MB) driver: 30124.
The CPU cores crunch CPDN and the GPU crunches Einstein. Usually the Einstein tasks are problem-free but on Thursday 7 June several crashed, sometimes almost immediately, with the 1008 error code.
In a thread about a different error Jord said about the 1008 error:
And exit code 1008:Error during CUDA device->host power spectrum data transfer (error: 999) [11:19:23][3784][ERROR] Demodulation failed (error: 1008)!
This is definitely another CUDA application error, it even says so!
Here are the Stderr messages for one of my crashed tasks:
http://einsteinathome.org/task/291877250
The Stderr output for the other crashed tasks seems to me very similar.
Assuming as Jord says that this is a CUDA application error I expected to find that other computers crunching the same tasks with NVidia GPUs would also crash them, probably with the same error code. But in the case, for example, of this workunit the other computer completed it. Here's another example with a successful completion by a computer that doesn't seem very different from mine. This makes me think that the problem lies not in the task but in my computer (or in its GPU).
On the same day some tasks completed successfully and were validated but 5 produced a validation error. Here is one of the five. Another computer with the same type of GPU appears to have completed the task successfully.
As the 1008 error crashes and the validate errors are interspersed on the same date and both are unusual for this computer I am wondering whether both have the same cause.
On this computer if I don't reboot it soon enough, after about six or seven days I see a green mark in a particular position on the blue title bar. It's a bad sign. Later the visual display may crash out for a second, producing a totally black screen, and a message appears saying that the graphics card encountered a problem but has recovered. Random spots of misplaced colour may also appear. A reboot solves this graphics display problem completely. This also happened on 7 June.
Could the NVidia card's need to be rebooted have caused both the 1008 crashes and the Validate errors?
Copyright © 2024 Einstein@Home. All rights reserved.
Error 1008
)
Definitely yes. After such driver resets, only a reboot can reactivate the GPU (memory?) for crunching.
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
Thanks. That explains the
)
Thanks. That explains the situation. I shall have to get into a better habit of rebooting more frequently and before I see the green mark on the title bar.