Hi, I've just started working with E@H a few days ago. But I can not work with my GPU (Nvida GTX 1080) all drivers are up to date.
Log:
02.04.2017 20:34:27 | | Starting BOINC client version 7.6.33 for windows_x86_64
02.04.2017 20:34:27 | | log flags: file_xfer, sched_ops, task, cpu_sched_status
02.04.2017 20:34:27 | | Libraries: libcurl/7.47.1 OpenSSL/1.0.2g zlib/1.2.8
02.04.2017 20:34:27 | | Running as a daemon (GPU computing disabled)
02.04.2017 20:34:27 | | Data directory: C:\ProgramData\BOINC
02.04.2017 20:34:27 | | Running under account boinc_master
02.04.2017 20:34:27 | | No usable GPUs found
02.04.2017 20:34:27 | | Host name: PC
02.04.2017 20:34:27 | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz [Family 6 Model 158 Stepping 9]
02.04.2017 20:34:27 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx tm2 pbe fsgsbase bmi1 hle smep bmi2
02.04.2017 20:34:27 | | OS: Microsoft Windows 10: Core x64 Edition, (10.00.14393.00)
02.04.2017 20:34:27 | | Memory: 15.90 GB physical, 18.27 GB virtual
02.04.2017 20:34:27 | | Disk: 223.02 GB total, 62.17 GB free
02.04.2017 20:34:27 | | Local time is UTC +2 hours
02.04.2017 20:34:27 | Einstein@Home | URL http://einstein.phys.uwm.edu/; Computer ID 12512705; resource share 100
02.04.2017 20:34:27 | | No general preferences found - using defaults
02.04.2017 20:34:27 | | Reading preferences override file
02.04.2017 20:34:27 | | Preferences:
02.04.2017 20:34:27 | | max memory usage when active: 8138.25MB
02.04.2017 20:34:27 | | max memory usage when idle: 14648.84MB
02.04.2017 20:34:27 | | max disk usage: 65.11GB
02.04.2017 20:34:27 | | max CPUs used: 6
Copyright © 2024 Einstein@Home. All rights reserved.
Doesn't look you have any
)
Doesn't look you have any drivers for OpenCl. Did Window's install your graphic drivers? They are notorious for downloading their drivers which disables OpenCL. Go to Nvidia's website and download the driver directly from them and do a clean install. Don't allow Windows to install your drivers in the future or next Tuesday you will find yourself right back where you started with no OpenCl.
I have just installed the
)
I have just installed the latest NVIDIA driver directly from NVIDIA. But the GTX 1080 is still not identify.
Johannes_35
)
You aren't the only one to
)
You aren't the only one to have problems.
On my Windows 7 laptop with a GeForce GTX 460M, all E@H GPU tasks fail after a few hours : the screen starts flahing on and off, there is a message telling the driver crashed and had to be recuperated, and the task fails.
Last one :
27/08/2017 01:01:05 | Einstein@Home | Computation for task LATeah0138L_1060.0_0_0.0_16336335_0 finished
27/08/2017 01:01:05 | Einstein@Home | Output file LATeah0138L_1060.0_0_0.0_16336335_0_0 for task LATeah0138L_1060.0_0_0.0_16336335_0 absent
27/08/2017 01:01:05 | Einstein@Home | Output file LATeah0138L_1060.0_0_0.0_16336335_0_1 for task LATeah0138L_1060.0_0_0.0_16336335_0 absent
Is there a recommended driver version? Do I need a specific SW package? Is there a special way to load Boinc? I can't find any information about it.
Johannes_35 wrote:I have just
)
Once when I had a persistent failure to identify it only got healed when I did a full driver uninstall, followed by an install of the current direct from Nvidia with the "clean install" option selected. I had already failed using the same driver, including use of DDU. I think in that specific case the clean install option removed some leftover setting or such which somehow contributed to the problem.
This is not to say that preliminary uninstall, use of DDU, or selection of the clean install option are always necessary. I have often had good success just running the downloaded installer with no special measures.
'tis a mystery, it is.
_AF_France_Aquitaine_Cote-Ado
)
Laptops generally tend to be susceptible to overheating, especially low cost producers may skimp on fans or power supply units. Did you monitor temperature, memory load or overclocking recently? No special Boinc installation required to my knowledge, other than a recent release (which generally seems to apply to all software).
In fact my GPU is a NVIDIA
)
In fact my GPU is a NVIDIA GeForce GT 540M (2048MB) driver: 385.41.
There is no overheating : my laptop has a very big fan, not silent at all. GPU tasks are running on it from years, including 7 x 24 use.
I'm currently crushing SETI GPU tasks (CUDA and OpenCl) 7 x 24, and they are running without any problem.
The strange thing is that Einstein GPU tasks fall into error just before reaching 100%. Overheating should stop them at any time.
Result: My computer works hours and hours for nothing.
_AF_France_Aquitaine_Cote-Ado
)
How do you know there is no overheating? A fan making a lot of noise is a sign of a fan needing to work hard. If you have been running "for years" how often have you inspected/cleaned/serviced the heat sink and fan assembly? I used to run tasks on just the CPU cores of a couple of laptops a few years ago. I found I needed to remove fluff/dust from fins/filters etc, about every 6 months. Even then, both died prematurely from the stress of crunching. Laptops are not designed to cope with crunching loads 24/7.
It's a fallacy to think that because a different project works OK, the problem must be with this project. Different apps put different stresses on devices. I run lots of desktop style GPUs and my experience is that when tasks fail there are two main culprits - heat and quality/adequacy of power. Occasionally, problems may be caused by a particular driver/software update but when that happens it's usually rather obvious because lots of people have the same issue. At the moment, that doesn't seem to be happening. After heat and power, the third most likely cause of problems is hardware issues, particularly as components (eg capacitors) age.
You currently have 5 failed tasks showing. Two were around the 11k and 16k mark. Three were over 40k. What %complete was showing for the longest run time? The estimate stops increasing at 89.997% on all of mine and then jumps to 100%. Initial crunching has actually finished at this point. There is a followup stage which assesses the 10 highest ranking candidate signals that have been found in the initial pass. Double precision is used in this followup stage. If the GPU isn't capable of double precision, the followup stage is done on the CPU. I imagine this is the point of highest stress so perhaps it's not surprising that tasks may fail at this point if the hardware is not 'robust' enough.
I know you probably don't want to hear all this but I agree with Solling2's assessment.
Cheers,
Gary.
Your host is running Windows
)
Your host is running Windows so you could test if downclocking the GPU would change the situation some way. MSI Afterburner for example can be used to set maximum core clock somewhat slower than default. It depends on the GPU model how much the default speed could be tweaked.
https://www.msi.com/page/afterburner##downloads
TThrottle is another software you could use to set a maximum temp limit for the GPU.
https://efmer.com/tthrottle/
Both softwares can be used to monitor the load temperatures of both the CPU and GPU. I would try setting GPU core clock 200MHz lower than default (if possible) or setting the max temp to 70 C and then see if tasks would still error out.
Latest attempt on my Windows
)
Latest attempt on my Windows 10 PC, GTX 1050 Ti, nVidia driver 385.41