E@H Suddenly Stops Using One Card

Gamboleer
Gamboleer
Joined: 5 Dec 10
Posts: 173
Credit: 168389195
RAC: 0
Topic 196678

Hello,

I have a Windows 8 system running a GTX650ti and a GTX650; both were doing two simultaneous tasks with all four cores of a 4-core processor free (on my list of computers, STRIKEAIR). Ran great for several weeks, then one morning I get up and find that overnight, something has changed -- only one GPU is running, and it's running only one task.

I hadn't changed a thing - the system hadn't been rebooted, nor BOINC or drivers updated, though Windows 8 updates itself so it's possible there was a Win8 update.

GPU-z shows both cards. Device manager the same. I ended up aborting a bunch of tasks, reinstalling BOINC and resetting my local settings. Now I'm getting two simultaneous tasks again, but only on card 1.

Any suggestions for an order of battle for debugging this? Because the system had dropped to one task at a time on the primary card, I'm suspecting a software problem.

Thanks - Alec

juan BFP
juan BFP
Joined: 18 Nov 11
Posts: 839
Credit: 421443712
RAC: 0

E@H Suddenly Stops Using One Card

Besides a hardware malfunction... Just follow this and you will get your answer and how to fix... the boinc uses by defoult de "most capable GPu"... just don´t know what does it realy means...
but´s makes no real difference just follow the thread.

http://einsteinathome.org/node/196636

lHj2ixL.jpg

 

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 379
Credit: 740030628
RAC: 0

Hello, I am not too sure

Hello,

I am not too sure what changed to cause your second card to become inactive.

However, try creating a file called cc_config.xml at the following location:

C:\ProgramData\BOINC\projects\einstein.phys.uwm.edu

Add the following:


1

Restart BOINC and check the event log to see if both GPUs are detected and crunching.

If that does not resolve the issue, then there is another item to check.

Right-click on the desktop and open the NVIDIA Control Panel. On the left-side of the window, select 3D Settings and Manage 3D settings. Under the Global Settings tab, confirm that CUDA - GPUs is set to All.

Gamboleer
Gamboleer
Joined: 5 Dec 10
Posts: 173
Credit: 168389195
RAC: 0

Thanks for the replies. I'm

Thanks for the replies. I'm now having a vague memory of having to use the config file to force recognition of a second card in SETI on another machine (however, that one never recognized card 2 until I made the change). I'll try and report back, probably tomorrow.

Alec

mikey
mikey
Joined: 22 Jan 05
Posts: 12705
Credit: 1839110349
RAC: 3608

RE: Thanks for the replies.

Quote:

Thanks for the replies. I'm now having a vague memory of having to use the config file to force recognition of a second card in SETI on another machine (however, that one never recognized card 2 until I made the change). I'll try and report back, probably tomorrow.

Alec

Be sure to reboot the machine too! Windows CANNOT reset a gpu without a full system restart!! So IF a gpu crashes it will NOT start back up on its own in Windows.

Janus
Janus
Joined: 10 Nov 04
Posts: 27
Credit: 23862534
RAC: 58

RE: RE: Thanks for the

Quote:
Quote:

Thanks for the replies. I'm now having a vague memory of having to use the config file to force recognition of a second card in SETI on another machine (however, that one never recognized card 2 until I made the change). I'll try and report back, probably tomorrow.

Alec

Be sure to reboot the machine too! Windows CANNOT reset a gpu without a full system restart!! So IF a gpu crashes it will NOT start back up on its own in Windows.


I know this is off-topic, but you can reset a non-primary GPU in windows by disabling it in the task manager and then re-enabling it. This will bring those pesky downclocked Nvidias back on track.
To reset a primary GPU without restarting: Simply re-install the current GPU driver.
If anyone have a better solution please say so, I've been looking for a smoother way to knock Nvidia GPUs out of safe-mode for a while.

Gamboleer
Gamboleer
Joined: 5 Dec 10
Posts: 173
Credit: 168389195
RAC: 0

So...I made the cc_config.xml

So...I made the cc_config.xml file using notepad, pasted the text from above, saved, changed the extension to .xml, placed inside my E@H folder in the ProgramData > BOINC > Projects > einstein.phys.uwm.edu folder that was there.

No change.

1) Flashed my BIOS, since I had V13 and Gigabyte's website for my mobo said V17 was available, which supported my CPU and Windows 8.

No change.

2) Re-seated the card, verified that its 6-pin is hooked up and fan running (it doesn't run if I forget the cable), verified that my card is seen in device manager and working properly, and hooked up another monitor; it's displaying video just fine.

3) Checked to see if Windows Update had downloaded something new - it had not.

4) Verified that my nVidia settings are set for all cards to use CUDA.

5) Rebooted. Several times, including full powerdown.

Here's something interesting from my Event Log (see bottom lines):

12/13/2012 7:31:16 PM | | No config file found - using defaults
12/13/2012 7:31:16 PM | | Starting BOINC client version 7.0.28 for windows_x86_64
12/13/2012 7:31:16 PM | | log flags: file_xfer, sched_ops, task
12/13/2012 7:31:16 PM | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
12/13/2012 7:31:16 PM | | Data directory: C:\ProgramData\BOINC
12/13/2012 7:31:16 PM | | Running under account Ostrich
12/13/2012 7:31:16 PM | | Processor: 4 GenuineIntel Intel(R) Core(TM) i3-3220T CPU @ 2.80GHz [Family 6 Model 58 Stepping 9]
12/13/2012 7:31:16 PM | | Processor: 256.00 KB cache
12/13/2012 7:31:16 PM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 syscall nx lm vmx tm2 popcnt pbe
12/13/2012 7:31:16 PM | | OS: Microsoft Windows 8: x64 Edition, (06.02.9200.00)
12/13/2012 7:31:16 PM | | Memory: 7.96 GB physical, 15.96 GB virtual
12/13/2012 7:31:16 PM | | Disk: 931.00 GB total, 896.64 GB free
12/13/2012 7:31:16 PM | | Local time is UTC -7 hours
12/13/2012 7:31:16 PM | | NVIDIA GPU 0: GeForce GTX 650 Ti (driver version 306.97, CUDA version 5.0, compute capability 3.0, 1024MB, 8364946MB available, 1646 GFLOPS peak)
12/13/2012 7:31:16 PM | | NVIDIA GPU 1 (not used): GeForce GTX 650 (driver version 306.97, CUDA version 5.0, compute capability 3.0, 1024MB, 940MB available, 813 GFLOPS peak)
12/13/2012 7:31:16 PM | | OpenCL: NVIDIA GPU 0: GeForce GTX 650 Ti (driver version 306.97, device version OpenCL 1.1 CUDA, 1024MB, 8364946MB available)
12/13/2012 7:31:16 PM | | OpenCL: NVIDIA GPU 1 (not used): GeForce GTX 650 (driver version 306.97, device version OpenCL 1.1 CUDA, 1024MB, 940MB available)

As you can see, BOINC sees but simply isn't using the second card.

One odd thing I did notice; somehow my Windows clock had got about 16 hours ahead of true time; I could not say for certain if I had somehow mucked up the time setting when I first installed, but that would be unlike me.

Any further ideas?

Horacio
Horacio
Joined: 3 Oct 11
Posts: 205
Credit: 80557243
RAC: 0

RE: So...I made the

Quote:

So...I made the cc_config.xml file using notepad, pasted the text from above, saved, changed the extension to .xml, placed inside my E@H folder in the ProgramData > BOINC > Projects > einstein.phys.uwm.edu folder that was there.

No change.Here's something interesting from my Event Log (see bottom lines):

12/13/2012 7:31:16 PM | | No config file found - using defaults


The cc_config.xml file has to be placed in the "ProgramData > BOINC" folder, not inside the project folder.

cliff west
cliff west
Joined: 19 Feb 05
Posts: 6
Credit: 16063759
RAC: 0

i have been having the same

i have been having the same problems. i am running two gtx 570's. one day they are working then an update is pushed and my system rebooted now only one is working for crunching. i have some crazy 3d programs named 3d vison that i have did not allow to load all the way ( i don't have a 3d monitor so why).

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 379
Credit: 740030628
RAC: 0

Sorry, I accidentally

Sorry, I accidentally specified the wrong location for the cc_config.xml file. It should be in the base BOINC directory, not the project directory.

Gamboleer
Gamboleer
Joined: 5 Dec 10
Posts: 173
Credit: 168389195
RAC: 0

Fixed. Both GPUs crunching

Fixed. Both GPUs crunching now.

Somehow I feel like I should have figured out myself the location was wrong when I saw the top line of the Event Log being "No config file found - using defaults".

But! I am happy I have updated my BIOS.

Thanks much for the assistance.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.