Geforce GS 8400 PCI errors in GPU computing

Mike
Mike
Joined: 19 Jan 16
Posts: 4
Credit: 120271621
RAC: 0
Topic 198468

I just resurrected an old P4 with no PCIe slot. I added a PCI version of the GS 8400. BOINC recognizes it as a useable GPU. I received three error while computing results for CUDA calculations with it in Einstein, but the same rig has successfully calculated and received credit in SETI.

Can anyone out there enlighten me?

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

Geforce GS 8400 PCI errors in GPU computing

Hi!

I'm an amateur but I'm sure the gurus at this place could use some additional information.

What OS, BOINC version, Einstein@Home application and graphics driver version are you running?

Also... please post what "system recognition info" the BOINC Manager event log shows when you start BOINC.

Also... might help spot something if you could share links to those three tasks that ended "error while computing".

Best luck!

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118281729460
RAC: 24958507

In your list of hosts you

In your list of hosts you have two with 8400 GS GPUs. One has successfully returned a task, the other has three compute errors.

By clicking on the taskID for one of the problem tasks, you can see the same start to the output in each case, namely

Stderr output

7.6.22

An attempt was made to reference a token that does not exist.
(0x3f0) - exit code 1008 (0x3f0)

....
....


That doesn't tell you much other than that it's probably hardware operating beyond its limits.

In each case the task was partly crunched which suggests some sort of intermittent hardware issue occurred (could probably be temperature related) that brought things crashing down.

You could try reviewing the proper cooling of the card - maybe renew the thermal paste. My guess is that it might be a thermal issue so if you can open the case and direct a fan at the card you might ne able to prove this if tasks will then complete.

You could also try swapping the card with the other one of the same type. Does the problem go with the card or stay with the host. That might help narrow it down.

EDIT: Sorry, just realised that you mentioned this was a PCI card and not PCIe so I guess you wont be swapping cards any time soon :-).

Cheers,
Gary.

Mike
Mike
Joined: 19 Jan 16
Posts: 4
Credit: 120271621
RAC: 0

Thank you both. I actually

Thank you both. I actually do have another PCI GS 8400 enroute, so I can swap it out and see if its a host problem.

If that crunches OK, then it is the card. I thought maybe it was a driver issue and might have to back up a version or two.

In the meantime, I've suspended Einstein tasks for that system.

Again, I appreciate the help.

Mike
Mike
Joined: 19 Jan 16
Posts: 4
Credit: 120271621
RAC: 0

The same card crunched CUDA

The same card crunched CUDA for SETI after the Einstein errors:

7.6.22

v8 task detected
setiathome_CUDA: Found 1 CUDA device(s):
nVidia Driver Version 340.52
Device 1: GeForce 8400 GS, 511 MiB, regsPerBlock 8192
computeCap 1.1, multiProcs 1
pciBusID = 2, pciSlotID = 0
clockRate = 1400 MHz
In cudaAcc_initializeDevice(): Boinc passed DevPref 1
setiathome_CUDA: CUDA Device 1 specified, checking...
Device 1: GeForce 8400 GS is okay

Task was completed and validated.

Puzzling.

Thanks Again

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118281729460
RAC: 24958507

OK, then swapping the cards

OK, then swapping the cards will be a useful test.

You should also consider that when cards are quite old, there could easily be an issue with the performance of voltage stability and regulation on the card. Capacitors in the voltage regulator circuitry on consumer grade equipment have a habit of failing internally (not always externally visible) after a 'normal' lifetime, leading to a progressive degradation of power quality which can cause crashes and lockups.

There comes a time when replacing a number of old clunkers with a single more modern unit with a decent GPU, will drastically increase production whilst giving you lower running costs and heat output. This can ultimately more than pay for itself with just the power savings.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 118281729460
RAC: 24958507

RE: The same card crunched

Quote:

The same card crunched CUDA for SETI after the Einstein errors:

Task was completed and validated.

Puzzling.


Not really.

Different apps put different stresses on the hardware. This behaviour really confirms that there is some part of the card that is overstressed by one app but not the other. That is not really something totally unexpected.

It could be heat or it could be voltage stability. Can you try an Einstein task while you have a room fan blowing furiously on the card? Does the GPU have a fan and, if so, does it run at full speed when crunching?

How old is the PSU in the machine? Does it have decent specs on the 12V rail? Is it a decent brand? Can you look inside it for signs of swollen capacitors?

The Einstein app has been made more efficient over the years. It will most likely be putting more strain on the card.

Cheers,
Gary.

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 0

Your card might just be too

Your card might just be too old to run the required CUDA version, SETI only needs CUDA 2.0, E@H requires 3.2.

In addition, I have some vague recollections that the oldest generation of GeForce cards fell out of support here a few years ago.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.