RAC drop?

Darren
Darren
Joined: 13 Nov 09
Posts: 38
Credit: 171,972,935
RAC: 0
Topic 196642

Gday all

Recently I swapped out one of my older machines for a newer 975xbx2 with a quad core and a newer (but still oldish) gpu. Ive noticed however that my RAC has dropped of quite alot.

Am I missing something? That machine as are all the others still happily crunching away. None of the temps are excessive, and core utilisation is fine.

I did update Nvidia drivers so thats the only thig I change other than adding faster hardware.

Has Nvidia borked their drivers again? or has E@H change credit amounts?

Any ideas?

Cheers
Darren

archae86
archae86
Joined: 6 Dec 05
Posts: 2,950
Credit: 3,965,889,502
RAC: 5,124,669

RAC drop?

Darren wrote:
or has E@H change credit amounts?

No

Quote:

Any ideas?


I have two GTX460 hosts. They will run happily for weeks at a time, then suddenly find a way to drop to much lower productivity. The only consistent symptoms are lower power consumption (the model of UPS I use on both hosts happens to support a decent onscreen power monitor), lower than expected GPU temperature, and suddenly much longer completion times.

Sometimes GPU clock speed monitoring tools report a major downclock (i.e. to 100 MHz), but other times they report normal clock rate and yet the GPU productivity, as given directly by completion times, and indirectly by clock rate and temperature, is impaired.

Once when I had a particularly bad case of this it got drastically better when I updated to the Nvidia driver version I'm currently on (306.97), but I have had at least a case or two since. Sometimes when I'm in that state rebooting the PC fixes things right up, sometimes not. Sometimes just starting up a different monitoring program seems to have tipped it back to normal behavior (I think I've seen cases of this both from OC Guru II and from SIV64X), but for me nothing has been a consistent fix, and the different monitoring tools don't always report the system to be in a problem state.

None of which may be the trouble you have, but if you have not already done so, it might be good to occasionally review your completion time distribution, and watch for changes.

Good luck, and let us know if you figure out something.

Darren
Darren
Joined: 13 Nov 09
Posts: 38
Credit: 171,972,935
RAC: 0

Decent thought there. I

Decent thought there.

I went through them and checked but completion times havent really changed.

Might revert to older drivers and see what happens...

Ive dropped about 20% off RAC the last couple weeks even tho the new hardware is a fair bit more powerful.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,390
Credit: 51,891,827,773
RAC: 71,126,205

RE: Ive dropped about 20%

Quote:
Ive dropped about 20% off RAC the last couple weeks even tho the new hardware is a fair bit more powerful.


Have you noticed all the validate errors on your i7-860 box? Over time, this would have quite a negative effect on your RAC.

I would guess that something in that setup is going past its limits. I wouldn't think it was anything to do with drivers. Time to check all the usual stuff :-). Good luck with the hunt!

Cheers,
Gary.

Darren
Darren
Joined: 13 Nov 09
Posts: 38
Credit: 171,972,935
RAC: 0

RE: RE: Ive dropped about

Quote:
Quote:
Ive dropped about 20% off RAC the last couple weeks even tho the new hardware is a fair bit more powerful.

Have you noticed all the validate errors on your i7-860 box? Over time, this would have quite a negative effect on your RAC.

I would guess that something in that setup is going past its limits. I wouldn't think it was anything to do with drivers. Time to check all the usual stuff :-). Good luck with the hunt!

Hmm didnt notice that...

Overheating? temps say theyre ok but would that cause this issue?

edit: temps all in low to mid 70s for both gpus and cpu. The only change recently to that machine was swapping one of the 560s for a 660ti.

edit 2: One of the other machines is also getting lots of invalidate errors http://einsteinathome.org/host/5022192/tasks&offset=0&show_names=1&state=4&appid=0

Temps on that one are waaaaay to high in the mid to high 80s. Will have to add some fans or repaste the gpus tonight...

Jord
Joined: 26 Jan 05
Posts: 2,952
Credit: 5,734,350
RAC: 241

Oh oops, you'd seen that

Oh oops, you'd seen that already. ;-)

Anyway, validate errors are different from tasks not validating. For validate errors, see Gary's first post of this thread.

Tasks not validating can happen due to dust, heat, overclocking, bad memory, bad capacitors, bad karma. ;-)

Darren
Darren
Joined: 13 Nov 09
Posts: 38
Credit: 171,972,935
RAC: 0

Yes seemed that whole

Yes seemed that whole machines output and alot of the wu's of the other one were throwing out those errors.

I repasted both GPUs on the worst machine and added another 12cm fan in a suitable spot to try for more airflow and the temps have come down to more acceptable levels.

Will let it run a bit and see how it goes. If it keeps erroring the work units Ill try to find a case with better airflow (or have some waterblocks machined up lol).

Thanks for replys and pointing me in the right place to look for solutions.

juan BFP
juan BFP
Joined: 18 Nov 11
Posts: 839
Credit: 421,443,712
RAC: 0

If you don´t use think to

If you don´t use think to use a program like EVGA Precision or MSI Afterburner to increase the fan speed at lower temperatures, that will do a lot of help to mantain your GPUs cooler. Even better than any external fan (always needed on a crunching host)

lHj2ixL.jpg

 

Darren
Darren
Joined: 13 Nov 09
Posts: 38
Credit: 171,972,935
RAC: 0

Well seems it might be a

Well seems it might be a driver issue with the 6600ti.

Temps are fine but WUs all seem to error.

Found this thread about it on Seti but dont know if E@H has similar issue.
http://setiathome.berkeley.edu/forum_thread.php?id=69735

Anyone else with the same card have issues?

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 1,973
Credit: 398,030,397
RAC: 1,363,911

RE: Well seems it might be

Quote:

Well seems it might be a driver issue with the 6600ti.

Temps are fine but WUs all seem to error.

Found this thread about it on Seti but dont know if E@H has similar issue.
http://setiathome.berkeley.edu/forum_thread.php?id=69735

Anyone else with the same card have issues?


No, that particular driver bug doesn't affect Einstein - the GTX 670 that I researched that workround for the SETI application on runs just fine here - host 5744895.

Darren
Darren
Joined: 13 Nov 09
Posts: 38
Credit: 171,972,935
RAC: 0

Hmm lol square one... I

Hmm lol square one...

I cant see how its buggered ram or a cap since everything else works 100% fine...the temps are fine etc...games all work no worries etc...it hasnt crashed for weeks...

Might have to shut that machine down to E@H til I can get time to swap cards around a bit and see if its the 660 or something else. the bulk of the errors seem to have appeared after changing that gpu.

Edit: I did find a guy here using the 6600ti without problems...
http://einsteinathome.org/host/5807458/tasks&offset=0&show_names=1&state=4&appid=0

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.