Generic Multiple GPU discusssion

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5002
Credit: 18875345619
RAC: 6214670

Unless you modify the power

Unless you modify the power restraints, the PBO stresses the system too far for the overclocking it provides resulting in instability.

 

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6534
Credit: 9632418151
RAC: 2857710

Peter, After looking at

Peter,

After looking at the results of your rx 6800 xt and that my rx 5700's are averaging a million Rac per card. I think a case could be made for your 2 gpu mix should make it up to 3 million Rac.

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

archae86
archae86
Joined: 6 Dec 05
Posts: 3160
Credit: 7255705646
RAC: 1462373

Tom M wrote:After looking at

Tom M wrote:
After looking at the results of your rx 6800 xt and that my rx 5700's are averaging a million Rac per card. I think a case could be made for your 2 gpu mix should make it up to 3 million Rac.

My box with one 5700 plus one 6800 (neither of these is an XT) should be quite a bit over 3 million on the high-pay GRP work recently being issued.

I've had trouble that appears to take the form of the AMD driver suddenly deciding to downclock the 6800 quite severely.  While nominal is up near 2200, and I was running it purposely near 1950 as a single (for power conservation and temperature reduction with little computation output lost) and was running it about 1400 in the double configuration (for yet more power reduction and temperature margin), it would suddenly drop temperature by about 10C and stay there for hours.  When I'd check the clock speed it would be remarkably low--something near 500.

Just this morning I noticed there was a new WHQL driver update from AMD, version 21.4.1 of April 20.  I'm interested to see whether any of my driver/clock rate/fan control oddities seem to have improved in this release.  Initial indications seem possibly promising.

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6534
Credit: 9632418151
RAC: 2857710

archae86 wrote: I've had

archae86 wrote:

I've had trouble that appears to take the form of the AMD driver suddenly deciding to downclock the 6800 quite severely. 

I am currently running the latest release of the Enterprise driver on my top-performing box.  Since the "content" that driver is supposed to support (and is certified on) includes quite a bit of number crunching you might give it a try out if the latest games/regular driver doesn't pan out.

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

archae86
archae86
Joined: 6 Dec 05
Posts: 3160
Credit: 7255705646
RAC: 1462373

archae86 wrote:My box with

archae86 wrote:

My box with one 5700 plus one 6800 (neither of these is an XT) should be quite a bit over 3 million on the high-pay GRP work recently being issued.

{snip}

Just this morning I noticed there was a new WHQL driver update from AMD, version 21.4.1 of April 20. 

With one day of running this new driver release I can report that my undesired major stable downclock went away entirely.  So far the system seems stable.   With the current "high-pay" GRP tasks the apparent system production is right about 3,400,000/day.  I think it very likely the driver is the difference.  Of that 3.4M, the 5700, an XFX in mining mode with -40% power limitation imposed, is contributing somewhat over 1.4M, while the 6800, held back by a 1400 MHZ clock limit, is doing a bit under 2M.

My previous plan was to run the two-card high-capability, high-power configuration just for the current cold snap, then pull out the 5700 for the summer.  My new plan is to have a try at using the capability the undesired major downclock showed me by intentionally configuring a two-card low-power configuration for summer.  At the most extreme I could probably run 1X (instead of the current 3X), with the 6800 set to about 500 MHz GPU clock and the 5700 set to -50% power limitation.  Just to see the full range of possibility, I currently intend to do a trial run at those settings this weekend, which is forecast to be hot here.  Then perhaps I can fill in some intermediate possibilities and give myself a configuration menu for seasonal adjustment.  Not opening up the case to add and subtract a card seems good.

I got a pause in task fetching overnight, as the daily quota for this configuration of 1024 was not quite enough.  So I've raised my cc_config overstatement of the real number CPUs from 16 to 24.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 4029
Credit: 47832061401
RAC: 39283792

Tom, I noticed you're back

Tom, I noticed you're back down to 5 GPUs on that box. did you have issues again with the 7-GPU setup?

_________________________________________________________________________

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6534
Credit: 9632418151
RAC: 2857710

Ian&Steve C. wrote:Tom, I

Ian&Steve C. wrote:

Tom, I noticed you're back down to 5 GPUs on that box. did you have issues again with the 7-GPU setup?

Yes.  Windows 10 went wonky.  Just got done doing a Windows "reset" with one Rx 5700 installed so as part of the reset it installed a Feb 24th, 2021 driver.  Reset keeping personal files.  So didn't have to re-download the Boinc Manager install.

Made a copy of the BOINC folder after resetting all the projects.  Then reset Windows.  Then re-installed/copied the folder back.

After turning off the PBO and Cpu boost (with above 4G enabled) I have gotten it to boot without crashing.

It's currently practicing on a single thread per gpu (7 gpus).

"Headlines at 11" :)

It's been 15+ minutes and nothing has "crashed" yet.

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6534
Credit: 9632418151
RAC: 2857710

Tom M wrote: It's been 15+

Tom M wrote:

It's been 15+ minutes and nothing has "crashed" yet.

 

Gee whiz!  It's still running...

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6534
Credit: 9632418151
RAC: 2857710

Tom M wrote: Tom M

Tom M wrote:

Tom M wrote:

It's been 15+ minutes and nothing has "crashed" yet.

 

Gee whiz!  It's still running...

It looks like it is stable.  I will wait until at least tomorrow to start increasing the number of tasks per gpu.  It "should" be stable and more productive up to 3 per gpu.

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6534
Credit: 9632418151
RAC: 2857710

Tom M wrote: Tom M

Tom M wrote:

Tom M wrote:

Tom M wrote:

It's been 15+ minutes and nothing has "crashed" yet.

Gee whiz!  It's still running...

It looks like it is stable.  I will wait until at least tomorrow to start increasing the number of tasks per gpu.  It "should" be stable and more productive up to 3 per gpu.

Locked up again.  Then crashed with the Rx 5600 xt's unplugged.  And crashed through a driver upgrade.  Got upgrade done running 1 gpu.  Have added 2 back in to see if it will run for a while.

Why 3 total?  Because another possible target is simply a 3 gpu system plugged into the MB.

The system just reported a driver time-out.

Time to migrate back to the Enterprise driver and see if that helps.

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.