Observations on FGRBP1 1.18 for Windows

Holmis

Joined: 4 Jan 05

Posts: 1118

Credit: 1055935564

RAC: 0

Gary Roberts wrote:The 970

16 Jan 2017 22:40:24 UTC

Message 154194 in response to message 154192

(moderation:

)

Gary Roberts wrote:

The 970 value is pretty much in line with what Holmis reported in the Technical News thread. I guess 970 owners in general will be highly delighted :-).

And that was an eyeball estimate based on one task. Keeping up with the eyeballing I get the feeling that subsequent tasks have taken a bit more time so getting closer to Archae's calculated 53% runtime compared to 1.17.

And I do feel delighted and very happy about the new beta version!

AgentB

Joined: 17 Mar 12

Posts: 915

Credit: 513211304

RAC: 0

RX-480 running x3 (Ubuntu

17 Jan 2017 0:23:55 UTC

Message 154197

(moderation:

)

RX-480 running x3 (Ubuntu 16.04 and amdgpu-pro-16.50)

Average times from 1864s (1.17) to 1369s (1.18) (over 20 tasks)

That's also about 27%, well done Christophe.

This may tempt me to get the HD7990 out of retirement.

The new baby is off and running!

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7223854931

RAC: 1001116

archae86 wrote:As it happens,

17 Jan 2017 4:38:02 UTC

Message 154201

(moderation:

)

archae86 wrote:

As it happens, the single-GPU host, which has had some occasional troubles in recent weeks, but had run fine for days, ran OK for about 20 minutes with a single 1.18 task. But very shortly after I added a second it failed. Further it failed in such a way that subsequent 1.18 and 1.17 tasks failed very quickly. In other words the system had somehow gotten into a lethal condition. A full power-down reboot cleared this condition, and the system resumed apparently normal processing. But shortly after I got brave and again allowed two 1.18 tasks, it failed again. Now I can't do further testing as the project denies it new work until tomorrow on the grounds that the daily quota of 12 tasks is exceeded. I understand that failures lower the task limit, but the system has only 27 error returns reported against it today, which I would not have expected to put me out of business.

So after midnight UTC I was indeed able to download fresh work to this problem host. On examination I learned I had been overclocking it, to the tune of +150 core clock, +400 memory clock in Afterburner terms. As I had several indications that the condition of the system had been marginal running previous work, I turned these down to +50/+250.

It has run a couple of complete 1.18 WUs at 1X. A direct comparison of 1.17 vs 1.18 running 1X on this host shows the 1.18 running a reported 4 degrees C hotter. If I was right at the edge before, this could easily have pushed me over.

However I had a disturbing incident while running 1.18 at 2X at these slower clocks. As I had started two WUs together, I intended to pause one after both had got about 15% done, in order to get a better use of GPU by not having them in synch. Promptly on suspending one WU (so another ready one would start) the screen went black for several seconds. Afterward I got a pop up announcing a driver restart. Also the active tasks (and the couple I had on deck but not suspended) promptly errored out.

Now that was really odd. I think (but can't be sure) that I've lowered the clocks enough for it not to be a simple excess clock speed problem. Possibly 1.18 is imperfectly compatible with this system in some other way. Anyway, I intend to build some 1X time for confidence before exploring 2X behavior on this host again.

Meanwhile six other cards on three other hosts are happily running 2X 1.18 tasks at their unchanged clock settings.

Jim1348

Joined: 19 Jan 06

Posts: 463

Credit: 257957147

RAC: 0

My two GTX 750 Ti's are each

17 Jan 2017 8:06:52 UTC

Message 154210

(moderation:

)

My two GTX 750 Ti's are each fed by a core of an i7-4771 running under Win7 64-bit (373.06 drivers).

These are minimally-overclocked cards running a single WU each at 1210 MHz, and are cool at 53 C.

1.17 -> 4350 seconds

1.18 -> 2550 seconds

So the ratio is 0.586

Logforme

Joined: 13 Aug 10

Posts: 332

Credit: 1714373961

RAC: 0

Holmis wrote:IA more

17 Jan 2017 7:47:10 UTC

Message 154211 in response to message 154186

(moderation:

)

Holmis wrote:

IA more optimized app will/should put more stress on the hardware so it might be a good time to check the running conditions and maybe make some adjustments to running parameters. Even a validate error once in a while might "invalidate" an overclock as the wasted time might be more than the gain from the overclock.

I'm sure you're right, the problem is not with the app (1.17 or 1.18), but with my card. It's very old by now and I guess it's heading for the recycle soon.

MAGIC Quantum M...

Joined: 18 Jan 05

Posts: 1886

Credit: 1410097869

RAC: 1174069

I hope this new version works

17 Jan 2017 8:24:29 UTC

Message 154214

(moderation:

)

I hope this new version works faster for my cards so I can get back to work here to start my 13th year off full blast.

I tried a few on my 660Ti and 560Ti OC's and they took hours just running X2 on the quads

That was bad enough to not even try any on my 650Ti's or the 550Ti and they are all OC or SC

I miss the good old days with GPU only BRP's

n12365

Joined: 4 Mar 16

Posts: 26

Credit: 6491436572

RAC: 0

My 1060(6GB) is running two

17 Jan 2017 12:38:50 UTC

Message 154225

(moderation:

)

My 1060(6GB) is running two at time with no CPU tasks on a i5-4690 under Windows 10 with 376.33 drivers.

1.17: 2529.29 seconds

1.18: 1610.58 seconds

Ratio: 0.637

walton748

Joined: 1 Mar 10

Posts: 94

Credit: 1504818376

RAC: 3119930

archae86 wrote: However I

17 Jan 2017 13:58:35 UTC

Message 154229 in response to message 154201

(moderation:

)

archae86 wrote:

However I had a disturbing incident while running 1.18 at 2X at these slower clocks. As I had started two WUs together, I intended to pause one after both had got about 15% done, in order to get a better use of GPU by not having them in synch. Promptly on suspending one WU (so another ready one would start) the screen went black for several seconds. Afterward I got a pop up announcing a driver restart. Also the active tasks (and the couple I had on deck but not suspended) promptly errored out.

I observed the very same this host, it is running windows 7, no overclock on the card at all, all 4 processor cores nailed to turbo 4.2 GHz instead of 4.4GHz through bios and windows setting, ram at xmp setting, temperatures in the mid-60s degrees C max, mostly lower. The system sports a relatively new NVidia 1070 by MSI, NVidia driver 376.33.

The system had been up and crunching for a week or so without interruption or any other use. The errorred-out wus where standard 1.17s.

As I say it: one more observation is that I suffered these driver-resets by Windows when I shut down BOINC/Einstein a while ago, but with no effect on the Einstein-tasks then (well...they weren'd worked upon any longer when that happened, I'd say).

I could not reproduce the suspend error right now on my other host, but this one's been restarted recently, and the driver failure/reset on BOINC/Einstein shutdown applied here, too. I just did not pay too much attention then.
One more thing: it looks like validate failures start to occur after a week's uptime or so. This qualifies as an observation for me, as I am used to that my systems can crunch on for weeks prodocing valids only, but in reality I have seen it only twice now. For what that is worth. On the other hand, if I am right the problem kind of builds up so slowly that it is hard to diagnose fast.

Cheers,

Walton

walton748

Joined: 1 Mar 10

Posts: 94

Credit: 1504818376

RAC: 3119930

I could reproduce the suspend

17 Jan 2017 15:53:14 UTC

Message 154230 in response to message 154229

(moderation:

)

I could reproduce the suspend failure on both hosts now. Uptime (= crunching time) is like one hour, so my suspicion probably points into the wrong direction.

<edit>Wow, that really wrecked it - did not recover after the driver reset</edit>

<edit2>I mean, the system as such recovered, but processing Einstein FGRBP1 did not. Sorry for being unspecific.</edit2>

Cheers,

Walton

Shafa

Joined: 31 May 05

Posts: 53

Credit: 627005014

RAC: 0

For 64bit/Linux @ AMD, DDR2

17 Jan 2017 18:11:30 UTC

Message 154241

(moderation:

)

For 64bit/Linux @ AMD, DDR2 or DDR3

Ratio 1.18 / 1.17 is mostly:

nVidia Fermi: 0.62-0.68 (GTX460,570,580,590)

nVidia Kepler: slightly below 0.6 (GTX760)

nVidia Quadro (Maxwell, M1000M, laptop): 1.18 N/A, always finish with errors after 17 seconds

Observations on FGRBP1 1.18 for Windows

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner