GPU Upgrade Shows No Improvement in Work Unit Completion

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7055004931
RAC: 1615082

RE: 2. How do I get into

Quote:
2. How do I get into the CUDA beta testing program


Beta testing is the term we tend to use in posts here on the forums, and Beta is also a substring that appears in the actual application name once you have downloaded work and it is running on your machine.

However the actual web site preference item you need to activate is this one:Run test applications?
I suggested that you activate that option, giving both motivations and detailed instructions, some hours back on this thread. Perhaps you have already done so, in which case it should take effect the next time you download work, and the newly downloaded work actually runs.

One advantage to running a very much shorter work queue than five days (or 15 days!!!) is that you could see the effect of such changes much sooner.

Commonly when I change to a new card, or make other major changes, I operate at work queues of a small fraction of day, only going back to the three or four days needed to ride through a weekend outage when I think things are pretty stable on my setup. Unlike SETI, the servers here at Einstein have a history a long periods of very steady service, though the occasional outage happens, more than a weekend is uncommon.

Florida Rancher
Florida Rancher
Joined: 4 Oct 13
Posts: 31
Credit: 23998436
RAC: 0

I did as you asked and

I did as you asked and checked the "Run test applications box" yesterday. Today, I looked at my "Ready to start" tasks but don't see any CUDA55 listed yet.

I also took your advice and shortened my work queue from 5 to 3 days.

In the past the two WUs running concurrently on the 970 were pretty much equal in elapsed time and in sync; finishing almost in a dead heat. Since I made the corresponding changes you, Gary and Jurgen have recommended the WUs are no longer in sync and one is lagging the other by about 50%

I don't understand why the two elapsed times are differing by a large degree. It appears as if the two run times vary by about 1800 to 2000 seconds if I understand the results correectly. These are the two WUs for the 970 and not the piece of junk 730.

Florida Rancher
Florida Rancher
Joined: 4 Oct 13
Posts: 31
Credit: 23998436
RAC: 0

Next time before buying a

Next time before buying a video card I need to do my due diligence. Checking first with someone like you who has a wealth of experience would be smart.

You make a good point by suggesting I pull the 730. I don't know what options to use to set up a config file but I do know that the settings for each are not interchangeable.

When I went from 1 to 2 simultaneous tasks for the 970 I didn't see any real improvement in elapsed times. If anything the elapsed times are noticeably longer. In the past 1 WU took about 95 minutes while 2 WUs now takes about 140 minutes each.

So, I'm reluctant to run more than 2 unless I'm missing something. Do you have any ideas why this is?

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7055004931
RAC: 1615082

RE: When I went from 1 to 2

Quote:

When I went from 1 to 2 simultaneous tasks for the 970 I didn't see any real improvement in elapsed times. If anything the elapsed times are noticeably longer. In the past 1 WU took about 95 minutes while 2 WUs now takes about 140 minutes each.

So, I'm reluctant to run more than 2 unless I'm missing something. Do you have any ideas why this is?


The useful output per day running 2 at once is better than 1 at once so long as the average elapsed time is less than twice as long.

Using your numbers, your 95 minutes running 1 at a time means 15.15 WU/day, while your 140 minutes running 2 means 20.57/day. That is a rather nice benefit, and on the order of what I'd expect.

While I fall back on the "always try it, don't count on a rule of thumb" I nevertheless think that going from 1 at a time to 2 at a time nearly always helps, sometimes by only a little, but more often by quite a bit (as in your reported case). But above 2 at a time things tend to thin out very, very rapidly. So for people not oriented to careful experimentation, measurement, and revision, "just put it at 2 and leave it there" will usually give them nearly all the gusto available, with very little grief.

The main benefit you might get from removing the 730 would be that BOINC would more accurately estimate the time required to do work, which would make your requested queue size much more accurately honored. However, I think you would lose total output, not gain. Your choice, but in your shoes I'd leave the 730 in there for now.

I do this in real life, as one of my boxes currently runs one 970 plus one 750 Ti--not so disparate as yours, but different enough. Both run 2 tasks at a time, so I don't need a fancy config file to control them.

As you have reduced your requested queue size, it will be a little while yet before your rig fetches work again. Only then can the test application setting result in CUDA55 work coming to you, and a few days more after that before you'll see the benefit in returned work. I have little doubt you will see considerable benefit.

cliff
cliff
Joined: 15 Feb 12
Posts: 176
Credit: 283452444
RAC: 0

Hi Phil, RE: When I

Hi Phil,

Quote:


When I went from 1 to 2 simultaneous tasks for the 970 I didn't see any real improvement in elapsed times. If anything the elapsed times are noticeably longer. In the past 1 WU took about 95 minutes while 2 WUs now takes about 140 minutes each.

So, I'm reluctant to run more than 2 unless I'm missing something. Do you have any ideas why this is?


It 'might' be the p02 state mem problem, NVidia in their wisdom decided the any compute tasks would automatically put the card into P02 state, and limit memory speed to 3305MHz.
Since most cards are capable of a memory speed of 3505MHz that can affect throughput, or at any rate it does at my end.

With that in mind this problem can be addressed by using Nvidia Inspector to correct the P02 memory speed. there's at thread about this along the lines of low memory clock etc etc .

NVInspector is available from 3DGURU.Com download pages...

I run 'all' my GPU's with NVI reset memory P02 state speeds of 3505MHz without any problems. That's across 3 computers and 6 GPU's:-)

Including at one stage a GTX970 in rig 3. [Now replaced by a palit GTX980 clocked to 3600MHz p02 state] Palit GTX's 'seem' to allow higher memory speeds than other brands, but not by much.

As usual YMMV:-)
Regards

Cliff,

Been there, Done that, Still no damm T Shirt.

Florida Rancher
Florida Rancher
Joined: 4 Oct 13
Posts: 31
Credit: 23998436
RAC: 0

Hey Cliff: That's good

Hey Cliff:

That's good stuff you're sending me. I did as you said and downloaded NVIDIA Inspector 1.9.7.6. I brought up the settings for the 970, selected overclocking and P2 then changed the memory clock to 3505 MHz.

I clicked apply but it defaults back to 3004 MHz. What's up with that?

Regards,
Phil

Florida Rancher
Florida Rancher
Joined: 4 Oct 13
Posts: 31
Credit: 23998436
RAC: 0

Duh! Sorry Archae for being

Duh! Sorry Archae for being such a dunce and not looking closer at my own math. Of course 20 WUs per day is a nice 33% increase.

Furthermore, excuse my mistake about the 2nd card being a 730. Actually it's a Dell NVIDIA card that came with my Dimension 8900.

Regards,
Phil

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7055004931
RAC: 1615082

RE: That's good stuff

Quote:

That's good stuff you're sending me. I did as you said and downloaded NVIDIA Inspector 1.9.7.6. I brought up the settings for the 970, selected overclocking and P2 then changed the memory clock to 3505 MHz.

I clicked apply but it defaults back to 3004 MHz. What's up with that?


There is a longish thread on this exact topic here on Eistein at: 970 Memory overclocking thread

Getting this working once is a bit fiddly. Getting set up so that it automatically gets going the way you want after each reboot is yet more fiddly.

Another point of fiddling is that absolutely no one can tell you which exact settings will be the limit of how fast your particular card in your particular box running current applications can go, so if you want to press hard on this route you are in for an extended period of trial, failure, adjustment, try again.

The good news is that it is very likely that at the end you can make your 970 appreciably more productive by applying this measure (and the improvement will be multiplicative with your previous improvement by going from 1X to 2X, and your pending improvement from enabling use of the test/beta application), but it will be way, way more work to get it going reliably and automatically than for those two click once and done adjustments.

To get back to your specific situation, as I recall one needs to apply the NVidia Inspector enabling adjustment after a reboot before BOINC has run even a little--so even to try this you have to figure out how to stop BOINC from automatically starting up after reboot, then reboot and begin the matter.

My memory may be wrong--so better to sit back with considerable beverage and snack supply and read through the thread.

Florida Rancher
Florida Rancher
Joined: 4 Oct 13
Posts: 31
Credit: 23998436
RAC: 0

Cliff: This was in the

Cliff:

This was in the "Low memory" post:

Thanks to skgiven from GPU-Grid I can now overclock my memory! Here's how:

- the GPU must not be crunching BOINC (either pause your GPU project, or all GPUs, or suspend BOINC completely)
- in the nVidia Inspector OC tab set the overclock for P0 (because you can't go any higher than this in P2)
- now you can set up to this memory clock for P2 as well
- apply & have fun

I don't understand what he's saying in the second point about setting the overclock for P0. If I change memory clock and I change P2 by the same amount but P0 is already set the 3505 MHz.

Help,
Phil

Florida Rancher
Florida Rancher
Joined: 4 Oct 13
Posts: 31
Credit: 23998436
RAC: 0

I've read down to the part

I've read down to the part where ExtraTerrestrial states:

Quote:

Overclocking the memory:

Thanks to skgiven from GPU-Grid I can now overclock my memory! Here's how:

- the GPU must not be crunching BOINC (either pause your GPU project, or all GPUs, or suspend BOINC completely)
- in the nVidia Inspector OC tab set the overclock for P0 (because you can't go any higher than this in P2)
- now you can set up to this memory clock for P2 as well
- apply & have fun

GTX970/980 are reported to reach about 4 GHz memory clock. It should be interesting to see if this can boost Einstein performance another 10%.

I don't understand how to do the second point he makes about P0.

Phil

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.