Binary Radio Pulsar Search (Parkes PMPS XT) "BRP6"

|MatMan|

Joined: 22 Jan 05

Posts: 24

Credit: 249005261

RAC: 0

RE: Planning is still like

10 Mar 2015 18:45:13 UTC

Message 129921 in response to message 129915

(moderation:

)

Quote:

Planning is still like described here: http://einsteinathome.org/node/197990&nowrap=true#138717

In a nutshell, once we have this app version stable we are planning to offer both CUDA 3.2 and 5.5 app versions for a transition period, and then we will see a) what we gain by including CUDA 5.5 support but also b) how many hosts we would lose by dropping CUDA 3.2 support and requiring CUDA 5.5+ in the future. We hope to be able to drop CUDA 3.2 support and switch to 5.5. We'll see.

Thanks for the info! I wasn't aware of that thread. Why are you planning a CUDA 5.5 app? The CUDA 6.0 toolkit depreciates targeting G80 (sm_10) but still supports this architecture. CUDA 6.5 removed sm_10 support but still supports sm_11, sm_12, and sm_13 architectures (although deprecated).
I can't imagine there are still a lot of G8x GPUs out there but who knows...

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 734426288

RAC: 1299374

RE: Why are you planning a

11 Mar 2015 8:44:29 UTC

Message 129922 in response to message 129921

(moderation:

)

Quote:

Why are you planning a CUDA 5.5 app? The CUDA 6.0 toolkit depreciates targeting G80 (sm_10) but still supports this architecture. CUDA 6.5 removed sm_10 support but still supports sm_11, sm_12, and sm_13 architectures (although deprecated).
I can't imagine there are still a lot of G8x GPUs out there but who knows...

There are three things to consider: a) What is the gain in performance we get by moving to a higher CUDA version, b) what is the loss in performnace we suffer by losing those hosts that have hardware or drivers installed that do not support that version and c) are there any new bugs in the CUDA runtime or toolchain for that particular version that interfer with our code.

At the moment CUDA 5.5 looks like a good compromise. We will do some performance testing later, from CUDA 7 downwards, but if we don't see substantial benfit (performance-wise), we will chose a level of compatibility that includes as many hosts as possible.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7230154841

RAC: 1157694

I've seen some surprising

11 Mar 2015 16:29:15 UTC

Message 129923

(moderation:

)

I've seen some surprising variability of 1.52 completion times recently on my GTX 970 host.

Initially, after 1.52 arrived, this host, which is running 3X, had elapsed times and CPU times extremely tightly distributed, with ET running about 2:48:00 and CPU about 0:17:00.

But I've now had two units which took ET 3:37:15 and 3:18:02, which would be many sigma high from the previous tight distribution. Unlike long-running units before 1.52, these had CPU time only moderately elevated, at 0:23:30 and 0:20:58, though still far outside the primary distribution.

It appears that "normal" units paired with these got a bit of a boost, presumably as the slowed units swapped out faster, so the beneficiaries units had ET as low as 2:30:09. This partially compensating side effect means the real productivity loss to these couple of slow units is not so great as might initially seem, and in any case these are not at all exceptional compared to the outliers readily seen in 1.47/1.50 work.

Of course it is also possible these timings are artifacts of something not quite right on my host, as the same host was generating about one invalid a day, has recent generated two compute errors, and has an overclock which just yesterday I backed down slightly in hopes of averting the invalid generations.

One not only needs a decent sample size, but may also need work issued over a material interval, as I think I've seen somewhat systematic variation in the rate and severity of "chewy" work units in the manifestation seen in 1.47/1.50.

I'm still a huge fan of 1.52, I just doubt things will look quite so wonderful in the long term as some of us saw at first glance.

Gavin

Joined: 21 Sep 10

Posts: 191

Credit: 40644337738

RAC: 1

RE: I'm still a huge fan of

11 Mar 2015 18:05:08 UTC

Message 129924 in response to message 129923

(moderation:

)

Quote:

I'm still a huge fan of 1.52, I just doubt things will look quite so wonderful in the long term as some of us saw at first glance.

There's Wonderful, then there's average wonderful!
Either way the beta app(s) is(are) far better than what came before :-) My hats off to HB et al for the time spent on this.

Aside from the obvious speedup of the GPU app, I'm amazed by the 'levelling' that's taken place between my systems.
Gone are the advantages of PCI-e3 over 2, gone are the advantages of higher end multicore CPU's against the cheap 'everyday' mainstream dual core CPU's.
(unless you run multi card setups). There's no longer an apparent advantage to having fast RAM over standard 1333Mhz RAM.
My AMD 7970's/280(+X's) now perform equally as well regardless of backbone, whereas before there was a definitive gulf in performance between GPU variants/system backbone...

Anybody else seen this or is it just me?!

Gavin.

Phil

Joined: 8 Jun 14

Posts: 617

Credit: 228905397

RAC: 34278

RE: Anybody else seen this

11 Mar 2015 22:30:18 UTC

Message 129925

(moderation:

)

Quote:

Anybody else seen this or is it just me?!

Gavin.

My 750s running on older E5400 CPUs have taken off like rockets with the new software. I have not posted anything because mining data just grinds on me, but they have gone from low 20sK RAC to low 30sK and appear to not be done going up yet.

I'm getting ready to leave for the evening but when I get back later I'll try to post some system specs and such.

Phil

I thought I was wrong once, but I was mistaken.

ExtraTerrestria...

Joined: 10 Nov 04

Posts: 770

Credit: 579243531

RAC: 203686

RE: Anybody else seen this

11 Mar 2015 22:35:10 UTC

Message 129926 in response to message 129924

(moderation:

)

Quote:

Anybody else seen this or is it just me?!

I don't have any numbers myself, but this is a very logical consequence of what was changed in the app. Nice to see it manifest this way in the real world (in the results thread) :)

MrS

Scanning for our furry friends since Jan 2002

Daniels_Parents

Joined: 9 Feb 05

Posts: 101

Credit: 1877689213

RAC: 0

RE: There's Wonderful, then

11 Mar 2015 22:43:22 UTC

Message 129927

(moderation:

)

Quote:

There's Wonderful, then there's average wonderful!

Yeah ! Smooth and powerful - all hosts's GPUs at 97% !
But now I have to order big fans immediately to ban the heat :-)

I know I am a part of a story that starts long before I can remember and continues long beyond when anyone will remember me [Danny Hillis, Long Now]

Jim1348

Joined: 19 Jan 06

Posts: 463

Credit: 257957147

RAC: 0

RE: Aside from the obvious

12 Mar 2015 1:48:37 UTC

Message 129928 in response to message 129924

(moderation:

)

Quote:

Aside from the obvious speedup of the GPU app, I'm amazed by the 'levelling' that's taken place between my systems.
Gone are the advantages of PCI-e3 over 2, gone are the advantages of higher end multicore CPU's against the cheap 'everyday' mainstream dual core CPU's.
(unless you run multi card setups). There's no longer an apparent advantage to having fast RAM over standard 1333Mhz RAM.

Anybody else seen this or is it just me?!

I run two GTX 750 Ti's on a Haswell motherboard (PCIe 3.0 x8 for each card, Win7 64-bit). They are very consistent, though after only 18 hours the statistics are not exactly complete. But when both cards are running 1.52, then the times are between 2 hours 21 minutes and 2 hours 24 minutes; very nice. But when only one of the cards is running 1.52 and the other card is running a different project (POEM or GPUGrid), then the time drops slightly to just over 2 hours 20 minutes. So there is a slight bus loading effect, but if that is the worst of it, I am all for it.

AgentB

Joined: 17 Mar 12

Posts: 915

Credit: 513211304

RAC: 0

RE: Anybody else seen

12 Mar 2015 20:51:27 UTC

Message 129929 in response to message 129924

(moderation:

)

Quote:

Anybody else seen this or is it just me?!

Before the beta series, my two GTX-460 setup one at x16, the other at x4 would complete tasks at a ratio of 70:30, and the RAC typically around 48K / day working only BRP5. CPU levels would typically be above 50%, the PCI bandwidth being the limiting factor.

1.52 the ratio is almost 50:50 and all tasks faster, I am now expecting a RAC maybe near 96K, and most surprisingly the CPU usage is usually below 20% on all 4 cores. I would say the beta CPU times may be as low as a 10% or the original!

Finally the GPU fans are above 50%.

Beta is betta, much betta.

This is a great example of GPU computing, so developers take a bow, you have released the Krakens.

ExtraTerrestria...

Joined: 10 Nov 04

Posts: 770

Credit: 579243531

RAC: 203686

RE: But now I have to order

12 Mar 2015 21:47:55 UTC

Message 129930 in response to message 129927

(moderation:

)

Quote:

But now I have to order big fans immediately to ban the heat :-)

I know you're not complaining, but for everyone concerned about the increased power consumption and temperatures of their GPUs:

Einstein is still working the GPU less hard than e.g. GPU-Grid. If you still want to decrease the stress on the GPU a bit you can do so easily on nVidia GPUs since Kepler (600 series) by reducing the power target. If you lower it far enough (Einstein is probably running at significantly less than 100% of your stock limit), the GPU will automatically downclock slightly and - most importantly - will lower the voltage accordingly. This makes the GPU run more efficiently and reduces power draw & heat output (they're the same). You also stay within the specs guaranteed by your card manufacturer.

Personally I like my 28 nm GPUs (all desktop nVidias starting from the 600 series) to run around 1.10 V. This is just a few 10 MHz slower than full throttle at 1.17 - 1.20 V, but saves around 20% power (depending on actual setting, of course).

The usual OC utilities can set this. Obviously you could also increase the fan speed, but this creates noise and increases fan wear somewhat and fan power draw (realistically negligible unless it gets really loud).

For AMD GPUs it's not so easy, but you have some indirect settings in the Catalyst Control Center where you should be able to ask the card to use less power. From what I've seen in reviews they hardly scale voltage and throttle mostly via clock speed, which gives a far worse 1:1 relationship between power savings and performance loss. Here I'd rather tweak clock & voltage myself.

MrS

Scanning for our furry friends since Jan 2002

Binary Radio Pulsar Search (Parkes PMPS XT) "BRP6"

Forums › Technical News

Comment viewing options

Forums › Technical News