Binary Radio Pulsar Search (Parkes PMPS XT) "BRP6"

|MatMan|
|MatMan|
Joined: 22 Jan 05
Posts: 24
Credit: 249005261
RAC: 0

RE: Planning is still like

Quote:

Planning is still like described here: http://einsteinathome.org/node/197990&nowrap=true#138717

In a nutshell, once we have this app version stable we are planning to offer both CUDA 3.2 and 5.5 app versions for a transition period, and then we will see a) what we gain by including CUDA 5.5 support but also b) how many hosts we would lose by dropping CUDA 3.2 support and requiring CUDA 5.5+ in the future. We hope to be able to drop CUDA 3.2 support and switch to 5.5. We'll see.


Thanks for the info! I wasn't aware of that thread. Why are you planning a CUDA 5.5 app? The CUDA 6.0 toolkit depreciates targeting G80 (sm_10) but still supports this architecture. CUDA 6.5 removed sm_10 support but still supports sm_11, sm_12, and sm_13 architectures (although deprecated).
I can't imagine there are still a lot of G8x GPUs out there but who knows...

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 729919596
RAC: 1187775

RE: Why are you planning a

Quote:
Why are you planning a CUDA 5.5 app? The CUDA 6.0 toolkit depreciates targeting G80 (sm_10) but still supports this architecture. CUDA 6.5 removed sm_10 support but still supports sm_11, sm_12, and sm_13 architectures (although deprecated).
I can't imagine there are still a lot of G8x GPUs out there but who knows...

There are three things to consider: a) What is the gain in performance we get by moving to a higher CUDA version, b) what is the loss in performnace we suffer by losing those hosts that have hardware or drivers installed that do not support that version and c) are there any new bugs in the CUDA runtime or toolchain for that particular version that interfer with our code.

At the moment CUDA 5.5 looks like a good compromise. We will do some performance testing later, from CUDA 7 downwards, but if we don't see substantial benfit (performance-wise), we will chose a level of compatibility that includes as many hosts as possible.

HB

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7226228262
RAC: 1071026

I've seen some surprising

I've seen some surprising variability of 1.52 completion times recently on my GTX 970 host.

Initially, after 1.52 arrived, this host, which is running 3X, had elapsed times and CPU times extremely tightly distributed, with ET running about 2:48:00 and CPU about 0:17:00.

But I've now had two units which took ET 3:37:15 and 3:18:02, which would be many sigma high from the previous tight distribution. Unlike long-running units before 1.52, these had CPU time only moderately elevated, at 0:23:30 and 0:20:58, though still far outside the primary distribution.

It appears that "normal" units paired with these got a bit of a boost, presumably as the slowed units swapped out faster, so the beneficiaries units had ET as low as 2:30:09. This partially compensating side effect means the real productivity loss to these couple of slow units is not so great as might initially seem, and in any case these are not at all exceptional compared to the outliers readily seen in 1.47/1.50 work.

Of course it is also possible these timings are artifacts of something not quite right on my host, as the same host was generating about one invalid a day, has recent generated two compute errors, and has an overclock which just yesterday I backed down slightly in hopes of averting the invalid generations.

One not only needs a decent sample size, but may also need work issued over a material interval, as I think I've seen somewhat systematic variation in the rate and severity of "chewy" work units in the manifestation seen in 1.47/1.50.

I'm still a huge fan of 1.52, I just doubt things will look quite so wonderful in the long term as some of us saw at first glance.

Gavin
Gavin
Joined: 21 Sep 10
Posts: 191
Credit: 40644337738
RAC: 1

RE: I'm still a huge fan of

Quote:
I'm still a huge fan of 1.52, I just doubt things will look quite so wonderful in the long term as some of us saw at first glance.

There's Wonderful, then there's average wonderful!
Either way the beta app(s) is(are) far better than what came before :-) My hats off to HB et al for the time spent on this.

Aside from the obvious speedup of the GPU app, I'm amazed by the 'levelling' that's taken place between my systems.
Gone are the advantages of PCI-e3 over 2, gone are the advantages of higher end multicore CPU's against the cheap 'everyday' mainstream dual core CPU's.
(unless you run multi card setups). There's no longer an apparent advantage to having fast RAM over standard 1333Mhz RAM.
My AMD 7970's/280(+X's) now perform equally as well regardless of backbone, whereas before there was a definitive gulf in performance between GPU variants/system backbone...

Anybody else seen this or is it just me?!

Gavin.

Phil
Phil
Joined: 8 Jun 14
Posts: 583
Credit: 228589942
RAC: 8833

RE: Anybody else seen this

Quote:

Anybody else seen this or is it just me?!

Gavin.

My 750s running on older E5400 CPUs have taken off like rockets with the new software. I have not posted anything because mining data just grinds on me, but they have gone from low 20sK RAC to low 30sK and appear to not be done going up yet.

I'm getting ready to leave for the evening but when I get back later I'll try to post some system specs and such.

Phil

I thought I was wrong once, but I was mistaken.

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 578650204
RAC: 198551

RE: Anybody else seen this

Quote:
Anybody else seen this or is it just me?!


I don't have any numbers myself, but this is a very logical consequence of what was changed in the app. Nice to see it manifest this way in the real world (in the results thread) :)

MrS

Scanning for our furry friends since Jan 2002

Daniels_Parents
Daniels_Parents
Joined: 9 Feb 05
Posts: 101
Credit: 1877689213
RAC: 0

RE: There's Wonderful, then

Quote:
There's Wonderful, then there's average wonderful!

Yeah ! Smooth and powerful - all hosts's GPUs at 97% !
But now I have to order big fans immediately to ban the heat :-)

I know I am a part of a story that starts long before I can remember and continues long beyond when anyone will remember me [Danny Hillis, Long Now]

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 463
Credit: 257957147
RAC: 0

RE: Aside from the obvious

Quote:

Aside from the obvious speedup of the GPU app, I'm amazed by the 'levelling' that's taken place between my systems.
Gone are the advantages of PCI-e3 over 2, gone are the advantages of higher end multicore CPU's against the cheap 'everyday' mainstream dual core CPU's.
(unless you run multi card setups). There's no longer an apparent advantage to having fast RAM over standard 1333Mhz RAM.

Anybody else seen this or is it just me?!


I run two GTX 750 Ti's on a Haswell motherboard (PCIe 3.0 x8 for each card, Win7 64-bit). They are very consistent, though after only 18 hours the statistics are not exactly complete. But when both cards are running 1.52, then the times are between 2 hours 21 minutes and 2 hours 24 minutes; very nice. But when only one of the cards is running 1.52 and the other card is running a different project (POEM or GPUGrid), then the time drops slightly to just over 2 hours 20 minutes. So there is a slight bus loading effect, but if that is the worst of it, I am all for it.

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513211304
RAC: 0

RE: Anybody else seen

Quote:

Anybody else seen this or is it just me?!

Before the beta series, my two GTX-460 setup one at x16, the other at x4 would complete tasks at a ratio of 70:30, and the RAC typically around 48K / day working only BRP5. CPU levels would typically be above 50%, the PCI bandwidth being the limiting factor.

1.52 the ratio is almost 50:50 and all tasks faster, I am now expecting a RAC maybe near 96K, and most surprisingly the CPU usage is usually below 20% on all 4 cores. I would say the beta CPU times may be as low as a 10% or the original!

Finally the GPU fans are above 50%.

Beta is betta, much betta.

This is a great example of GPU computing, so developers take a bow, you have released the Krakens.

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 578650204
RAC: 198551

RE: But now I have to order

Quote:
But now I have to order big fans immediately to ban the heat :-)


I know you're not complaining, but for everyone concerned about the increased power consumption and temperatures of their GPUs:

Einstein is still working the GPU less hard than e.g. GPU-Grid. If you still want to decrease the stress on the GPU a bit you can do so easily on nVidia GPUs since Kepler (600 series) by reducing the power target. If you lower it far enough (Einstein is probably running at significantly less than 100% of your stock limit), the GPU will automatically downclock slightly and - most importantly - will lower the voltage accordingly. This makes the GPU run more efficiently and reduces power draw & heat output (they're the same). You also stay within the specs guaranteed by your card manufacturer.

Personally I like my 28 nm GPUs (all desktop nVidias starting from the 600 series) to run around 1.10 V. This is just a few 10 MHz slower than full throttle at 1.17 - 1.20 V, but saves around 20% power (depending on actual setting, of course).

The usual OC utilities can set this. Obviously you could also increase the fan speed, but this creates noise and increases fan wear somewhat and fan power draw (realistically negligible unless it gets really loud).

For AMD GPUs it's not so easy, but you have some indirect settings in the Catalyst Control Center where you should be able to ask the card to use less power. From what I've seen in reviews they hardly scale voltage and throttle mostly via clock speed, which gives a far worse 1:1 relationship between power savings and performance loss. Here I'd rather tweak clock & voltage myself.

MrS

Scanning for our furry friends since Jan 2002

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.