Binary Radio Pulsar Search (Perseus Arm Survey) "BRP5"

Neil Newell
Neil Newell
Joined: 20 Nov 12
Posts: 176
Credit: 169699457
RAC: 0

RE: As both power

Quote:

As both power consumption and GPU load were down at 3x from BRP4 on this system, and I am currently running only a single CPU job on this 4-core host, I thought I might get appreciable speedup of 4x over 3x, but the improvement in throughput was very small, and came at a cost of degraded system level power productivity. So I've reverted to 3x. For this system the 3x benefit over 2x is moderate, but definite, including a power efficiency improvement.

Thanks for the detail; so looks like 3x is still optimum (for nvidia) with 2x not far behind?

I'll probably stick with 2x; keeps the GPU's that little bit cooler (even if the science per watt isn't quite as good).

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7219434931
RAC: 978117

Neil Newell wrote:so looks

Neil Newell wrote:
so looks like 3x is still optimum (for nvidia) with 2x not far behind?


Yes, for my host configuration for the GTX660. I don't feel like plugging my GTX460 back in, but suspect it would prefer 2x to 3x.

I think the rough and ready rule of thumb of running 2x and restricting the pure CPU jobs to n-1 cores is still a fast way to get pretty close to optimum without testing for Einstein word on Nvidia cards. It seems almost universal that 2x beats 1x appreciably if it works at all, and gains above 2x seem to vary from modest to negative. Optimum CPU core count varies more, but n-1 won't often be badly off optimum in performance.

I'm pretty confident my rig would get higher total throughput with one or more additional CPU jobs to the single one I'm running--but I am on summer power conservation, and already throttling a fair number of hours per day in normal service (not these tests), so the poor incremental power efficiency of adding CPU jobs has me not going that way until heating season returns about November.

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 379
Credit: 740030628
RAC: 0

RE: I've been testing 1x

Quote:


I've been testing 1x and 2x on a linux PCIe 2.0 system with GTX580.

1x: ~12,000s
2x: ~19,000s (9,500s/task)

As this is quite a difference, I'm wondering if even higher utilisation would be better (compared to BRP4, where NVIDIA at least don't seem to improve much beyond 2x).

I think that higher utilization may be a possibility. I have not checked power consumption on my NVIDIA systems yet but I noticed that one of my AMD systems is now drawing 100w less at the wall while running BRP5 tasks. I suspect there may be some additional headroom for running more tasks but have not tried so far.

Due to high electric costs in the summer, I think I may just keep my systems as is with the reduced power consumption of the BRP5 tasks rather than trying to increase the utilization.

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 531
Credit: 642696543
RAC: 1106512

Hallo! I´m missing on the

Hallo!
I´m missing on the SERVER STATUS page at the lower right corner a data field called "BRP5 progress".

Kind regards and happy crunching.
Martin

zablociak
zablociak
Joined: 5 Sep 11
Posts: 10
Credit: 636140972
RAC: 0

RE: As far as I remember we

Quote:

As far as I remember we discovered and reported this while CUDA 4.0 was in alpha test to developers, I think about 2.5y ago. We tested and reported with every new CUDA version that has come out since.

BM


Two and a half years ago means that NVIDIA is not going to correct this bug in any near future. Maybe it's time to move on and find another solution or workaround, in order to make use of CUDA 4.2 and CUDA 5.0. CUDA 4.2 really works great in GPUGRID when compared to CUDA 3.1.

Mumak
Joined: 26 Feb 13
Posts: 325
Credit: 3517032001
RAC: 1636050

I got a notice few days ago,

I got a notice few days ago, that CUDA 5.5 is available. Maybe check that?

-----

tolafoph
tolafoph
Joined: 14 Sep 07
Posts: 122
Credit: 74659937
RAC: 0

RE: I've been testing 1x

Quote:

I've been testing 1x and 2x on a linux PCIe 2.0 system with GTX580.

1x: ~12,000s
2x: ~19,000s (9,500s/task)

As this is quite a difference, I'm wondering if even higher utilisation would be better (compared to BRP4, where NVIDIA at least don't seem to improve much beyond 2x).

With a really small sample on my GTX 580 I got:

2x BRP5 = 7790s GPU@ 90%
3x BRP5 = 7570s GPU@ 94%

So only a small (~3%) gain for running 3 tasks.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250382418
RAC: 34803

BRP5 credit definitely

BRP5 credit definitely requires some more investigation and thinking, more than we have time for now. For the time being I raised the credit to 5000 for newly generated BRP5 workunits.

Generation of BRP4 workunits has been disabled alltogether, what's already there will be sent out and processed by GPUs. (In a few days BRP4 workunit generation will be reconfigured and the remaining Arecibo data will be processd by slower CPUs.)

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250382418
RAC: 34803

RE: I´m missing on the

Quote:
I´m missing on the SERVER STATUS page at the lower right corner a data field called "BRP5 progress".

Me too. But for various technical reasons this will take a while. For one the first ~30k WUs that were sent out last weekend have to be completely processed. You will see that this happened when the number of "BRP5 Workunits waiting for assimilation" is dropping again.

BM

BM

zablociak
zablociak
Joined: 5 Sep 11
Posts: 10
Credit: 636140972
RAC: 0

RE: BRP5 credit definitely

Quote:
BRP5 credit definitely requires some more investigation and thinking, more than we have time for now. For the time being I raised the credit to 5000 for newly generated BRP5 workunits.
BM


Great news. Thank you!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.