Binary Radio Pulsar Search (Parkes PMPS XT) "BRP6"

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 2956733075

RAC: 716097

My observations of total time

4 Mar 2015 21:47:24 UTC

Message 129891 in response to message 129889

(moderation:

)

My observations of total time per task (both elapsed and CPU time) agree with archae86.

But I haven't looked at instantaneous CPU loading (or GPU loading, come to that) during the course of a run. That might be worth doing sometime.

And @ Stef - make a note of which task is using high CPU, which one low CPU, so you can match them with total time at the end of the run. I think you might have them the wrong way round.

Stef

Joined: 8 Mar 05

Posts: 206

Credit: 110568193

RAC: 0

You were right, the slow one

4 Mar 2015 22:04:38 UTC

Message 129892 in response to message 129891

(moderation:

)

You were right, the slow one is the CPU-hog.
However I haven't noticed that with the BRP6 tasks I ran before, they were running with very low CPU load.
It doesn't matter to me as I keep one CPU thread free for the GPU tasks anyway.

There is only one question left for me: is it still worth running n tasks in parallel. I guess nobody has tested that yet.

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 2956733075

RAC: 716097

So, of the two I have running

4 Mar 2015 22:07:41 UTC

Message 129893

(moderation:

)

So, of the two I have running at the moment,

PM0007_01111_94_0 looks to be fast-running, predicted to use very little CPU, and indeed shows around 2% CPU usage in Windows Process Explorer (of an 8-core CPU, so say ~15% of a core, for Linux comparisons)

PM0007_01071_224_0 is slow-running, predicted to use a lot of CPU, and shows 6% CPU / 50% core usage.

Process Explorer is only showing me the GPU usage of my HD 4000 Intel GPU, zero for the NV GTX 670. And GPU-Z shows the total load on the 670, but doesn't break it down by process.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7221544931

RAC: 972157

RE: But I haven't looked at

4 Mar 2015 22:33:10 UTC

Message 129894 in response to message 129891

(moderation:

)

Quote:

But I haven't looked at instantaneous CPU loading (or GPU loading, come to that) during the course of a run.

My informal observation is that the CPU loading seems pretty consistent throughout the run of a particular WU (so long as the companion task characteristics remain the same).

I've not been watching GPU loading. I'm running 2X on my 660s and 750s, and 3X on my 970, and have not yet attempted to find the preferred multiple for this application. I don't plan to try until higher CUDA-level version comes out, or seems unlikely to be distributed for a long time.

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 725891000

RAC: 1227662

RE: RE: But I haven't

4 Mar 2015 22:54:01 UTC

Message 129895 in response to message 129894

(moderation:

)

Quote:

Quote:
But I haven't looked at instantaneous CPU loading (or GPU loading, come to that) during the course of a run.

My informal observation is that the CPU loading seems pretty consistent throughout the run of a particular WU (so long as the companion task characteristics remain the same).

In general the CPU load will decrease during the runtime of a sub-workunit (BRP6 workunits consist of a bundle of two sub-units). By how much the load will decrease and how quickly is data-dependent, but the general trend should always be a sawtooth-like curve with two teeth, so to speak (for the two sub-workunits).

Cheers
HB

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 2956733075

RAC: 716097

RE: RE: RE: But I

4 Mar 2015 23:35:45 UTC

Message 129896 in response to message 129895

(moderation:

)

Quote:

Quote:
Quote:
But I haven't looked at instantaneous CPU loading (or GPU loading, come to that) during the course of a run.

My informal observation is that the CPU loading seems pretty consistent throughout the run of a particular WU (so long as the companion task characteristics remain the same).

In general the CPU load will decrease during the runtime of a sub-workunit (BRP6 workunits consist of a bundle of two sub-units). By how much the load will decrease and how quickly is data-dependent, but the general trend should always be a sawtooth-like curve with two teeth, so to speak (for the two sub-workunits).

Cheers
HB

Do you know whether the two sub-workunits will always be of consistent 'chewiness', for want of a better word? If they were, then the transition point will always be in the middle of the run, which will help us with the analysis.

Or maybe, I presume, the transition will be at the 50% point by definition, even if the two halves have different duration.

Mumak

Joined: 26 Feb 13

Posts: 325

Credit: 3520461658

RAC: 1608234

Is there a similar

5 Mar 2015 7:07:24 UTC

Message 129897

(moderation:

)

Is there a similar improvement in performance expected for BRP4G too, since it uses the same application?

-----

Stef

Joined: 8 Mar 05

Posts: 206

Credit: 110568193

RAC: 0

So the long v1.50 finished

5 Mar 2015 8:12:04 UTC

Message 129898

(moderation:

)

So the long v1.50 finished over night.

Here is a first summary of running 2 tasks parallel on a 750Ti:

Using v1.39 the average of 40 workunits was:
20366s runtime and 2103s CPU time.

The long v1.50 task (PM0007_01161_126_1) was:
22643s runtime and 5254s CPU time. (!)

The other six v1.49/v1.50 tasks I've done so far have taken pretty much the same time each and in average:
15942s runtime and 550s CPU time.

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 725891000

RAC: 1227662

RE: Is there a similar

5 Mar 2015 9:14:38 UTC

Message 129899 in response to message 129897

(moderation:

)

Quote:

Is there a similar improvement in performance expected for BRP4G too, since it uses the same application?

That is a very good question. It's using the same application, but different search parameters, and to make things more complicated, the BRP4G tasks go out to a very special breed of GPUs (Intel GPUs integrated in the CPU, not dedicated GPUs ). Too many variables for me to make a good guess, we will try this later.

HBE

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 725891000

RAC: 1227662

RE: Do you know whether

5 Mar 2015 9:17:06 UTC

Message 129900 in response to message 129896

(moderation:

)

Quote:

Do you know whether the two sub-workunits will always be of consistent 'chewiness', for want of a better word? If they were, then the transition point will always be in the middle of the run, which will help us with the analysis.

Or maybe, I presume, the transition will be at the 50% point by definition, even if the two halves have different duration.

I think I remember seeing cases where the sub-units had quite a different 'chewiness', so the sub-task switch can happen at points other than 50% of the total runtime.

Binary Radio Pulsar Search (Parkes PMPS XT) "BRP6"

Forums › Technical News

Comment viewing options

Forums › Technical News