Gamma-ray pulsar binary search #1 on GPUs

Jeroen

Joined: 25 Nov 05

Posts: 379

Credit: 740030628

RAC: 0

My 980 Ti cards in Win 10

16 Dec 2016 22:03:56 UTC

Message 152909

(moderation:

)

My 980 Ti cards in Win 10 went from 7850 seconds per task with version 1.15 to 350 seconds per task with version 1.16.

Edit:

2x tasks are running at approximately 480 seconds each. I tested with stock GPU memory frequency of 3304 MHz in P2 and 3800 MHz in P2 overclocked but the higher memory frequency made little difference in performance. The difference was a few seconds in runtime.

Filipe

Joined: 10 Mar 05

Posts: 186

Credit: 405584640

RAC: 420621

So, no GPU work for all the

16 Dec 2016 20:40:55 UTC

Message 152927 in response to message 152877

(moderation:

)

So, no GPU work for all the sistems running 32 bits? Like mine. There is probably a lot of people out there still using 32bits OS.

Marteinstein

Joined: 21 Jun 11

Posts: 4

Credit: 9758945

RAC: 0

There is a problem with this

16 Dec 2016 21:09:50 UTC

Message 152930

(moderation:

)

There is a problem with this app. My computer laggs with this app (I5-4460, Nvidia GTX 750, Win8).

Taskmanager shows one full core for the Einstein GRPBS-1-GPU app, roughly as expected.

However since the computer lagged i looked which task was also active:
There was a task with name "system" active with about 11% which on my computer is close to half a core.
This tasks went to zero % usage at the time one of the GRPBS-1-GPU-Task ended.
As the next GRPBS-1-GPU-Task started this "system"-task was up again.

Therefore GRPBS-1-GPU-Task practically overall use up to 1.5 cores, which in the long run is unmanageable.

Any solutions?

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250389218

RAC: 34726

I honestly doubt that there

16 Dec 2016 21:10:12 UTC

Message 152931

(moderation:

)

I honestly doubt that there are enough 32Bit GPU systems attached to E@H that it makes sense to invest a lot of time in porting the application to use a different FFT library. But we'll try to dig through our DB to find out.

When the FGRP GPU app is a bit more established, we could possibly limit BRP4G to the machines that can't run FGRP; such that when we get new Arecibo data, these machines get work again and it will last a bit longer. However it's not clear if we will get more Arecibo data at all, and when.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7219624931

RAC: 975722

I don''t know how the maximum

16 Dec 2016 21:33:31 UTC

Message 152934 in response to message 152931

(moderation:

)

I don''t know how the maximum jobs allowed to be downloaded per day is computed, but I have a system which is currently waiting until tomorrow UTC to request more work because it already hit the 640 task download limit for today for that system.

A little arithmetic on the elapsed times for 1.16 work on that system suggests that it is capable of about 600 tasks/day of this flavor at my current settings and tuning.

So while in principal I'm OK, a slightly faster system that earned the same limit would be starved. Unless there is a near-term plan to gather multiple of the actual tasks into a single BOINC task, perhaps a moderate limit relaxation could be considered.

[Note on calculation--which is approximate. The system has one GTX 1070 card which is currently running 2X, and giving elapsed times of a bit over 8 minutes. Rounding down to 8, that gives 360 tasks/day for that card. The 6GB GTX 1060 card is averaging a little under 12 minutes. Rounding up to 12 gives 240 tasks/day for that card.]

TimeLord04

Joined: 8 Sep 06

Posts: 1442

Credit: 72378840

RAC: 0

[Update:]OK - MAC Units,

16 Dec 2016 21:52:34 UTC

Message 152936

(moderation:

)

[Update:]

OK - MAC Units, (1.13 FGRP), still crunching away and completing TWO at a time on TWO EVGA GTX-750TI SC cards. Completion times running 2 at a time is under 36 Min.

Win XP Pro x64 now on 1.16 FGRP Units, (all Arecibo BRP4G Units now done), and crunching on EVGA GTX-760. Completion times 1 Hr and 2 Min. to 1 Hr and 3 Min. respectively, crunching TWO at a time. MUCH improved over the 1.15 Units.

In close, between my two systems, I'm crunching 6 Total Units at a time on three GPUs. Systems seem stable, GPU Fan noise is minimal. NO Fan Control on MAC; but, Precision X is running on Win XP Pro x64.

Both systems run 15 Hrs per day, from 6 AM to 9 PM - PST; in summer from 6 PM to 9 AM - PDT.

[EDIT:]

MAC should start on the 1.14 Units tonight before 9 PM; or some time tomorrow morning.

[EDIT 2:]

Still getting Invalids on the MAC due to OpenCL Bug. Now up to 4 Invalids. Prior to FGRP work, MAC was on BRP6 work and NEVER had an Invalid, nor an Inconclusive.

I will keep monitoring.

TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join SETI Refugees

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250389218

RAC: 34726

So while in principal I'm OK,

16 Dec 2016 22:00:00 UTC

Message 152937 in response to message 152934

(moderation:

)

So while in principal I'm OK, a slightly faster system that earned the same limit would be starved. Unless there is a near-term plan to gather multiple of the actual tasks into a single BOINC task, perhaps a moderate limit relaxation could be considered.

The current tasks are still the ones sized for testing, in particular to be validated with CPU jobs, so meant to finish in reasonable time with CPU Apps as well. The next type of workunits that we will probably issue on Monday will run quite a bit longer. These should also have a better ratio of CPU vs. GPU utilization.

Trotador

Joined: 2 May 13

Posts: 58

Credit: 2122643213

RAC: 0

It seems that the wus

16 Dec 2016 22:58:23 UTC

Message 152941

(moderation:

)

It seems that the wus duration has increased since yesterday, is it correct?

Mad_Max

Joined: 2 Jan 10

Posts: 154

Credit: 2211544756

RAC: 319943

archae86 wrote:I don''t know

17 Dec 2016 5:59:15 UTC

Message 152964 in response to message 152934

(moderation:

)

archae86 wrote:

I don''t know how the maximum jobs allowed to be downloaded per day is computed, but I have a system which is currently waiting until tomorrow UTC to request more work because it already hit the 640 task download limit for today for that system.

A little arithmetic on the elapsed times for 1.16 work on that system suggests that it is capable of about 600 tasks/day of this flavor at my current settings and tuning.

So while in principal I'm OK, a slightly faster system that earned the same limit would be starved. Unless there is a near-term plan to gather multiple of the actual tasks into a single BOINC task, perhaps a moderate limit relaxation could be considered.

Last i check (though it was long ago, so it may be no longer relevant) daily quota was dynamic and calculated based on number of CPU cores and GPUs specific host have (also quota is lowered with every failed WUs, and restored with validated WUs). So for faster system daily quota will be higher.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250389218

RAC: 34726

Trotador wrote:It seems that

17 Dec 2016 7:37:17 UTC

Message 152967 in response to message 152941

(moderation:

)

Trotador wrote:

It seems that the wus duration has increased since yesterday, is it correct?

Not systematically, i.e. in the "size" of the workunits under control by us.

However, the duration of the last part of the computation is data-dependent. If your GPU isn't capable of double precision computation, this part is done on the CPU and will have a noticeable contribution to the overall runtime.

Gamma-ray pulsar binary search #1 on GPUs

Forums › Technical News

Comment viewing options

Forums › Technical News