Gamma-ray pulsar binary search #1 on GPUs

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

Christian Beer wrote:If some

Christian Beer wrote:

If some of you could monitor your DCF and report any changes, that would be great.

and

The issue with wrong estimated time and DCF is still under investigation.

I've done some experimenting and here are the findings on my system. (i7 3770K @ 4.2GHz HT=on running 4 CPU tasks, GTX 970 running x2)

To get faster results I've edited the client_state.xml to change the DCF. (Not recommended unless one knows what they are doing and are willing to risk losing all work onboard.)

FGRP CPU app:
DCF   Estimated RT    Run time (RT)
1.0   6h14m36s        6h31m42s

FGRP GPU app running x2:
DCF   Estimated RT    Run time
1.0   2h13m47s        47m34s
0.4   53m30s

Multi-Directed CV app:
DCF   Estimated RT    Run time
1.0   4h45m24s        12h46m1s
2.65  12h36m20s

The Run time above is for the most recently returned tasks as of this post and my observations are that the tasks usually take about this long to run. For the Multi-Directed CV app there are shorter task and they have shorter estimates but still drive up the DCF.
So on my system I get swings in the DCF from just under 0.4 to about 2.7, needless to say I run with a small cache to not over fetch CPU tasks when the DCF is driven down by the FGRP GPU app.

Christian Beer
Christian Beer
Joined: 9 Feb 05
Posts: 595
Credit: 125301294
RAC: 328432

rbpeake wrote:Just curious,

rbpeake wrote:
Just curious, what is the difference between the CPU and the GPU searches, scientifically?  Thanks!

There is no scientific difference between the two searches. One is CPU only the other is GPU only. The same setup we use for BRP4 and BRP4G. The content of the FGRPB1G tasks is of course modified to fit on GPUs but that is just a technical thing. Scientifically both apps are doing the same thing.

Christian Beer
Christian Beer
Joined: 9 Feb 05
Posts: 595
Credit: 125301294
RAC: 328432

Holmis wrote:The Run time

Holmis wrote:
The Run time above is for the most recently returned tasks as of this post and my observations are that the tasks usually take about this long to run. For the Multi-Directed CV app there are shorter task and they have shorter estimates but still drive up the DCF.
So on my system I get swings in the DCF from just under 0.4 to about 2.7, needless to say I run with a small cache to not over fetch CPU tasks when the DCF is driven down by the FGRP GPU app.

Ok, so it seems I need to increase the speedup factor a little bit more to get at least that estimation right. The problem with the multi-directed search is that run-times vary per host so it is correct for a lot of systems but incorrect for others. We can't really fix that right now. We would need to get rid of DCF which means updating things server-side that we don't want to update because we already know that the update won't work either.

chester
chester
Joined: 15 Jun 10
Posts: 15
Credit: 506261798
RAC: 0

Hi all Мy CPU crunches CPU

Hi all

Мy CPU crunches CPU version 1.5 times longer, than GPU version.

1. If GPU version is more efficient on certain CPU models, why not send them GPU version instead?

2. Can i somehow enforce downloading GPU version to run on CPU?

 

Machine: AMD A10-6800K https://einsteinathome.org/host/12192829

CPU version: 28,251.13 sec https://einsteinathome.org/workunit/265968478

GPU version: 19,907.76 sec https://einsteinathome.org/workunit/265691764

Mad_Max
Mad_Max
Joined: 2 Jan 10
Posts: 153
Credit: 2139702840
RAC: 228351

Holmis wrote:I've done some

Holmis wrote:

I've done some experimenting and here are the findings on my system. (i7 3770K @ 4.2GHz HT=on running 4 CPU tasks, GTX 970 running x2)

To get faster results I've edited the client_state.xml to change the DCF. (Not recommended unless one knows what they are doing and are willing to risk losing all work onboard.)

FGRP CPU app:
DCF   Estimated RT    Run time (RT)
1.0   6h14m36s        6h31m42s

FGRP GPU app running x2:
DCF   Estimated RT    Run time
1.0   2h13m47s        47m34s
0.4   53m30s

Multi-Directed CV app:
DCF   Estimated RT    Run time
1.0   4h45m24s        12h46m1s
2.65  12h36m20s

The Run time above is for the most recently returned tasks as of this post and my observations are that the tasks usually take about this long to run. For the Multi-Directed CV app there are shorter task and they have shorter estimates but still drive up the DCF.
So on my system I get swings in the DCF from just under 0.4 to about 2.7, needless to say I run with a small cache to not over fetch CPU tasks when the DCF is driven down by the FGRP GPU app.

Yep my stats (from 6 different hosts) look similar:

Multi-Directed CV CPU WU try to drive DCF to 1.5-2 range
FGRP CPU to 1.0-1.5 range
BRP4G need DCF around 1 (2x WU per GPU)
FGRP GPU try drag DCF down to 0.3-0.4 range on hosts with 2x WU per GPU and only 0.2-0.3 range with single WU per GPU(before last speedup correction it was even <0.2 for some time).

I also was forced to reduce cache size <1 day on host running CPU/GPU mix to avoid huge CPU tasks over fetch.

TimeLord04
TimeLord04
Joined: 8 Sep 06
Posts: 1442
Credit: 72378840
RAC: 0

chester_4 wrote:Hi allМy

chester_4 wrote:

Hi all

Мy CPU crunches CPU version 1.5 times longer, than GPU version.

1. If GPU version is more efficient on certain CPU models, why not send them GPU version instead?

2. Can i somehow enforce downloading GPU version to run on CPU?

 

Machine: AMD A10-6800K https://einsteinathome.org/host/12192829

CPU version: 28,251.13 sec https://einsteinathome.org/workunit/265968478

GPU version: 19,907.76 sec https://einsteinathome.org/workunit/265691764

Do what I'm doing, and choose NOT to crunch CPU Units.  Go to Account  ---> Preferences  --->  Project (Preferences):  "Use CPU" -->  "No", and Save Changes.

 

When finished with these modifications, hit Update in BOINC.  Your existing CPU Queue may continue crunching until gone; BUT, NO NEW CPU Work will DL to your systems.  Smile

 

[EDIT:]

Once your CPU Work Queue is gone, the CPU will then be dedicated to FEED the GPU, and be free to run your system.  You may find that with this setup, that you may be able to surf the Web without lag, or even stream videos while crunching - without lag, or frame drops.

 

I say "MAY" because on Plex Streaming, right now, on 1.16 Units being crunched on my XP Pro x64 system running EVGA GTX-760 and crunching two Units at a time, that if I stream video, I get frame drops and lagging in video quality.  Sound is OK, but video seems to suffer with these newer GPU Units.

 

TL

TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join SETI Refugees

chester
chester
Joined: 15 Jun 10
Posts: 15
Credit: 506261798
RAC: 0

You would be correct if i

You would be correct if i would have GPU installed.

Computer has no GPU (coprocessor), but somehow was getting GPU versions of this app and crunched them on CPU in 2/3 time of CPU version.

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5845
Credit: 109937342815
RAC: 31375719

chester_4 wrote:You would be

chester_4 wrote:
You would be correct if i would have GPU installed.

Essentially, you do have a GPU installed.  You should do some reading about what AMD APUs actually are.  The advice you were given would maximise the output of the 'GPU' part of your processor.  In addition, you might be able to run some CPU type tasks to improve the total output of your machine, but at the expense of higher power consumption, more heat produced and possibly somewhat longer GPU crunch times.

You'll need to experiment to find the best balance suited to your needs.  That would be much easier if you know what an app_config.xml file is and how to tweak things with it.  It may well be that the best result for you is just to exclude the CPU version of the search.  I don't own any AMD APUs so I can't give advice based on experience.

 

Cheers,
Gary.

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513211304
RAC: 0

chester_4 wrote:Hi allМy

chester_4 wrote:

Hi all

Мy CPU crunches CPU version 1.5 times longer, than GPU version.

1. If GPU version is more efficient on certain CPU models, why not send them GPU version instead?

2. Can i somehow enforce downloading GPU version to run on CPU?

Machine: AMD A10-6800K https://einsteinathome.org/host/12192829

CPU version: 28,251.13 sec https://einsteinathome.org/workunit/265968478

GPU version: 19,907.76 sec https://einsteinathome.org/workunit/265691764

I think there may be some misunderstandings here.  We are looking at one of the rare work units (1WU = two tasks) with one a CPU and the other a GPU application. 

So you are not running a GPU application on that host.   Interestingly these dual app WUs have "Binary points=36" - normally they have "Binary points=175" which may mean they use more memory (longer arrays but fewer of them).    I might see if i crunched a few of these on my CPU and see if i had a time reduction as well.

As Gary mentioned you could run the GPU app on the APU, I think i saw someone with Kaveri running these apps, so it might be worth looking into.  If clinfo is showing the GPU with 1GB of memory it should work.   See some clues  this thread

 edit: yes i had run 72 such "GPU" tasks  on this CPU host, and they were actually slightly slower,  averaging 42K seconds compared with 40K normally.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5845
Credit: 109937342815
RAC: 31375719

Christian Beer wrote:... So

Christian Beer wrote:
...
So in terms of credit we should be back at the same level we had with the BRP4G search.

There have been BRPx searches for quite a long time.  I chose to do the BRP5/BRP6 searches based on quite a few reasons, such as

  1. The data came from an Australian radio telescope (near the town of Parkes in country New South Wales).
  2. The data download per task was a lot less than that of BRP4G so I didn't need a huge monthly limit.
  3. The number of hosts trying to get BRP4G data could completely swamp the available bandwidth of my internet connection if I went with BRP4G.
  4. The 'pay rate' for doing BRP4G was markedly lower than that for doing BRP6.

The first three are the real reasons.  The 4th is a happy side benefit of the choice I'd made.  I'd have been doing BRP6 even if the 'pay rate' details had been completely reversed.

I'm very happy to see FGRPB1G come along since it is very data volume friendly for me.  If a CPU task is worth 693 credits and the latest GPU tasks have 5 times the work content of a CPU task, I don't think some people are going to be happy with 1365 credits.  It would be interesting to know where the higher efficiency comes from. 5 times the work in 3.5 times the duration is an amazing gain in productivity.  People are very encouraged to contribute when their donated resources are being used to maximum efficiency.

As a compromise, I would suggest that at least 3.5 times the standard 693 - say 2425 credits - should be considered as a minimum.  If the efficiency of an app improves over time, shouldn't the credit award be maintained to reward volunteers for their continuing support? It would be quite an inducement for volunteers to consider adding a decent GPU to a CPU only host.

 

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.