Binary Radio Pulsar Search (Perseus Arm Survey) "BRP5"

Tom*
Tom*
Joined: 9 Oct 11
Posts: 54
Credit: 366729484
RAC: 0

tbret, If the Fermis get a

tbret,

If the Fermis get a 10 - 1 ratio against BRP4'S then using Fermis as the baseline works for several reasons.

1. There are a ton of Fermis crunching EINSTEIN.
2. My GTX660 and HD7950 also get a 10 - 1 ratio
3. other systems closer to 7-1 will get that Bonus several of us have talked about to compensate for longer runtimes.
4. Do it like the OLYMPICS throw out the fastest and the slowest and average
whats left I have seen ratios of 10.5 - 1 and 7 - 1 If we settle on 10-1
i.e. 5000 per task most if not all will either be equivalent RAC or exceed the
RAC for the BONUS.

Win Win or whats the downside?

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2955283220
RAC: 719767

That's fine by me. I was the

That's fine by me. I was the one that proposed the 4,000 credit setting - and I did it on the basis of one single host (albeit a pretty clean, simple, one, without any complications from multiple GPUs or multiple tasks per GPU).

But I did it with a full audit trail, with clickable links to the source data - tasks and machine specifications.

Since then, we've had a number of other suggestions - all higher, none lower. But at least one of the 10-1 claims was picked apart by Ageless, with a similar degree of rigour that I attempted with my initial post. And the answer turned out to be between 7.6-1 and 7.7-1 (message 124753).

So I'd back any change, but only if you cite the source data, and give an indication of method. Scientific method.

Horacio
Horacio
Joined: 3 Oct 11
Posts: 205
Credit: 80557243
RAC: 0

On my host with the 2

On my host with the 2 GTX560Ti the BRP5 takes 10.11 times longer than the BRP4...
Under the same config and using the computer as ussual, and the relationship was calculated averaging the last 20 BRP4s validated and the 7 valid BRP5s I have. (The real times for the BRP5 are all between +/- 5% of the average, while the BRP4 are between +/- 20%, so using the worst BRP4 and the best BRP5 it gives a ratio of 8.24, and using the best BRP4 with the worst BRP5 it gives 13.33, the average of both ratios is 10.81 and the average of the averaged times of the extreme values is 10.33)

It seems that the 10:1 ratio is very consistent on Fermi GPUs...

And while I dont care about the credits in relation to other projects, with the BRP5 I am going to get 20% less credits than with the BRP4, so Im seriously tempted to not crunch any more BRP5s until there is no more BRP4...

ritterm
ritterm
Joined: 18 Jun 08
Posts: 23
Credit: 46657826
RAC: 0

It's quite possible that I

It's quite possible that I don't understand how the run time calculations are being made by those who claim that the Perseus WU's (BRP5?) are running about 10x longer than the Arecibo WUs (BRP4?). Or, maybe I'm completely missing the difference between BRP4 and BRP5. Anyway, based on some recent WU's on my GTX 260 host (on which I run 2 WU's at one time), my experience seems to be that the Perseus WU's run about 10x longer. (No selective editing is intended to mislead.)

380633360 Completed and validated 56,188.13 12,443.14 4,000.00 Binary Radio Pulsar Search (Perseus Arm Survey) v1.33 (BRP4cuda32nv301)

380569251 Completed and validated 56,758.23 12,624.07 4,000.00 Binary Radio Pulsar Search (Perseus Arm Survey) v1.33 (BRP4cuda32nv301)

380520764 Completed and validated 5,405.68 1,276.01 500.00 Binary Radio Pulsar Search (Arecibo) v1.33 (BRP4cuda32nv301)

380502449 Completed and validated 5,680.81 794.90 500.00 Binary Radio Pulsar Search (Arecibo) v1.33 (BRP4cuda32nv301)

GPU load with the Arecibo WU's was about 75% and for the Perseus WU's about 65%. The mix of CPU projects was the same on this host for all (but I don't know exactly what was running alongside the Einstein WU's at any given time).

tbret
tbret
Joined: 12 Mar 05
Posts: 2115
Credit: 4862535627
RAC: 119544

RE: ). But I did it with a

Quote:

).

But I did it with a full audit trail, with clickable links to the source data - tasks and machine specifications.

I was in error when I agreed that 8:1 (4,000/wu) would be just about parity across BRP4 and BRP5 work units.

What follows is as representative as I can muster without giving-up my day job. The difficulty for me is that so many of my machines have different "versions" and clocks etc, of the same card in them.

As you noted, they tend to be almost oddly uniform if you look at the same card, etc.

Each of the below is "2 at a time" and rounded and picked-apart to compare the "same device" when I know that the machines were doing only this work. In other words, my sampling attempts to control for when other work was running, or I was running out of work (resulting in 1 at a time numbers). Call this "a preponderance of the evidence" instead of "all of the evidence there is."

Computer 6112121 23800/2250 = 10.57 Twin 560Ti Galaxy, standard (WinXP)

Computer 6101762 27900/2500 = 11.16 Triple GTX 560 standard Galaxy + 2 OC EVGA (Win7)

Computer 5891285 23300/2400 = 9.7 Dual 560Ti - NVIDIA reference (Win7)

Computer 6109868 30000/2800 = 10.7 Triple 670 - EVGA 2 SC, 1 reference (Win7)

Computer 5870046 21300/2200 = 9.68 Dual 470, EVGA standard clock (Win7)

Computer 6109845 25400/2300 = 11.04 Dual GTX 660Ti (WinXP)

My research into bus and GPU usage makes me believe the 600-series cards may improve with "3 at a time" crunching.

So what do we compare? Best case to best case? One at a time crunching? Two at a time on everything?

I reluctantly withdraw my support for 4,000/wu. It looks like 5,000 would be more fair on the greatest number of machines.

Yes, I think I can design an experiment to prove that more convincingly, but I do not think it will materially change the results.

Now I need to rejoin real life, which is already in progress.

tbret
tbret
Joined: 12 Mar 05
Posts: 2115
Credit: 4862535627
RAC: 119544

BRP5 / BRP4 = 24,147.34 / 2,

BRP5 / BRP4 =
24,147.34 / 2,389.97 = 10.10361636
23,319.87 / 2,398.42 = 9.723013484
23,602.20 / 2,369.89 = 9.959196418
23,329.16 / 2,357.59 = 9.895342277
23,314.24 / 2,371.30 = 9.831839076
23,496.48 / 2,363.84 = 9.939962096
24,429.69 / 2,386.51 = 10.23657559
20,460.41 / 2,377.87 = 8.604511601
20,696.03 / 2,386.54 = 8.671981195
24,070.60 / 2,438.75 = 9.870056381
23,584.48 / 2,147.45 = 10.9825514
24,862.60 / 2,390.28 = 10.40154292

Two NVIDIA reference 560Ti GPUs, 2 at a time.
AMD 1090T Phenom II CPU, I would expect times to shorten with an Intel CPU.

One BRP5 averages 9.851682399 BRP4 * 500 = 4925.8412

If you sort the comparison largest to smallest and smallest to largest, you still get 9.861196937.

If you go the opposite direction you still get 9.837926114.

5,000 "credits" might be a tiny fraction high, but better that than to ask people to accept less credit AND delayed gratification AND the project needs less firepower on BRP4 than BRP5.

I'm going to go see if Perseus has a pulse in his Arm. Hang the credits.

Eric_Kaiser
Eric_Kaiser
Joined: 7 Oct 08
Posts: 16
Credit: 25699305
RAC: 0

RE: And the answer turned

Quote:
And the answer turned out to be between 7.6-1 and 7.7-1 (message 124753).


In my opinion that's not lucky to point out this post. Sure the math is correct, but it doesn't reflect the fact that there is a big variance of runtimes of BRP4 and BRP5 WUs.
In this post the math been made with the runtimes of my finished WUs. No one right now can really explain why BRP4 runtime is between ~1,200sec and ~6,300.
Even so BRP5 runtimes vary from ~13,000s to ~41,000.
So, the factor calculated in the named message doesn't reflect this.
If you take the shortest runtime from both factor is >10. To bad that most of the result validated are now deleted from database.
Because of my broad range of different runtimes for WUs I think the factor calculated in message 124753 is not representive.

Just my 2 cents

tbret
tbret
Joined: 12 Mar 05
Posts: 2115
Credit: 4862535627
RAC: 119544

Dual reference GTX 470s

Dual reference GTX 470s running 2 at a time

BRP5 / BRP4
21,275.48 / 2,212.35 = 9.616688137
21,360.70 / 2,209.62 = 9.667137336
21,358.64 / 2,210.99 = 9.66021556
21,382.37 / 2,205.16 = 9.696516353
21,326.00 / 2,207.75 = 9.659608198
21,347.48 / 2,217.87 = 9.625216987
21,336.51 / 2,201.60 = 9.691365371

average = 9.65953542 * 500 = 4832.704824

Edit: These GPUs are more heavily loaded than the 560Tis are. These stay in that 92%+ range and the 560Tis are often in the 70s and 80s.

Eric_Kaiser
Eric_Kaiser
Joined: 7 Oct 08
Posts: 16
Credit: 25699305
RAC: 0

One final word: I'm out of

One final word:
I'm out of this discussion. A big thank you to everyone who gave me hints and tipps that might solve the problem with the broad range of different runtime of BRP WUs.
I've supported Einstein for different reasons. One of the reasons was also fun. I came to the point where the fun part ends. I'm not justifying myself any longer. This also means that I'm cancelling the already fetched WUs at home and quit supporting Einstein for a while (how long this might be) or ever.
I regret that I wrote postings here in this forum.

Good luck & happy crunching

Eric

Maximilian Mieth
Maximilian Mieth
Joined: 4 Oct 12
Posts: 130
Credit: 10269803
RAC: 3228

On my system, BRP5 tasks seem

On my system, BRP5 tasks seem to run 7.8 times longer than BRP4 tasks. CPU runtime is about 8.3 times higher. For me it seems that 4000 credits for BRP5s are a good choice. However, I just completed two WUs so that is not representative.
My system is a Lenovo laptop (Win7) with NVIDIA 610M and Intel i5-3210M. I run two GPU tasks and 4 CPU tasks.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.