If the Fermis get a 10 - 1 ratio against BRP4'S then using Fermis as the baseline works for several reasons.
1. There are a ton of Fermis crunching EINSTEIN.
2. My GTX660 and HD7950 also get a 10 - 1 ratio
3. other systems closer to 7-1 will get that Bonus several of us have talked about to compensate for longer runtimes.
4. Do it like the OLYMPICS throw out the fastest and the slowest and average
whats left I have seen ratios of 10.5 - 1 and 7 - 1 If we settle on 10-1
i.e. 5000 per task most if not all will either be equivalent RAC or exceed the
RAC for the BONUS.
That's fine by me. I was the one that proposed the 4,000 credit setting - and I did it on the basis of one single host (albeit a pretty clean, simple, one, without any complications from multiple GPUs or multiple tasks per GPU).
But I did it with a full audit trail, with clickable links to the source data - tasks and machine specifications.
Since then, we've had a number of other suggestions - all higher, none lower. But at least one of the 10-1 claims was picked apart by Ageless, with a similar degree of rigour that I attempted with my initial post. And the answer turned out to be between 7.6-1 and 7.7-1 (message 124753).
So I'd back any change, but only if you cite the source data, and give an indication of method. Scientific method.
On my host with the 2 GTX560Ti the BRP5 takes 10.11 times longer than the BRP4...
Under the same config and using the computer as ussual, and the relationship was calculated averaging the last 20 BRP4s validated and the 7 valid BRP5s I have. (The real times for the BRP5 are all between +/- 5% of the average, while the BRP4 are between +/- 20%, so using the worst BRP4 and the best BRP5 it gives a ratio of 8.24, and using the best BRP4 with the worst BRP5 it gives 13.33, the average of both ratios is 10.81 and the average of the averaged times of the extreme values is 10.33)
It seems that the 10:1 ratio is very consistent on Fermi GPUs...
And while I dont care about the credits in relation to other projects, with the BRP5 I am going to get 20% less credits than with the BRP4, so Im seriously tempted to not crunch any more BRP5s until there is no more BRP4...
It's quite possible that I don't understand how the run time calculations are being made by those who claim that the Perseus WU's (BRP5?) are running about 10x longer than the Arecibo WUs (BRP4?). Or, maybe I'm completely missing the difference between BRP4 and BRP5. Anyway, based on some recent WU's on my GTX 260 host (on which I run 2 WU's at one time), my experience seems to be that the Perseus WU's run about 10x longer. (No selective editing is intended to mislead.)
380633360 Completed and validated 56,188.13 12,443.14 4,000.00 Binary Radio Pulsar Search (Perseus Arm Survey) v1.33 (BRP4cuda32nv301)
380569251 Completed and validated 56,758.23 12,624.07 4,000.00 Binary Radio Pulsar Search (Perseus Arm Survey) v1.33 (BRP4cuda32nv301)
380520764 Completed and validated 5,405.68 1,276.01 500.00 Binary Radio Pulsar Search (Arecibo) v1.33 (BRP4cuda32nv301)
380502449 Completed and validated 5,680.81 794.90 500.00 Binary Radio Pulsar Search (Arecibo) v1.33 (BRP4cuda32nv301)
GPU load with the Arecibo WU's was about 75% and for the Perseus WU's about 65%. The mix of CPU projects was the same on this host for all (but I don't know exactly what was running alongside the Einstein WU's at any given time).
But I did it with a full audit trail, with clickable links to the source data - tasks and machine specifications.
I was in error when I agreed that 8:1 (4,000/wu) would be just about parity across BRP4 and BRP5 work units.
What follows is as representative as I can muster without giving-up my day job. The difficulty for me is that so many of my machines have different "versions" and clocks etc, of the same card in them.
As you noted, they tend to be almost oddly uniform if you look at the same card, etc.
Each of the below is "2 at a time" and rounded and picked-apart to compare the "same device" when I know that the machines were doing only this work. In other words, my sampling attempts to control for when other work was running, or I was running out of work (resulting in 1 at a time numbers). Call this "a preponderance of the evidence" instead of "all of the evidence there is."
Computer 6112121 23800/2250 = 10.57 Twin 560Ti Galaxy, standard (WinXP)
Two NVIDIA reference 560Ti GPUs, 2 at a time.
AMD 1090T Phenom II CPU, I would expect times to shorten with an Intel CPU.
One BRP5 averages 9.851682399 BRP4 * 500 = 4925.8412
If you sort the comparison largest to smallest and smallest to largest, you still get 9.861196937.
If you go the opposite direction you still get 9.837926114.
5,000 "credits" might be a tiny fraction high, but better that than to ask people to accept less credit AND delayed gratification AND the project needs less firepower on BRP4 than BRP5.
I'm going to go see if Perseus has a pulse in his Arm. Hang the credits.
And the answer turned out to be between 7.6-1 and 7.7-1 (message 124753).
In my opinion that's not lucky to point out this post. Sure the math is correct, but it doesn't reflect the fact that there is a big variance of runtimes of BRP4 and BRP5 WUs.
In this post the math been made with the runtimes of my finished WUs. No one right now can really explain why BRP4 runtime is between ~1,200sec and ~6,300.
Even so BRP5 runtimes vary from ~13,000s to ~41,000.
So, the factor calculated in the named message doesn't reflect this.
If you take the shortest runtime from both factor is >10. To bad that most of the result validated are now deleted from database.
Because of my broad range of different runtimes for WUs I think the factor calculated in message 124753 is not representive.
One final word:
I'm out of this discussion. A big thank you to everyone who gave me hints and tipps that might solve the problem with the broad range of different runtime of BRP WUs.
I've supported Einstein for different reasons. One of the reasons was also fun. I came to the point where the fun part ends. I'm not justifying myself any longer. This also means that I'm cancelling the already fetched WUs at home and quit supporting Einstein for a while (how long this might be) or ever.
I regret that I wrote postings here in this forum.
On my system, BRP5 tasks seem to run 7.8 times longer than BRP4 tasks. CPU runtime is about 8.3 times higher. For me it seems that 4000 credits for BRP5s are a good choice. However, I just completed two WUs so that is not representative.
My system is a Lenovo laptop (Win7) with NVIDIA 610M and Intel i5-3210M. I run two GPU tasks and 4 CPU tasks.
tbret, If the Fermis get a
)
tbret,
If the Fermis get a 10 - 1 ratio against BRP4'S then using Fermis as the baseline works for several reasons.
1. There are a ton of Fermis crunching EINSTEIN.
2. My GTX660 and HD7950 also get a 10 - 1 ratio
3. other systems closer to 7-1 will get that Bonus several of us have talked about to compensate for longer runtimes.
4. Do it like the OLYMPICS throw out the fastest and the slowest and average
whats left I have seen ratios of 10.5 - 1 and 7 - 1 If we settle on 10-1
i.e. 5000 per task most if not all will either be equivalent RAC or exceed the
RAC for the BONUS.
Win Win or whats the downside?
That's fine by me. I was the
)
That's fine by me. I was the one that proposed the 4,000 credit setting - and I did it on the basis of one single host (albeit a pretty clean, simple, one, without any complications from multiple GPUs or multiple tasks per GPU).
But I did it with a full audit trail, with clickable links to the source data - tasks and machine specifications.
Since then, we've had a number of other suggestions - all higher, none lower. But at least one of the 10-1 claims was picked apart by Ageless, with a similar degree of rigour that I attempted with my initial post. And the answer turned out to be between 7.6-1 and 7.7-1 (message 124753).
So I'd back any change, but only if you cite the source data, and give an indication of method. Scientific method.
On my host with the 2
)
On my host with the 2 GTX560Ti the BRP5 takes 10.11 times longer than the BRP4...
Under the same config and using the computer as ussual, and the relationship was calculated averaging the last 20 BRP4s validated and the 7 valid BRP5s I have. (The real times for the BRP5 are all between +/- 5% of the average, while the BRP4 are between +/- 20%, so using the worst BRP4 and the best BRP5 it gives a ratio of 8.24, and using the best BRP4 with the worst BRP5 it gives 13.33, the average of both ratios is 10.81 and the average of the averaged times of the extreme values is 10.33)
It seems that the 10:1 ratio is very consistent on Fermi GPUs...
And while I dont care about the credits in relation to other projects, with the BRP5 I am going to get 20% less credits than with the BRP4, so Im seriously tempted to not crunch any more BRP5s until there is no more BRP4...
It's quite possible that I
)
It's quite possible that I don't understand how the run time calculations are being made by those who claim that the Perseus WU's (BRP5?) are running about 10x longer than the Arecibo WUs (BRP4?). Or, maybe I'm completely missing the difference between BRP4 and BRP5. Anyway, based on some recent WU's on my GTX 260 host (on which I run 2 WU's at one time), my experience seems to be that the Perseus WU's run about 10x longer. (No selective editing is intended to mislead.)
380633360 Completed and validated 56,188.13 12,443.14 4,000.00 Binary Radio Pulsar Search (Perseus Arm Survey) v1.33 (BRP4cuda32nv301)
380569251 Completed and validated 56,758.23 12,624.07 4,000.00 Binary Radio Pulsar Search (Perseus Arm Survey) v1.33 (BRP4cuda32nv301)
380520764 Completed and validated 5,405.68 1,276.01 500.00 Binary Radio Pulsar Search (Arecibo) v1.33 (BRP4cuda32nv301)
380502449 Completed and validated 5,680.81 794.90 500.00 Binary Radio Pulsar Search (Arecibo) v1.33 (BRP4cuda32nv301)
GPU load with the Arecibo WU's was about 75% and for the Perseus WU's about 65%. The mix of CPU projects was the same on this host for all (but I don't know exactly what was running alongside the Einstein WU's at any given time).
RE: ). But I did it with a
)
I was in error when I agreed that 8:1 (4,000/wu) would be just about parity across BRP4 and BRP5 work units.
What follows is as representative as I can muster without giving-up my day job. The difficulty for me is that so many of my machines have different "versions" and clocks etc, of the same card in them.
As you noted, they tend to be almost oddly uniform if you look at the same card, etc.
Each of the below is "2 at a time" and rounded and picked-apart to compare the "same device" when I know that the machines were doing only this work. In other words, my sampling attempts to control for when other work was running, or I was running out of work (resulting in 1 at a time numbers). Call this "a preponderance of the evidence" instead of "all of the evidence there is."
Computer 6112121 23800/2250 = 10.57 Twin 560Ti Galaxy, standard (WinXP)
Computer 6101762 27900/2500 = 11.16 Triple GTX 560 standard Galaxy + 2 OC EVGA (Win7)
Computer 5891285 23300/2400 = 9.7 Dual 560Ti - NVIDIA reference (Win7)
Computer 6109868 30000/2800 = 10.7 Triple 670 - EVGA 2 SC, 1 reference (Win7)
Computer 5870046 21300/2200 = 9.68 Dual 470, EVGA standard clock (Win7)
Computer 6109845 25400/2300 = 11.04 Dual GTX 660Ti (WinXP)
My research into bus and GPU usage makes me believe the 600-series cards may improve with "3 at a time" crunching.
So what do we compare? Best case to best case? One at a time crunching? Two at a time on everything?
I reluctantly withdraw my support for 4,000/wu. It looks like 5,000 would be more fair on the greatest number of machines.
Yes, I think I can design an experiment to prove that more convincingly, but I do not think it will materially change the results.
Now I need to rejoin real life, which is already in progress.
BRP5 / BRP4 = 24,147.34 / 2,
)
BRP5 / BRP4 =
24,147.34 / 2,389.97 = 10.10361636
23,319.87 / 2,398.42 = 9.723013484
23,602.20 / 2,369.89 = 9.959196418
23,329.16 / 2,357.59 = 9.895342277
23,314.24 / 2,371.30 = 9.831839076
23,496.48 / 2,363.84 = 9.939962096
24,429.69 / 2,386.51 = 10.23657559
20,460.41 / 2,377.87 = 8.604511601
20,696.03 / 2,386.54 = 8.671981195
24,070.60 / 2,438.75 = 9.870056381
23,584.48 / 2,147.45 = 10.9825514
24,862.60 / 2,390.28 = 10.40154292
Two NVIDIA reference 560Ti GPUs, 2 at a time.
AMD 1090T Phenom II CPU, I would expect times to shorten with an Intel CPU.
One BRP5 averages 9.851682399 BRP4 * 500 = 4925.8412
If you sort the comparison largest to smallest and smallest to largest, you still get 9.861196937.
If you go the opposite direction you still get 9.837926114.
5,000 "credits" might be a tiny fraction high, but better that than to ask people to accept less credit AND delayed gratification AND the project needs less firepower on BRP4 than BRP5.
I'm going to go see if Perseus has a pulse in his Arm. Hang the credits.
RE: And the answer turned
)
In my opinion that's not lucky to point out this post. Sure the math is correct, but it doesn't reflect the fact that there is a big variance of runtimes of BRP4 and BRP5 WUs.
In this post the math been made with the runtimes of my finished WUs. No one right now can really explain why BRP4 runtime is between ~1,200sec and ~6,300.
Even so BRP5 runtimes vary from ~13,000s to ~41,000.
So, the factor calculated in the named message doesn't reflect this.
If you take the shortest runtime from both factor is >10. To bad that most of the result validated are now deleted from database.
Because of my broad range of different runtimes for WUs I think the factor calculated in message 124753 is not representive.
Just my 2 cents
Dual reference GTX 470s
)
Dual reference GTX 470s running 2 at a time
BRP5 / BRP4
21,275.48 / 2,212.35 = 9.616688137
21,360.70 / 2,209.62 = 9.667137336
21,358.64 / 2,210.99 = 9.66021556
21,382.37 / 2,205.16 = 9.696516353
21,326.00 / 2,207.75 = 9.659608198
21,347.48 / 2,217.87 = 9.625216987
21,336.51 / 2,201.60 = 9.691365371
average = 9.65953542 * 500 = 4832.704824
Edit: These GPUs are more heavily loaded than the 560Tis are. These stay in that 92%+ range and the 560Tis are often in the 70s and 80s.
One final word: I'm out of
)
One final word:
I'm out of this discussion. A big thank you to everyone who gave me hints and tipps that might solve the problem with the broad range of different runtime of BRP WUs.
I've supported Einstein for different reasons. One of the reasons was also fun. I came to the point where the fun part ends. I'm not justifying myself any longer. This also means that I'm cancelling the already fetched WUs at home and quit supporting Einstein for a while (how long this might be) or ever.
I regret that I wrote postings here in this forum.
Good luck & happy crunching
Eric
On my system, BRP5 tasks seem
)
On my system, BRP5 tasks seem to run 7.8 times longer than BRP4 tasks. CPU runtime is about 8.3 times higher. For me it seems that 4000 credits for BRP5s are a good choice. However, I just completed two WUs so that is not representative.
My system is a Lenovo laptop (Win7) with NVIDIA 610M and Intel i5-3210M. I run two GPU tasks and 4 CPU tasks.