Well, it is the way it is supposed to work. You use less cpu-time therefore you should have less credit...
- Peter
No, I think one shall get about equal credits/WU, or else my Celeron 433 would claim many more credits than 44.x/WU after crunching for more than 117 ksec.
So if a host suddenly needs half the time to finish a WU, it did twice the work and should get twice as many credits/h as before.
There shouldn't be a difference between OC to the double clock and an optimized application, than causes the same result.
Every other way would be really senseless, or do you think 3 slower hosts that crunch 30 WUs/day together should claim more credit than 1 faster host that does the same amount of work allone?
This is probably the most difficult issue Dr. Anderson and the BOINC development team have to address with the BOINC framework itself.
The problem is not only do you have to consider how different hosts perform compared to each other on a given project, you have to consider how different projects compare to each other.
To be a fair and equitable framework for multiple projects, one must start with the premise that the value of the work being done for all projects is the same. IOW, X hours of crunching for EAH should be worth the same from a credit viewpoint as crunching X hours for ANY other project for any given host. Otherwise, from a participants viewpoint, it wouldn't make any sense to crunch for any project except for the one that "paid" the most.
Then comes the problem of different hosts. For any given WU, it stands to reason that all hosts have to perform the same amount "effort" to complete it, regardless of the actual time it takes to do that. IOW, the intrinsic value of a given WU is constant.
Finally you need to come up with a method to to ensure the credit equivalency between the various projects.
I don't intend to get into the debate about whether the real point of BOINC is to contribute to science or not. It should be pretty obvious the various projects themselves are in it for the science, but the fact is that BOINC itself is all about the credits.
==> C37 is about 5 percent than A36 on this computer.
==> S37a is a blink of an eye faster than C37 (1 percent; maybe below a sensefull measuring tolerance)
All in all the optimized apps double the output of this machine. :)
Again: Great work akosf :))
==> C37 is about 5 percent than A36 on this computer.
==> S37a is a blink of an eye faster than C37 (1 percent; maybe below a sensefull measuring tolerance)
I got slightly different results with my 3.4 Prescott. (with HT enabled)
C37 took an average of 3:03 per unit
S37a took an average of 3:12 per unit
==> C37 is about 5 percent than A36 on this computer.
==> S37a is a blink of an eye faster than C37 (1 percent; maybe below a sensefull measuring tolerance)
All in all the optimized apps double the output of this machine. :)
Again: Great work akosf :))
CU HiNuN
I have the same experiences but on my AMD Athlon64. Last night I try S37a and the diferences between S37a and C37 is 1,4% for short WU's and 2,1% for long WU's
Results of my Athlon are here
==> C37 is about 5 percent than A36 on this computer.
==> S37a is a blink of an eye faster than C37 (1 percent; maybe below a sensefull measuring tolerance)
I got slightly different results with my 3.4 Prescott. (with HT enabled)
C37 took an average of 3:03 per unit
S37a took an average of 3:12 per unit
Hmm. Maybe an effect of using HT or not? In my test the P4 processed the work with HT off.
I got slightly different results with my 3.4 Prescott. (with HT enabled)
C37 took an average of 3:03 per unit
S37a took an average of 3:12 per unit
I've observed similar diferences in a P4 Northwood 2,4Ghz (no HT capable), as I already mentioned.
S37a is 9% slower than C37.
Prescotts have a larger cache and a 31 stages pipeline (21 stages in Northwood) so it's possible that Prescotts perform better in executing SSE2 to multiple data with low latencies.
But perhaps the use of HT reduces this advantage.
These are only my own suppositions, but I think it makes sense...
I got slightly different results with my 3.4 Prescott. (with HT enabled)
C37 took an average of 3:03 per unit
S37a took an average of 3:12 per unit
I've observed similar diferences in a P4 Northwood 2,4Ghz (no HT capable), as I already mentioned.
S37a is 9% slower than C37.
Prescotts have a larger cache and a 31 stages pipeline (21 stages in Northwood) so it's possible that Prescotts perform better in executing SSE2 to multiple data with low latencies.
But perhaps the use of HT reduces this advantage.
These are only my own suppositions, but I think it makes sense...
One observation that no one seems to mention: On my HT machine if 2 Albert WUs are running simultaneously each runs (sometimes significantly) longer than when there is 1 Albert running simultaneously against a different WU like Rosetta. Whether it is resource contention or whatever, it does effect the run time on my machine.
One observation that no one seems to mention: On my HT machine if 2 Albert WUs are running simultaneously each runs (sometimes significantly) longer than when there is 1 Albert running simultaneously against a different WU like Rosetta. Whether it is resource contention or whatever, it does effect the run time on my machine.
Its normal for WU's to take longer when HT is enabled, but your doing 2 simultaneously so it is actually quicker. Example (not real times just example)
with HT Disabled you might do 1 WU in 1 hours but
with HT Enable you'l should see something like 2 done simultaneously in 1h:20m
Also it might depend very slightly on using with Trux's Client with Processor affinity enabled which can sometimes give a tiny fraction extra speed.
[EDIT]
or did you mean ?
1 einstein and 1 rosseta - both normal
2 einstein units - einstein take longer than above situation
I think i read in another project tread it is something like what you said. Something like HT has to share its cache or some other resorse wheras dual core has each its own
[EDIT]
RE: Well, it is the way it
)
No, I think one shall get about equal credits/WU, or else my Celeron 433 would claim many more credits than 44.x/WU after crunching for more than 117 ksec.
So if a host suddenly needs half the time to finish a WU, it did twice the work and should get twice as many credits/h as before.
There shouldn't be a difference between OC to the double clock and an optimized application, than causes the same result.
Every other way would be really senseless, or do you think 3 slower hosts that crunch 30 WUs/day together should claim more credit than 1 faster host that does the same amount of work allone?
cu,
Michael
This is probably the most
)
This is probably the most difficult issue Dr. Anderson and the BOINC development team have to address with the BOINC framework itself.
The problem is not only do you have to consider how different hosts perform compared to each other on a given project, you have to consider how different projects compare to each other.
To be a fair and equitable framework for multiple projects, one must start with the premise that the value of the work being done for all projects is the same. IOW, X hours of crunching for EAH should be worth the same from a credit viewpoint as crunching X hours for ANY other project for any given host. Otherwise, from a participants viewpoint, it wouldn't make any sense to crunch for any project except for the one that "paid" the most.
Then comes the problem of different hosts. For any given WU, it stands to reason that all hosts have to perform the same amount "effort" to complete it, regardless of the actual time it takes to do that. IOW, the intrinsic value of a given WU is constant.
Finally you need to come up with a method to to ensure the credit equivalency between the various projects.
I don't intend to get into the debate about whether the real point of BOINC is to contribute to science or not. It should be pretty obvious the various projects themselves are in it for the science, but the fact is that BOINC itself is all about the credits.
Alinator
Did a complete test on a
)
Did a complete test on a second Prescott P4: http://einsteinathome.org/host/570081/tasks
==> C37 is about 5 percent than A36 on this computer.
==> S37a is a blink of an eye faster than C37 (1 percent; maybe below a sensefull measuring tolerance)
All in all the optimized apps double the output of this machine. :)
Again: Great work akosf :))
CU HiNuN
RE: Did a complete test on
)
I got slightly different results with my 3.4 Prescott. (with HT enabled)
C37 took an average of 3:03 per unit
S37a took an average of 3:12 per unit
me-[at]-rescam.org
RE: Did a complete test on
)
I have the same experiences but on my AMD Athlon64. Last night I try S37a and the diferences between S37a and C37 is 1,4% for short WU's and 2,1% for long WU's
Results of my Athlon are here
RE: RE: Did a complete
)
Hmm. Maybe an effect of using HT or not? In my test the P4 processed the work with HT off.
CU HiNuN
Comparision of A36 and C37 on
)
Comparision of A36 and C37 on my Pentium Xeon (Prestonia)
A36:
Min time: 8416.38 sec
Max time: 11855.38 sec
Avg time: 10548.23 sec
No of results: 33
C37:
Min time: 9421.44 sec
Max time: 10636.72 sec
Avg time: 10092.57 sec
No of results: 13
All of you can make your own conclusions.
RE: I got slightly
)
I've observed similar diferences in a P4 Northwood 2,4Ghz (no HT capable), as I already mentioned.
S37a is 9% slower than C37.
Prescotts have a larger cache and a 31 stages pipeline (21 stages in Northwood) so it's possible that Prescotts perform better in executing SSE2 to multiple data with low latencies.
But perhaps the use of HT reduces this advantage.
These are only my own suppositions, but I think it makes sense...
RE: RE: I got slightly
)
One observation that no one seems to mention: On my HT machine if 2 Albert WUs are running simultaneously each runs (sometimes significantly) longer than when there is 1 Albert running simultaneously against a different WU like Rosetta. Whether it is resource contention or whatever, it does effect the run time on my machine.
RE: One observation that
)
Its normal for WU's to take longer when HT is enabled, but your doing 2 simultaneously so it is actually quicker. Example (not real times just example)
with HT Disabled you might do 1 WU in 1 hours but
with HT Enable you'l should see something like 2 done simultaneously in 1h:20m
Also it might depend very slightly on using with Trux's Client with Processor affinity enabled which can sometimes give a tiny fraction extra speed.
[EDIT]
or did you mean ?
1 einstein and 1 rosseta - both normal
2 einstein units - einstein take longer than above situation
I think i read in another project tread it is something like what you said. Something like HT has to share its cache or some other resorse wheras dual core has each its own
[EDIT]