Both R and Z units come in short and long varieties. I've got all 4 varients in my queue at present. UNfortunately most're the <20m short type. I'm in danger of starvation in a day or two if I stay 100% E@H.
The wu's from a particular skygrid; are they roughly the same size? Do R & Z units come from the same skygrid?
[pre]
Processor S39L/S39 S39L/dist. Gain
P4 3.4 GHz HT 0.89 0.22 (-11%)
P4 1400 MHz 0.97 0.27 (- 3%)
PIII 863 MHz 0.98 0.19 (- 2%)
[/pre]
Maybe little bit surprising result - the biggest gain for the most advanced proc (now doing two WUs in 5900 seconds!).
Akosf - did you notice the fact that the CPUtime/date dependence of your optimizations is very similar to approaching of an object towards a black hole? Exponential behavior like dilatation of time or contraction of length.
So GO for the singularity - zero seconds for WU is the goal! :-))
The wu's from a particular skygrid; are they roughly the same size? Do R & Z units come from the same skygrid?
nearly sequential units seem generally to be highly correlated in execution time to their neighbors, but I have seen gradual drift of more than 30% over fewer than a hundred sequence numbers, and I've seen outliers of 50% or more even in pretty close sequence.
Just today my Gallatin PC has been processing results near z1_1402.0__2524_S4R2a_1. While my execution time for nine of them has been in a tight band between 124 minutes and 131 minutes, I had two require 198 and 200 minutes right in the middle of the sequence.
When we post execution time comparisons in the few percent range we are risking noise error larger than than the measured difference.
On the other hand, sometimes well over a dozen wu's will have extremely close execution requirements.
Ok, thanks for the answer archae86. On the second question; to clarify, is it possible to have a r1_1074_XXX_S4Ra_(x) & a z1_1074_XXX_S4Ra_(x) from the same skygrid? XXX being =. Sry about getting away from the observation detail of the thread.
Just upgraded to S39 and from the little orange target absolutely racing across the sky, I'd say about 10% improvement on S38 on my system.
Aksof, you are a genius.
Have you ever thought about offering your services to other projects, not that I don't appreciate all you have done for Einstein, but a 75% decrease in model crunch times at say CPDN like you have achieved here, now that would be a godly effort. :)
I've got to say, this is amazing; akosf well done!
Perhaps my two or three results, so far, have been flukes but i have gone from 5 hours (18900 seconds) to 2 hours > a WU (In most cases 1 hr 20min). Dell XPS 400 Pent D 2.8gighrz ...
The two returned so far are of the same batch recently downloaded. I will install on my other two computers and hopefully give you a better assessment of speed gain on the P3 mobile and AMD.
Aksof, i cannot stress how much this seems to have improved my speed,
-Mike Molzahn
"The most incomprehensible thing about the universe is that it is comprehensible"
-Albert Einstein
[pre]
P4 3.4 GHz HT 0.89 0.22 (-11%)
P4 1400 MHz 0.97 0.27 (- 3%)
[/pre]
Maybe little bit surprising result - the biggest gain for the most advanced proc
Do you know the L1 cache sizes of your two P4's? Willamette/Northwood/Gallatin are 8 kbyte L1 data cache, whereas Prescott and derivates have 16 kbyte L1 data cache.
Have you ever thought about offering your services to other projects, not that I don't appreciate all you have done for Einstein, but a 75% decrease in model crunch times at say CPDN like you have achieved here, now that would be a godly effort. :)
Well, I examined some project applications some months ago. But my favourite is Einstein@Home.
Code of CPDN is worthless. (just my opinion)
I think a better compiler would be able to increase its speead about 30-50%. (careful estimate)
The handoptimizing is a bit hopeless because of very big code and the very slow feedback. (validation)
SETI has some very good optimised code.
I spent an afternoon to optimise it, and I could get just 10% improvement. (Crunch3r's Athlon XP SSE 2.09)
SZTAKI code is very slow.
I tried to optimise it (11th dimension) and my code was about 5 times faster, but I found a security problem, like as seti@home classic. My proposal was deleted on the message board.
PrimeGrid. I spent just some seconds to optimize it and I get ~250% improvement. But there are much more faster algorithms (my binary based searcher was about 1000 times faster, so it means that my duron would be able to find the solution in about 1 billion years.)
So, this project is a planetheater... (just my opinion)
Famous Akosf Results with S39L per WU
original S39 S39L
8-10 hours 2h07min 1h27min
*******************************************
PC Pentium4, 2,6GHz, 1GB RAM, Windows XPHome, ServicePack2
RE: Both R and Z units come
)
The wu's from a particular skygrid; are they roughly the same size? Do R & Z units come from the same skygrid?
[pre] Processor
)
[pre]
Processor S39L/S39 S39L/dist. Gain
P4 3.4 GHz HT 0.89 0.22 (-11%)
P4 1400 MHz 0.97 0.27 (- 3%)
PIII 863 MHz 0.98 0.19 (- 2%)
[/pre]
Maybe little bit surprising result - the biggest gain for the most advanced proc (now doing two WUs in 5900 seconds!).
Akosf - did you notice the fact that the CPUtime/date dependence of your optimizations is very similar to approaching of an object towards a black hole? Exponential behavior like dilatation of time or contraction of length.
So GO for the singularity - zero seconds for WU is the goal! :-))
BOINC.SK team
RE: The wu's from a
)
nearly sequential units seem generally to be highly correlated in execution time to their neighbors, but I have seen gradual drift of more than 30% over fewer than a hundred sequence numbers, and I've seen outliers of 50% or more even in pretty close sequence.
Just today my Gallatin PC has been processing results near z1_1402.0__2524_S4R2a_1. While my execution time for nine of them has been in a tight band between 124 minutes and 131 minutes, I had two require 198 and 200 minutes right in the middle of the sequence.
When we post execution time comparisons in the few percent range we are risking noise error larger than than the measured difference.
On the other hand, sometimes well over a dozen wu's will have extremely close execution requirements.
Ok, thanks for the answer
)
Ok, thanks for the answer archae86. On the second question; to clarify, is it possible to have a r1_1074_XXX_S4Ra_(x) & a z1_1074_XXX_S4Ra_(x) from the same skygrid? XXX being =. Sry about getting away from the observation detail of the thread.
Now we are cooking with
)
Now we are cooking with gas.
Just upgraded to S39 and from the little orange target absolutely racing across the sky, I'd say about 10% improvement on S38 on my system.
Aksof, you are a genius.
Have you ever thought about offering your services to other projects, not that I don't appreciate all you have done for Einstein, but a 75% decrease in model crunch times at say CPDN like you have achieved here, now that would be a godly effort. :)
Hey all, I've got to say,
)
Hey all,
I've got to say, this is amazing; akosf well done!
Perhaps my two or three results, so far, have been flukes but i have gone from 5 hours (18900 seconds) to 2 hours > a WU (In most cases 1 hr 20min). Dell XPS 400 Pent D 2.8gighrz ...
The two returned so far are of the same batch recently downloaded. I will install on my other two computers and hopefully give you a better assessment of speed gain on the P3 mobile and AMD.
Aksof, i cannot stress how much this seems to have improved my speed,
-Mike Molzahn
"The most incomprehensible thing about the universe is that it is comprehensible"
-Albert Einstein
my homepage
RE: [pre] P4 3.4 GHz HT
)
Do you know the L1 cache sizes of your two P4's? Willamette/Northwood/Gallatin are 8 kbyte L1 data cache, whereas Prescott and derivates have 16 kbyte L1 data cache.
On my 3.4 Prescott (HT
)
On my 3.4 Prescott (HT enabled) 2 units of S39L finished 2 minutes faster than 2 units of S39.
me-[at]-rescam.org
RE: Aksof, you are a
)
:-)
Well, I examined some project applications some months ago. But my favourite is Einstein@Home.
Code of CPDN is worthless. (just my opinion)
I think a better compiler would be able to increase its speead about 30-50%. (careful estimate)
The handoptimizing is a bit hopeless because of very big code and the very slow feedback. (validation)
SETI has some very good optimised code.
I spent an afternoon to optimise it, and I could get just 10% improvement. (Crunch3r's Athlon XP SSE 2.09)
SZTAKI code is very slow.
I tried to optimise it (11th dimension) and my code was about 5 times faster, but I found a security problem, like as seti@home classic. My proposal was deleted on the message board.
PrimeGrid. I spent just some seconds to optimize it and I get ~250% improvement. But there are much more faster algorithms (my binary based searcher was about 1000 times faster, so it means that my duron would be able to find the solution in about 1 billion years.)
So, this project is a planetheater... (just my opinion)
Famous Akosf Results with
)
Famous Akosf Results with S39L per WU
original S39 S39L
8-10 hours 2h07min 1h27min
*******************************************
PC Pentium4, 2,6GHz, 1GB RAM, Windows XPHome, ServicePack2
Akosf, excellent work !!!
seti_britta