I have found S40.12 to be more than 4 min faster than S40.03. I did not try S40.04 because I didn't get any errors from S40.03. I have been crunching one long and one short data set for the last week and the processing times have been very consistent between each other. While not a controlled test, every WU crunched by S40.12 so far has been faster than S40.03. I am running an OCed 3700+ with 1MB of L2 cache. I think the results speak for them selves.
Like Zeigenmelker said, the odds of the time improvements being sheer coincidence are pretty unlikely.
Once again I take my hat of to you askof
There are 10^11 stars in the galaxy. That used to be a huge number. But it's only a hundred billion. It's less than the national deficit! We used to call them astronomical numbers. Now we should call them economical numbers. - Richard Feynman
I did not do a controlled rerun, but the first five results returned running S40.12 on my Gallatin (Northwood-descended P4 EE 8k L1, 512k L2, 2M L3 cache) are definitely slower than most recent results from the same two major datafiles.
I'll add to my previous comment that the hyperthreading is turned on on this machine, and as it is running 95% Einstein resource share, the "other job" is nearly always another Einstein.
Perhaps if other P4 users document HT vs. Not HT we can unravel whether the S40.12 vs. S40.04 speedup/slowdown for P4's is HT related.
As to non P4, non-hyperthreaded machines, I have initial results:
Pentium III 700 MHz, probably faster, ratio estimate roughly 97%
Pentium III 933 MHz, equal time, well within noise level
Banias (Pentium M) 1.4 GHz definitely slower, CPU time ratio about 104%
I confess I'm tempted to pull S40.12 off the Gallatin HT and Banias machines.
As to credit, I think my rate of zero credits on S40.04 was low enough that I'd get more credit with that than S40.12.
As to science accuracy, if I understand akosf right, the S40.12 "improvemet" may be to degrade accuracy to more closely match the distributed science ap.
As to science productivity, so long as two members of my quorum closely agree, no extra result is sent out, a canonical result is declared, and so science productivity would actually be higher until so many distinct application variations appear that a quorum is meaningfully likely to fail. So even my zero credit results are currently not an actual loss of science productivity.
I'd be happy to learn how the above assessments may be wrong, and will switch back my Banias and Gallatin tomorrow to check if the speed change is reproduced on reversal.
initial results on my x64x2 2.6 are all on the very low end of the s40.04 range, but I don't have enough to say it's definately a speedup or not. Granted at the moment I'm getting a bunch of 48m's, and very rarely saw them with .04 but that could just be the work I'm getting
Pentium 4 2,8 GHz HT (Prescott) HT active
WU-Type 1358,5 Hz
S40.04 7514 s
S40.12 9230 s
The extreme increasing of time will not be acceptable for me, so I fall back to S40.04.
BTW: with the S40.04 I received NO ERRORS with "0" Credits within more than 100 WU´s!
Chris
Still waiting for an Prescott SSE3 optimized client.
*Die Signatur befindet sich aus technischen Gründen auf der Rückseite dieses Beitrages!*
agree, definitely slower on
)
agree, definitely slower on my P4 2.8C HT enabled
I have found S40.12 to be
)
I have found S40.12 to be more than 4 min faster than S40.03. I did not try S40.04 because I didn't get any errors from S40.03. I have been crunching one long and one short data set for the last week and the processing times have been very consistent between each other. While not a controlled test, every WU crunched by S40.12 so far has been faster than S40.03. I am running an OCed 3700+ with 1MB of L2 cache. I think the results speak for them selves.
S40.03:
z1_1357.0__1975_S4R2a_0 time: 3212s
z1_1357.0__1974_S4R2a_0 time: 3209s
z1_1357.0__1973_S4R2a_0 time: 3209s
z1_1357.0__1972_S4R2a_0 time: 3213s
S40.03/S40.12:
z1_1357.0__1971_S4R2a_0 time: 3002s
S40.12:
z1_1357.0__1970_S4R2a_0 time: 2934s
z1_1357.0__1969_S4R2a_0 time: 2935s
z1_1357.0__1968_S4R2a_0 time: 2934s
z1_1357.0__1967_S4R2a_0 time: 2931s
Like Zeigenmelker said, the odds of the time improvements being sheer coincidence are pretty unlikely.
Once again I take my hat of to you askof
There are 10^11 stars in the galaxy. That used to be a huge number. But it's only a hundred billion. It's less than the national deficit! We used to call them astronomical numbers. Now we should call them economical numbers. - Richard Feynman
My computer Pentium 4, 3.0
)
My computer Pentium 4, 3.0 MGhz, using for project 1 CPU of HT
Min. result for S40.04 = 4335s (from more than 60 WU)
First result for S40.12 = 4160s
And now start new WU :(
But, I think 175s faster is good result.
Akosf, sanks you.
RE: agree, definitely
)
Same with my P4 with HT on.
My other P4 without HT is faster with S40.12.
All work done so far with S40.12 is without errors.
Anders n
Edit info
RE: RE: Wasn’t 40.12
)
I'll add to my previous comment that the hyperthreading is turned on on this machine, and as it is running 95% Einstein resource share, the "other job" is nearly always another Einstein.
Perhaps if other P4 users document HT vs. Not HT we can unravel whether the S40.12 vs. S40.04 speedup/slowdown for P4's is HT related.
As to non P4, non-hyperthreaded machines, I have initial results:
Pentium III 700 MHz, probably faster, ratio estimate roughly 97%
Pentium III 933 MHz, equal time, well within noise level
Banias (Pentium M) 1.4 GHz definitely slower, CPU time ratio about 104%
I confess I'm tempted to pull S40.12 off the Gallatin HT and Banias machines.
As to credit, I think my rate of zero credits on S40.04 was low enough that I'd get more credit with that than S40.12.
As to science accuracy, if I understand akosf right, the S40.12 "improvemet" may be to degrade accuracy to more closely match the distributed science ap.
As to science productivity, so long as two members of my quorum closely agree, no extra result is sent out, a canonical result is declared, and so science productivity would actually be higher until so many distinct application variations appear that a quorum is meaningfully likely to fail. So even my zero credit results are currently not an actual loss of science productivity.
I'd be happy to learn how the above assessments may be wrong, and will switch back my Banias and Gallatin tomorrow to check if the speed change is reproduced on reversal.
initial results on my x64x2
)
initial results on my x64x2 2.6 are all on the very low end of the s40.04 range, but I don't have enough to say it's definately a speedup or not. Granted at the moment I'm getting a bunch of 48m's, and very rarely saw them with .04 but that could just be the work I'm getting
Pentium 4 2,8 GHz HT
)
Pentium 4 2,8 GHz HT (Prescott) HT active
WU-Type 1358,5 Hz
S40.04 7514 s
S40.12 9230 s
The extreme increasing of time will not be acceptable for me, so I fall back to S40.04.
BTW: with the S40.04 I received NO ERRORS with "0" Credits within more than 100 WU´s!
Chris
Still waiting for an Prescott SSE3 optimized client.
*Die Signatur befindet sich aus technischen Gründen auf der Rückseite dieses Beitrages!*
I've just seen a 46:17 which
)
I've just seen a 46:17 which is my fastest large result ever. At 50+/day I don't think it's just an outlier.
I got some results from my
)
I got some results from my Athlon 2400+(oc):
WU -------------------- Time ------- Albert
z1_1373.5__2276_S4R2a_1 --- 3431 --- S40.04
z1_1373.5__2275_S4R2a_2 --- 3431 --- S40.04
z1_1373.5__2274_S4R2a_2 --- 3431 --- S40.04
z1_1373.5__2273_S4R2a_0 --- 3429 --- S40.04
z1_1373.5__2272_S4R2a_0 --- 3430 --- S40.04
z1_1373.5__2271_S4R2a_0 --- 3430 --- S40.04
z1_1373.5__2270_S4R2a_0 --- 3432 --- S40.04
z1_1373.5__2269_S4R2a_0 --- 3431 --- S40.04
z1_1373.5__2268_S4R2a_0 --- 3430 --- S40.04
z1_1373.5__2267_S4R2a_0 --- 3429 --- S40.04
z1_1373.5__2266_S4R2a_0 --- 3429 --- S40.04
z1_1373.5__2265_S4R2a_0 --- 3429 --- S40.04
z1_1373.5__2264_S4R2a_0 --- 3430 --- S40.04
z1_1373.5__2263_S4R2a_0 --- 3431 --- S40.04
z1_1373.5__2262_S4R2a_0 --- 3431 --- S40.04
z1_1373.5__2261_S4R2a_0 --- 3433 --- S40.04
z1_1373.5__2260_S4R2a_0 --- 3424 --- S40.04/S40.12
z1_1373.5__2259_S4R2a_1 --- 3303 --- S40.12
z1_1373.5__2258_S4R2a_0 --- 3304 --- S40.12
z1_1373.5__2257_S4R2a_1 --- 3307 --- S40.12
z1_1373.5__2256_S4R2a_2 --- 3302 --- S40.12
z1_1373.5__2255_S4R2a_0 --- 3302 --- S40.12
z1_1373.5__2254_S4R2a_0 --- 3301 --- S40.12
z1_1373.5__2253_S4R2a_0 --- 3302 --- S40.12
z1_1373.5__2252_S4R2a_0 --- 3300 --- S40.12
z1_1373.5__2251_S4R2a_0 --- 3302 --- S40.12
z1_1373.5__2250_S4R2a_0 --- 3301 --- S40.12
z1_1373.5__2249_S4R2a_0 --- 3301 --- S40.12
z1_1373.5__2248_S4R2a_0 --- 3301 --- S40.12
z1_1373.5__2247_S4R2a_0 --- 3302 --- S40.12
z1_1373.5__2246_S4R2a_0 --- 3300 --- S40.12
z1_1373.5__2245_S4R2a_0 --- 3299 --- S40.12
z1_1373.5__2244_S4R2a_0 --- 3301 --- S40.12
To my surprise this one is faster too. :)
Great work, akos!
cu,
Michael
Wow, those results are really
)
Wow, those results are really consistant in time. Even my dedicated crunching box, a 1.5gig athlon has noise in the +-5 minute range.