Akos, looks like you solved the ivalid result problem nice and quick!
Excellent work.
There are 10^11 stars in the galaxy. That used to be a huge number. But it's only a hundred billion. It's less than the national deficit! We used to call them astronomical numbers. Now we should call them economical numbers. - Richard Feynman
It would seem that Intels sse3 library takes to these opts very poorly compared to the improvements the AMD's are reporting, Is it possible akosf that something in the setup of the Intel is missing that AMD has and needs to be compensated for?
I often discovered that AMD's float-point unit are more flexible than Intel's one. I hope that it will changed by the 'Core'.
Probably i will adjust S41.07 on a Northwood to get higher speed. -> S41.07i
(Sorry but i don't have SSE3 capable intel processor, but i will buy a Conroe.)
Thank-you akosf as my 2.8 Northwood is the slowest comparably to everything else I have as it takes 8-9k secs in HT mode using S39L. My P3 using S41.06 does them in 11k secs. So it would appear there is much room for improvement for sse2. Thanks again akosf for all you do
Is it possible akosf that something in the setup of the Intel is missing that AMD has and needs to be compensated for?
Probably i will adjust S41.07 on a Northwood to get higher speed. -> S41.07i
Thank-you akosf as my 2.8 Northwood is the slowest comparably to everything else I have as it takes 8-9k secs in HT mode using S39L. My P3 using S41.06 does them in 11k secs. So it would appear there is much room for improvement for sse2. Thanks again akosf for all you do
So, I did a speed-test on my Pentium4-M Northwood 1,8GHz with the same workunit. The result didn't surprise me.
Stats for my Intel P4 3.0Ghz Prescott (with SSE3) Hyper-Threaded:
S41.06 => 5273s (mean of 12 results) => 1.00
U41.04 => 4868s (mean of 4 results by now) => 0.93 (7% gain)
S41.07 => To test soon!
Should I try even the S39L in your opinion?
It would be really good to clarify the speedups (or speeddowns) on Prescotts. Thanks.
Hi akosf
As stated earlier by other, and tested by me, then its an HT penalty. Running HT on my Prescott 3.0 ghz, I found that S39L and S40.04 was the fastest ones. ALL other bins were slower.
So Mr. Pernod suggested that I updated my einstein prefs to say "use only 1 cpu" and in that way allocate all resources (cpu cycles, L1 and L2 cache) to a single exe. I did that and at once I had amazing speedups. Later I upgraded from S40.04 to S41.06 and had even greater speedups.
Running S40.04 in HT mode, a wu took approx. 2.30 hours. After "disabling" HT mode S41.06 took 55 mins.
This means that in HT mode a wu took approx. 1.15 mins, now 55 mins.
Hope this clarifies my situation.
BTW, I'm currently testing 41.04 on this Prescott, and it seems faster than S41.06 in this configuration (a single exe running)
It would be really good to clarify the speedups (or speeddowns) on Prescotts. Thanks.
It would be a pleasure!! ;-)
The wu running at the moment is crunched with U41.04. Next I'll try a run of at least 5 wu with S41.07. This evening I'll post the results!
report on
)
report on U41.02/03/04:
valid:17
invalid:0
U41.02/.03/.04 (more done
)
U41.02/.03/.04 (more done with .02/.03 than .04)
valid: 51
invalid: 0
Akos, looks like you solved the ivalid result problem nice and quick!
Excellent work.
There are 10^11 stars in the galaxy. That used to be a huge number. But it's only a hundred billion. It's less than the national deficit! We used to call them astronomical numbers. Now we should call them economical numbers. - Richard Feynman
my Result with "Pentium 4 -
)
my Result with "Pentium 4 - 630" :
U41.04 = 9,838.84 sec
S40 = 8,634.41 sec
best version is S40 ;)
i wait new version... :/
sorry... 2 post.... for lag
)
sorry... 2 post.... for lag :/
RE: RE: It would seem
)
Thank-you akosf as my 2.8 Northwood is the slowest comparably to everything else I have as it takes 8-9k secs in HT mode using S39L. My P3 using S41.06 does them in 11k secs. So it would appear there is much room for improvement for sse2. Thanks again akosf for all you do
Stats for my Intel P4 3.0Ghz
)
Stats for my Intel P4 3.0Ghz Prescott (with SSE3) Hyper-Threaded:
S41.06 => 5273s (mean of 12 results) => 1.00
U41.04 => 4868s (mean of 4 results by now) => 0.93 (7% gain)
S41.07 => To test soon!
One thread on Einstein and the other one on Seti to reduce overhead and bottlenecks... it works!! All valid results!
Should I try even the S39L in your opinion?
Thanks for your precious work, Akos!
And see you later for more stats!
RE: RE: RE: Is it
)
So, I did a speed-test on my Pentium4-M Northwood 1,8GHz with the same workunit. The result didn't surprise me.
S39L: 7493 sec
S40.4: 6675 sec
S41.07: 5351 sec (~40% faster than S39L)
RE: Stats for my Intel P4
)
It would be really good to clarify the speedups (or speeddowns) on Prescotts. Thanks.
RE: RE: Stats for my
)
Hi akosf
As stated earlier by other, and tested by me, then its an HT penalty. Running HT on my Prescott 3.0 ghz, I found that S39L and S40.04 was the fastest ones. ALL other bins were slower.
So Mr. Pernod suggested that I updated my einstein prefs to say "use only 1 cpu" and in that way allocate all resources (cpu cycles, L1 and L2 cache) to a single exe. I did that and at once I had amazing speedups. Later I upgraded from S40.04 to S41.06 and had even greater speedups.
Running S40.04 in HT mode, a wu took approx. 2.30 hours. After "disabling" HT mode S41.06 took 55 mins.
This means that in HT mode a wu took approx. 1.15 mins, now 55 mins.
Hope this clarifies my situation.
BTW, I'm currently testing 41.04 on this Prescott, and it seems faster than S41.06 in this configuration (a single exe running)
RE: It would be really good
)
It would be a pleasure!! ;-)
The wu running at the moment is crunched with U41.04. Next I'll try a run of at least 5 wu with S41.07. This evening I'll post the results!