i think it'd be better to work on an SSE2 version, until SSE3 is more widely used, unless there isn't that much benifit from SSE2 over SSE(1)
It sounds well. Could you tell me more about your idea?
I'm thinking on a SSE2(3) / 3DNow! overlapped code, but it needs some new programming technique to eliminate dataconversions.
e.g.: I'm testing a new rounding method.
RE: What is this "r"
)
It's a flaw. I did a mistake during hex-editing, but I will correct it.
Not.
OK. I have found the reason
)
OK. I have found the reason of the bad U41 results.
It's my fault. :-(
So, one of my computer is
)
So, one of my computer is crunching with the latest U41.xx version.
Version: U41.03
Processor: Sempron E6 1600MHz (strongly believes he is Opteron 842)
I will put it to the download thread after 10-15 valid results.
It means about 3-4 days.
edit: This code needs SSE3 compatible processors, because it uses one SSE3 instruction (FISTTP exactly).
RE: So, one of my computer
)
I don't get very many pendings because of a big cache, so the first result will be here in about 20 min.
cu,
Michael
[edit]
VALID! :-)))
I got my first valid result with a U-albert. :)
Speed seems to be faster, but it's a mixed WU, so more precice information later.
[/edit]
VALID! but mixed looks really
)
VALID! but mixed
looks really good, but further investigation must wait because the bread in the oven is crunching really fast.
i think it'd be better to
)
i think it'd be better to work on an SSE2 version, until SSE3 is more widely used, unless there isn't that much benifit from SSE2 over SSE(1)
Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins
What optimized version should
)
What optimized version should I use for my SSE2 Pentium M (dothan) notebook?
RE: i think it'd be better
)
It sounds well. Could you tell me more about your idea?
I'm thinking on a SSE2(3) / 3DNow! overlapped code, but it needs some new programming technique to eliminate dataconversions.
e.g.: I'm testing a new rounding method.
RE: What optimized version
)
I think S41.06 is the fastest version for Dothan, at moment.
Here are the first
)
Here are the first compareable results:
z1_1188.0__1107 --- 1,994.08 sec --- S41.06
z1_1188.0__1108 --- 1,979.03 sec --- S41.06
z1_1188.0__1109 --- 1,981.02 sec --- S41.06
z1_1188.0__1110 --- 1,981.75 sec --- S41.06
z1_1188.0__1111 --- 1,976.23 sec --- S41.06
z1_1188.0__1106 --- 1,847.13 sec --- U41.02
z1_1188.0__1105 --- 1,852.06 sec --- U41.02
Other results:
z1_1322.0__1721 --- 1,813.16 sec --- U41.02
z1_1322.0__1720 --- 1,811.72 sec --- U41.02
z1_1322.0__1719 --- 1,813.22 sec --- U41.02
z1_1322.0__1718 --- 1,812.44 sec --- U41.02
z1_1322.0__1717 --- 1,814.41 sec --- U41.02
z1_1322.0__1716 --- 1,811.67 sec --- U41.02
A few results are pending, all others are valid. :-)
Congrats to akos!
cu,
Michael
[edit]All results from my A64 X2 Toledo Core[/edit]