DCF of 1.55 means the estimated time for newcomers will be too short by a factor of 1.55. For S5R3, my C2D had a DCF of 0.14 (!) meaning that the predicted runtime for newcomers with this computer would be too long by a factor of almost 7!!!. So it's much closer now. I can't believe it's that different for Windows?
Ah... Well, I suppose that you could argue that the over-estimation for a new host attaching with R3 getting the Windows 4.36/4.46 application is greater than the under-estimation of R4, but as you can see from the figures I posted for my Pentium 4, it's still a bit on the high side...
Someone did validate that the Windows "1" executable really is using SSE, right?
Of my four Windows hosts, the two Conroes are at 1.47 and 1.52 (having been about 0.21 on S5R3 running the 4.36 ap).
The Banias is at 1.95, and the Coppermine is at 3.1. I'm much less sure of the S5R3 values for these as for the Conroes, but think it may have been about .25 and .5.
Speaking geometrically, I agree that the Conroes are very much closer to 1 than before, and the Banias somewhat closer. I think the Coppermine is no better, and probably somewhat worse.
Dare I dream that a way meeting the project's needs will be found to compile code to the Windows platform of efficiency near the Linux ap level, then these would become pretty good, save for the Coppermine, which is too old and too small a part of user contribution to be worth any concern on this point.
The S5R4 Windows SSE app is listed as 6.02. The reference unit should be rerun on the current production release, 6.04.
I think we can take that as simply a typo on the part of the person drawing the plot. Bernd doesn't release applications with the same version number for different platforms, and according to the applications page, 6.02 for Linux and 6.04 for Windows were released within a second of each other.
If I'm working on a machine at a convenient time (i.e. one that's only running one Einstein task, and is finishing one task/starting another), I'll try an app_info which calls 6.04_0 directly (bypassing the SSE despatcher), and see whether - as expected - the _0 version is slower on an SSE-capable host.
The S5R4 Windows SSE app is listed as 6.02. The reference unit should be rerun on the current production release, 6.04.
I think we can take that as simply a typo on the part of the person drawing the plot. Bernd doesn't release applications with the same version number for different platforms, and according to the applications page, 6.02 for Linux and 6.04 for Windows were released within a second of each other.
If I'm working on a machine at a convenient time (i.e. one that's only running one Einstein task, and is finishing one task/starting another), I'll try an app_info which calls 6.04_0 directly (bypassing the SSE despatcher), and see whether - as expected - the _0 version is slower on an SSE-capable host.
Could be (re: typo). Needs clarification though...
As for the other, along with that testing, I haven't been motivated enough to migrate to R4 on my AMD, but perhaps the person doing the reference chart could rerun with an edited binary to get rid of the "AuthenticAMD" string and see if it matters, since they are running the tests on an AMD anyway... If not, I guess I'll get around to going exclusive R4 sometime over the next 2 weeks...
Anything with a Core2 seems to be ~1.7, P4D's ~2.0 and the worst is Athlon 64's at ~2.2-2.3 (though one got stuck with a ton of long units (>30% longer than the first batch I got) and is reporting 2.70 now) and my poor P3 generation Celeron is at 2.83 now (P4 generation Celron is 2.50).
Someone did validate that the Windows "1" executable really is using SSE, right?
If I'm working on a machine at a convenient time (i.e. one that's only running one Einstein task, and is finishing one task/starting another), I'll try an app_info which calls 6.04_0 directly (bypassing the SSE despatcher), and see whether - as expected - the _0 version is slower on an SSE-capable host.
OK, test completed. Host 1226365 (a Q9300, so it should have any SSE going) currently has 26 S5R4 results showing, with an average task duration of 34,223 seconds. Range is 31,447 to 38,127 seconds.
Task 104413177 was done using the _0 (presumed) non-SSE app: all I did was switch the tag from the stub launcher to the _0 executable in app_info.xml
Thus de-optimised, the task took 59,306 seconds - so far outside the range of values for the _1 app (confirmed as running normally by Task Manager) that I feel no urge to repeat the experiment.
Of course, the acid test would be to run _1 on a non-SSE CPU, but since my only candidate - the 400MHz MMX Celeron - is two days into a WU, with five days to run, I'll leave that to another volunteer.
Thus de-optimised, the task took 59,306 seconds - so far outside the range of values for the _1 app (confirmed as running normally by Task Manager) that I feel no urge to repeat the experiment.
Task 104413177 was done using the _0 (presumed) non-SSE app...
Thus de-optimised, the task took 59,306 seconds - so far outside the range of values for the _1 app (confirmed as running normally by Task Manager) that I feel no urge to repeat the experiment.
With that task running sequence number 786 for frequency 940.90, it was very close to my estimated location of the third peak run time at sequence number 789.
As my current estimate for the cycle length for the 346.95 Hz work this host ran with the good application is 36, the result 194425992 at sequence number 37 may be a rough comparison reference, suggesting the _1 application benefit near the cycle peak for this host to be that of reducing execution time to about 63% of what it would otherwise have been. While the uncertainty range in that estimate is substantial, I agree that equality can confidently be excluded.
RE: DCF of 1.55 means the
)
Ah... Well, I suppose that you could argue that the over-estimation for a new host attaching with R3 getting the Windows 4.36/4.46 application is greater than the under-estimation of R4, but as you can see from the figures I posted for my Pentium 4, it's still a bit on the high side...
Someone did validate that the Windows "1" executable really is using SSE, right?
Of my four Windows hosts, the
)
Of my four Windows hosts, the two Conroes are at 1.47 and 1.52 (having been about 0.21 on S5R3 running the 4.36 ap).
The Banias is at 1.95, and the Coppermine is at 3.1. I'm much less sure of the S5R3 values for these as for the Conroes, but think it may have been about .25 and .5.
Speaking geometrically, I agree that the Conroes are very much closer to 1 than before, and the Banias somewhat closer. I think the Coppermine is no better, and probably somewhat worse.
Dare I dream that a way meeting the project's needs will be found to compile code to the Windows platform of efficiency near the Linux ap level, then these would become pretty good, save for the Coppermine, which is too old and too small a part of user contribution to be worth any concern on this point.
RE: Someone did validate
)
Yes, [AF>Futura Sciences]click has compiled this diagram that shows relative app performance using a benchmark reference unit:
http://img151.imageshack.us/my.php?image=appevalplotv3yz8.png
CU
Bikeman
RE: RE: Someone did
)
The S5R4 Windows SSE app is listed as 6.02. The reference unit should be rerun on the current production release, 6.04.
RE: RE: RE: Someone did
)
I think we can take that as simply a typo on the part of the person drawing the plot. Bernd doesn't release applications with the same version number for different platforms, and according to the applications page, 6.02 for Linux and 6.04 for Windows were released within a second of each other.
If I'm working on a machine at a convenient time (i.e. one that's only running one Einstein task, and is finishing one task/starting another), I'll try an app_info which calls 6.04_0 directly (bypassing the SSE despatcher), and see whether - as expected - the _0 version is slower on an SSE-capable host.
RE: RE: RE: RE: Someo
)
Could be (re: typo). Needs clarification though...
As for the other, along with that testing, I haven't been motivated enough to migrate to R4 on my AMD, but perhaps the person doing the reference chart could rerun with an edited binary to get rid of the "AuthenticAMD" string and see if it matters, since they are running the tests on an AMD anyway... If not, I guess I'll get around to going exclusive R4 sometime over the next 2 weeks...
Mine are all over the map,
)
Mine are all over the map, but
Anything with a Core2 seems to be ~1.7, P4D's ~2.0 and the worst is Athlon 64's at ~2.2-2.3 (though one got stuck with a ton of long units (>30% longer than the first batch I got) and is reporting 2.70 now) and my poor P3 generation Celeron is at 2.83 now (P4 generation Celron is 2.50).
Interesting reading about the subject though. :)
RE: RE: Someone did
)
OK, test completed. Host 1226365 (a Q9300, so it should have any SSE going) currently has 26 S5R4 results showing, with an average task duration of 34,223 seconds. Range is 31,447 to 38,127 seconds.
Task 104413177 was done using the _0 (presumed) non-SSE app: all I did was switch the tag from the stub launcher to the _0 executable in app_info.xml
Thus de-optimised, the task took 59,306 seconds - so far outside the range of values for the _1 app (confirmed as running normally by Task Manager) that I feel no urge to repeat the experiment.
Of course, the acid test would be to run _1 on a non-SSE CPU, but since my only candidate - the 400MHz MMX Celeron - is two days into a WU, with five days to run, I'll leave that to another volunteer.
RE: Thus de-optimised, the
)
Wimp... ;-)
Thanks...
RE: Task 104413177 was done
)
With that task running sequence number 786 for frequency 940.90, it was very close to my estimated location of the third peak run time at sequence number 789.
As my current estimate for the cycle length for the 346.95 Hz work this host ran with the good application is 36, the result 194425992 at sequence number 37 may be a rough comparison reference, suggesting the _1 application benefit near the cycle peak for this host to be that of reducing execution time to about 63% of what it would otherwise have been. While the uncertainty range in that estimate is substantial, I agree that equality can confidently be excluded.