Windows S5R3 "power users" App 4.26 available

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4355

Credit: 254087477

RAC: 36303

21 Jan 2008 17:42:44 UTC

Topic 193453

(moderation:

)

From the 4.25 App Thread:

Quote:

I'll try to build an App with the old Visual Studio of 2003 (instead of VS2005). At least the /G7 optimization should work there. Let's see if it helps...

It can be found on the Power User's Apps page. This is definitely not a release candidate, just something to see in which direction to proceed.

Brian Silvers

Joined: 26 Aug 05

Posts: 772

Credit: 282700

RAC: 0

Windows S5R3 "power users" App 4.26 available

21 Jan 2008 17:58:22 UTC

Message 77479

(moderation:

)

Quote:

From the 4.25 App Thread:
Quote:
I'll try to build an App with the old Visual Studio of 2003 (instead of VS2005). At least the /G7 optimization should work there. Let's see if it helps...

It can be found on the Power User's Apps page. This is definitely not a release candidate, just something to see in which direction to proceed.

BM

Do you feel it is ok to switch with a result in progress? I just fired up the last one I had due to being away from the computer for 8-10 hours later on today...

Brian Silvers

Joined: 26 Aug 05

Posts: 772

Credit: 282700

RAC: 0

RE: RE: From the 4.25 App

21 Jan 2008 18:01:30 UTC

Message 77480 in response to message 77479

(moderation:

)

Quote:

Quote:
From the 4.25 App Thread:
Quote:
I'll try to build an App with the old Visual Studio of 2003 (instead of VS2005). At least the /G7 optimization should work there. Let's see if it helps...

It can be found on the Power User's Apps page. This is definitely not a release candidate, just something to see in which direction to proceed.

BM

Do you feel it is ok to switch with a result in progress? I just fired up the last one I had due to being away from the computer for 8-10 hours later on today...

Eh, no pain no gain... I'm going to try it and post the results...

Edit: It has restarted with the new application without crashing, so that's a good sign, I guess...

Edit2: The "AuthenticAMD" string is back in the app. Does this mean that AMD processors may be at a disadvantage in certain segments of code?

Brian Silvers

Joined: 26 Aug 05

Posts: 772

Credit: 282700

RAC: 0

Preliminary mixed-result

21 Jan 2008 18:56:30 UTC

Message 77481

(moderation:

)

Preliminary mixed-result performance seems to be nothing short of amazing, but as others have pointed out, it is hard to get a feel for the actual performance without doing some sampling.

My current estimated runtime for h1_0712.50_S5R2__37_S5R3a is only 33,000 seconds

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 3068858528

RAC: 2052680

Switched two XP machines a

21 Jan 2008 20:36:50 UTC

Message 77482

(moderation:

)

Switched two XP machines a couple of hours ago - running smoothly, but haven't been monitoring the speed.

More to the point, I've now switched the Vista32 box. Task 91267641 is mixed-mode (first 12% with 4.25, remainder with 4.26): anything later on host 831490 will be pure 4.26. This is the machine where I first reported the SETI optimised incompatibility with Vista, and did subsequent testing of what turned into viable apps. NB Vista didn't trash every WU, but when it did fail, it happened at the beginning of a run - so the current one is going to be OK (touch wood).

Svenie25

Joined: 21 Mar 05

Posts: 139

Credit: 2436862

RAC: 0

I just deleted the -lines

21 Jan 2008 20:41:16 UTC

Message 77483

(moderation:

)

I just deleted the -lines from the app_info, just it was said in the linuxthread. Now the errormessage I got is gone away. So letÂ´s see, what happens. ;)

Brian Silvers

Joined: 26 Aug 05

Posts: 772

Credit: 282700

RAC: 0

RE: Preliminary

21 Jan 2008 21:16:04 UTC

Message 77484 in response to message 77481

(moderation:

)

Quote:

Preliminary mixed-result performance seems to be nothing short of amazing, but as others have pointed out, it is hard to get a feel for the actual performance without doing some sampling.

My current estimated runtime for h1_0712.50_S5R2__37_S5R3a is only 33,000 seconds

Up to 34,000 now, but I've seen this behavior before, where it runs slower during the middle portions of the result than it does at the beginning and end...

...that and I don't have some pow(x)/log(y)^2 - |(3.14159 - atan(x))| formula to guide me...

BTW, I still give props to you math nerds... ;-)

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 840071679

RAC: 965214

Hi! In theory (from

21 Jan 2008 22:42:57 UTC

Message 77485

(moderation:

)

Hi!

In theory (from disassembly, profiling and early extrapolation of runtime) , this version should be about 15...20% faster than the previous one (for workunits around the minimum runtime within a frequency range). There's only one store-forwarding stall left in the critical code (in the part that does a conversion of a double to a 64 bit int) and I think this one could be eliminated in future versions as well. Looks good to me. I don't think the AMD punishment stuff does any harm in this app, but I will try later to replace "AuthenticAMD" with "GenuineIntel" and see what happens to performance :-)

The output of the compiler looks so much better when compared to that of the newer (!) compiler version that I wonder what has happened to the MS compiler. Did they completely change the underlying compiler engine?? It is rather radical that MS dropped the CPU specific optimization switches in the newer compiler version.

CU
Bikeman

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7415721687

RAC: 1860223

RE: this version should be

21 Jan 2008 23:26:02 UTC

Message 77486 in response to message 77485

(moderation:

)

Quote:

this version should be about 15% faster than the previous one...

Is that a comparison to 4.25, or to 4.15?

My in-process results look pretty clearly faster than 4.15, so a fortiori faster than 4.25. I won't guess by how much, some real answers will be available in a few hours.

I do have a completion and validation on one mixed-ap result. It started on 4.15 for somewhat less than 2 CPU hours. Then it finished on 4.26. The total time 25,563 seconds is quite plainly faster than expected for this host on 4.15.

I should be able to post a pure 4.26 result, with an attempt at speedup estimate account for the periodicity effect within about two hours.

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 840071679

RAC: 965214

RE: RE: this version

21 Jan 2008 23:32:53 UTC

Message 77487 in response to message 77486

(moderation:

)

Quote:

Quote:
this version should be about 15% faster than the previous one...

Is that a comparison to 4.25, or to 4.15?

Compared to 4.25, on a Core 2. Will know more tomorrow. I happen to have some workunits which are quite close to the minimum runtime per frequency, where the slope of the runtime variation is quite small, so it's quite possible to make comparisions by comparing runtimes from consecutive results.

CU
Bikeman

EDIT: From what I read here, it seems that the poor performance of the VS 2005 compiler compared to VS 2003 in this particular case (generating code full of store-forwarding-stalls) might be related to a bug acknowledged by Microsoft and fixed only in Visual Studio 2008.

BRM

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7415721687

RAC: 1860223

My first pure 4.26 result is

22 Jan 2008 1:12:21 UTC

Message 77488

(moderation:

)

My first pure 4.26 result is complete, but awaits quorum partner return for validation.

The execution time is very encouraging indeed:

23960 seconds, which is 86% of the value I'd expect for this host using the 4.15 ap for sequence number 69 at frequency 719.80.

As I lack samples from nearby sequence numbers, I've relied on the cycle period estimate to choose a comparable number from the next cycle higher. Plausible errors in that estimate and random variation from activity on the host puts a little uncertainty on this number, but it is a big speedup beyond any doubt. About 27000 CPU seconds was the minimum for two higher cycles on this host, and sequence number 69 is not at a cycle minimum, nor even close.

On the secondary indicator of power, I again forgot to get comparative readings, but the indirect die temperature indicator strongly hinted that stalling was much less prevalent on 4.26 than on 4.25. 4.26 matched 4.15 die temperatures on my Q6600 closely, while 4.25 ran appreciably cooler (3 or 4 degrees C).

My Q6600 has completed two mixed ap 4.15/4.26 results. Both are clearly faster than 4.15 expectation, but validation awaits quorum partner returns.

Windows S5R3 "power users" App 4.26 available

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner