Why the short due date with long workunits?

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 690776380

RAC: 271517

RE: I think all PI-MMX's

25 May 2007 23:43:47 UTC

Message 67006 in response to message 67004

(moderation:

)

Quote:

I think all PI-MMX's are going to be hard pressed to meet any but the lowest template frequency WU's at this point, based on what my K6's have done even taking into account the stronger FPU's in them. Assuming the project team can roughly half the runtime once they optimize it should open them up to a wider range of frequencies, but if the deadline stays at two weeks EAH will be a very tight deadline project for them.

The expected optimization may half the runtime per workunit, but only for (at least) SSE capable CPUs, so anything below a P III or Athlon XP won't benefit from SSE codepaths. However, the current Windows app is running quite slow on those clients compared to the Linux app, so a ca. 30% increase might be possible for those older CPUs as well under Windows.

BRM

Alinator

Joined: 8 May 05

Posts: 927

Credit: 9352143

RAC: 0

RE: RE: I think all

28 May 2007 15:02:26 UTC

Message 67007 in response to message 67006

(moderation:

)

Quote:

Quote:
I think all PI-MMX's are going to be hard pressed to meet any but the lowest template frequency WU's at this point, based on what my K6's have done even taking into account the stronger FPU's in them. Assuming the project team can roughly half the runtime once they optimize it should open them up to a wider range of frequencies, but if the deadline stays at two weeks EAH will be a very tight deadline project for them.

The expected optimization may half the runtime per workunit, but only for (at least) SSE capable CPUs, so anything below a P III or Athlon XP won't benefit from SSE codepaths. However, the current Windows app is running quite slow on those clients compared to the Linux app, so a ca. 30% increase might be possible for those older CPUs as well under Windows.

CU

BRM

Well I don't know where you got that from. When Akos worked over the S5R1 apps he didn't leave the old timers out then (ie non-SSE), performance improved by a factor 2 for them. I haven't seen anything said about not doing any thing for them this time around, so I would expect there to be comparable gains all other things being equal.

Alinator

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 690776380

RAC: 271517

RE: Well I don't know

28 May 2007 20:10:53 UTC

Message 67008 in response to message 67007

(moderation:

)

Quote:

Well I don't know where you got that from. When Akos worked over the S5R1 apps he didn't leave the old timers out then (ie non-SSE), performance improved by a factor 2 for them. I haven't seen anything said about not doing any thing for them this time around, so I would expect there to be comparable gains all other things being equal.

Alinator

Akos already helped to improve the C source code of the current app's hot loop, so it's not completely un-optimized;-). Any further improvements (short of using SSE(n) instructions) would have to come from handcoding the algorithm in assembly language, and modern compilers are not that bad that you can expect a speedup by the factor of 2 when using the same instruction set.

I've taken a look at the compiler output and it's not all that bad, actually, except for one thing that will be corrected soon and will hopefully bring performance parity between Windows and Linux.

Maybe Akos can do magic again, I just think it's unfair to expect that with every iteration of optimization, a factor of 2 can be achieved.

BRM

Why the short due date with long workunits?

Forums › Cruncher's Corner

RE: I think all PI-MMX's

RE: RE: I think all

RE: Well I don't know

Comment viewing options

Forums › Cruncher's Corner