AMD reveals plans for "SSE5"

ebahapo

Joined: 22 Jan 05

Posts: 47

Credit: 755276

RAC: 0

RE: Im not paranoid enough

12 Sep 2007 15:33:13 UTC

Message 72749 in response to message 72748

(moderation:

)

Quote:

Im not paranoid enough to think intel deliberately crippled the compiler for AMD, i mean, how much work would THAT be, and could it even be done without losing lots of optimalization for their own processors?

Simple, the compiler checks for "GenuineIntel" and runs the best optimized code available. For anything else, it runs the same code it would for a Pentium II.

th3

Joined: 24 Aug 06

Posts: 208

Credit: 2208434

RAC: 0

At compile time? Should

12 Sep 2007 16:14:47 UTC

Message 72750

(moderation:

)

At compile time? Should depend on compiler flags set. How about at run-time, you get an app with the binaries already compiled, then i doubt they could pull of a significant slowing down on AMDs without a trade-off where also Intels would run slower than optimal.

Team Philippines

ebahapo

Joined: 22 Jan 05

Posts: 47

Credit: 755276

RAC: 0

RE: At compile time? Should

12 Sep 2007 16:28:47 UTC

Message 72751 in response to message 72750

(moderation:

)

Quote:

At compile time? Should depend on compiler flags set. How about at run-time, you get an app with the binaries already compiled, then i doubt they could pull of a significant slowing down on AMDs without a trade-off where also Intels would run slower than optimal.

At run-time, when the program starts, the processor is probed. Then it sets up the entries in an jump-table to select which key math routines are run.

It doesn't cost much, as it already probes the processor for support of MMX, SSE, SSE2, etc, and then uses the best routine for a given processor. Of course, Intel considers the "best" routine for non-Intel processors the worst it can choose.

Just read the write-up linked to above and it's crystal clear what Intel did. And it's no new news either, as you can see by its date. It's been around since version 8 of their compiler and is very well known in the industry.

th3

Joined: 24 Aug 06

Posts: 208

Credit: 2208434

RAC: 0

Look at never stuff, "P4 1.7"

12 Sep 2007 17:05:42 UTC

Message 72752

(moderation:

)

Look at never stuff, "P4 1.7" sounds like socket 423, not even Northwood gen.

Intels compiler is faster also for AMD in many cases:
http://techreport.com/articles.x/8369/10
http://techreport.com/articles.x/8369/13 (bottom of the page)
(Lower is better)

The part about the Fortran compiler is interesting enough, but considered how old that article is i dont know how much to put into it for current gen CPUs.

Team Philippines

ebahapo

Joined: 22 Jan 05

Posts: 47

Credit: 755276

RAC: 0

RE: Intels compiler is

12 Sep 2007 17:29:03 UTC

Message 72753 in response to message 72752

(moderation:

)

Quote:

Intels compiler is faster also for AMD in many cases:
http://techreport.com/articles.x/8369/10
http://techreport.com/articles.x/8369/13 (bottom of the page)
(Lower is better)

Being faster than the MS compiler is easy, as it doesn't support vectorization. But you're right, sometimes using the Intel compiler even crippling non-Intel processors is better than nothing.

ML1

Joined: 20 Feb 05

Posts: 347

Credit: 86563414

RAC: 51

RE: At compile time? Should

12 Sep 2007 17:48:07 UTC

Message 72754 in response to message 72750

(moderation:

)

Quote:

At compile time? Should depend on compiler flags set. How about at run-time, you get an app with the binaries already compiled, then i doubt they could pull of a significant slowing down on AMDs without a trade-off where also Intels would run slower than optimal.

As mentioned, it is trivial for the Intel compiler to add additional code (and not even code specified by the programmer) to check the CPU manufacturer's name at run-time and then run 'whatever code'.

Note that the compiler can insert any code the compiler writers wish, regardless of what the program being compiled might specify. You only get to find out what really has been coded if you look to see what machine code the compiler has generated...

I guess inserting a NOPs loop would be a little too obvious if anyone ever took the trouble to look...

Meanwhile, a LOT of time and effort can be (and I'm sure has been) needlessly wasted.

All very 'silly'.

Happy crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)

Klimax

Joined: 27 Apr 07

Posts: 87

Credit: 1370205

RAC: 0

Somewhere here on this forum

13 Sep 2007 5:37:11 UTC

Message 72755

(moderation:

)

Somewhere here on this forum is a post,explaining certain slow down for AMD CPUs with Intel Mathlib.It is caused by misdetection.It was attempted to rule-out I think- first Athlons because of wrong implementation of SSE.However misdetection took out every OTHER AMD and left the wrong ones.It by the way affected even some Celerons at least... part of discussion is in S2R2 thread.

Maybe Bikeman could know?

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 756963739

RAC: 1162337

RE: Somewhere here on this

13 Sep 2007 11:12:28 UTC

Message 72756 in response to message 72755

(moderation:

)

Quote:

Somewhere here on this forum is a post,explaining certain slow down for AMD CPUs with Intel Mathlib.It is caused by misdetection.It was attempted to rule-out I think- first Athlons because of wrong implementation of SSE.However misdetection took out every OTHER AMD and left the wrong ones.It by the way affected even some Celerons at least... part of discussion is in S2R2 thread.

Maybe Bikeman could know?

Yup, we (Annika, Ziegenmelker and I ) detected this problem a few month ago. I think Akos discovered the same thing around that time.

On paper, this looked exactly like "naughty" the Intel maltlib runtime check:

At startup of the app, the math library that was linked by the MS Visual C++ compiler version used for E@H at that time, would use the CPUID instruction to find out whether the crunchers CPU supports SSE2 or not, to toggle between different code paths for some math library functions. But then the detection code went on to check whether the Vendor string reported by the CPU was equal to "AthenticAMD" and whether the "family" of the CPU was 15 (ALL (!) AMD K8). In that case, it would disable SSE2 codepaths even if CPUID reported that it was supported!

So if you hacked the app to change the comparision string constant "AuthenticAMD" into "AuthenticABC", the AMD detection would fail to recognize it's an AMD K8, and voila, the app at that time ran ca. 20% faster on SSE2 enabled AMDs (because the non SSE codepaths uincluded a really slow implementation of a specific function, modf).

I guess the real intention behind this CPU detection code was to exclude only first generation, 130nm K8s because their SSE2 implementation was so slow that the standard x87 FPU codepaths would indeed have been faster. Instead, all K8s were slowed down :-(.

Since then, a newer version of the Visual C++ compiler was used by E@H so there's no need anymore for hacks like this.

H-BE

ML1

Joined: 20 Feb 05

Posts: 347

Credit: 86563414

RAC: 51

RE: ... Since then, a newer

13 Sep 2007 22:06:37 UTC

Message 72757 in response to message 72756

(moderation:

)

Quote:

... Since then, a newer version of the Visual C++ compiler was used by E@H so there's no need anymore for hacks like this.

'Fixed' as in that you are no longer using the Intel compiler?

Regards,
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 756963739

RAC: 1162337

RE: RE: ... Since then, a

13 Sep 2007 22:36:35 UTC

Message 72758 in response to message 72757

(moderation:

)

Quote:

Quote:
... Since then, a newer version of the Visual C++ compiler was used by E@H so there's no need anymore for hacks like this.

'Fixed' as in that you are no longer using the Intel compiler?

Regards,
Martin

The problem occured with a Microsoft compiler, not the Intel Compiler. Since then a newer version of the Microsoft compiler was used which no longer checks for the "AuthenticAMD" vendor CPU info.

Bikeman

AMD reveals plans for "SSE5"

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner