MS Windows Beta Test App 4.24 available

Harvey Allen

Joined: 8 Jul 05

Posts: 12

Credit: 1228079

RAC: 0

RE: How many of them have

16 Aug 2006 10:10:31 UTC

Message 43507 in response to message 43500

(moderation:

)

Quote:

How many of them have you met? How many compilers have you written on your own?
BM

Good questions.

None this generation. No compiler but I once wrote an OS. In the course of reviewing the official OS I saw lots of instructions used to perform time critical operations were not the most efficient. The authors claimed exhaustion and deadlines. Probably true. Management could have spent the resources to tighten things up later but preferred otherwise.

In a compiler you can review the code generated by a particular statement and judge if there is a better choice. This makes checking the quality of the compiler easy.

An assembler coder is aware of the state prior and following the statement to select better code but the change in 402 vs 424 is so dramatic it makes you wonder.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250561529

RAC: 34604

I've looked into the

16 Aug 2006 12:13:00 UTC

Message 43508

(moderation:

)

I've looked into the functions provided by ipp or mkl, but they're not worth it. They are mostly way too complex for what we are doing in our code.

More than 90& (99% before optimization) of the run time is spent within a single loop that once had about a dozen lines of C-Code, containing only simple multiplications and additions (and the very vew divisions we coudn't avoid - I think there's actually only one of them left - hm, gives me another idea...). We have parallized/vectorized as much as we could.

You can map the operations we perform to matrix operations, but due to the necessary overhead executing these even with a higly-optimized library this will not be faster than doing just the necessay low-level operations.

The latest speedup came mostly from avoiding type conversions (requiring to set the rounding mode, which is slow), a bit of optimizing the interface to the assembler-coded parts, and some playing with the C-code. It's always a tradeoff you have to make - global static variables are the fastest in many cases, but with too many of them your code becomes unreadable and unmaintainable.

I also found that apparently due to caching effects some compiler switches that worked well for previous versions weren't optimal for this code (e.g. unrolling loops).

Finally I used the same compiler and settings we use for the Linux version (a gcc-4.1) now for Windows, too (at least for this critical module), which saved some interfacing, quite some maintenance, and brought the Windows App to the speed our Linux App had before.

So - the libraries don't help us, as we don't perform standard operations (like e.g. FFT), and the latest speedup of the Windows App was due to a combination of things, where roughly half of them didn't come from the assembler coding.

I'll have another try with the Intel compiler, but I think all critical parts by now have been taken out ouf the hands of the compiler by our assembler coding anyway, so I don't expect much of it.

Harvey Allen

Joined: 8 Jul 05

Posts: 12

Credit: 1228079

RAC: 0

RE: More than 90% (99%

16 Aug 2006 14:01:09 UTC

Message 43509 in response to message 43508

(moderation:

)

Quote:

More than 90% (99% before optimization) of the run time is spent within a single loop

Sounds like everything fits into L1 cache. At least on AMD.

Too bad they only make 64k/64k. A 32k/32k would show if there was any room left.

Can you split a wu in the middle and work the halves with two threads and see if you escape thrashing?

Harvey Allen

Joined: 8 Jul 05

Posts: 12

Credit: 1228079

RAC: 0

I just saw that Conroe has L1

17 Aug 2006 3:41:17 UTC

Message 43510

(moderation:

)

I just saw that Conroe has L1 I 32kb/D 32kb. That would be interesting to play around with.

Stick

Joined: 24 Feb 05

Posts: 790

Credit: 33131015

RAC: 1144

Since v4.24 is now the

17 Aug 2006 8:43:27 UTC

Message 43511

(moderation:

)

Since v4.24 is now the "official" Windows application, shouldn't it be removed from the Beta Test page?

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250561529

RAC: 34604

RE: Since v4.24 is now the

17 Aug 2006 9:46:02 UTC

Message 43512 in response to message 43511

(moderation:

)

Quote:

Since v4.24 is now the "official" Windows application, shouldn't it be removed from the Beta Test page?

Thanks for the hint. For the moment it may help people with problems automatically downloading the official App to have this at hand for manual installation. It will be removed from the Beta App page at the next update.

Mats Nilsson

Joined: 10 Dec 05

Posts: 94

Credit: 15011147

RAC: 0

If I am running the Beta Test

20 Aug 2006 16:01:29 UTC

Message 43513

(moderation:

)

If I am running the Beta Test App 4.24 what should I do now to switch to the official.

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3562358667

RAC: 0

just delete the app_info.xml

20 Aug 2006 16:34:50 UTC

Message 43514

(moderation:

)

just delete the app_info.xml file. the executables should be indentical, otherwise hte version numbers would've been different.

Mats Nilsson

Joined: 10 Dec 05

Posts: 94

Credit: 15011147

RAC: 0

Thanks will do that but will

20 Aug 2006 18:41:02 UTC

Message 43515

(moderation:

)

Thanks will do that but will let my work cash go dry first by setting no new work.

Misfit

Joined: 11 Feb 05

Posts: 470

Credit: 100000

RAC: 0

RE: Thanks will do that but

22 Aug 2006 1:18:27 UTC

Message 43516 in response to message 43515

(moderation:

)

Quote:

Thanks will do that but will let my work cash go dry first by setting no new work.

That isn't necessary. Just delete the app_info.xml file from your einstein project folder. That is all you need to do.

me-[at]-rescam.org

MS Windows Beta Test App 4.24 available

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner