MS Windows Beta Test App 4.24 available

Harvey Allen
Harvey Allen
Joined: 8 Jul 05
Posts: 12
Credit: 1228079
RAC: 0

RE: How many of them have

Message 43507 in response to message 43500

Quote:
How many of them have you met? How many compilers have you written on your own?
BM

Good questions.

None this generation. No compiler but I once wrote an OS. In the course of reviewing the official OS I saw lots of instructions used to perform time critical operations were not the most efficient. The authors claimed exhaustion and deadlines. Probably true. Management could have spent the resources to tighten things up later but preferred otherwise.

In a compiler you can review the code generated by a particular statement and judge if there is a better choice. This makes checking the quality of the compiler easy.

An assembler coder is aware of the state prior and following the statement to select better code but the change in 402 vs 424 is so dramatic it makes you wonder.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250561529
RAC: 34604

I've looked into the

I've looked into the functions provided by ipp or mkl, but they're not worth it. They are mostly way too complex for what we are doing in our code.

More than 90& (99% before optimization) of the run time is spent within a single loop that once had about a dozen lines of C-Code, containing only simple multiplications and additions (and the very vew divisions we coudn't avoid - I think there's actually only one of them left - hm, gives me another idea...). We have parallized/vectorized as much as we could.

You can map the operations we perform to matrix operations, but due to the necessary overhead executing these even with a higly-optimized library this will not be faster than doing just the necessay low-level operations.

The latest speedup came mostly from avoiding type conversions (requiring to set the rounding mode, which is slow), a bit of optimizing the interface to the assembler-coded parts, and some playing with the C-code. It's always a tradeoff you have to make - global static variables are the fastest in many cases, but with too many of them your code becomes unreadable and unmaintainable.

I also found that apparently due to caching effects some compiler switches that worked well for previous versions weren't optimal for this code (e.g. unrolling loops).

Finally I used the same compiler and settings we use for the Linux version (a gcc-4.1) now for Windows, too (at least for this critical module), which saved some interfacing, quite some maintenance, and brought the Windows App to the speed our Linux App had before.

So - the libraries don't help us, as we don't perform standard operations (like e.g. FFT), and the latest speedup of the Windows App was due to a combination of things, where roughly half of them didn't come from the assembler coding.

I'll have another try with the Intel compiler, but I think all critical parts by now have been taken out ouf the hands of the compiler by our assembler coding anyway, so I don't expect much of it.

BM

BM

Harvey Allen
Harvey Allen
Joined: 8 Jul 05
Posts: 12
Credit: 1228079
RAC: 0

RE: More than 90% (99%

Message 43509 in response to message 43508

Quote:
More than 90% (99% before optimization) of the run time is spent within a single loop

Sounds like everything fits into L1 cache. At least on AMD.

Too bad they only make 64k/64k. A 32k/32k would show if there was any room left.

Can you split a wu in the middle and work the halves with two threads and see if you escape thrashing?

Harvey Allen
Harvey Allen
Joined: 8 Jul 05
Posts: 12
Credit: 1228079
RAC: 0

I just saw that Conroe has L1

I just saw that Conroe has L1 I 32kb/D 32kb. That would be interesting to play around with.

Stick
Stick
Joined: 24 Feb 05
Posts: 790
Credit: 33131015
RAC: 1144

Since v4.24 is now the

Since v4.24 is now the "official" Windows application, shouldn't it be removed from the Beta Test page?

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250561529
RAC: 34604

RE: Since v4.24 is now the

Message 43512 in response to message 43511

Quote:
Since v4.24 is now the "official" Windows application, shouldn't it be removed from the Beta Test page?


Thanks for the hint. For the moment it may help people with problems automatically downloading the official App to have this at hand for manual installation. It will be removed from the Beta App page at the next update.

BM

BM

Mats Nilsson
Mats Nilsson
Joined: 10 Dec 05
Posts: 94
Credit: 15011147
RAC: 0

If I am running the Beta Test

If I am running the Beta Test App 4.24 what should I do now to switch to the official.

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 0

just delete the app_info.xml

just delete the app_info.xml file. the executables should be indentical, otherwise hte version numbers would've been different.

Mats Nilsson
Mats Nilsson
Joined: 10 Dec 05
Posts: 94
Credit: 15011147
RAC: 0

Thanks will do that but will

Thanks will do that but will let my work cash go dry first by setting no new work.

Misfit
Misfit
Joined: 11 Feb 05
Posts: 470
Credit: 100000
RAC: 0

RE: Thanks will do that but

Message 43516 in response to message 43515

Quote:
Thanks will do that but will let my work cash go dry first by setting no new work.


That isn't necessary. Just delete the app_info.xml file from your einstein project folder. That is all you need to do.

me-[at]-rescam.org

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.