Windows S5R3 App 4.25 available for Beta Test

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282700
RAC: 0

RE: Brian, I PMed Richard 3

Message 77359 in response to message 77358

Quote:
Brian, I PMed Richard 3 days ago asking about what the numbers mean. Perhaps you can tell me?

The only part I know about is the first 3 and the final number. The first 3 are indeed the frequency being checked. The last number(s), not including the 0-based index number of your result (at the end), are the sequence number of that particular frequency range. Apparently the .nn plays some role. Not sure what though... I too would like to know... :-) Perhaps someone over in Germany that seems to be awake will come along and help...

Quote:
Is the x axis the second number?

Nope, that's the last one, the sequence number.

Quote:

Yes, Dual boot for all my systems (except the wifes laptop).
the linux wus had 101 and 102 as the second set of numbers. Windows wus were375,376, 378, 379, and 380. hmm, is there a range for these numbers??

Good, then I'll keep watching your 6000+ system. I think the range is 0-400, but I could be mistaken...

Astro
Astro
Joined: 18 Jan 05
Posts: 257
Credit: 1000560
RAC: 0

I attached my linux hosts 5

I attached my linux hosts 5 days ago, but only attached the windows persona the morning before they released 4.25. Since I don't know beans,,,,I decided to keep running 4.15 on the windows to get a "baseline". Now, If I need to collect one of every x axis number possible to show a decent baseline, then I mights well dump 4.15 and use 4.25 instead. Would that be the right thing to do? Let's see 2 wus/day, 400 samples....If properly distributed, that'd take 200 days to get them all, and that's if I never rebooted to linux, AND if I wasn't attached to 2 other projects. I wonder if 4.15 would even be around long enough to get a decent sample size??, or one that could be "comparable".???

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2974597955
RAC: 797223

RE: Brian, I PMed Richard 3

Message 77361 in response to message 77358

Quote:

Brian, I PMed Richard 3 days ago asking about what the numbers mean. Perhaps you can tell me? he pointed to two sets of numbers within the wu name. One which has a decimal (the first numbers which appear to be some sort of frequency), and a seconds set which I have NO idea what it is. I've seen the first number referred to as a "template" by an Archae86 chart. Basically, I don't know the terminology. I see the nifty inverted "full wave" DC chart "like" graphs, but don't know what the bottom line (x axis) represent. Is the x axis the second number? I I were to create a column for each in my script, what titles should I give them??

Yes, Dual boot for all my systems (except the wifes laptop).
the linux wus had 101 and 102 from 0715.25 as the second set of numbers. Windows wus were 375,376, 378, 379, and 380 from 0794.15. hmm, is there a range for these numbers??


Sorry I didn't reply to the PM, but I'm not a gravity wave scientist and, frankly, I don't know. Like everyone else, I can guess, speculate, read the news archive, and search the message boards. I've also still (just) got a couple of memory (brain) cells for emergency use when I can't think of a search phrase.

h1_0679.30_S5R2__97_S5R3a_2

h1 ... Once upon a time, we had separate data from Hanford and Livingstone, so we had 'h' tasks and 'l' tasks. The S5 heirarchical search looks at data from both sites together, so this is redundant.

0679.30 ... A frequency. Expressed to two decimal places.

S5R2 ... When this data format came into use, I think.

97 ... Task number within the frequency band. Issued in decreasing order, down to base 0

S5R3a ... Major version number of the Science App intended to do the search.

2 ... Replication number (base 0). Somebody's wingman must have thrown a wobbly.

The graphs Peter and I have been posting have the field I have called "Task number" along the X-axis. I don't think it's worth combining tasks from multiple (data) frequencies in any sort of crunch-time analysis: as we saw very early in the S5R3 analysis, runtimes oscillate, but the frequency of oscillation is different at different search (data) frequencies. You might like to add that whole thread to your reading list.

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282700
RAC: 0

RE: Sorry I didn't reply

Message 77362 in response to message 77361

Quote:

Sorry I didn't reply to the PM, but I'm not a gravity wave scientist and, frankly, I don't know.

...nor are you German... (I was referring to Bikeman)...

Astro
Astro
Joined: 18 Jan 05
Posts: 257
Credit: 1000560
RAC: 0

still...two remaining

still...two remaining questions:

Is the range of task numbers finite? I.E never get higher than X, and never negative?

How is the oscillation frequency determined (60 hz, 45 hz, etc). Is it a function of the base frequency? Is it calculated from the resultant chart data peaks/troughs? measured P-P(peak to peak) or every other P as if the second have of the wave wasn't rectified? I.E every 45 tasks a peak occurs??

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2974597955
RAC: 797223

RE: still...two remaining

Message 77364 in response to message 77363

Quote:

still...two remaining questions:

Is the range of task numbers finite? I.E never get higher than X, and never negative?

How is the oscillation frequency determined (60 hz, 45 hz, etc). Is it a function of the base frequency? Is it calculated from the resultant chart data peaks/troughs? measured P-P(peak to peak) or every other P as if the second have of the wave wasn't rectified? I.E every 45 tasks a peak occurs??


Read the thread I linked. At the time I posted, it was a completely unknown - even to Bernd - artefact of the search process (unplanned and unexpected). But perhaps he knows by now.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 753130486
RAC: 1191037

Hi! You'll find some infos

Hi!

You'll find some infos in the "Visualization" Thread that also was recommended to you.

The task number is directly related to the region of the sky that the respective WU deals with. The zero numbered WU always seems to start at a pole of the starphere coordinate system, so to speak. The following tasks will traverse the sky towards the equator, and then towards the opposing pole. Then it will start all over with other parameters.

Tasks that look at sky-regions near the poles are slower than those near the equator. Because "space" to look at get's smaller at the pole (the circumference of the ring of points that are investigated gets smaller), the "angular speed" of the search increases near the poles, and I guess that's what produces the steep slope of the graphs near the maximums.

For different search frequency (the first number in the WU name), the algorithm traverses the sky with different speeds, so it takes more tasks to cover the whole sky at higher frequencies ==> longer period of the runtime oscillation.

Anyway, I guess we should continue this discussion in the S5R3 sticky thread, because we are getting a bit off topic here.

CU

Bikeman

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 753130486
RAC: 1191037

As to the Win beta

As to the Win beta app:

Graphics work like a charm, even under Windows Vista.

As to the speed, I did some profiling and disassembling and as Bernd has already mentioned, the Microsoft compiler just ruined the hot-loop, even worse than gcc did for the latest beta app :-(

In addition, the Compiler emits code that is really not so hot when it comes to copying or initializing double precision floating point data:

For those who are familiar with assembly language programming:

Quote:


....
mov esi, DWORD PTR [eax]
mov eax, DWORD PTR [eax+04h]
mov DWORD PTR [esp+0b4h], eax
mov DWORD PTR [esp+0b0h], esi
fld QWORD PTR [esp+0b0h]
mov DWORD PTR [esp+090h], ecx
mov DWORD PTR [esp+094h], ecx
fld QWORD PTR [esp+090h]
mov DWORD PTR [esp+070h], ecx
mov DWORD PTR [esp+074h], ecx
fld QWORD PTR [esp+070h]
....


Here we see a stunning series of three consecutive instances of what is (IIRC) called a "store forwarding stall". Very expensive. I really don't know what the compiler had on his mind when he was writing that :-), but these little instructions alone might be responsible for an overall 4...5% performance loss. And there are more spots like this. I wonder if there's a compiler switch to prevent this.

CU
Bikeman

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 561
Credit: 4527270
RAC: 0

RE: mov esi, DWORD

Message 77367 in response to message 77366

Quote:

mov esi, DWORD PTR [eax]
mov eax, DWORD PTR [eax+04h]
mov DWORD PTR [esp+0b4h], eax
mov DWORD PTR [esp+0b0h], esi
fld QWORD PTR [esp+0b0h]
mov DWORD PTR [esp+090h], ecx
mov DWORD PTR [esp+094h], ecx
fld QWORD PTR [esp+090h]
mov DWORD PTR [esp+070h], ecx
mov DWORD PTR [esp+074h], ecx
fld QWORD PTR [esp+070h]

I really don't know what the compiler had on his mind when he was writing that :-)


Each complier has to follow a scheme, because they don't have any intuitions.

1, init local variables

Quote:
mov esi, DWORD PTR [eax]
mov eax, DWORD PTR [eax+04h]
mov DWORD PTR [esp+0b4h], eax
mov DWORD PTR [esp+0b0h], esi
mov DWORD PTR [esp+090h], ecx
mov DWORD PTR [esp+094h], ecx
mov DWORD PTR [esp+070h], ecx
mov DWORD PTR [esp+074h], ecx


2, compile calculations

Quote:
fld QWORD PTR [esp+0b0h]
fld QWORD PTR [esp+090h]
fld QWORD PTR [esp+070h]


3, optimalization (interlace of integer/FPU operations)

Quote:
mov esi, DWORD PTR [eax]
mov eax, DWORD PTR [eax+04h]
mov DWORD PTR [esp+0b4h], eax
mov DWORD PTR [esp+0b0h], esi
fld QWORD PTR [esp+0b0h]
mov DWORD PTR [esp+090h], ecx
mov DWORD PTR [esp+094h], ecx
fld QWORD PTR [esp+090h]
mov DWORD PTR [esp+070h], ecx
mov DWORD PTR [esp+074h], ecx
fld QWORD PTR [esp+070h]


So, this is an optimized result.
But, always is a faster solution, of course.

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282700
RAC: 0

RE: But, always is a

Message 77368 in response to message 77367

Quote:

But, always is a faster solution, of course.

Let's hope so...

My "0" result finished in 50,579.88. I'm guessing that translates to 10-12% drop for me...although it is a guess, as I don't have all these data plots to go on... It could be less than that, perhaps even around the 7% that you stated happened on a Sempron...

Anyway, tough call on whether or not to make the app official. Seems completely stable, but the performance drop could increase the incidence of systems missing the deadline. Could you please ask Bernd to consider increasing the deadline up to 18 days temporarily?

Thanks....

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.