upcoming GW run S6Bucket (was S6GC1)

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,935
Credit: 198,728,265
RAC: 45,029
Topic 195672

Planning for the next GW run is ongoing, but some decisions have already been made and can be publicized:

  • * We will analyze data from S6 using the same semi-coherent method that we used in past runs, and essentially the same application code. The run will therefore be named "S6GC1".

* As there is less usable data from S6 than there was from S5, the data volume per workunit will decrease a lot. This will mainly affect the download volume at the beginning of a run, and the data volume to be stored on the client's disk.

* To increase the sensitivity, we will go for fewer but longer coherent segments (this is very technical, but the new run will be strongly dominated by the coherent computation (FStat), the incoherent combination will be 20% or less - for comparison in S5GC1HF it's roughly 50/50)

* With the current application code (that does a refinement only in one dimension (spindown) of the parameter space) the memory requirement will also be lower with the reduced number of stacks. A refinement in three dimensions (spindown and sky position - for comaprison: the old incoherent "Hough" code did a refinement only in sky position) would be possible and further increase sensitivity. It is currently under investigation whether we could implement this in the remaining time, how much sensitivity we would gain, and how much memory that would cost.

* Also under investigation is whether and how to possibly add a "vetoing" already in the application that could filter out e.g. known instrumental artifacts. Currently this is done later in post processing. As a limiting factor on the sensitivity of the search is the number of candidates we can send back from the clients, eliminating candidates before sending them back might therefore increase the sensitivity of the search, too. One unwanted side-effect of this vetoing would be that the run-time of a given task becomes data-dependent and less predicable.

I intend to continue posting to this thread as the preparation for S6GC1 progresses.

BM

BM

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 438
Credit: 156,903,308
RAC: 33,385

upcoming GW run S6Bucket (was S6GC1)

Hallo BM !
Many thanks for keeping us informed! I´m somewhat upset about the little response to your thread. Not every day I´m looking throughout the forums. But I will be very much pleased to heare soon more about the progress and planes for this project.

Kind regards
Martin

pabliedung
pabliedung
Joined: 5 Dec 10
Posts: 1
Credit: 10,493,957
RAC: 0

Thanks a lot for the updates!

Thanks a lot for the updates! They are much appreciated.

telegd
telegd
Joined: 17 Apr 07
Posts: 91
Credit: 10,212,522
RAC: 0

Yes, thanks for the

Yes, thanks for the update.

Quote:
...eliminating candidates before sending them back might therefore increase the sensitivity of the search, too. One unwanted side-effect of this vetoing would be that the run-time of a given task becomes data-dependent and less predicable.

Does this mean the work-unit would abort mid-task when a known artifact is encountered? I assume that this would only reduce the time taken on any given task, not increase it?

If so, would this only be an issue for calculating credit? Sorry if I am misinterpreting your explanation!

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6,124
Credit: 126,356,367
RAC: 21,297

RE: Yes, thanks for the

Quote:

Yes, thanks for the update.

Quote:
...eliminating candidates before sending them back might therefore increase the sensitivity of the search, too. One unwanted side-effect of this vetoing would be that the run-time of a given task becomes data-dependent and less predicable.

Does this mean the work-unit would abort mid-task when a known artifact is encountered? I assume that this would only reduce the time taken on any given task, not increase it?


No, it's just some processing on the client side that would have been done anyway on the server side. So that will, if anything, increase the WU time and that in turn will depend upon whether said artifacts were disclosed by the particular WU. The increased search sensitivity is achieved by having more 'real' candidates returned on a fixed size 'bus back to town' ie. eliminating known dead wood ( so it's possible that without this extra veto work 'quieter' astronomical candidates didn't previously get on the list because of room taken by 'noisier' instrument sources ).

I hope I got that right Bernd! :-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter. Blaise Pascal

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,935
Credit: 198,728,265
RAC: 45,029

RE: One unwanted

Quote:
One unwanted side-effect of this vetoing would be that the run-time of a given task becomes data-dependent and less predicable.

I have to admit that this is too short for an explanation of a very technical detail.

The GW application currently reports the 10.000 most promising "candidates" found in every workunit. This list ("toplist") is built and maintained during computation of a task. Whether or not a new candidate makes it into this list depends on what's already in the list. Replacing a candidate doesn't take much time, albeit being data-dependent this time is negligible in the current application.

Calculating the veto statistics for a given candidate takes a significant computational effort. However information from the "normal" (F-Statistic) computation could be re-used when still available in memory, reducing the additionally required computation time.

For implementing the veto we thought about three options:

  • 1. Calculate the veto statistics for each candidate before entering it into the toplist. (+) allows to reuse FStat values, easiest implementation. (-) highest computational effort (millions of candidates)

2. Calculate the veto statistics only for those candidates that would make it into the toplist. (+) lower computational effort than 1., FStat values can be re-used. (-) data-dependent, thus unpredictable computational effort (you don't know in advance how often a candidate is replaced in the toplist during a task)

3. Calculate the veto statistics at the end only for the candidates in the toplist. (+) lowest computational effort, depends only on length of toplist, not on number of templates searched. (-) FStat values have to be re-computed for the candidates (old values aren't stored due to memory limitations), most effort required for implementation, veto shortens the toplist (in worst case to empty list).

At the time of writing the initial post we intended to go for option 2. However we fond that we won't be able to gain enough confidence in the veto algorithm and implementation in the remaining time of the current run to actually let it automatically filter out candidates.

So we'll go for a (modified) option 3.: we'll calculate the veto statistics for each candidate of the toplist at the end and add the result to the candidate list (in an additional column) that is reported back. In S6GC1, no candidate will actually be "vetoed" (i.e. removed from the list), but calculating the statistics on the client computers will save time during post-processing and help gain confidence in the veto.

Bottom line: at current state of plans the point "data-dependent computation time" will not apply to S6GC1.

BM

BM

telegd
telegd
Joined: 17 Apr 07
Posts: 91
Credit: 10,212,522
RAC: 0

Thanks Bernd for the

Thanks Bernd for the technical explanation. It is much appreciated.

Quote:
having more 'real' candidates returned on a fixed size 'bus back to town'


Thanks Mike, I liked this metaphor!

So, going with modified Option 3 means that the riders on the bus don't change. It just pre-tags them with a warning if it seems like they contain artifacts? How much of a savings on the server side will you be able to realize? I am guessing you are hoping to eventually get something like Option 2 in a future application?

I assume that differences in run time, and therefore credit calculations, makes life more complex for the project.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,935
Credit: 198,728,265
RAC: 45,029

Einstein@home is currently

Einstein@home is currently distributing the S6 data to the download mirrors all over the world. This should affect the downloads of S5GC1HF data files by the clients too much, but you'll never know.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,935
Credit: 198,728,265
RAC: 45,029

The new S6 GW analysis run

The new S6 GW analysis run will have the internal name "S6Bucket" (because of the bucket-shape of the sensitivity curve used to select the data from S6).

I hope to send out the first few hundred test workunits of the new run in the next hours.

BM

BM

Rechenkuenstler
Rechenkuenstler
Joined: 22 Aug 10
Posts: 138
Credit: 101,411,803
RAC: 0

Does anybody have an example,

Does anybody have an example, how to incorporate this search into the app_info.xml file?

[AF>EDLS] Polynesia
[AF>EDLS] Polynesia
Joined: 1 Apr 09
Posts: 24
Credit: 2,273,003
RAC: 0

Hello, Would you like to

Hello,

Would you like to the uninitiated simply explain what this new application, thank you!?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.