Binary Radio Pulsar Search (Arecibo) "BRP4" - new data!

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250444187
RAC: 35219

RE: In the coming weeks,

Quote:
In the coming weeks, whats the relative number of BRP4 to BRP5?

The goal is that while we have 'inner galaxy' beams to process, we intend to dedicate half of the project's GPU power to processing these. Given the run-time ratio this means that during this time we'll send out 3-4 BRP4G tasks for every BRP5 task. This should give us a processing rate around 40 beams/d, meaning that the current data will last about a month. After that, the BRP4G tasks will vanish again and BRP5 will take over, until we get fresh data from Arecibo.

BM

BM

HenkM
HenkM
Joined: 29 Sep 09
Posts: 32
Credit: 279008202
RAC: 0

I think mixing BRP4 GPU and

I think mixing BRP4 GPU and BRP5 GPU tasks is not a good idea at all.

The pool of “to do†WUs is not a logical FIFO pool any more. This is caused by sending out BRP4 WUs with a 7 day expiration time. There is no logical technical reason for sending out them that way to the hosts.

The effect of the 7 day expiration time is, as I have seen:
Tasks are stopped just 3 minutes before termination and another fresh task is started.
Tasks are completely unnecessarily started at high priority.

This way crunching Einstein needs much attention.

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

RE: Tasks are completely

Quote:

Tasks are completely unnecessarily started at high priority.

This way crunching Einstein needs much attention.


Or you could lower your cache settings a bit and neither you nor Boinc will have to panic. =)
This is a very reliable project that hardly ever goes down and on the rare occasion when they do it's at most a few days over a weekend, so something like a 3 days cache should bridge more or less any unexpected outage.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250444187
RAC: 35219

RE: I think mixing BRP4 GPU

Quote:
I think mixing BRP4 GPU and BRP5 GPU tasks is not a good idea at all.

Well, you don't have to mix these. You may opt out of either application (in Einstein@Home preferences) any time.

Quote:
This is caused by sending out BRP4 WUs with a 7 day expiration time. There is no logical technical reason for sending out them that way to the hosts.

Indeed there is no technical reason, but a scientific one: once we have data for BRP4G, we need to get this processed as fast as feasible, while we don't care that much how fast we go through BRP5 - this search will take several months anyway.

BM

BM

jon b.
jon b.
Joined: 30 Jun 09
Posts: 7
Credit: 368423856
RAC: 1042778

RE: The pool of “to doâ€

Quote:


The pool of “to do†WUs is not a logical FIFO pool any more. This is caused by sending out BRP4 WUs with a 7 day expiration time. There is no logical technical reason for sending out them that way to the hosts.

The effect of the 7 day expiration time is, as I have seen:
Tasks are stopped just 3 minutes before termination and another fresh task is started.
Tasks are completely unnecessarily started at high priority.

As far as I know, the deadline for these tasks is 14 days. The problem you are experiencing is most likely caused by having a large work buffer.

HenkM
HenkM
Joined: 29 Sep 09
Posts: 32
Credit: 279008202
RAC: 0

Because there is a hurry I

Because there is a hurry I opted out the BRP5.

@jon b and others.
I know the problems of a big cache but in my case this is not the problem, my cache is set to 5 days.

Stranger7777
Stranger7777
Joined: 17 Mar 05
Posts: 436
Credit: 429609269
RAC: 76174

I opted out the BRP5 too for

I opted out the BRP5 too for a short time thus concentrating on BRP4G.
How much data are planned to be crunched in BRP4G project? How many packs are planned to receive from Arecibo yet?

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117547113428
RAC: 35332494

Let me offer some wild

Let me offer some wild speculation to answer your questions :-).

Quote:
How much data are planned to be crunched in BRP4G project?


As much as Arecibo can produce ;-).
It's all very recent data and I would think that these small batches will continue to be made available from time to time well into the future as new data is collected.

Quote:
How many packs are planned to receive from Arecibo yet?


Hopefully lots :-).
I imagine it all depends on how long the scientific project that's collecting the data in the first place keeps running.

My guess is that the rapid turnaround last time (when the batch of data was crunched in around 18 days) would have really been appreciated by the people who sent it - to the extent that they are likely to keep sending new data as they collect it.

Cheers,
Gary.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250444187
RAC: 35219

RE: How much data are

Quote:
How much data are planned to be crunched in BRP4G project? How many packs are planned to receive from Arecibo yet?

The short answer is: we don't know. The BRP4G pipeline will be fed the "interesting" data from Arecibo that we find we need to go through quickly, we don't know how much there will be beyond what we got yet. Due to the nature of how Arecibo is used and what data is produced, how that data is transferred, and the varying capacity of the pipelines that can process it, it's hard to plan very much ahead on our end.

The long story follows.

Arecibo can take about 50 beams per day. How much "survey data" that we use on Einstein@Home is actually taken per day depends on various conditions. Sometimes the telescope is not in operation at all, some times it is used to observe specific (known) pulsars or regions in the sky that we don't expect to bear unknown pulsars.

The survey data also covers different regions in the sky that are of different importance (to us). Due to the lower "pulsar density", "outer galaxy" beams are less interesting. Another factor is how old the data is, and how many other pipelines of the PALFA collaboration have already processed and grabbed the pulsars out of it. Most interesting is data from the "inner galaxy" region that no other PALFA pipeline has looked at yet.

The data flow from Arecibo to the Einstein@Home pipeline is not continuous. Data is recorded on disks that are physically shipped to Cornell when these are full. At Cornell the data is fed into a data storage system, managed by a database system that has shown not to be as reliable as we would wish. From Germany we are periodically querying the system for data that's new (to us), and occasionally we get some. At our end the data arrives quite "bursty". Sometimes we get thousands of beams in a few days, and sometimes there's nothing for months.

Our original plan was to process the arriving Arecibo data with our CPUs (BRP4). This was under the assumption of a more or less continuous income of data. However we found that the capacity of this pipeline is not sufficient to dig through the bunches of "interesting" data that we got recently as fast as we wish. Therefore we set up another GPU pipeline (BRP4G).

After the data storage system at Cornell was stuck for a couple of months, we initially got ~700 beams of "interesting" data that we managed to process with the CPUs, as we threw essentially all of our CPU power at it (the GW search had no work at that time). We now got another chunk of >1000 "interesting" beams that we feed to our GPUs. Our CPUs (including Android devices) are fed from the pile of less interesting data that we don't need to go through so quickly. The process counting e.g. on the server status page, however, doesn't know about "initeresting" vs "less interesting" data, CPUs vs GPUs or BRP4 vs BRP4G, thus as long as we process interesting data with GPUs the numbers shown there might well be misleading.

BM

BM

Filipe
Filipe
Joined: 10 Mar 05
Posts: 186
Credit: 405957412
RAC: 397623

Bernd, Really well

Bernd,

Really well explained. Thanks for keeping us all this well informed.

Filipe

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.