Postprocessing of E@H data..but how?

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 734317350

RAC: 1295636

19 Dec 2007 13:21:37 UTC

Topic 193383

(moderation:

)

Just curious: What happens to the returned workunits once they are validated?

E@H awards about 7 million credits each day, that should mean about ca 7e06 credits / 240 (credits per result) / 2 ~ 15,000 validated results per day.

At an average (compressed) size of more than 75 kB, this would mean roughly 1 TB of zip-compressed ASCII data each day. I guess storing the data in a DB would require roughly the same amount of space, or only slightly less.

For the whole S5R3 run this means a couple of hundreds of TB of raw data contributed by clients.

Do these results pile up in a giant storage array until the run is completed or will postprocessing begin immediately after validation? Or after all results for a certain frequency band are in?? As I said, I'm just curious.

CU
Bikeman

JLDun

Joined: 22 Apr 06

Posts: 10

Credit: 283166

RAC: 467

Postprocessing of E@H data..but how?

25 Dec 2007 6:19:35 UTC

Message 76475

(moderation:

)

"I have this question too."

peanut

Joined: 4 May 07

Posts: 162

Credit: 9644812

RAC: 0

I'm curious too. The volume

15 Jan 2008 3:06:52 UTC

Message 76476

(moderation:

)

I'm curious too. The volume of data for this project is truely mind boggling. I wonder how the servers keep up with it all.

If you think of the universe as a computer cluster of untold numbers of particle "processors" each with a program to follow the rules of physics, then you really have a super duper powerful computer. How may tera bytes of storage does the universe have? I'm just rambling now.

Odysseus

Joined: 17 Dec 05

Posts: 372

Credit: 20592566

RAC: 6484

RE: If you think of the

19 Jan 2008 2:45:12 UTC

Message 76477 in response to message 76476

(moderation:

)

Quote:

If you think of the universe as a computer cluster of untold numbers of particle "processors" each with a program to follow the rules of physics, then you really have a super duper powerful computer. How may tera bytes of storage does the universe have? I'm just rambling now.

ISTR reading (in Asimov?) many years ago an estimate of around 10^70 for the number of particles in the observable universe. Sum over the number of bits required to specify the quantum state of each one â€¦

Donald A. Tevault

Joined: 17 Feb 06

Posts: 439

Credit: 73516529

RAC: 0

RE: Just curious: What

20 Jan 2008 20:34:02 UTC

Message 76478

(moderation:

)

Quote:

Just curious: What happens to the returned workunits once they are validated?

E@H awards about 7 million credits each day, that should mean about ca 7e06 credits / 240 (credits per result) / 2 ~ 15,000 validated results per day.

At an average (compressed) size of more than 75 kB, this would mean roughly 1 TB of zip-compressed ASCII data each day. I guess storing the data in a DB would require roughly the same amount of space, or only slightly less.

For the whole S5R3 run this means a couple of hundreds of TB of raw data contributed by clients.

Do these results pile up in a giant storage array until the run is completed or will postprocessing begin immediately after validation? Or after all results for a certain frequency band are in?? As I said, I'm just curious.

CU
Bikeman

Here's another question along the same line. . .

Now that the S5R3 apps include some of what used to be post-processing functions, can we now extract any meaningful data from the S5R3 units that have been completed, or do we still have to wait until all work units have completed?

tullio

Joined: 22 Jan 05

Posts: 2118

Credit: 61407735

RAC: 0

RE: RE: If you think of

21 Jan 2008 5:37:50 UTC

Message 76479 in response to message 76477

(moderation:

)

Quote:

Quote:
If you think of the universe as a computer cluster of untold numbers of particle "processors" each with a program to follow the rules of physics, then you really have a super duper powerful computer. How may tera bytes of storage does the universe have? I'm just rambling now.

ISTR reading (in Asimov?) many years ago an estimate of around 10^70 for the number of particles in the observable universe. Sum over the number of bits required to specify the quantum state of each one â€¦

Yes, but some of them may be "entangled", so that a measurement on the state of one particle gives also the state of the other particle. This is the idea on which quantum computing is based (see www.qubit.org).
Tullio

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6591

Credit: 319723114

RAC: 429198

Well, I've been doing a bit

21 Jan 2008 10:08:58 UTC

Message 76480

(moderation:

)

Well, I've been doing a bit of digging at the LIGO Document Control Center using the phrase "Data Analysis" entered into the Keyword field ( no entries in other fields ). I recovered many hits but in particular G060539-00.pdf, which is titled LIGO Data Analysis Systems (Data Management and Analysis) - Annual NSF Review ( LDAS ) presented on 23/10/2006. [ NB. LDAS took 22 man-years of software construction, and undergoes upgrades ie. versioning ]. Pages 3 thru 12 outline the sorts of numbers that are emitted by the detectors, and the hardware systems that handle it. The remainder deals with 'in house' analysis of both realtime and offline character not involving E@H particularly, but also see here and here. Heavy metal is probably an understatement. However page 7 indicates ~ 470 TeraBytes per year generated. This is about 3MB/second per interferometer, with the sampling rate on the differential arm ( 'gravity' ) signal @ 16384 Hz being ~2% of that - as what is also included is a raft of 'state of the IFO' time-aligned data channels like servo settings, seismometers etc... This is all continuous & un-triggered readout with GPS timestamps. So much for data production!!!

There are four major data analysis working groups - inspiral, burst, continuous wave and stochastic. These have different requirements for their searches/algorithms/waveforms due to the different astrophysical signal origins/types - though a typical approach uses matched filtering. It looks like those that are suitably permitted access/share the data via the LSC DataGrid - again not directly involving E@H.

Essentially E@H is simply looking for excess power ( significantly above noise ) in the data. Einstein @ Home is characterised as 'Off-line large scale computing power' and 'distributed data analysis system' - we're in the data analysis pipeline for LIGO - not the only ( but a major ) player. LDAS can perform 'data pre-processing, conditioning, reduction' and stores the outputs of the analyses within a relational database system.

So I'd say the results of our processing at E@H will find it's way to that repository, and a further guess would be that the timing of post-processing would depend upon a given strategy. I would expect/think that a stepping through the phase space of whichever is the search/problem in question, lends itself to the pipelined structure of this enterprise. There are some early conceptual design/requirements documents here, here, here, here, here, and here - which don't quite directly answer the questions asked in his thread but give a background flavour to the task at hand.

Cheers, Mike.

NB. Oh, and from memory, the cosmologists who do inflation/big-bang modelling ( from the 'slow roll' of an 'inflaton' field ) routinely talk of state/entropy generation to a googol ( 10^100 ) or so magnitude. If you like this is ~ the quantity of states that the universe starts with - the initial 'clock winding up' - and it's downhill from there. But heck, there must be a bucket or three assumptions in such guesstimates. :-)

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

jowr

Joined: 19 Feb 05

Posts: 55

Credit: 1947636

RAC: 0

That's interesting. Would

1 Feb 2008 5:41:54 UTC

Message 76481

(moderation:

)

That's interesting.

Would it be reasonable to assume that E@H will find everything there is to find since we are processing _all_ of the data on _all_ of the sky while looking for anything poking out of the noise?

Postprocessing of E@H data..but how?

Forums › Science

Postprocessing of E@H data..but how?

I'm curious too. The volume

RE: If you think of the

RE: Just curious: What

RE: RE: If you think of

Well, I've been doing a bit

That's interesting. Would

Comment viewing options

Forums › Science