Just curious: What happens to the returned workunits once they are validated?
E@H awards about 7 million credits each day, that should mean about ca 7e06 credits / 240 (credits per result) / 2 ~ 15,000 validated results per day.
At an average (compressed) size of more than 75 kB, this would mean roughly 1 TB of zip-compressed ASCII data each day. I guess storing the data in a DB would require roughly the same amount of space, or only slightly less.
For the whole S5R3 run this means a couple of hundreds of TB of raw data contributed by clients.
Do these results pile up in a giant storage array until the run is completed or will postprocessing begin immediately after validation? Or after all results for a certain frequency band are in?? As I said, I'm just curious.
CU
Bikeman
Copyright © 2024 Einstein@Home. All rights reserved.
Postprocessing of E@H data..but how?
)
"I have this question too."
I'm curious too. The volume
)
I'm curious too. The volume of data for this project is truely mind boggling. I wonder how the servers keep up with it all.
If you think of the universe as a computer cluster of untold numbers of particle "processors" each with a program to follow the rules of physics, then you really have a super duper powerful computer. How may tera bytes of storage does the universe have? I'm just rambling now.
RE: If you think of the
)
ISTR reading (in Asimov?) many years ago an estimate of around 10^70 for the number of particles in the observable universe. Sum over the number of bits required to specify the quantum state of each one …
RE: Just curious: What
)
Here's another question along the same line. . .
Now that the S5R3 apps include some of what used to be post-processing functions, can we now extract any meaningful data from the S5R3 units that have been completed, or do we still have to wait until all work units have completed?
RE: RE: If you think of
)
Yes, but some of them may be "entangled", so that a measurement on the state of one particle gives also the state of the other particle. This is the idea on which quantum computing is based (see www.qubit.org).
Tullio
Well, I've been doing a bit
)
Well, I've been doing a bit of digging at the LIGO Document Control Center using the phrase "Data Analysis" entered into the Keyword field ( no entries in other fields ). I recovered many hits but in particular G060539-00.pdf, which is titled LIGO Data Analysis Systems (Data Management and Analysis) - Annual NSF Review ( LDAS ) presented on 23/10/2006. [ NB. LDAS took 22 man-years of software construction, and undergoes upgrades ie. versioning ]. Pages 3 thru 12 outline the sorts of numbers that are emitted by the detectors, and the hardware systems that handle it. The remainder deals with 'in house' analysis of both realtime and offline character not involving E@H particularly, but also see here and here. Heavy metal is probably an understatement. However page 7 indicates ~ 470 TeraBytes per year generated. This is about 3MB/second per interferometer, with the sampling rate on the differential arm ( 'gravity' ) signal @ 16384 Hz being ~2% of that - as what is also included is a raft of 'state of the IFO' time-aligned data channels like servo settings, seismometers etc... This is all continuous & un-triggered readout with GPS timestamps. So much for data production!!!
There are four major data analysis working groups - inspiral, burst, continuous wave and stochastic. These have different requirements for their searches/algorithms/waveforms due to the different astrophysical signal origins/types - though a typical approach uses matched filtering. It looks like those that are suitably permitted access/share the data via the LSC DataGrid - again not directly involving E@H.
Essentially E@H is simply looking for excess power ( significantly above noise ) in the data. Einstein @ Home is characterised as 'Off-line large scale computing power' and 'distributed data analysis system' - we're in the data analysis pipeline for LIGO - not the only ( but a major ) player. LDAS can perform 'data pre-processing, conditioning, reduction' and stores the outputs of the analyses within a relational database system.
So I'd say the results of our processing at E@H will find it's way to that repository, and a further guess would be that the timing of post-processing would depend upon a given strategy. I would expect/think that a stepping through the phase space of whichever is the search/problem in question, lends itself to the pipelined structure of this enterprise. There are some early conceptual design/requirements documents here, here, here, here, here, and here - which don't quite directly answer the questions asked in his thread but give a background flavour to the task at hand.
Cheers, Mike.
NB. Oh, and from memory, the cosmologists who do inflation/big-bang modelling ( from the 'slow roll' of an 'inflaton' field ) routinely talk of state/entropy generation to a googol ( 10^100 ) or so magnitude. If you like this is ~ the quantity of states that the universe starts with - the initial 'clock winding up' - and it's downhill from there. But heck, there must be a bucket or three assumptions in such guesstimates. :-)
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
That's interesting. Would
)
That's interesting.
Would it be reasonable to assume that E@H will find everything there is to find since we are processing _all_ of the data on _all_ of the sky while looking for anything poking out of the noise?