8 Mar 2011 1:27:31 UTC

Topic 195693

(moderation:

We now continue on from a prior thread ....

Let's look at a version of the Fourier analytic equation for calculating the coefficients :

which is for a periodic function f having some period T. Try to imagine ( at least in your mind's eye ) what happens if/as we let T get larger and larger, so that in the limit T becomes infinite :

- firstly, this will mean that f has only one cycle, because the period becomes sooooo long that it can never happen twice! :-) :-)

- secondly, the variable combination of k/T becomes a single continuous variable. Why? Because when T was fixed k simply marked off the frequency axis in integral jumps ( with some minimum gap/jump ), whereas if T can get arbitrarily large then I can choose any arbitrary k so as to form a quotient k/T which is as close to any real number as I please. As T gets larger 1/T allows more 'fine graining' of frequency space.

- thirdly, the normalising 1/T factor out front is no longer needed as it was only present to account for different functions recurring at different rates. This is no longer an issue.

In any case, despite or because of such heuristic arguments, the following defines the Fourier Transform of a non-periodic function :

where Ff could be read as denoting 'that function which is the Fourier transform of the function fâ€™. Ff(s) is this function evaluated at the value s, where s is some frequency value chosen. So Fg is the Fourier transform of the function g etc. Ff and f are functions that are related by the Fourier transform process although are actually defined on the same mathematical domain*, that being all the real numbers. We just happen to distinguish a special meaning per the problem context - as a number is just a number with the units of physical measurement being a separate business. We are using the different symbols â€˜sâ€™ and â€˜tâ€™ to name the index variable for each respective function, but both indicate ( that's what indices do ) real numbers. We use the juxtaposition of the symbols F and f together as Ff to remind us of that relationship between functions via the Fourier process. One viewpoint is that F is an operator on the function f having the result of producing another function denoted as Ff. Thus F alone does not represent a function on numbers yielding other numbers, and so F(s) would make no sense.**

This is a somewhat more abstract approach â€“ an operator turning one function into another, thus it is a â€˜function upon functionsâ€™ allied in concept to an â€˜ordinaryâ€™ function translating one number to another â€“ but I think is a necessary step to avoid a tangle of notation/concepts. This still has an â€˜index look-upâ€™ idiom. Hence you could run your finger down a list of functions to reach some desired one and then slide your finger across to the next column to see what function is associated with it via the property of being itâ€™s Fourier transform. Indeed one of my Fourier textbooks has precisely that as an appendix or three that goes on for pages and pages. As with any notation one ought focus on the results of using it as and when applied. We will lean even more heavily on this idea later â€“ defining an operation and itâ€™s notation by results â€“ when I talk of the â€˜deltaâ€™ and similiar â€˜awkwardâ€™ functions.

To restate/summarise/flog-to-death :

- f is a function with domain the real numbers and the range ( or more precisely the image ) of this function is ( some subset of ) the real numbers. You can evaluate the function at some domain point â€˜tâ€™ and call that range value obtained f(t).

- Ff is likewise a function defined upon the real numbers, and also available to be evaluated ( ie. look up the number thatâ€™s associated with a given choice of index ) as Ff(s).

- F is an operator upon functions. Itâ€™s domain is the set of all functions for which the Fourier analysis process makes sense, and the range is the set of all functions which can be a result of that process.***

Opinions differ and you will find other views. What Iâ€™ve chosen is not perfect but follows many authors who seem to have avoided much symbolic/semantic mess.****

Now one can ask the question "why can't we use the formula for periodic functions?" Short answer : one can use the formula but the integral won't converge. As f oscillates forever, and so does the complex exponential, then unless there is some especial cancellation or collusion there will not be a single value that a Reimmann sum will become arbitrarily close to. Or say the 'box' function, if you stretch/extend it to infinity you will get a constant horizontal line at level of 'one' - the integral formula above also won't converge. But in a certain sense ( see later under the topic of 'generalised functions' ) one can get the Fourier Transform of those periodic ones - and the tricky ones too.

In the context of non-repeating functions then what is the meaning of the transform? Mathematically we have determined coefficients which are spaced infinitesimally apart in frequency rather than with discrete values. With the Fourier Series one could always define some least frequency ( the fundamental ) : as the period was finite ( bounded above ) then the inverse of the period had a lower bound.

Not so with the Transform. This makes sense as with any arbitrarily long period we need to be able to represent really slow changing trends, and possibly as close to zero rate of change as needed. On the other hand we might have some curve/signal which changes quite quickly, thus there is no a priori upper limit on the frequencies to be included in a synthetic sum.

In a real sense the derivative of the time signal correlates with the maximum frequency to use in the transform, which ( via the Fundamental Theorem of Calculus ) is explicable because the frequencies are calculated by integrating the time series!! There's a 'complementarity' here that we will work on some more, when we look at the 'normal' or Gaussian function next time! :-)

Cheers, Mike.

* Note that I am distinguishing a function f as an entity defined by the totality of all itâ€™s pairings of domain/range numbers from a specific range value f(s) which is the result of a particular index look-up. With many texts, and my earlier explanations, this is somewhat blurred but we need to be more precise now.

** Maybe a reasonably good analogy ( if not pushed too far ) is with bank accounts, say the type that return an interest payment for amounts deposited over time. A function f upon the numbers in that account is represented by the formula that derives the interest obtained. This would be evaluated when appropriate to give a number f($), with $ being the account balance that applies when that action is performed. Many banks offer this and you might choose to change banks, or to a different type of account with the same bank, to your advantage. Thus there will be a â€˜transferâ€™ function or operator that in effect withdraws amounts from such an account and deposits into some new account. There will likely be a different contractual arrangement on that new account as well, differing from the prior deal in specific terms like the rate of interest & method of calculation, minimum balance, investment periods, time delay on any â€˜callâ€™ etc. So one could consider a transfer operator T that converts one financial contract/account into another so that some f becomes Tf. Hence T is acting on an account entity and not the balance contained within, so that T($) has no meaning. Tf($) does have meaning : it is interest computed for the Tf account as per the amount $ within. You could have labelled the account balances distinctly, as say $_A and $_B, to indicate which account a balance refers to even though itâ€™s all the same currency.

*** To be a spoiler these pretty well turn out to be the same set of functions for the domain and range! Meaning that a Fourier analysable function may be analysed yet again, but no matter for now ... :-)

**** I'd have to point out that other definitions of the transform abound. I chose to scale the argument of the complex exponential by 2*PI and that in effect makes 't' a pure time variable. Others don't, hence either 's' and/or 't' will scale differently, however the 2*PI will still appear somewhere - usually dividing out front of the integral. For that matter my minus sign on the complex exponential phase could be a plus, so that makes the propellors rotate the other way and the analysed coefficients thus flip their phase sign. From what I've read it seems the former sign convention is more common in engineering, the latter more so in physics. None of this is as big an issue as it might first seem, except when translating one version to another ( you'd substitute '2*PI*s' for 's', or whatever, in the various formulae and then simplify ). The 2*PI is always there and just shifts around the place, as one can't escape the circularity of the base functions. Any chosen representation remains internally consistent and all general lessons remain sound. We will be looking at shifting and scaling of arguments later, and derive some interesting results.

I have made this letter longer than usual because I lack the time to make it shorter. Blaise Pascal

Language

Copyright © 2021 Einstein@Home. All rights reserved.

## Fourier Stuff - Part 2

)

Let's look at the formulae for use with continuously indexed domains then :

Where these formulae only make sense and are mathematically legitimate if f is not too rowdy ( square integrable, finite power, Lebesgue integrable etc ). Actually if f is nice then Ff will be too ( and/or vice versa ).

The first is the analysis formula : to decide the frequency-by-frequency contributions of the base functions to our signal function.

The second is the synthesis formula : instead of a sum to infinity of discrete frequency terms it's an integral over a continous range.

Here'll you note some similarities or symmetries. If I swap the signs in the exponential and the relevant indices to sum over, then one equation looks pretty much like the other. This is a deep and key point in fact. Here's the wide-angle-lens view on that. Go back to the vector space analogy : each function is a vector that can be expressed in terms of a sum of base vectors. The coefficients at one per base vector also form a vector, being an ordered list of number components. Don't worry if that is an infinite number of components. This other vector is not necessarily in the same 'concept space' as the original vector though ( even though the indices are the same algebraic type ). What we have is procedures generically called Fourier operations that relate a vector in one space to that in another, but the 'forward' and 'reverse' operations are almost indentical in concept and execution. Obviously we'd want the Fourier Transform of f when giving Ff to inverse/back/synthesise transform to f. If either leg can happen at all, then that loop f -> Ff -> f exists. So that implies Ff -> f -> Ff for that matter.

Apart from a few annoying details of scaling, ranging, naming and notation the same holds regardless of the 'algebraic type' of the domains : integers or real numbers. So there are really four Fourier processes :

integers -> integers

integers -> continuous

continuous -> integers

continuous -> continuous

which is why I've been somewhat deliberately vague in my choice of thread title, using 'stuff' to throw a blanket over them all. We will do 'integer -> integer' later on ......

Now with many transforms or conversions of one vector/function to another there arises the issue of 'what doesn't change?', which implies there might be something that doesn't vary when one applies some transform/operator. This arises so often in modelling lots of physical situations that a special word is used, of German origin I believe, and that is the word 'eigen'. Which means 'characteristic' or 'typical' in the sense that the basic behaviours of some physical system relate primarily to an 'eigenstate'. It turns out that one can express other, non-eigenstate, circumstances in terms of combinations of eigenstates. There is a whole forest of terminology based upon the 'eigen' word like : 'eigenvalue', 'eigenvector', 'eigenfunction', 'eigenspace', 'eigendecompostion', 'eigenbasis' etc. I've even seen the word 'eigenface' applied where any human's facial features can be modelled as variously weighted sums of basic features/patterns. It all comes back to this idea :

Operation on (something) = constant times (something)

If the (something) can be thought of as a vector then call it an 'eigenvector' if it can satisfy that equation. Generally (something) = 'zero' is not considered a particularly illuminating or interesting case : for one thing the constant won't be uniquely specified, for another operating on nothing and getting nothing is not usually edifying. So an eigenvector = 0 solution is pretty well always excluded from consideration even if trivially true. If the (something) is considered as a function ( and in a way, what can't be? ) then label it an 'eigenfunction'. The constant is typically called an 'eigenvalue', and it could be equal to zero without any worries ( though that does have special significance to the problem at hand ). So roughly speaking if I, standing up, turn around 360 degrees then I have 'operated' or turned myself into myself, with an 'eigenvalue' of one. The eigenvalue doesn't have to be one, it could be negative say or even imaginary! Suppose the operation is differentiation and the function is one of our complex exponentials :

and we could phrase that as saying that the complex exponential is an eigenfunction of the derivative operator/operation. Generally the eigenvectors/functions will each be matched to their own specific eigenvalues. In the above case you can see that the eigenvalue depends upon s, so if perhaps s = -2 then the eigenfunction is exp^[-2*PI*i*(-2)*t] = exp^[+4*PI*i*t] and the eigenvalue is +4*PI*i.

So if you apply some operation to a set of things, and at least for some of those things you get the same thing as a result ( within a multiplicative constant, so the thing is still 'pointing in the same direction' ) then that tells you alot about the system you're dealing with. Much analysis of dynamical systems involves what I would call 'hunting the eigens'. Heck I'll call it 'eigenhunting'. Why not? :-)

Particularly important is the magnitude and the sign/sense of the eigenvalue with respect to one*. If I kept taking derivatives of those Fourier base functions I will get them back again, but the eigenvalues will involve progressively higher powers in the variable s ( squared, cubed .... ). Ah, I'll have to stop going too far down this detour alas. But perhaps wind up a bit by saying that this eigenbusiness can also disclose underlying symmetries.

So what of the Fourier Transform then? It's an operation upon a set of functions/vectors. Are there any eigens about? Yup, you betcha. The Gaussian function is an eigenfunction of the continuous Fourier Transform.

The Gaussian This is not the only Gaussian form/formula about, but it's the one I'll use :

named after Johann Carl Friedrich Gauss of course. What a guy. One of my faves right up there with Galileo, Newton, Einstein and Feynman. The classic/archetypal 'polymath', sort of an intellectual 'midas' who made gold out of anything he touched. Even the students he supervised are a Who's Who of 19th century academia. You may well know him from magnetism ( units of field strength ) as well as maths. Look him up somewhere and have a boggle! :-)

Now :

ignoring what we might think the variables 's' and 't' might mean for some problem, they are both real numbers varying over the same sets. This transform is the same function yet again, as it has the exact same association of any chosen real number in it's domain with the respective real number in it's range. The totality of domain/range functional pairings has not altered one whit. It is it's own Fourier Transform, and thus is an eigenfunction of this Fourier operation with the eigenvalue as one. Next up : scaling and shifting with transforms.

Cheers, Mike.

* There's nearly always an identity function/operator in a system, which has the effect of turning something into itself, for every something, thus everything is an eigen-something with eigenvalue one. You might want to guess what the eigenvalues would be for a system within which signals die away, or amplify without bound, or oscillate. I've assumed/implied the eigenfunctions/vectors here are mutually orthogonal too .... that doesn't have to generally hold ie. eigen-whatever doesn't necessarily imply either orthogonal or basis behaviour. But very frequently they are. A linear combination of eigen-somethings ( eg. Fourier transforms ) doesn't have to be an eigen-something.

I have made this letter longer than usual because I lack the time to make it shorter. Blaise Pascal

## Scaling is where we multiply

)

Scaling is where we multiply or divide the indices in a function's domain by some constant and see what happens. In this instance with time measurements :

... if one clock ticks faster than another because it was constructed/scaled to do so, then a given actual signal will appear to evolve slower on the faster clock! Meaning that if you take two signal values - so & so volts here, and such & such volts there - at certain times on the original clock then those values occur with a wider time interval between them as measured by the faster clock.

If you think of the converse scaling from t' to t then 1/a 1, the one out front ( in red ) indicates that the amplitude of the transform of the signal reduces at any given frequency. The second ( in gold ) means that the transform stretches on the frequency scale.

- if a < 1, the amplitude of the transform increases and the transform compresses on the frequency scale.

Obviously a = 1 is no change. Plus we can't let a = 0**. Negative values of a ( a < 0 ) express symmetry ( or even-ness ) of the transform with time reversal - you may recall that we could identify negative frequencies with waves running backwards in time. Sharp punters will begin to notice a very important principle emerging :

now the Gaussian function ( others too ) can be assigned some algebraic measure of spread like variance or Full_Width_At_Half_Maximum or whatever. The key point is that these widths in the domain and the alternate domain are inversely related and the product of the two is at least some minimum non-zero number aka Heisenberg's Uncertainty Principle including Planck's constant in Fourier language.

This is a bedrock of quantum mechanics, where so-called conjugate quantities do have such an inverse relationship*** eg. time and energy ( proportional to frequency ), distance and momentum ( proportional to wave-number or cycles per distance unit ), angle and angular momentum ..... so I guess this says something quite deep**** about the universe. Or at least how one can measure it, which 'logical positivism' states is the same thing. Basic physical quantities relate via the patterns of Fourier transforms. There is an irreducible lower bound to our ignorance, it can't be absolutely zero! I don't think Fourier would have predicted that! :-)

Now time shifting seems pretty boring by comparison :

... the phases of the Fourier components ( which we generally modeled as complex numbers ) in the frequency domain are slid along/around the cycle on a per frequency basis. That is, while the shift is constant ( the beta in purple ) what fraction of a cycle that represents depends upon the frequency - higher frequencies give a greater relative phase shift. Nothing apparently stunning, but the amplitude of the signal for any given frequency is not altered ( the absolute value of the red exponential term is one ), nor is any frequency shifted. Believe it or not this is Conservation of Energy in disguise! A time shift leaves energies unchanged. One can likewise consider distance and inverse distance and get Conservation of Momentum emerging ( a displacement leaves momentum unchanged ), or angle/inverse-angle yields Conservation of Angular Momentum ( a rotation leaves angular momentum unaltered ). Again, we have key features of the universe which Fourier stuff neatly encompasses! :-)

Emmy ( Emily? ) Noether was a contemporary collaborator and correspondent of Einstein. She established a theorem linking symmetries to conservation principles. Symmetry here means what doesn't change when some operation is performed. Hence if the time/distance/angle shifts mentioned above leave the transform's frequency and amplitude unchanged, then something is conserved. Fourier Stuff doesn't prove that per se but is certainly consistent with it. It's also a very deep idea that one can describe the world in some 'primary' domain like time or distance or angle, and also legitimately describe it via the inverse time or inverse distance or inverse angle domain.

OK, we have to start moving towards convolution now. :-)

( edit ) Sorry, I should have emphasised that when we squeeze the Gaussian in the time domain it's slope/derivative increases ( you'd ski down the hill faster! ) and this has the effect that the transform widens implying that more higher frequency components need including in the sum that back transforms ( frequency to time ) to produce that steeper shape. Conversely having a more gentle slope in the time domain allows fewer higher frequency components, and bigger doses of the lower frequency components, that is the frequency curve squeezes.

( edit ) Regarding the time reversal effect on the transform the 1/a out front strictly should be written as 1/|a| - keeping in mind that the Fourier integral will reverse it's limits ( minus infinity swaps with plus infinity ) when s changes sign.

Cheers, Mike.

* I very much like Einstein's approach : 'time is that which is measured by clocks' .... :-)

** Though the ( limiting ) case where a approaches zero is interesting. See the discussion on the 'delta' function later.

*** The areas under the curves are preserved too, this in QM ultimately becomes a constancy of total probability. So 'something' must happen but we can change the odds amongst the various possible outcomes, effectively shifting probabilities around. But you have to rob Peter to pay Paul. :-)

**** Elsewhere in science this emerges in many different disguises. For instance Shannon's theory regarding information transmitted over a fixed bandwidth communication setup ( bit lengths vs error rates ). Or it'll crop up at our gravitational wave interferometers such that if we want to look at some finer level of frequency change then we'll need to observe for longer intervals to be sure of that ( delta_frequency down means delta_time up ).

I have made this letter longer than usual because I lack the time to make it shorter. Blaise Pascal

## Convolution Part 1 Sooooo

)

Convolution Part 1 Sooooo ....... let's think of how you might go about finding or matching a guess at some signal shape within some sequence of data points that you have. Try this shape, but call it a signal template to be fancy :

yup it's the square wave or box function that we know and love. Here's our signal record :

and you can see just by eye that our template is in there. This would be a fine method for, say, finding the first pulsar ever discovered. Yup indeedy. Jocelyn Bell did it that way on hundreds of yards of paper tape! In our minds we'd probably be doing this :

that is, sliding the template along the record ( or the record past the template ) and saying 'Ah ha!' when they co-incide. But this is way too simple for real work ( though the Stardust@Home project recruited people to do what software probably couldn't ). We need a mathematical, and hence computational/algorithmic technique to (i) slide the template along and (ii) know when a match occurs, or at least assess some degree of match if not exact. That's our motivation. Let's slide the template along but at each offset* note what is the common area under both curves :

here the little gold ball is tracking the leading edge of the moving box. This ball goes up and down according to how much area is simultaneously contained ( green shading ) within both boxes. It hits a peak when they exactly overlap, rises and falls to either side, and there is no common area when the boxes separate completely. Mathematically we would write this as follows :

where I designate x as the offset amount and this is designated as the ( cross ) correlation** ( product ) of the function f with the function g. Note carefully that even though the first named function is included in the product as a complex conjugate ( designated by the trailing asterisk ie. f* ), this will not make a difference for real valued functions. We are accumulating a pointwise arithmetic product for each choice of x. For reasons which will shortly become clearer, I can do what is called a substitution of variable and convert this to :

this operation is called the convolution of f with g. The substitution is a way of adopting a different measurement scale, like feet to metres or gallons to litres, where most importantly the x now appears as a -x. Note that x has units of time in this discussion.

Now I'll do a sidestep ( but not really ). Let's suppose I have the Fourier transforms of f(t) and g(t) respectively, meaning that I have two functions in the frequency domain Ff(s) and Fg(s), and in that domain I'll multiply the functions together as Ff(s)Fg(s) :

this gives a new shape in the frequency domain that is composed of overlapping amounts of each. Note that where either function is zero then the product is zero. I've applied either one of the functions to be the filter of the other one. If I do the (back) transform to the time domain - thus discovering the function that when Fourier transformed gives Ff(s)Fg(s) in the frequency domain - I will obtain :

This is a marvelous result and indeed works both ways. The product of the functions in one domain is the convolution of the corresponding (back) transforms in the other domain. Correlation is equivalent to a convolution, which in turn can be expressed as a multiplication of functions. Most importantly a match/not as judged by correlation in the time domain thus becomes a match/not as judged by the result of multiplication in the frequency domain.

Take an extreme case where f(t) = 0 for all time ie. no signal at all. Ever. Then the transform Ff(s) is zero for all frequencies ( as you'd naturally guess ) because the following integral always returns zero :

... for any frequency choice. [ This is also a case where the Fourier transform gives you the same functional form, like the Gaussian did, but recall that we agreed with eigenvectors to exclude the 'zero eigenvector' instance from eigen-discussions ]. If you multiply this function by Fg(s) which is non-zero in some frequency range ( because g(t) was non-zero for some range in time ) then that gives Ff(s)Fg(s) = 0 regardless of whatever g was up to. This means there is no correlation b/w f and g in the time domain. There is hence no correlation b/w 'no signal' and 'some signal'. While this is intuitively pleasing, there would be no correlation b/w 'no signal' and 'no signal' too ... :-)

Now let's take the other extreme where the template is identically the signal. This should not just be a good match, but in fact the best possible match ( well, there's possibly a normalisation issue ..... ) :

The left hand side of the graphic shows the time domain, the right side the frequency domain. This suggests the general idea a signal match somewhere along the time co-ordinate ought yield some sort of spike within our transform manipulations in the frequency domain. Incidentally for this example we have also 'auto-correlated', giving a measure of total signal presence.

We have now arrived at the point to generally*** describe the core of the E@H 'analysis engine'. Take some stretch of data from Aricebo, Parkes or LIGO IFO's. Do their Fourier transform. You have templates for the signals, and so we do their transforms. Multiply the signal transform by a/the template transform(s) in the frequency domain. Obtain some numerical estimate of match ( ie. to what degree is that function multiple non-zero ) which will hence reflect the degree of match in the time domain. Remember that shifting the signal in time ( slide the template along ..... ) does not affect the magnitude of the Fourier transform at any given frequency.

So why would one bother to do things this round-a-bout way? Why not just convolve ( ie. do the integrals directly in the time domain ) for each choice of template with each stretch of signal data, and at each possible choice of time offset? Efficiency of course is the answer. The Fourier Transform in it's discrete form can be enacted by the FFT**** very quickly, and multiplying two functions together across a single sweep of a common frequency band is not too bad ( and you could integrate/running-total their product as you go ), plus you don't have to transform back from the frequency domain to the time domain to know how good the match is! What one effectively/eventually winds up with goes under the generic heading of 'matched filter' ( matched I think means the filter is specific to the template ), a sort of mathematical gadget that you feed some signal into and out comes a number that reflects the success of the match. It is such numbers that are returned to the E@H servers upon completion of a work unit, in a ranked list of most to least significant 'hits'. This is the 'main crankshaft' of the project.

This makes good intuitive sense. If two time indexed functions are close in shape then they ought have similiar frequency components, and that also implies similiar derivatives. So the degree to which their amounts-per-frequency agree should be close. If you like consider the candidate template as a filter that emphasises the frequencies that match it's own pattern, while diminishing those that don't. An opera tenor singing near a wine glass ( or a grand piano with the lid open .... ) will cause it to vibrate according to the sound energies at the frequencies it's construction will cause it to respond to. The glass will barely vibrate for frequencies off the right pitch. The glass is 'filtering' the tenor's song, and we are mimicking such a physical 'analogue' device but by using numbers within our computers! :-)

Next up : a closer view of convolution in a discrete/integer sense.

Cheers, Mike.

* If I write x' = x + a then f(x) = f(x' - a), so whatever happened at x = 0 now happens at x' = a ( meaning the functional values are the same ). Depending upon your preference this is either shifting the function to the right or shifting the horizontal axis to the left. For x' = x - a then one gets get f(x' + a) and converse shifts.

** Again you may not find all texts agree on nomenclature and exact definitions .....

*** I haven't indicated exactly who is doing what here : server side, client side, pre-processing or post-processing etc. But something akin to the sequence I describe happens somewhere & sometime, no doubt with shortcuts and other cleverness. What also gets rolled into the filter setup/usage is how an interferometer would/should respond to a signal from so-and-so direction in the sky, dopplering due to the Earth's motions, the putative source system's motions, change in frequency at source ...... in addition to what some waveform would 'look like' if just chuffing past you in free space. As I've previously alluded I'm not knowledgeable enough to give further detail on this.

**** We'll see the Discrete Fourier Transform and it's speedy mate the Fast Fourier Transform later on, but here's a quick glance : compare ( back of the envelope ) estimated computational loads via the 'usual' number of 'basic' instructions to execute : a typical DFT transform for N points goes like N^2, but goes like (1/2)N*log_base_2(N) for the FFT. So if one has 'only' 2^10 = 1024 data points, say, then 1024^2 ~ 1 million instructions for DFT but (1/2)*1024*(10) ~ 5000 instructions for the FFT. A factor of around 200 times faster. The relative ratio (DFT divided by FFT) goes like (2N/log_base_2(N)) and hence that benefit increases even more as N increases. If you consider just one second of data taking at a LIGO IFO photodetector that's 16384 = 2^14 points. That translates to a factor of about 8000 times faster with FFT vs DFT!!

So the FFT really does deserve the FAST adjective ..... so much so that one might not even consider doing certain analyses without it .... :-)

I have made this letter longer than usual because I lack the time to make it shorter. Blaise Pascal

## RE: If you consider just

)

Whoops, slipped a cog. I think that factor is wrong : 2 * 16384 / 14 ~ 2340

Ah well, an exceptional magnitude of improvement. I think there have been further tricks and shortcuts too, since the days of Cooley & Tukey - who made the FFT popular in mid to late 1960's - even though later historical research trace the essence of the FFT technique through various guys right back to Gauss. Probably no fluke that interest re-surged after the invention of the hardware ( digital computers ) that could properly take advantage of it. Before that 5000 instructions was as equally impractical as a million, effectively infinite or impossible either way! :-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter. Blaise Pascal

## Matrix Meandering After some

)

Matrix Meandering After some looooong thoughts I've decided to postpone the explanation of integer convolution and discuss ( more ) fully integral/integer indexing of functional values and thus expose the topic of matrices : which we will later see is the 'natural' notation to use when handling the DFT and FFT. Performing a DFT/FFT is really the same as solving a system of simultaneous equations and matrices just rock for that purpose. Somewhat like the complex number business the matrix approach/notation allows simple representation of high level ideas and parks the messy detail under the hood. Plus I reckon it's really cool area of mathematics! The messy detail will turn out to be the inner or dot product, and exactly how one might multiply one matrix by another ( plus whether/why/how/if that operation can be uniquely reverted/inverted ). But first things first. Let's have a look at what restricting stuff to integers as indices looks like on a graph :

so it's a pattern of disconnected dots on a two dimensional plane. The functional values ( vertical distances ) I'm illustrating as integers to be simple, but it's really the fact that the horizontal axis is stepped along in jumps of one unit that matters. We don't actually have a measurement for the in-between bits, and strictly speaking there could be any number and magnitude of functional values within the gaps, for some real world problem. Later we will discuss this very important interpolation issue - or what is the function doing ( probably ) when we aren't measuring it? So this isn't 'smooth' or 'continuous' or 'differentiable' or 'integrable' or any of those concepts discussed earlier. One could write this in tabular form :

which looks like a spreadsheet excerpt. Because it is! That's a good image to have in mind, a matrix is like some part of a spreadsheet. So here's a notation to use to represent these integer situations, called a matrix, starting with a 1-dimensional case or vector :

which one can think of as like a computer language array syntax. A particular number within an array is usually called an array element, with some sort of sub/super-script ( hard to BBCode ). So a_1_2 would denote the element of A in the first row and second column, or a_i_j for the generic i'th row and j'th column element. All the row and column integer indices are positive and traditionally start at one and not zero*. Thus they are the 'counting' or Natural Numbers = {1, 2, 3, 4, 5 .... }. If the values form a single line then the word vector is also applied. I've adopted the convention/assumption that vectors or 1-dimensional matrices have a default vertical column listing of their entries { why I fuss about this detail will become apparent later with matrix multiplication, inner products and ordering of matrices in a matrix product }. But can be written for convenience horizontally by using a little 'T' superscript which stands for ( matrix ) transpose like so :

... or exchange the rows for the columns, a bit like ( ignoring pagination ) English document structure where one goes left to right along a line of glyphs and then go down the lines in order vs. some Asian languages ( I think ) which have consecutive characters going down the columns, upon which finishing a column one moves to the next one on the right ( ? ). Mind you a 1-dimensional matrix can be though of as 2-dimensional : we just have only one row or column, either of which can be any length. As you can see matrices typically have a name or a symbol to know them by and often we will state their dimensions if not immediately apparent. So I can say 'A is an m by n matrix' with the understanding that I mention the number of rows first ( m ) and then the number of columns ( n ). As in : B is a 5 by 3 matrix - so it is called 'B' and has five rows of three columns each. You don't have a different number of columns for each row within a particular matrix. A general matrix is 'rectangular' if m and n are different, and 'square' if m = n ( same number of rows as columns ).

there's a couple of 'obvious' operations with matrices which aren't too surprising :

which works both ways either multiplying in or factoring out a scalar from each and every element. Don't forget we could legitimately use complex numbers, and indeed we will! One can add and subtract matrices but only if each is of exactly the same dimensions, and this is done per each matching row and column entry thus :

..... however matrix multiplication is not similiarly defined ie. multiply the respective elements. Instead it is something that initially looks rather odd. Remember the inner product? For two vectors ( represented as 1-D matrices in this context ) it looks like this :

where I have taken the inner product of a vector with itself to show what that indicates. It is the square of the 'length' of the vector. Good old ( multi-dimensional ) Pythagoras again. The length is also known as the 'norm' or 'magnitude'. We can only sensibly form such a number ( that running total over all index positions ) when the sum makes sense - hence the rule about the number columns on the left equaling the number of rows on the right - which is easily satisfied for a vector and it's own transpose. Next up we will do the biggie ..... an m by n matrix multiplying an n by p matrix ( noting yet again that n is both the number of columns of the first and the number of rows of the second ) which will give an m by p matrix as a result! :-)

Cheers, Mike.

* so the joke goes : hence we are now in the Engineering Department and not the Physics Faculty ..... :-) :-)

I have made this letter longer than usual because I lack the time to make it shorter. Blaise Pascal

## More Matrix Muddling : I'll

)

More Matrix Muddling : I'll just throw this pseudo-code at you, then discuss. It's an algorithm to form the result matrix C ( m by p ) of multiplying two matrices together ie. AB = C. The left-side matrix A ( m by n ) has the same number of columns as the right-side matrix B ( n by p ) has rows - call this the 'row/column rule' for discussion purposes. You can calculate the entries of C in any order you like actually, as long as you (a) do them all and (b) that each is the inner product of a specified row from A and a specified column from B, as follows :

[pre]For every row i of the first matrix ( 1 thru m such rows ):

For every column j of the second matrix ( 1 thru p such columns ):

Form the inner product between row i and column j

Place this single number result in the i,j entry/position of the result matrix

Next column

Next row[/pre]

... this works for scalars, which are just a 1 x 1 matrices, so as I can write 3 and [3] interchangeably then 12 = 3 x 4 means the same as [12] = [3] x [4] !! The inner product of two vectors ( viewed as 1-D matrices ) is covered here too - consider the left side matrix as a transpose 'lying down' and the right side matrix in column form, so a 1 x n matrix multiplying an n x 1 matrix gives a 1 x 1 matrix !! However when we write matrix equations - where the symbols within are placeholders for matrices - it looks like the 'ordinary' arithmetic rules apply. BUT Most Importantly one cannot rely on AB = BA !! In general that will not hold for several reasons :

(a) if the matrices are rectangular then (i) while the row/column rule may apply one way it won't necessarily when the order of multiplication is reversed ( say, a 3 x 2 matrix and a 2 x 4 matrix ) and (ii) the row column rule may apply both ways but the resultant arrays will be of different sizes ( so if A is 3 x 2 and B is 2 x 3 then AB is 3 x 3 but BA is 2 x 2 ) :

(b) even if square matrices are involved and the reversal does makes row/column rule sense there is still no reason to expect the numbers will be the same for each position in the result matrix C. It can happen that AB = BA for some examples, and is interesting if so, but it mustn't be relied upon. Thus matrix multiplication is usually non-commutative. This is of crucial relevance when we come later to factorising matrices - expressing a given matrix as a product of others. The following is sort of what I do in my mind's eye when multiplying matrices by hand :

..... which I do so hope that I haven't stuffed up! :-) :-)

Remember that each entry in the result array requires the evaluation of a inner product, so the calculation cost can be significant. Roughly : there are m x p entries to evaluate, but each of those entries has n multiplications and about n additions ( being an inner product ). Mind you multiplications are the greater cost to be concerned with. We will mainly consider this situation :

... here an n by n matrix multiplies an n by 1 vector, yielding another n by 1 vector. The F will be our 'Fourier' matrix ( which has a special pattern of entries within, that the FFT algorithm takes huge advantage of ), the x is a vector of to-be-determined Fourier co-efficients and b is a vector of signal values at discrete times. We will want to 'invert' that equation and write :

... thus encapsulating the idea/mechanism of the Discrete Fourier Transform : grab your measurements into a vector, multiply that vector by the appropriately sized 'F inverse' matrix and a-bing/a-badda/a-boom out comes a vector of Fourier co-efficients. The Fast Fourier Transform is indeedy the greedy and speedy way to do that matrix multiplication. Now 'F inverse' has the property that :

so I've introduced the 'identity matrix' or I. It is the '1' of the matrix world, but it comes in many sizes - one for each needed situation/problem - while bearing in mind our row/column rule :

But any given matrix does not necessarily have an inverse*. IF it does then it would be called 'invertible' or 'non-singular'. IF NOT then the terms 'non-invertible' and 'singular' apply. For instance a matrix may have a whole row or column of zeroes, in which case any inner product formed from that row or column must also be zero, thus I can't form some non-zero entry in a product using it - specifically not a completely/properly formed identity matrix as it will be missing a '1' somewhere along it's leading diagonal ( the line of values from top left to bottom right ).

This behaviour has a semi-analogy** ( ? ) in ordinary arithmetic : the number '2' has a multiplicative inverse = 1/2, but 0 doesn't as 1/0 is 'undefined'. Zero is the only number for which that is true. Matrices however can be non-zero ( the 'zero' matrix of whatever size has all entries equal to zero ) and yet still not have an inverse. The reason is the complexity, if you like, of the matrix multiplicative operation. It has many reasons for not being able to run the other way, and the zero matrix is not the only one that doesn't invert.

What does 'singular' mean here? I like to think in terms of reversibility. If I say the number 'two' times 'some number' equals 'six', then you can invert the problem and deduce that the 'some number' must be 'three', AND UNIQUELY SO. However if I say the number 'zero' times 'some number' equals 'zero', then what choices do you have? Any 'some number' will do. Thus once you multiply by zero one can't uniquely determine the other number in the original equation. Thus very many 'some numbers' lead to the one result. Whereas if I say that I took the reciprocal of a number and got 1/2 then you immediately and un-ambiguously know that I began with 2. So zero is sort of the 'black hole of multiplication', once you get there you can't exit by any multiplicative operation and you lose information as to your origin!

Some of you may spotted a problem thus far ..... I've left out how to handle complex numbers properly! With the inner product where the vector or matrix contains complex numbers it's not good enough to just transpose. You have to take the complex conjugate*** too ...

A chap named Hermite thought this up, in his own language and times the 'H' was silent when pronouncing his surname ( but these days it isn't ). You see if we don't create the complex conjugate when we transpose a vector to then form an inner product with itself, the result may not be purely real - which we want as real, in order that it will represent the distance ( squared ) to the origin in whatever dimensional space we're operating within. If we didn't adopt that method then for, say A = [1 i], then A_T.A = 1 - 1 = 0 and we would not only have the length of a non-zero vector equaling zero - whoops - we'd also have the vector orthogonal to itself - whoops on steroids !! :-)

For real values the complex conjugate is that same real value again ( no imaginary component ) so the Hermitian is the same as the transpose. Note the similiarity with the continuous function case in forming an inner product :

.... where we added/summated/integrated over some domain the product of one function times the complex conjugate of the other. We are doing that with the discrete case in much the same fashion. I didn't really justify earlier that inner product definition for continuous functions. I hope you can further see the analogy of those functions with vectors now. :-)

Now where the vector/matrix contains complex numbers that means we are in some multi-dimensional space of complex numbers, and this is not the same as a space of the same dimensions ( or even double the dimensions! ) indexed by real numbers along the co-ordinates. They just ain't the same type of cow, and never will be. We won't need to really worry about that too much. But let's simply remember that they are different.

Next up : what's a system of simultaneous equations when it's at home anyway? For that matter, what does it mean to solve just one equation?

Cheers, Mike.

* I am going to consider only inverses of square matrices and thus avoid the topic, however interesting, of 'left inverses' and 'right inverses'.

** I'm a math/science grad so I really didn't grasp that whole analogy/simile/metaphor thingy. :-)

*** so we're in the School of Mathematics now ..... :-)

I have made this letter longer than usual because I lack the time to make it shorter. Blaise Pascal

## Solving An Equation What does

)

Solving An Equation What does it mean to solve an equation? Let us start with what an equation is :

something = another_thing

a statement or declaration that two things are equal. So 5 + 3 = 8 is true but 5 + 3 = 7 is not. All along we have been using the idea of a variable, a named placeholder for some quantity chosen from a certain set or domain. So now I write :

[pre]5 + x = 8 { where x is a real number }[/pre]

and can immediately deduce that x has to be the number 3 for the equation to be true. But if I wrote :

[pre]5 + x = 7 { where x is a real number }[/pre]

then x equals 2 is correct for that. Whereas :

[pre]5 + x = 7.5 { where x is an integer }

or

5 + x = -34 { where x is a natural number }[/pre]

don't have solutions given those choices of domain set. Often we leave out the '{ where x is a real number }' type of qualifying comment as the context has or will define that. But never forget that something like that must be there or is implied to be so. Often a train of logic leading up to an equation being evaluated/solved will have restrictions in place, including the base set from which one draws variable values. Try :

x^2 + 1 = 0

now do we have a value of x which will make this truly an equation ie. left side equals the right? If { where x is a real number } applies then no, but if { where x is a complex number } is applicable then yes. Here is a troublesome 'equation' :

0 = 2

or 'uh oh!' as this is somewhat of a dead end and backtracking is appropriate. No clever choice of domain set will save us here. Often we do reach such cul-de-sac's and have to say 'this is a bad road'. The method of 'proof by contradiction' uses this : assume some road is 'good' -> inevitably reach some statement like 0 = 2 -> thus deduce the road was actually 'bad' -> therefore some other road was the 'good' one ( at least in scenarios where one can enumerate logical outcomes/paths and say 'if this, then not that' ). In any case for '0 = 2' we will say no solution for whatever proposition we were examining that led to that. This is not quite so troublesome :

0 = 0

as in

0 * x = 0

this is not a solution. It is however many the base domain set for x will allow, and that could be an infinite number. This means 'any old x will do' or 'this line of thinking didn't narrow down our possibilities for this variable'. What we did for those uniquely soluble cases above was probably something like this :

5 + x = 8

-5 + 5 + x = -5 + 8

x = 3

that is, we manipulated the equation without doing anything that might break it's equality or truth ( assuming it had those qualities originally ). What we 'actually' did was add these two equations together :

5 + x = 8

-5 = - 5

( I'll often use the word 'equation' without qualification or certainty of truth from now on, but remember the above points of logic ). Thus clearly if :

something_1 = another_thing_1

and also

something_2 = another_thing_2

then

something_1 + something_2

can be evaluated by inserting or substituting the alternate value/expression for 'something_1' and 'something_2'. Hence :

something_1 + something_2 = another_thing_1 + another_thing_2

which I can legitimately do because the equalities pronounced them as interchangeable ie. 'something_1' may swap for 'another_thing_1' and 'something_2' could swap for 'another_thing_2'. So I can in effect add two equations together and not alter their truth*! Similiar logic gives subtraction and also multiplication. Just remember to do to the right side the same operation as suffered by the left**. Division is fine also, but not by zero as that is an invalid operation to perform on either side regardless of their equality or not. This can lead to a subtle trap/error :

[pre]x * y = 10 { x is a real number }

y = 10 / x { x is a non-zero real number }[/pre]

which is generally OK but I have to either know otherwise or make a side note that x = 0 is excluded because of that division. In fact you could have said neither x nor y can be zero from inspecting x * y = 10, by remembering that as 0 is the 'black hole of multiplication' one couldn't get 10 as a product if either x or y was zero. This leads to a more general principle of equations with variables : if one is going to allow the value of the variable to range widely across some set - meaning that any one number in the set can be inserted/substituted into the equation where the variable is mentioned - then it is one's responsibility to check for 'potholes' in the domain set, either existing at the start or introduced during some process. Failing to account for such difficulties/assumptions has been the commonest cause of my math problems anyway. The worst assumption is the one you don't know of ...... :-)

What we aim to achieve is separation. We start with :

something_containing_x = another_thing_possibly_also_containing_x

and get

x = something_not_involving_x

with hopefully the final RHS able to be evaluated without too much trouble like :

x = 4 + ( 7/3 )^4

whatever that turns out to be. Our Fourier Stuff is fortunately all linear equations, indeed I have-been/will-be outlining steps involved in the area of Linear Algebra - an incredibly important topic of maths for E@H and much of science generally. It is when things are non-linear that difficulties can really set in, perhaps intractably. Try this :

cube_root[x*y] = secant_of[PI^y - log(x)]

... best of luck separating that one out for either x or y***! While this equation certainly implies that x and y are related it's not my idea of pleasant mathematical company. :-)

So that's a precis of 'solving an equation'. Next time let's look at a whole gaggle of linear equations. A key point will be how many variables vs how many independent equations do we have in our system.

Cheers, Mike.

* this doesn't apply in Aussie politics .... possibly worldwide politics methinks :-) :-)

** Tread with especial care when taking roots ( and generally inverting functions that are not one-to-one ) as eg.

x^2 + 1 = 0

has two solutions x = i and x = -i

*** I just thought that one up randomly, without any check or analysis, thus I will run the risk of someone actually bothering to attempt to evaluate it .... thus maybe popping out a neat solution! The history of maths is absolutely chock full of guys/gals gnawing on these type of problems for fame or fun or fortune! Or who knows what motivation? If you ever get to read about Donald Coxeter - sadly not with us now - or Andrew Wiles or dear Sir Isaac Newton then you will see what I mean about 'gnawing'. Where would we be without them ? :-)

I have made this letter longer than usual because I lack the time to make it shorter. Blaise Pascal

## I have stared at this off and

)

I have stared at this off and on for a long time now. I can follow some of it and know what to do to follow the rest because I have heard of it back when I was in the sonar business and have no inclination to do so. That is not the way to say it but bear with me.

A long time ago and far, far away I was one of the first to realize you could take the very new CCD chips, chirp-Z the clock and get an FFT out. The hardware cost reduction was about a factor of 100. However it did not "sell" because it was not a computer.

So where are you going with this? Are you doing a monograph? A chapter of a textbook? Graduate notes?

What is your audience? What kind of response are you looking for? Why?

I have no problem with just imagining an audience as a mode of focusing thought if that is what it is. Done it myself.

But if there is a type of response you are looking for why not give us a hint? If I can I will.

## Well I'm doing what I

)

Well I'm doing what I originally promised : 'as E@H is essentially a part of the LIGO and pulsar signal processing pipelines then I thought I'd explain some of the deeper detail about the mathematical basis of such things ( a sequence of mini-tutes, in fact )'. The audience is whoever wants to read it and the response, if any, is whatever people make of it. Or not. Their choice. I'm not after fame or fortune or testing a textbook chapter I'm writing on anyone. Nothing like that at all. Just for the pleasure of sharing really. And it's a rich vein of knowledge over many centuries that give us what we have today - older guys like me tend to think in such terms. If anything I'm attempting to fill a intellectual void that might be in the mind of some E@H contributors - and those who read the message boards are the more curious minded end of that set - about what those work units are actually doing on their machines. Plus that emphasises why their involvement is such a valuable contribution/gift to the project. Perhaps look upon this as a return to them, other than credits.

But you will note that I do appear to zoom in & out of detail and difficulty level of description, and that is in deference to those who may have done some later high school math but never went on further. So my approach may help to bring them at least some nodding acquaintance with what they would have sampled had they continued. I discovered a while ago that there are some very smart people in the world, way smarter than me ( a guy with four tertiary degrees ), who for better or worse haven't had a 'formal education' for whatever reasons. Maybe I'll meet some more of them by doing this !!

[ More specifically in my experience : dairy farmers. Don't assume that because they look at the back ends of the same bunch of cows twice a day, that they are stupid. The best farmers are terrific integrators of facts. They know what cows are thinking! Try sussing that out with quantum mechanics! Google the gag about 'spherical cows' .... :-) ]

[ I construct these snippets progressively during intervals available between my real/offline life activities. It's a pleasant distraction/contrast. Also I'm enthused, obviously, by this project and seek to diffuse that around. :-) :-) ]

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter. Blaise Pascal

## RE: [...] [ More

)

Just as in a baby's cry, there's much more to a moo than just "moo".

Quite an alternative thread for a very detailed subject. I hope the audience can get to see the 'physical/analytical' significance behind the mathematical hieroglyphs!

An interesting distraction/contrast.

Thanks,

Martin

See new freedom: Mageia Linux

Take a look for yourself: Linux Format

The Future is what We all make IT (GPLv3)