15 Jan 2009 3:32:02 UTC

Topic 194140

(moderation:

My Windows XP host 1226365 has completed three S5R5 tasks, and is still showing a good selection of S5R4 v6.10 work. More S5R5 will be available for scrutiny in the morning. I'll try and write up some logs tomorrow, but you can start looking at the raw data now (and maybe save me a job!).

Edit - don't worry about the compute errors. A data file download went bad (I think I suspended networking while the transfer was still active), and I had to delete the file and let it download again. No computation problems, nothing that would affect task timing.

Cheers,

Gary.

Language

Copyright © 2023 Einstein@Home. All rights reserved.

## S5R5 Performance Analysis

)

Bikeman's rule says you have tasks right at a cycle peak. 121 skypoints, so a peak right about 1210, and you have sequence numbers 1213 down to 1207 so far. Let us hope the server deals you work for most of the sixty next tasks down below 1207, so we can see how low low really does go and get a confirmation of the shape. If you get most of them, we might even get to see the current form of the wiggles.

## Here is a thread specifically

)

Here is a thread specifically for those who want to discuss peaks and troughs, wiggles and waggles, and any other strange phenomena reputed to contribute to the cyclic nature of crunch times. This discussion is specifically for the new R5 tasks which are predicted to have larger variations than was the case for R4.

The usual suspects and culprits are invited to attend :-)

Cheers,

Gary.

## Well, I have only gotten a

)

Well, I have only gotten a couple of R5 tasks and only one completed at this time (EaH is down the priority list for a month or so, sigh) but am waiting for the wingman to report ...

The time to process looked about the same to me ... but, then again, I don't look all that hard ...

Mid February I should be ramping back up on other projects and EaH should start to get more attention then ...

## Another suspect reporting in

)

Another suspect reporting in for duty.

host 252515, has tasks h1_0082.65_S5R4__11_S5R5a_1, and down to _7, then change to 0082.70.

h1_0082.65_S5R4__11_S5R5a_1 took 16435.98s, skypoints = 112 so Bikeman formula =~0.1 therefore just about max,

h1_0082.65_S5R4__10_S5R5a_1 took 13629.25s. steep dive 2800s less.

## RE: Another suspect

)

Not quite... it's the number of skypartitions, not skypoints that has to be looked up .

Just to keep everything in one thread , here's the formula again:

BTW, as the command line arguments to the app are now printed into the debugging output of the results, its much easier to check after a WU has finished whether it's runtime is near the expected minimum or maximum.

Look at the output in the result, and find the argument :

--numSkyPartitions=xxx

e.g. "--numSkyPartitions=339"

Now lookup the sequence number in the name of the result following the double underscore, e.g. for WU h1_0709.40_S5R4__677_S5R5a that number would be 677

now divide the second by the first number, so here:

677 / 339 = 1.99

If the fractional part of that quotient is close to 0 or 1 , you are near a runtime maximum. If it's close to 0.5, you are near a runtime minimum.

This will help to put the first runtime results into perspective a bit.

For the WU in question, the number of partitions is 6, so 11/6 ~ 1.8 and much closer to the minimum. because of symmetry, there should only ne 3 different classes of WU (runtime-wise) for this frequency range, so a complete sequence would look like this:

slow->not-so-slow->fastest->fastest->not-so-slow->slow->slow,....

CU

Bikeman

## RE: Not quite... it's the

)

Oops. mea maxima culpa

Sorry about that.

Edit: looking again at the host of Richard Haselgrove mentioned before:

`numSkyPartitions=607`

So the second peak should be at sequence number 1214.

He actually has in hand 1213 down to 1207, so one could hope for a rather steep downhill.

The three actual results to date don't show that, so either there is a bit of noise in the data, error in the peak location, or I've once again mistaken it.

## Whoops, sorry about that,

)

Whoops, sorry about that, I'll use the excuse it was before 1st coffee had been fully drunk.

So correction should read:

skypartionss = 6 for Bikeman formula,

h1_0082.65_S5R4__11_S5R5a_1 took 16,435.98s, Bikeman formula = 11 / 6 = 1.833,

h1_0082.65_S5R4__10_S5R5a_1 took 13,629.25s, 10 / 6 = 1.67

h1_0082.65_S5R4__9_S5R5a_1 took 11,918s, 9 / 6 = 1.5 (min?)

## So that doesn't look too bad,

)

So that doesn't look too bad, does it, as far as predicting runtime is concerned.

Next question is how good the awarded credits (should be proportional to claimed credits) fit in there. I've got a feeling that the credits overcompensate the runtime variations, but more data is needed to re-calibrate it.

The problem here is that it's a "one-formula-fits-all" approach:

* must fit all kinds of CPUs (AMD X2, Intel PIII, i7, P4 ....) because credits are fixed for a WU independent of the crunching host, so there's nothing that can be done about it

* must fit all kinds of frequency ranges (to keep it simple, I assumed that the relative variation in runtime over all skypoints is the same for all frequency ranges )

CU

Bikeman

## Ok, here's the first diagram.

)

Ok, here's the first diagram. Crude but should be useful.

After some searching around I found a host that has crunched some S5R5 WUs with a wider range of sequence numbers and therefore significant runtime variation.

It's this one: host 1794053 , a Core2 Xeon

Crunching WU near 100 Hz under Linux.

If p = fraction(sequenceNo / period)

and you transform p' = (2p -1)^2 , then if you plot runtime vs p' you should expect a more or less straight line, as runtime at least in the last run was roughly a linear function of (2p-1)^2.

So you can do linear regression on the points you get this way and predict the minimum and maximum runtime. I used octave (a free Matlab clone) to plot the actual runtime measurements (circles) and perform the linear regression (solid line). As you can see the fit is not too bad.

This predicts a minimum runtime of ca 11712 seconds and a maximum of 18113 seconds, so a variation by a factor of 1.55, much higher than what we saw in S5R4, as predicted.

You'll get much nicer plots with Mike's Ready Reckoner, using a constant of 0.00067195 where 0.000291 was used before (but I still have to adapt the awk and shell scripts to extract the data automatically into the format used by the RR).

CU

Bikeman

EDIT: Here's the octave/matlab code for those interested, input file format should be compatible with Mike's RR. Sorry, the web forum software strips all the indentation from the source :-(. Also I'm a matlab beginner so this might not be the most elegant way to do it, e.g. plot should be optional etc. :-)

# load data for host

Tstat = load("-ascii",filename);

# compute phase for each datapoint

P=WUph(Tstat(:,1)',Tstat(:,2)');

# do linear regression on linearized runtime statistic

P_lin = (2 * P - 1).^2;

hold off;

plot(P_lin,Tstat(:,3)',"o");

hold on;

fit = polyfit(P_lin,Tstat(:,3)',1);

plot(0:1,polyval(fit,0:1));

# evaluate guess for min and max

lo = polyval(fit,0);

hi = polyval(fit,1);

res(1)=lo;

res(2) =hi;

endfunction

function phase = WUph (freq, num)

p = WUp (freq);

r = num ./ p;

phase = r .- floor (r);

endfunction

function period = WUp (freq)

period = ceil (0.00067195 * (ceil ((0.11501 + freq * (1. + 1.05e-4)) ./ 10) .* 10) .^ 2);

endfunction

## RE: Ok, here's the first

)

That's a pretty good fit. You're a good bloke, Bikeman! :-)

Since I actually have Matlab ( I'm also a beginner with it ), I'll give this analysis a whirl on it. This gives me ideas ..... I'll get back.

Cheers, Mike.

( edit ) Ah, there are implicit foreach's occurring in the runtime_estimate function. Everything is a matrix in MATLAB, doh! A bit hard to get used to the variables being typed by context, rather than explicit declaration. Sort of Perl-ish .....

( edit ) By experiment I have determined that MATLAB will load a comma separated list of numbers just the same as a space separated list of numbers, without any programmatic need to distinguish - it'll just pickup & parse on either format. Each file line becomes a row in the matrix that you assign to. Also from MATLAB help:

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal