Here are the latest results for Richard's machine.
I said I'd only send the ones that hadn't been listed previously but I decided to send everything that has been crunched with 4.32. I've left out the mixed ones and all the tasks crunched with 4.26
Just want to mention that it's nice and all to be refining the calculations, but you may want to consider having examples in there like the earlier versions so as to make this a tool for all users, not just people who already know what to do. The input via csv is great, but if I were to try to refer someone to your page, I'd have to provide a tutorial first on what is the frequency, the task number, etc, etc, etc...
Guess it depends on who you view as your primary audience...
Err ... well you'd use whichever version takes your fancy I guess, that's why I wrote them each as standalone versions. :-)
Cheers, Mike.
Well, just making the suggestion, since there are quite a few people who come along on the message boards who don't know about the cyclic nature and get all bent out of shape over the difference in cr/sec... Having to keep in mind what version is "newbie-friendly" as well as if it is accurate or not can be difficult for someone to keep track of if they are not intimately involved.
If I were doing this, I'd give a front-end "welcome" page that gave a choice between "standard" and "advanced", and let the user pick. From there you can make sure that the versions available are up to date. Direct URL access would also be available, so you could keep on posting direct links to refined pages as they are developed and people could bookmark them or whatever...
Alternatively, you can say I'm crazy and keep going as you have been. It won't be the first time, nor the last, that someone thinks I'm nuts... :-)
Well, just making the suggestion, since there are quite a few people who come along on the message boards who don't know about the cyclic nature and get all bent out of shape over the difference in cr/sec... Having to keep in mind what version is "newbie-friendly" as well as if it is accurate or not can be difficult for someone to keep track of if they are not intimately involved.
If I were doing this, I'd give a front-end "welcome" page that gave a choice between "standard" and "advanced", and let the user pick. From there you can make sure that the versions available are up to date. Direct URL access would also be available, so you could keep on posting direct links to refined pages as they are developed and people could bookmark them or whatever...
Alternatively, you can say I'm crazy and keep going as you have been. It won't be the first time, nor the last, that someone thinks I'm nuts... :-)
Please keep suggesting! :-)
You're right in that I'm not sure who is the primary audience, or what use may come of this RR thing. I'm just dicking around really and following the kind of directions that people suggest here.
Now that you mention it : a combo of frames and pop-ups might do the trick here for the approach you mention -> in the pipe for RR8! :-)
It'll be a committee camel yet ..... :-)
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
Well, that's a nice/well behaved data set! The distinct RR6 and RR7 prediction algorithms only differ in 1 second in peak and 47 seconds in trough for runtimes.
Hmmm .... this suggests a new direction of analysis, a mythical/reference/fiducial host. For hosts at the same frequency and app version ( say quorum or farm buddies ) one could scale appropriately - effectively making peak and trough times identical - and view the two ( or more ) hosts within the same plot. The CSV input format can expand to a possible fifth column of hostID or somesuch, and I could buttonise an aggregating functionality ( MERGE/SPLIT ). You wouldn't quote runtimes on the left end of those peak, average and trough horizontal lines, but you'd keep the sequence number markings for the extremes as is. Why do this? To conveniently amalgamate in the one plot behaviours from a group of machines and hence more easily visualise/analyse/summarise performance changes with app version upgrades. Is there no bottom to this rabbit hole? :-)
Come on Gary, I just know you'd want that! :-)
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
Mike, for various reasons that I'll explain later, if you think it worthwhile, make these two changes and rerun the RH data and see if you like the better fit in the higher cycles.
1. Change the constant 0.000206 to 0.0002042
2. Change the frequency in the data from 464.35 to 470.45
I've done that and to me it looks a lot better. What do you think?
While you are trying it out, I'll work on the explanation :).
Second, yes it does look better! Do tell ....
Mike, I'm sorry I've been a bit busy this weekend so it's taken me a while to get back to this.
If you review Bikeman's initial post in the big thread, he came up with these points which I'll summarise
* SkyGrid size is probably a crucial factor influencing the period of the cycle
* You get the GridSize by counting lines in the SkyGrid file
* The relation should be GridSize = Freq^2 * const (Bikeman const (BC) = 0.2453
* Cycle period (P) = GridSize / SkyPoints
* SkyPoints is ~1200 but in any case is always shown in stderr.out
* Substituting for GridSize gives P = Freq^2 * BC / SkyPoints
* So final equation is Period = Freq^2 * 0.0002044
Bikeman illustrated this by giving data for 3 closely spaced SkyGrid frequencies of 700, 740, 760.
Two posts later, I followed up with an expanded set of GridSizes
which also could be used to verify that 0.0002044 was approximately correct. The above values lead to the const being about 0.0002044 to 0.0002046
There was something that didn't quite feel right about the variations I was seeing so a few posts later I followed up again with these thoughts about the whole deal. I don't think I was very clear so I'm going to ask you to first read that post again and then consider the following.
* SkyGrid files cover a 10Hz frequency range which goes slightly above the frequency listed in the filename.
* We don't know exactly what frequency in that range should be used to calculate the cycle period.
* Why don't we play with frequencies throughout the range and see what gives the best fit?
* Surprisingly enough, the top of the range frequency (almost) gives a quite amazing fit over a wide range of frequencies.
We can invert the formula for period to give Const = Period / Freq^2
which leads to Const = GridSize / Skypoints / Freq^2
If we make the assumption that Freq is the (almost) top of the range implied by the SkyGrid filename (ie nnn.433 where nnn is the value in the filename) we get an extremely consistent value of the archae86 constant over a very wide range of frequencies. Please consider the following
I think you'll agree that for the wide frequency range from 380 to 930, the use of the frequency nnn.433 gives a remarkably consistent value of the archae86 constant. I came up with the .433 bit by theorising that the range started and ended at .45 and then experimented with frequencies at the bottom, midpoint, and top of the range. The bottom and midpoint seemed not to work all that well compared to the top which worked very well. Refining the top value ever so slightly gave me an absolutely superb fit if I used .433 to 3 decimal places.
I have no theoretical basis for any of the above. As has been pointed out by others, the frequency in the data filename which goes in .05 increments is quite different to the -freq flag value fed to the program on startup. So who knows what's really going on.
Observationally however, the use of nnn.45 or better still nnn.433 together with a archae86 constant of 0.0002042 or better still 0.00020417 does seem to give a much cleaner fit for the model line with the data points.
I hadn't looked before, but I seem to be sharing this dataset with host 997488 (Peanut's 8-core X5365 Darwin @ 3.00GHz), and host 1094135 (an anonymous 8-core X5355 Linux @ 2.66GHz). No wonder I can't keep up! But it's a nice little cross-platform comparison we've got running here - Gary, you might like to add those to your monitoring list.
Richard, Sorry for the delayed reply but I've been a bit busy.
I think a lot of people might be interested in the matchup between your machine and peanut's. Mike Hewson has kindly allowed me to use some of his webspace and since these files aren't really that big, I think I'll put a daily update of the two hosts - yours (1001562) and peanut's (997488) on the website with a link here. The link should stay the same - all I'll do is replace the files approximately each day. The files will keep growing as results are added. I'll put a note in this thread each time I update the files.
Fresh results have just been put on the website now. These results are from RH's machine and these are from peanut's.
As this is the first time I've actually done this, please let me know if I've stuffed anything up.
Fresh results have just been put on the website now. These results are from RH's machine and these are from peanut's.
As this is the first time I've actually done this, please let me know if I've stuffed anything up.
No stuff-up, the files are visible just fine.
And in fact they explain something which I'd been too lazy to follow up. I've started getting 'pendings' on my reports, which seemed odd, as Peanut's machine is so much faster than mine. Now I've seen that he's been moved off onto slightly higher frequencies. Now, with Peanut out of the way, I seem to have control of the 909.15 dataset, and I'm getting a consecutive run of sequence numbers.
There must be a "sweet spot" speed for requesting new work. Too slow, and other people nip in and steal your sequence numbers. Too fast, and the work generator can't keep up, and switches you to another band. My 2.4GHz stock quaddie seems to be just right - I'll have to rechristen her Goldilocks.
Observationally however, the use of nnn.45 or better still nnn.433 together with a archae86 constant of 0.0002042 or better still 0.00020417 does seem to give a much cleaner fit for the model line with the data points.
I'd be interested in what you think :).
Gary, this is very interesting. I have two comments initially:
1. As the Skygrid file gets used over a 10 Hz range, and is labelled for something near the top of that range, and you are adding .433 to that label, there is a built-in reason for the resulting value of the "archae86 constant" to be a bit lower than one based on observation of the cycle behavior of specific results. When I did some refit checking using the latter approach with a couple of relatively high-frequency data series, I felt the fit was definitely for a slightly higher value than the one you find, but those happened to be frequencies somewhat low in their 10 Hz rung of the ladder, so that may explain the (slight) discrepancy between my most recent .0002055 and your .00020417.
2. As you've focused on the skygrid relation, you've not mentioned the 10 Hz step function. When I first read your post, I thought you might have found that the true transition point is .433 Hz above the labeled frequency, But you actually seem to be finding that as a relationship fit.
Assuming this is all right, the implied revision seems to be to use a 10-Hz step quantization of the period estimate. It does, so far as I know, remain a need to establish just where that step should be. From someone's (you, Bikeman?) previous post I vaguely recall that the step to use of the next higher skygrid file is about .5 Hz above is label. If it is actually .433, that would unify things even more.
so instead of:
period = .000206*frequency^2
we should use:
period = .00020417*stepped_frequency^2
We still don't know how to determine stepped_frequency, but expect it to climb in steps at 10-Hz intervals "about1" .5 Hz above the more obvious 10-Hz integer multiple points, and to take the numeric value of "about2" .433 Hz above the round number top end of that 10-Hz interval which is in the actual skygrid file.
With my usual desire to back theory with observation, I'd like to backcheck this against actual cyclic behavior from my archives if we can get a specific function agreed for review. It seems to be that you have been specific indeed about everything save "about1".
So possibly, for an Excel formula representation, and for freq meaning the value obtained from the WU name:
period = .00020417*(ceiling(freq - .5,10) + .433)^2
(where, for the sake of specificity, I used:
about1 = .5
about2 = .433
For those who might be tempted to think the 10-Hz step thing, if real, is modest enough to ignore, I'll point out that for a frequency going from 900 to 901 Hz, this implies a "jump" in cycle length from 165.5 to 169.2 For seqeunce numbers out near the third peak, this is a peak point shift of almost twelve sequence numbers.
At the current time, I think improving this cycle estimate has more impact than the potential change from the Sin to the quadratic function for the wave-shape, and may best be prioritized higher. I would, of course, like to convert to the quadratic function if appropriate observation supports Bikeman's algorithmic understanding that it "ought" to be the proper function form (whereas the sin thing was just a guessed-at generating function to approximate the observed shape).
Gary, Bikeman, am I comprehending what you are saying? More specifically, does anyone know from any source the "correct" value of about1? For frequencies not very near the boundary it won't matter, but right at the boundary for high-sequence number Work Units, it makes a big difference.
Before I invest time in backchecking this possible revision, I'd like comments on whether you think it probably the right way forward.
Might be worth keeing an eye on Peanut's host. Since the list Gary posted this morning (tasks up to 0909.30), I see he's been issued work at 0909.45 and 0909.50: with any luck, he'll carry on and step right over any 0910.xx transition point.
In order to fine-tune the period-estimate, it would be possible to actually check the period length by visualizing the result if you are lucky enough to have results with sequence numbers that are multiples of the supposed period length. I've got a database of a couple of hundreds of results, I'll check If I have any of those candidates.
Here are the latest results
)
Here are the latest results for Richard's machine.
I said I'd only send the ones that hadn't been listed previously but I decided to send everything that has been crunched with 4.32. I've left out the mixed ones and all the tasks crunched with 4.26
Cheers,
Gary.
RE: RE: Just want to
)
Well, just making the suggestion, since there are quite a few people who come along on the message boards who don't know about the cyclic nature and get all bent out of shape over the difference in cr/sec... Having to keep in mind what version is "newbie-friendly" as well as if it is accurate or not can be difficult for someone to keep track of if they are not intimately involved.
If I were doing this, I'd give a front-end "welcome" page that gave a choice between "standard" and "advanced", and let the user pick. From there you can make sure that the versions available are up to date. Direct URL access would also be available, so you could keep on posting direct links to refined pages as they are developed and people could bookmark them or whatever...
Alternatively, you can say I'm crazy and keep going as you have been. It won't be the first time, nor the last, that someone thinks I'm nuts... :-)
RE: Well, just making the
)
Please keep suggesting! :-)
You're right in that I'm not sure who is the primary audience, or what use may come of this RR thing. I'm just dicking around really and following the kind of directions that people suggest here.
Now that you mention it : a combo of frames and pop-ups might do the trick here for the approach you mention -> in the pipe for RR8! :-)
It'll be a committee camel yet ..... :-)
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
RE: Here are the latest
)
Well, that's a nice/well behaved data set! The distinct RR6 and RR7 prediction algorithms only differ in 1 second in peak and 47 seconds in trough for runtimes.
Hmmm .... this suggests a new direction of analysis, a mythical/reference/fiducial host. For hosts at the same frequency and app version ( say quorum or farm buddies ) one could scale appropriately - effectively making peak and trough times identical - and view the two ( or more ) hosts within the same plot. The CSV input format can expand to a possible fifth column of hostID or somesuch, and I could buttonise an aggregating functionality ( MERGE/SPLIT ). You wouldn't quote runtimes on the left end of those peak, average and trough horizontal lines, but you'd keep the sequence number markings for the extremes as is. Why do this? To conveniently amalgamate in the one plot behaviours from a group of machines and hence more easily visualise/analyse/summarise performance changes with app version upgrades. Is there no bottom to this rabbit hole? :-)
Come on Gary, I just know you'd want that! :-)
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
RE: RE: Mike, for various
)
Mike, I'm sorry I've been a bit busy this weekend so it's taken me a while to get back to this.
If you review Bikeman's initial post in the big thread, he came up with these points which I'll summarise
* You get the GridSize by counting lines in the SkyGrid file
* The relation should be GridSize = Freq^2 * const (Bikeman const (BC) = 0.2453
* Cycle period (P) = GridSize / SkyPoints
* SkyPoints is ~1200 but in any case is always shown in stderr.out
* Substituting for GridSize gives P = Freq^2 * BC / SkyPoints
* So final equation is Period = Freq^2 * 0.0002044
Bikeman illustrated this by giving data for 3 closely spaced SkyGrid frequencies of 700, 740, 760.
Two posts later, I followed up with an expanded set of GridSizes
which also could be used to verify that 0.0002044 was approximately correct. The above values lead to the const being about 0.0002044 to 0.0002046
There was something that didn't quite feel right about the variations I was seeing so a few posts later I followed up again with these thoughts about the whole deal. I don't think I was very clear so I'm going to ask you to first read that post again and then consider the following.
* We don't know exactly what frequency in that range should be used to calculate the cycle period.
* Why don't we play with frequencies throughout the range and see what gives the best fit?
* Surprisingly enough, the top of the range frequency (almost) gives a quite amazing fit over a wide range of frequencies.
We can invert the formula for period to give
Const = Period / Freq^2
which leads to
Const = GridSize / Skypoints / Freq^2
If we make the assumption that Freq is the (almost) top of the range implied by the SkyGrid filename (ie nnn.433 where nnn is the value in the filename) we get an extremely consistent value of the archae86 constant over a very wide range of frequencies. Please consider the following
I think you'll agree that for the wide frequency range from 380 to 930, the use of the frequency nnn.433 gives a remarkably consistent value of the archae86 constant. I came up with the .433 bit by theorising that the range started and ended at .45 and then experimented with frequencies at the bottom, midpoint, and top of the range. The bottom and midpoint seemed not to work all that well compared to the top which worked very well. Refining the top value ever so slightly gave me an absolutely superb fit if I used .433 to 3 decimal places.
I have no theoretical basis for any of the above. As has been pointed out by others, the frequency in the data filename which goes in .05 increments is quite different to the -freq flag value fed to the program on startup. So who knows what's really going on.
Observationally however, the use of nnn.45 or better still nnn.433 together with a archae86 constant of 0.0002042 or better still 0.00020417 does seem to give a much cleaner fit for the model line with the data points.
I'd be interested in what you think :).
Cheers,
Gary.
RE: I hadn't looked before,
)
Richard, Sorry for the delayed reply but I've been a bit busy.
I think a lot of people might be interested in the matchup between your machine and peanut's. Mike Hewson has kindly allowed me to use some of his webspace and since these files aren't really that big, I think I'll put a daily update of the two hosts - yours (1001562) and peanut's (997488) on the website with a link here. The link should stay the same - all I'll do is replace the files approximately each day. The files will keep growing as results are added. I'll put a note in this thread each time I update the files.
Fresh results have just been put on the website now. These results are from RH's machine and these are from peanut's.
As this is the first time I've actually done this, please let me know if I've stuffed anything up.
Cheers,
Gary.
RE: Fresh results have just
)
No stuff-up, the files are visible just fine.
And in fact they explain something which I'd been too lazy to follow up. I've started getting 'pendings' on my reports, which seemed odd, as Peanut's machine is so much faster than mine. Now I've seen that he's been moved off onto slightly higher frequencies. Now, with Peanut out of the way, I seem to have control of the 909.15 dataset, and I'm getting a consecutive run of sequence numbers.
There must be a "sweet spot" speed for requesting new work. Too slow, and other people nip in and steal your sequence numbers. Too fast, and the work generator can't keep up, and switches you to another band. My 2.4GHz stock quaddie seems to be just right - I'll have to rechristen her Goldilocks.
RE: Observationally
)
Gary, this is very interesting. I have two comments initially:
1. As the Skygrid file gets used over a 10 Hz range, and is labelled for something near the top of that range, and you are adding .433 to that label, there is a built-in reason for the resulting value of the "archae86 constant" to be a bit lower than one based on observation of the cycle behavior of specific results. When I did some refit checking using the latter approach with a couple of relatively high-frequency data series, I felt the fit was definitely for a slightly higher value than the one you find, but those happened to be frequencies somewhat low in their 10 Hz rung of the ladder, so that may explain the (slight) discrepancy between my most recent .0002055 and your .00020417.
2. As you've focused on the skygrid relation, you've not mentioned the 10 Hz step function. When I first read your post, I thought you might have found that the true transition point is .433 Hz above the labeled frequency, But you actually seem to be finding that as a relationship fit.
Assuming this is all right, the implied revision seems to be to use a 10-Hz step quantization of the period estimate. It does, so far as I know, remain a need to establish just where that step should be. From someone's (you, Bikeman?) previous post I vaguely recall that the step to use of the next higher skygrid file is about .5 Hz above is label. If it is actually .433, that would unify things even more.
so instead of:
period = .000206*frequency^2
we should use:
period = .00020417*stepped_frequency^2
We still don't know how to determine stepped_frequency, but expect it to climb in steps at 10-Hz intervals "about1" .5 Hz above the more obvious 10-Hz integer multiple points, and to take the numeric value of "about2" .433 Hz above the round number top end of that 10-Hz interval which is in the actual skygrid file.
With my usual desire to back theory with observation, I'd like to backcheck this against actual cyclic behavior from my archives if we can get a specific function agreed for review. It seems to be that you have been specific indeed about everything save "about1".
So possibly, for an Excel formula representation, and for freq meaning the value obtained from the WU name:
period = .00020417*(ceiling(freq - .5,10) + .433)^2
(where, for the sake of specificity, I used:
about1 = .5
about2 = .433
For those who might be tempted to think the 10-Hz step thing, if real, is modest enough to ignore, I'll point out that for a frequency going from 900 to 901 Hz, this implies a "jump" in cycle length from 165.5 to 169.2 For seqeunce numbers out near the third peak, this is a peak point shift of almost twelve sequence numbers.
At the current time, I think improving this cycle estimate has more impact than the potential change from the Sin to the quadratic function for the wave-shape, and may best be prioritized higher. I would, of course, like to convert to the quadratic function if appropriate observation supports Bikeman's algorithmic understanding that it "ought" to be the proper function form (whereas the sin thing was just a guessed-at generating function to approximate the observed shape).
Gary, Bikeman, am I comprehending what you are saying? More specifically, does anyone know from any source the "correct" value of about1? For frequencies not very near the boundary it won't matter, but right at the boundary for high-sequence number Work Units, it makes a big difference.
Before I invest time in backchecking this possible revision, I'd like comments on whether you think it probably the right way forward.
Might be worth keeing an eye
)
Might be worth keeing an eye on Peanut's host. Since the list Gary posted this morning (tasks up to 0909.30), I see he's been issued work at 0909.45 and 0909.50: with any luck, he'll carry on and step right over any 0910.xx transition point.
In order to fine-tune the
)
In order to fine-tune the period-estimate, it would be possible to actually check the period length by visualizing the result if you are lucky enough to have results with sequence numbers that are multiples of the supposed period length. I've got a database of a couple of hundreds of results, I'll check If I have any of those candidates.
CU
Bikeman