Paul's Nonsense Test

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0
Topic 194207

Well, I thought that I would take up that challenge.

Though with some caveats ... in essence I will return my systems to "production" mode with my normal load of projects (just about all of them) running on my 5 top systems allowing the systems to run as I "normally" do ...

Though I absolutely will not promise to not intervene because contrary to Gary's belief I can already (and have) proved that there are significant issues with work fetch and scheduling (most specifically with Non-CPU intense projects) and because the GPU portion of BOINC is still immature ...

Because of the time it takes for BOINC to stabilize in this kind of environment and my systems were heavily biased for the PoM (Rosetta) and prior work on ABC ... so, over the next week I will let the systems "settle-in" and then on the 1st I will capture the start numbers ... From there I will gather numbers ...

The down side is that this is going to be more "messy" than more tailored tests, but, in another sense it will be more "pure" in that it will reflect the other end of the spectrum where a person might have larger numbers of projects on their systems.

Other advantages will be that we can also look at more projects, and also the specific systems can be looked at with Willy's CSPS data as pointed out by a note on MW's boards...

Comments welcome ...

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5885
Credit: 119089507792
RAC: 23718441

Paul's Nonsense Test

Quote:
Well, I thought that I would take up that challenge.


Exactly what "challenge" are you referring to?

Quote:
... with my normal load of projects (just about all of them) ...


That's more than 50 projects, right??

Quote:
.. contrary to Gary's belief I can already (and have) proved that there are significant issues with work fetch and scheduling ...


Just to be fair about this, my comments were very specifically directed at a situation where there were 4 reasonably reliable projects only with each one having exactly a 25% resource share. You are drawing a very long bow indeed by extrapolating those comments to apply to 50+ projects, some having dubious reliability and all having less than 2% resource share each. Unless you are prepared to average over about the next 10 years, you are setting BOINC a virtually impossible playing field.

Actually, I have a different challenge for you to think about. Select your 20 most important and reliable projects - your choice (but don't include "low availability" projects like LHC). Set up two comparable (but relatively modern - eg at least quad core) hosts with those set of projects - equal resource shares (5%) for each project. On one host set the two preferences that control work fetch to be 0.0/0.1 respectively and allow BOINC to make all decisions without user intervention. On the other, set the cache sizes however you wish and micro-manage the host to your heart's content. At the end of a minimum of 1 month (preferably more) do an accounting to see who has achieved the best "performance", you or BOINC, where performance is defined as "best honouring your stated resource shares while maximising the total credit awarded (granted + pending) and minimising the user intervention time required". I'm not really suggesting that you do the above (unless you want to) but I am interested to hear if you think you could significantly out-perform BOINC if you were to take up such a challenge :-).

In addition, we would have to agree on which particular version of BOINC would be used for the challenge. There seem to be quite significant differences in work fetch arrangements with different BOINC versions.

If you wish to continue the discussion, it might be best to start a new thread and I'll shift our messages there. I don't particularly want to hijack kenzieB's thread any more by continuing here.

Cheers,
Gary.

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

RE: RE: Well, I thought

Message 90617 in response to message 90616

Quote:
Quote:
Well, I thought that I would take up that challenge.

Exactly what "challenge" are you referring to?


Cross project earnings ... and your insistence that BOINC's, well, I can't use the Navy-ism, but it has to do with waste products and odors ... :)

Quote:
Quote:
... with my normal load of projects (just about all of them) ...

That's more than 50 projects, right??


Sure, but that is *MY* normal operational mode ... Why should we restrict our tests to confined and constrained systems that are not operating as the person uses them?

Just as a side note, I have also maintained that running systems in the wild in unconstrained ways undermines the science and I have been told repeatedly that I don't know what I am talking about ... NOW, all of a sudden, running systems in ways that are unconstrained is a huge problem ... :)

Quote:
Quote:
.. contrary to Gary's belief I can already (and have) proved that there are significant issues with work fetch and scheduling ...

Just to be fair about this, my comments were very specifically directed at a situation where there were 4 reasonably reliable projects only with each one having exactly a 25% resource share. You are drawing a very long bow indeed by extrapolating those comments to apply to 50+ projects, some having dubious reliability and all having less than 2% resource share each. Unless you are prepared to average over about the next 10 years, you are setting BOINC a virtually impossible playing field.

Hmmm, well, BOINC is supposed to be able to manage multiple projects. There is no limit specified in the documentation. I find it interesting that I have never covered up that I support a large number of projects and from that experience have pointed to a number of situations where BOINC performs less well to the point at times of being non-functional.

When I make those comments I get told BOINC works wonderfully ... until situations like this where all of a sudden BOINC can't be trusted to manage more than 4 projects correctly or well ...

And, for the record, the flaws I have noted problems with are related to low reliability projects and multiple classes of projects (non-CPU intense, CUDA, ATI GPU, CPU, etc.). So, why would this not be a "fair" test?

I will grant that the resource shares ranging from 200 to 5 means that the share on the system ranges from 12.46% to 0.31% ... but, over a month that should average out if BOINC works as well as you say it does, and much less well if it works the way that I think it does...

And I will also note that I have been complaining/commenting about the credit earnings varying depending on the OS used, the version of BOINC used, etc. ... were BOINC properly designed these would not be issues.

Quote:
Actually, I have a different challenge for you to think about. Select your 20 most important and reliable projects - your choice (but don't include "low availability" projects like LHC). Set up two comparable (but relatively modern - eg at least quad core) hosts with those set of projects - equal resource shares (5%) for each project. On one host set the two preferences that control work fetch to be 0.0/0.1 respectively and allow BOINC to make all decisions without user intervention. On the other, set the cache sizes however you wish and micro-manage the host to your heart's content. At the end of a minimum of 1 month (preferably more) do an accounting to see who has achieved the best "performance", you or BOINC, where performance is defined as "best honouring your stated resource shares while maximising the total credit awarded (granted + pending) and minimising the user intervention time required". I'm not really suggesting that you do the above (unless you want to) but I am interested to hear if you think you could significantly out-perform BOINC if you were to take up such a challenge :-).


Sadly, with only 6 systems, none of them alike I cannot do what you suggest. You can look in my account and see most of them right now ... Einstein is one of my "Choice" projects as it is production and is doing serious science.

Quote:
In addition, we would have to agree on which particular version of BOINC would be used for the challenge. There seem to be quite significant differences in work fetch arrangements with different BOINC versions.

More problems not of my making ... But, I am using 6.5.0 on all the windows systems as they are running CUDA or ATI GPU, and 5.10.45 on the Power Mac because I cannot use the 6.x versions as a number of projects STILL don't recognize the ID string (also a possible problem on the XP 64-Bit as I noted one project also does not recognize the 64 bit window (Almere Test Grid)).

One point in argument for this methodolgy is that for most projects that are issuing work will continue to issue work and you will be free to ignore data you don't like ... :)

Though I am going to do this test, and am interested in the other tests I also have confidence in the history that says that we are just wasting our time proving that hitting your thumb with a hammer hurts ... once again ... :)

Quote:
If you wish to continue the discussion, it might be best to start a new thread and I'll shift our messages there. I don't particularly want to hijack kenzieB's thread any more by continuing here.

Cool, I thought about that, but was soliciting comment to see if there were suggestions. There might be a couple posts below you will want to grab too ...

And I am not trying to pick a fight ... or to be difficult ... but ...

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

Move to here ...

Move to here ...

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3587984553
RAC: 992317

What? to where?

What? to where?

mikey
mikey
Joined: 22 Jan 05
Posts: 12866
Credit: 1884361078
RAC: 223379

RE: In addition, we would

Message 90620 in response to message 90616

Quote:
In addition, we would have to agree on which particular version of BOINC would be used for the challenge. There seem to be quite significant differences in work fetch arrangements with different BOINC versions.

Maybe this is why us long term crunchers think self managing Boinc is a good idea. Old dogs do not like to learn new tricks! And when Boinc has new problems with each new version, which is normal for programs in general, those of us that like micro-managing continue to do so. Personally I have also found that Linux seems to be a better foundation than Windows, meaning that Boinc just seems to run for a long time without a decrease in credits over time that, in Windows, is often fixed with a reboot.

As for your test I think over the very long time your hands off approach may work better, but over the short term I think it could go either way. In some cases Boinc does a bad job and micro-managing is better, in some cases micro-managing can make things worse. I think in the end though if a person "feels" better micro-managing Boinc, then there are few, to no, arguments that can change that.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5885
Credit: 119089507792
RAC: 23718441

Paul, I've shifted the

Message 90621 in response to message 90617

Paul,

I've shifted the messages that were disturbing kenzieB's thread to this new one you have created.

KenzieB's thread was all about measuring credit and looking at cross project credit parity for just four specific projects. I was posting my favourable comments about BOINC in that thread because my personal experience tells me that BOINC does handle, in a fair manner, four equally resourced projects. If you wish to comment on BOINC's inability to handle 50+ projects, a topic that extremely few people would be qualified to discuss (myself included) please continue to debate the topic here to your hearts content.

Whilst I don't have the relevant experience to debate the 50+ projects matter with you, let me just pick out a couple of points you raise.

Quote:
I will grant that the resource shares ranging from 200 to 5 means that the share on the system ranges from 12.46% to 0.31% ... but, over a month that should average out if BOINC works as well as you say it does, and much less well if it works the way that I think it does...


Over a month, 0.31% translates to just over two hours. If a project with that resource share had 5 hour tasks with a 2 week deadline, it would be impossible to have an entirely "fair" (ie exactly as per resource share) distribution of time in just one month.

Quote:
And I will also note that I have been complaining/commenting about the credit earnings varying depending on the OS used, the version of BOINC used, etc. ... were BOINC properly designed these would not be issues.


You seem to be blaming BOINC for the way a science app might perform on different OSes. Isn't that a matter to take up with the Project Admins about why their apps are performing differently? How do you "design BOINC better" to make up for variability in the performance of the science app?

Quote:
And I am not trying to pick a fight ... or to be difficult ... but ...


Actually, that's exactly what you seem to be doing and I'm not going to get involved in that. I will simply wish you all the best in your adventures with BOINC and trust that as BOINC continues to evolve, it may improve in some of those areas where you find it lacking.

Cheers,
Gary.

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

RE: Paul, I've shifted the

Message 90622 in response to message 90621

Quote:

Paul,

I've shifted the messages that were disturbing kenzieB's thread to this new one you have created.


Thank You.

Quote:
KenzieB's thread was all about measuring credit and looking at cross project credit parity for just four specific projects. I was posting my favourable comments about BOINC in that thread because my personal experience tells me that BOINC does handle, in a fair manner, four equally resourced projects. If you wish to comment on BOINC's inability to handle 50+ projects, a topic that extremely few people would be qualified to discuss (myself included) please continue to debate the topic here to your hearts content.


And I agree that the test is a valid test. I just do not think that the test is as complete as it could be. KenzieB is doing a specific test, mine is more of a survey.

Quote:

Whilst I don't have the relevant experience to debate the 50+ projects matter with you, let me just pick out a couple of points you raise.

Quote:
I will grant that the resource shares ranging from 200 to 5 means that the share on the system ranges from 12.46% to 0.31% ... but, over a month that should average out if BOINC works as well as you say it does, and much less well if it works the way that I think it does...

Over a month, 0.31% translates to just over two hours. If a project with that resource share had 5 hour tasks with a 2 week deadline, it would be impossible to have an entirely "fair" (ie exactly as per resource share) distribution of time in just one month.


In that I run with a very small cache the two week deadlines rarely occur. I also run with a long switch time so tasks that are started generally run to completion. I will note that even without all systems fully on-line in the hours since I started the preliminaries that 35 projects (out of 77 I track, and 62 that I have done work for) have registered a delta in earnings. The number you disregard is the number of cores. Though there are only 6 computers there is a total of 28 cores/CPUs.

With two Non-CPU intense projects and 2 more GPU this means that 30 some CPU class projects already have gotten their "fair" share in just 24 hours or so ... on at least one system.

Quote:
Quote:
And I will also note that I have been complaining/commenting about the credit earnings varying depending on the OS used, the version of BOINC used, etc. ... were BOINC properly designed these would not be issues.

You seem to be blaming BOINC for the way a science app might perform on different OSes. Isn't that a matter to take up with the Project Admins about why their apps are performing differently? How do you "design BOINC better" to make up for variability in the performance of the science app?


For many projects they still use the original benchmark based system and this system has had a historical bias that ranked systems from Linux to OS-X to Windows as far as earnings power. With more projects using other systems (fixed award, proportional award, etc.) this is less obvious. The problem is not in the science application, it is in the BOINC architecture

Quote:
Quote:
And I am not trying to pick a fight ... or to be difficult ... but ...

Actually, that's exactly what you seem to be doing and I'm not going to get involved in that. I will simply wish you all the best in your adventures with BOINC and trust that as BOINC continues to evolve, it may improve in some of those areas where you find it lacking.


I figured you would think that, hence the comment.

BOINC has evolved, slowly and painfully, and I too hope that it will improve and live up to its promise. One of the problems that I see is that there are a number of different operating models and yet only one gets an official "blessing" and recognition. I for one don't understand the people that select one project and only run that one project. But, I know that is the way quite a few people operate BOINC and that you can kind-of bend BOINC to operate reasonably well in this mode ... I operate at a different end, the other extreme, and BOINC does not always work well at this end either.

You have operated BOINC in the happy middle and yes, it works reasonably well in this model. But it is not the only model. And that was the point I was trying to make.

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

After 8 days of running I

After 8 days of running I have accumulated another 1,039,977 CS across 37 projects with 108,690 being earned on non-GPU type projects. This number should rise as several projects that have been completely quiet these last couple months have started to issue work randomly (Hydrogen).

Almost every project has had an outage at one time or another and there was the day where Comcast decided to interrupt service. So, I changed my queue from 0.1 days to 1.0 days as soon as I noticed. No machine ran dry. Cosmology and Ralph seems to be the only projects where the outage may be affecting the earnings more than expected.

The Milky Way GPU application has matured a little and the 19d version has allowed some customization so that it will run more as you desire. I am not sure I have it under control yet and the project scheduler seems reluctant to issue me work up to the available limits (for unknown reasons) and so I run into short periods where work is not available on the one machine running the API GPU application. The other machines running MW are running the appropriate optimized application. In that others are doing more detailed tests of the earnings of MW (primarily as compared to SaH and Einstein) I was never intending to get all that into that comparison issue.

The last couple of days QCN has stopped issuing me double tasks on the Mac Pro (not sure if that will last or not) so I no longer seem to have to abort tasks every now and again.

There have been task failures on Hydrogen, FreeHAL, Rosetta, Aqua and a couple tasks that seemed to have "hung" and I shot them.

ABC seems to be the leader on my spreadsheet but that may be a lingering holdover from earlier pending tasks finally being awarded. Pending is now down to about 2K and change which should be close to a sustaining level.

Over the next week I am hoping that my get-up-and-go won't have gotten-up-and-went and I can create a spreadsheet to begin to capture the CS per S data from Willy's data.

Anyway, the rough data after 8 days shows (for selected projects),

Project . . . Raw CS . . . . . . .  Share . . . . CS / Share
ABC . . . . . 11,047 . . . . . . . . 50 . . . . . 220.94
QMC . . . . ..  1,673 . . . . . . . . 10 . . . . . 167.30
CPDN . . . .. 15,394 . . . . . . . .100 . . . . . 153.94
EaH . . . . . 12,298 . . . . . . . .100 . . . . . 122.98
POEM . . . . . 2,955 . . . . . . . . 25 . . . . .  118.20
Rosetta . . ..  5,166 . . . . . . . . 50 . . . . . 103.32

I can see that my simplistic analysis to this point does not adequately take into account earning over time so I will need to ponder that. The good news is that gathering the raw data is the most annoying part and I can fiddle with the data later.

Some of the projects it is obvious that they are penalized because of intermittency of the the work and others (QCN) because of the fact that resource share is meaningless with regard to its execution.

mikey
mikey
Joined: 22 Jan 05
Posts: 12866
Credit: 1884361078
RAC: 223379

RE: After 8 days of running

Message 90624 in response to message 90623

Quote:

After 8 days of running I have accumulated another 1,039,977 CS across 37 projects with 108,690 being earned on non-GPU type projects. This number should rise as several projects that have been completely quiet these last couple months have started to issue work randomly (Hydrogen).

Anyway, the rough data after 8 days shows (for selected projects),

Project . . . Raw CS . . . . . . .  Share . . . . CS / Share
ABC . . . . . 11,047 . . . . . . . . 50 . . . . . 220.94
QMC . . . . ..  1,673 . . . . . . . . 10 . . . . . 167.30
CPDN . . . .. 15,394 . . . . . . . .100 . . . . . 153.94
EaH . . . . . 12,298 . . . . . . . .100 . . . . . 122.98
POEM . . . . . 2,955 . . . . . . . . 25 . . . . .  118.20
Rosetta . . ..  5,166 . . . . . . . . 50 . . . . . 103.32

I can see that my simplistic analysis to this point does not adequately take into account earning over time so I will need to ponder that. The good news is that gathering the raw data is the most annoying part and I can fiddle with the data later.

Some of the projects it is obvious that they are penalized because of intermittency of the the work and others (QCN) because of the fact that resource share is meaningless with regard to its execution.

Another thing you are doing is figuring out who gives the most CS/Share. Right now, from your chart, ABC is way ahead.

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

RE: Another thing you are

Message 90625 in response to message 90624

Quote:
Another thing you are doing is figuring out who gives the most CS/Share. Right now, from your chart, ABC is way ahead.


It may be "uglier" than that ...

Though I did try to run down the pending there was not much I could do about it except delay the test another month. Another part of the problem is that not all projects run on all my machines (or I don't run them on a machine) ...

For example, ABC runs on the Mac Pro, but QMC does not. Docking runs on all systems, but is not issuing work for the Mac Pro at this time (or I never get any for other reasons)...

Anyway, if ABC falls, then it was the lingering pending that inflated the numbers. We should know by next week if this was the case as this weeks earnings was 11K and if it is not in that neighborhood next week ...

Anyway, could not sleep so I been playing with my spreadsheet and trying to work out the shares vs. computers vs. cores and how to make something make sense that way ...

I suspect that the more telling number will be the CS per second number from WIlly ...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.