new BOINC recommended version 4.45

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9352143
RAC: 0

RE: I couldn't edit after

Message 12313 in response to message 12309

Quote:

I couldn't edit after an hour, but I wanted to add another plus.

The ability to select "no more work" for a project is nice too. I have two machines that are pretty slow (300MHz K6 and a 650MHz PIII laptop). Einstein causes those machines to go nuts since a workunit takes about half of the time until it's deadline. So, I don't normally run Einstein on those machines, but when all of the projects are down, I can just unclick "no more work" and it downloads an Einstein WU; I then tell it "no more work" and all is well for about 3 days. Much easier than detaching and reattaching. Good if you are going to retire a machine (the K6 goes when UPS delivers the new Dell, and the old machines get repurposed)

So you're the other cruncher running a K6/300! Funny thing here is I'm running NT Server 6a on mine as well. What are the odds on that! :-)

Only difference is I don't have any plans on retiring him until he suffers a motherboard burnout since I have it doing a number of backend chores for me and performs well in this role.

Out of curiosity, I'm assuming you've run SAH as well as PAH on it, so what's your average run times like? I only run SAH on mine and couldn't compare since you have your SETI boxes hidden, although I can see it looks like I could run PAH as well If I drop back to a short CI.

One thing though, once you retire yours I may have to consider putting my other 300 back online so they don't drop off Willy's stats. ;-)

Alinator

kami4ligo
kami4ligo
Joined: 15 Mar 05
Posts: 48
Credit: 16105651
RAC: 0

RE: RE: I couldn't edit

Message 12314 in response to message 12313

Quote:
Quote:

I couldn't edit after an hour, but I wanted to add another plus.

The ability to select "no more work" for a project is nice too. I have two machines that are pretty slow (300MHz K6 and a 650MHz PIII laptop). Einstein causes those machines to go nuts since a workunit takes about half of the time until it's deadline. So, I don't normally run Einstein on those machines, but when all of the projects are down, I can just unclick "no more work" and it downloads an Einstein WU; I then tell it "no more work" and all is well for about 3 days. Much easier than detaching and reattaching. Good if you are going to retire a machine (the K6 goes when UPS delivers the new Dell, and the old machines get repurposed)

So you're the other cruncher running a K6/300! Funny thing here is I'm running NT Server 6a on mine as well. What are the odds on that! :-)

Only difference is I don't have any plans on retiring him until he suffers a motherboard burnout since I have it doing a number of backend chores for me and performs well in this role.

Out of curiosity, I'm assuming you've run SAH as well as PAH on it, so what's your average run times like? I only run SAH on mine and couldn't compare since you have your SETI boxes hidden, although I can see it looks like I could run PAH as well If I drop back to a short CI.

One thing though, once you retire yours I may have to consider putting my other 300 back online so they don't drop off Willy's stats. ;-)

Alinator

I could bring in my K6/450 ;)

-rg-

Bill Hepburn
Bill Hepburn
Joined: 16 Feb 05
Posts: 7
Credit: 77524752
RAC: 0

RE: So you're the other

Message 12315 in response to message 12313

Quote:

So you're the other cruncher running a K6/300! Funny thing here is I'm running NT Server 6a on mine as well. What are the odds on that! :-)

Only difference is I don't have any plans on retiring him until he suffers a motherboard burnout since I have it doing a number of backend chores for me and performs well in this role.

Out of curiosity, I'm assuming you've run SAH as well as PAH on it, so what's your average run times like? I only run SAH on mine and couldn't compare since you have your SETI boxes hidden, although I can see it looks like I could run PAH as well If I drop back to a short CI.

One thing though, once you retire yours I may have to consider putting my other 300 back online so they don't drop off Willy's stats. ;-)

Alinator

It seems to scream through a SETI unit in about 28 hours (the 4 W/Us still shown all range between 102-118k seconds, one "short run" at a bit over 1k). I normally run SETI and PAH on it; if it runs dry from those, I let it grab an Einstein, but it struggles, so I haven't done that for quite a while now.

A few years ago, I taught an NT Server class, so I bought NT4 (I think it was SP4 at the time). It still works fine for what it needs to do, so I still use it. The Power Supply fan has gotten pretty noisy, and I really don't need any more heat in my office, so I'm planning on taking Dell up on their "free recycling" when I get done playing the shuffle the computers around game.

The net effect of the computer shuffle is that I will be replacing the K6-300 with a Pentium D Processor 840 with Dual Core Technology (3.20GHz, 800FSB). Wonder if I'll see my RAC go up??

Cheers.

Bill

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

RE: The net effect of the

Message 12316 in response to message 12315

Quote:
The net effect of the computer shuffle is that I will be replacing the K6-300 with a Pentium D Processor 840 with Dual Core Technology (3.20GHz, 800FSB). Wonder if I'll see my RAC go up??

Doubt it ... the old computer was fast enough ... :)

tekwyzrd
tekwyzrd
Joined: 25 Feb 05
Posts: 49
Credit: 2922090
RAC: 0

I'm considering abandoning

I'm considering abandoning einstein altogether. I have two einstein units (one of them is about 43% completed, which means a total of about 38 hours of work left on both) and one seti unit with about 3 hours of work left on it. The einstein units are due 4 days from now. The seti unit is due 12 days from now.

6/16/2005 11:57:37 AM||Suspending work fetch because computer is overcommitted.
6/16/2005 11:57:37 AM||Using earliest-deadline-first scheduling because computer is overcommitted.
6/16/2005 11:57:37 AM|Einstein@Home|Restarting result H1_1147.5__1147.7_0.1_T16_Fin1_2 using einstein version 4.79
6/16/2005 11:57:37 AM|SETI@home|Restarting result 05no03aa.15866.31776.790908.127_1 using setiathome version 4.11

One einstein unit is suspended until the one running is completed. This is the only way I can force the current version to share resources as I have decided. If I want each project to use 50%, I want it to do just that. I want to give each of the two projects use of 1 processor.

Overcommitted? 38 hours worth of work for einstein, 95 hours to do it, and my computer's overcommitted?

Consistently einstein takes over. When I download work BOINC immediately suspends the running seti unit and wants to run only einstein. Why? The shorter deadline.

Once the orbit@home project is up and running I'll give them my processor time instead of this resource hogging einstein.

Nothing travels faster than the speed of light with the possible exception of bad news, which obeys its own special laws.
Douglas Adams (1952 - 2001)

Bill Hepburn
Bill Hepburn
Joined: 16 Feb 05
Posts: 7
Credit: 77524752
RAC: 0

RE: I'm considering

Message 12318 in response to message 12317

Quote:

I'm considering abandoning einstein altogether. I have two einstein units (one of them is about 43% completed, which means a total of about 38 hours of work left on both) and one seti unit with about 3 hours of work left on it. The einstein units are due 4 days from now. The seti unit is due 12 days from now.

6/16/2005 11:57:37 AM||Suspending work fetch because computer is overcommitted.
6/16/2005 11:57:37 AM||Using earliest-deadline-first scheduling because computer is overcommitted.
6/16/2005 11:57:37 AM|Einstein@Home|Restarting result H1_1147.5__1147.7_0.1_T16_Fin1_2 using einstein version 4.79
6/16/2005 11:57:37 AM|SETI@home|Restarting result 05no03aa.15866.31776.790908.127_1 using setiathome version 4.11

One einstein unit is suspended until the one running is completed. This is the only way I can force the current version to share resources as I have decided. If I want each project to use 50%, I want it to do just that. I want to give each of the two projects use of 1 processor.

Overcommitted? 38 hours worth of work for einstein, 95 hours to do it, and my computer's overcommitted?

Consistently einstein takes over. When I download work BOINC immediately suspends the running seti unit and wants to run only einstein. Why? The shorter deadline.

Once the orbit@home project is up and running I'll give them my processor time instead of this resource hogging einstein.

One problem with the scheduler appears to be that it relies on the "Connect Interval" to figure out how much work to keep on your machine. If you have the "Connect Interval" set to a value like 3 days, it thinks that it has to get the work done to return on one of those every 3 day connections. So, all it's going to want to do is Einstein until it gets caught up.

After it catches up, the debt thing, if it works as advertised, will remember that it "owes" Seti some CPU time, and it will only ask for Seti units until Seti "catches up". Then, it should start to rotate between the two projects pretty much as you expect. Of course, if one (or both) projects go down...

I think one of the reasons that 4.4x got a "bum rap" (including in my mind) was that when you switch to 4.4x, it "inherits" whatever combination of work is on your machine at the time. It looks at the "Connect Interval" and "Time to Completion" to figure out what has to be done to meet the deadlines by when it thinks it can report work. While it is doing that, it racks up a pile of "debt" to all of the other projects that will have to be repaid. So the mix of projects running will seem odd for quite a long while (my observation was about a week). This was made worse since if it couldn't get work from a project that was down, it felt that it "owed" that project the time it wasn't using (I believe this no longer happens). Since Seti had so many problems, it messed up perceptions and extended the time to balance out so that you could understand what is going on.

Again, pretty long winded. Hope it sheds some light.

JoeB
JoeB
Joined: 24 Feb 05
Posts: 124
Credit: 85294970
RAC: 10284

Seems to me that BOINC 4.4x

Seems to me that BOINC 4.4x should be listed as a BETA version until it is thoughly checked out. When I first joined 4.19 was the "old" version and 4.25 was the "new and improved". Looks to me that 4.19 is still the old proven version, even though it still has warts. Why don't the "powers to be" label these new versions as "beta" until the details are worked out? Some folks want to compute WUs and others want to help code development.

Joe B

Divide Overflow
Divide Overflow
Joined: 9 Feb 05
Posts: 91
Credit: 183220
RAC: 0

IMHO, the scheduler in 4.45

IMHO, the scheduler in 4.45 is working well enough to be considered proven. It simply behaves so differently than the scheduler in 4.19 that many people assume that it's not working properly. Of course there are places where it can be fine-tuned and there's always room for improvement, but that can be said for all of the previous core client versions. It definately meets the fundamental goal of managing project applications and allowing science to get done.

Fortunately, the project team has left 4.19 available to continue using for those who do not wish to use the new version. To each their own. The devs have taken steps to limit the public use and resulting confusion from so many development versions running around.

tekwyzrd
tekwyzrd
Joined: 25 Feb 05
Posts: 49
Credit: 2922090
RAC: 0

The problem that everyone

The problem that everyone seems to misunderstand is the fact that some users choose to run their computers a certain way for a reason. I run a dual p3 700. I want to contribute what I can to projects I see value in. I picked two projects. I'm not one to try to run more than my computer is capable of handling.

I had a problem shortly after switching to boinc and joining the einstein project. I admit I had a few units go unreported. It was a situation that no scheduler, no matter how complex, could fix. I thought my video card went bad. No display at boot. I had to wait for a new card to arrive. Once I installed it he problem was still there.

After going through the books that came with the motherboard and following the flow chart, according to the manufacturer's information my motherboard had to be replaced. Fortunately I decided to diverge from the suggested procedure. The real problem turned out to be something that the manufacturer information didn't even mention. One of my ppga to slot 1 cards went bad.

This said, it seems to me that there's a similar problem here. The "only one solution" approach doesn't always work.

A problem that I encounter on my computer is that on occasion when BOINC is running it refuses to allow me to switch between other programs I have running, forcing the three finger salute. It goes unresponsive. I can't even suspend BOINC. Once I do the old alt-ctrl-del to bring up the task manager I can suspend it via the tray. It's worst when there's two einstein units running.

Every time I bring up the scheduler entering panic mode and giving einstein priority people say how great the new scheduler is doing. They say it just takes getting used to. They say running one project for four days straight then running another for four days to equalize the debt is the same as running both projects simultaneously for eight days.

I say it's not. There has to be a better solution.

I don't have an excessively long connect to interval. Three days (to have enough work on hand for common network problems). My recent run-in with the scheduler's panic mode was due to the fact that on June 9th I was given 5 work units, and two more on the 13th when I still had one partially done and one left to run. It takes about 24.5 hours to run one unit.

I have since made it a habit to set einstein to no new work once I have a couple units on hand (to prevent it from getting new work I don't want when I'm sending results to and getting work from seti.

Nothing travels faster than the speed of light with the possible exception of bad news, which obeys its own special laws.
Douglas Adams (1952 - 2001)

The Gas Giant
The Gas Giant
Joined: 18 Jan 05
Posts: 72
Credit: 3109569
RAC: 0

David, When BOINC goes

David,

When BOINC goes into deadline mode it has more work than half of the time to the wu with the earliest deadline. This is horribly wrong. It then crunches that project until that problem is cleared. So you cannot hold more than 3.94 days (10,000min divided by 2) of work in a cache or have a wu that is within half of it's deadline unless you want BOINC to operate in deadline mode and then not download more work for that project that was in deadline mode due to debt issues and then download a bucket load again and then immediately crunch in deadline mode once the debt issues are resolved. (bad grammar I know).

My specific problem is that my laptop does not connect to a network from 5pm Friday (TGIF btw) to 8:30am Monday and therefore needs a "connect to network about" preference of 4 days or more to ensure it gets sufficient work to last the weekend (then problems with wu estimated completion times cause problems). Once Monday comes around BOINC is horrible messed up and refuses to download work on any project with a 6.94 day deadline and so by Monday night I run out of work (since my laptop does not connect from 5:30pm to 8:30am weekdays).

I did not experience these issues with 4.19.

The only fix for this is to have 2 preferences.
- hold x days of work.
- connect every y days.

This is basically how seti queue use to work and it worked a treat.

I would have the following settings
- hold 5 days of work.
- connect every 1 days.

In this way you could hold sufficient work to last through most outages, but keep returning work so that it didn't get close to a deadline. The amount of work actually requested would then be based on resource share and computer up time and boinc on time. Then only thing to fix would be the estimated wu competion times, which causes major shortages in cached work. Or in the case of einstein it can cause a computer to become overcommited since it severely under estimates the completion time.

Live long and crunch.

Paul

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.