new BOINC recommended version 4.45

Ed and Harriet Griffith
Ed and Harriet ...
Joined: 18 Jan 05
Posts: 30
Credit: 1668218
RAC: 4855
Topic 189317

So far it seems to be running all right.


JohnB175
JohnB175
Joined: 22 Jan 05
Posts: 4
Credit: 232838
RAC: 0

new BOINC recommended version 4.45

ok I have to ask if I should make the switch. My small farm has been running 4.19 with no problems for quite some time now. I've read a lot of ppl rave and also bash the new scheduler. So the question is should I upgrade all my machines at this point to 4.45? What major benefits does it offer? I am running a mix of 2000, xp and linux. Any comments, suggestions are welcome.

Liberto [Valencia]
Liberto [Valencia]
Joined: 11 Nov 04
Posts: 38
Credit: 26927
RAC: 0

Contrary to what Ed & Harriet

Contrary to what Ed & Harriet Griffith say, I have been using for close to 39 hours now, the 4.45 and with the exception of one Climate unit and 4 units from Einstein, that have been reduced to one by now, there is no way to get other units from other projects.

Even more... I have a unit that has been reported back as finished:
10/06/2005 14:13:08|Einstein@Home|Computation for result H1_1206.0__1206.4_0.1_T13_Fin1_0 finished
10/06/2005 14:13:08|Einstein@Home|Starting result H1_1206.0__1206.0_0.1_T14_Fin1_0 using einstein version 4.79
10/06/2005 14:13:09|Einstein@Home|Started upload of H1_1206.0__1206.4_0.1_T13_Fin1_0_0
10/06/2005 14:13:16|Einstein@Home|Finished upload of H1_1206.0__1206.4_0.1_T13_Fin1_0_0

10/06/2005 14:13:16|Einstein@Home|Throughput 13063 bytes/sec
10/06/2005 14:13:17|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
10/06/2005 14:13:17|Einstein@Home|Requesting 0 seconds of work, returning 1 results
10/06/2005 14:13:19|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded

However, it is not shwon as reported in my view page of pending results and it has been more than 25 minutes since this happened.
I think that 4.45 has still to be refined a lot before having it as official - as SETI says.


Patience is a virtue

Purple Rabbit
Purple Rabbit
Joined: 15 Feb 05
Posts: 12
Credit: 82491580
RAC: 149601

Don't upgrade your Linux

Don't upgrade your Linux machine(s) yet. Linux is currently at BOINC 4.43. There are some real problems with this release. I had the ghost WU problem so I downgraded my Linux machine back to 4.30.

BOINC 4.45 is doing all right for me on Windows. You will have to get used to the new scheduler. You won't like it for the first few days :-) It's different. I like it now. I don't have to watch for missed deadlines anymore. Note that I have a broadband connection and set my connect interval to .25 days. Many people have an issue with it going into panic mode "prematurely" with connect intervals larger than 2.

Blank Reg
Blank Reg
Joined: 18 Jan 05
Posts: 228
Credit: 40599
RAC: 0

Contrary to what Ed & Harriet

Message 12306 in response to message 12304

Contrary to what Ed & Harriet Griffith say, I have been using for close to 39 hours now, the 4.45 and with the exception of one Climate unit and 4 units from Einstein, that have been reduced to one by now, there is no way to get other units from other projects.

Even more... I have a unit that has been reported back as finished:
10/06/2005 14:13:08|Einstein@Home|Computation for result H1_1206.0__1206.4_0.1_T13_Fin1_0 finished
10/06/2005 14:13:08|Einstein@Home|Starting result H1_1206.0__1206.0_0.1_T14_Fin1_0 using einstein version 4.79
10/06/2005 14:13:09|Einstein@Home|Started upload of H1_1206.0__1206.4_0.1_T13_Fin1_0_0
10/06/2005 14:13:16|Einstein@Home|Finished upload of H1_1206.0__1206.4_0.1_T13_Fin1_0_0

10/06/2005 14:13:16|Einstein@Home|Throughput 13063 bytes/sec
10/06/2005 14:13:17|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
10/06/2005 14:13:17|Einstein@Home|Requesting 0 seconds of work, returning 1 results
10/06/2005 14:13:19|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded

However, it is not shwon as reported in my view page of pending results and it has been more than 25 minutes since this happened.
I think that 4.45 has still to be refined a lot before having it as official - as SETI says.

It is not 4.45 that is the problem. you can not get Seti WUs because the server is down again or is dropping connection because of the back log....

Liberto [Valencia]
Liberto [Valencia]
Joined: 11 Nov 04
Posts: 38
Credit: 26927
RAC: 0

OK Fuzzy Logic, got the

OK Fuzzy Logic, got the message! Sorry!

Patience is a virtue

Bill Hepburn
Bill Hepburn
Joined: 16 Feb 05
Posts: 7
Credit: 77524752
RAC: 0

This gives me the opportunity

Message 12308 in response to message 12303

This gives me the opportunity to go on record with my sense of the new version. The Seti@Home forums have degenerated over this whole thing to a point that I scarcely look there any longer, and wouldn't expose myself to the ad hominem attacks that seem to be far too freqent there.

There is the old saying "if it works, don't mess with it." There is a lot of merit to that in general.

I can't comment on linux versions, but here is my opinion based on my experience with Windows boxes (NT, ME, XP Home, and XP Pro, one each). I am connected to some combination of 5 projects.

The short version is that I am using 4.45 on all of my boxes, and it is working fine. I think it got a bum rap, for a lot of reasons not necessarily related to 4.45 itself (more Seti related I believe), and it will take quite a long while before it gets over that.

The much longer version follows.

The ability to install and run as a service is a big plus (especially for NT and XP Pro). When the machine is booted up, it is crunching; no need to be logged in. I believe this came about since 4.19.

The ability to sort a display window is gone. I remember reading somewhere that Dr. Anderson doesn't like it, so it's gone. I don't know who said that, or if they knew what they were talking about. I hope that they were wrong and/or didn't know what they were talking about. Every other Windows application allows it, it can be useful, and the developer can set up the desired "default" sort order. IMO, a small minus.

The scheduler was an attempt to "fix" scheduling. If you had a rational mix of programs, on a reasonably fast machine, that is turned on long enough, you didn't have problems that needed fixing. Now, the scheduler intervenes for you, and not necessarily in the way you want it to. Computers aren't as smart as people! (However, that is not the same as "some people aren't dumber than computers.") IMHO, this is a significant minus. Reasons follow:

This scheduling thing is a less than perfect process. It relies on the "Connect Interval" to figure out how much work to keep on your machine. I have used .5 days for ages, and it works fine for me (I'm on a 24/7 connection). Some people have their "Connect Interval" set to a much larger number (5 days or more) for various reasons, (mostly to keep a large cache of work on hand) and that makes the scheduler go nuts. It thinks that it has to get work done and returned on one of those every 5 day connections, so it goes nuts, if you wade through JM7's descriptions in the forums or Mr. Buck's "Boinc-Wiki" , it makes sense in a wierd computer way, but it isn't intuitive. There is talk of reintroducing in a future build the "Keep x days work on hand" in addition to "Connect every x days". IMHO, a good idea. "Keep x days of work, but connect whenever you want to." seems like a reasonable thing to me, and I can't do that now.

This short term/long term debt thing is confusing at best. You will read "leave it alone for a couple of days and it will follow your resource share." I believe that to be correct, this really needs to say "Leave it alone for several days, when all of the programs you are connected to are up and running normally and it will follow your resource share." I hadn't seen all programs running normally for more than couple of days since the beginning of BOINC.

The concept seems to be that it tries to honor your "resource share" over time (and try to make sure you don't blow deadlines). It won't download work unless the cache is almost empty unless that work will go toward honoring the "resource share" (keeping debt balanced)... so after an outage where you couldn't crunch program A for a few days, it only wants to download and crunch program A work units. It will get program B work if it can't get enough program A work units, but it really doesn't want to. If the CPU is about to "run dry", it is supposed to get work from whereever it can, and seems to do so pretty well, but the cache gets real low (a few minutes) and you can't seem to force it to download (and that makes a lot of people go crazy). There is also a problem in this area with hyperthreading/multiple CPUs that one CPU can run dry and it just sits there (I believe "the fix is in", but not until the next build, I haven't seen it in a long time anyway).

Thie scheduler thing also takes into account the estimated time to complete a work unit. I understand that those are computed from the "benchmark" computations on the individual machine, and information estimates from the individual projects. Those estimates are notoriously inaccurate. So it uses questionable data to do its scheduling thing.

I believe that the problem of "Ghost Units" is fixed (although CPDN had a database corruption problem and they are/were passing out work units that are/were not being credited properly, so they kind of look like ghosts). I also believe that it is important to "uninstall" the 4.19 version before installing 4.45 to help avoid ghosts (and it doesn't hurt anyway).

So a lot of words. Some of them may (or may not) make any sense. I hope this sheds some light.

Bill Hepburn
Bill Hepburn
Joined: 16 Feb 05
Posts: 7
Credit: 77524752
RAC: 0

I couldn't edit after an

Message 12309 in response to message 12308

I couldn't edit after an hour, but I wanted to add another plus.

The ability to select "no more work" for a project is nice too. I have two machines that are pretty slow (300MHz K6 and a 650MHz PIII laptop). Einstein causes those machines to go nuts since a workunit takes about half of the time until it's deadline. So, I don't normally run Einstein on those machines, but when all of the projects are down, I can just unclick "no more work" and it downloads an Einstein WU; I then tell it "no more work" and all is well for about 3 days. Much easier than detaching and reattaching. Good if you are going to retire a machine (the K6 goes when UPS delivers the new Dell, and the old machines get repurposed)

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

Bill, Well written

Bill,

Well written ...

And as far as I can see ... accurate.

I don't know about the column sorting thing, I am not sure that was not an artifact of the selection of the WX widgits ... I doubt that Dr. Anderson goes around pulling out features he does not like for the heck of it ... so, I think that is an urban legend ...

But, I have been wrong before ... :)

The connection setting is up for debate and it looks like it is possible we will see a pair of variables ... also, Dr. Anderson put in a fix to try to mitigate the inaccurate estimates, especially when the WU start up ...

Bill Hepburn
Bill Hepburn
Joined: 16 Feb 05
Posts: 7
Credit: 77524752
RAC: 0

RE: I don't know about the

Message 12311 in response to message 12310

Quote:

I don't know about the column sorting thing, I am not sure that was not an artifact of the selection of the WX widgits ... I doubt that Dr. Anderson goes around pulling out features he does not like for the heck of it ... so, I think that is an urban legend ...

I certainly hope you are right. But as I said, I remember that somebody said is in one of the forums. My memory may be faulty, or somebody may have been misinformed or didn't know what they were talking about. I'll give you even money, you pick 'em.

I was grumbling earlier today while I was trying to look through a whole bunch of red "cannot connect" messages on my laptop to see what was going on. Would have been nice to sort them by project. Turned out somebody forgot to plug in the network connection when they got home last night.

Quote:

But, I have been wrong before ... :)

But from what I have seen, not very often.

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5385205
RAC: 0

RE: I was grumbling earlier

Message 12312 in response to message 12311

Quote:
I was grumbling earlier today while I was trying to look through a whole bunch of red "cannot connect" messages on my laptop to see what was going on. Would have been nice to sort them by project. Turned out somebody forgot to plug in the network connection when they got home last night.

You can also use BOINC View as an alternate monitor. Even if I was only running one application on one computer it is nice as it does things like logging of the work units completed. And you can sort the columns anyway you would like ...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.