Downloading new work

F. Prefect
F. Prefect
Joined: 7 Nov 05
Posts: 135
Credit: 1016868
RAC: 0
Topic 190187

I've beem running Einstein for several days on 5 machines that are not networked. For some reason 2 of my machines are not receiving new work. One is a 1.2 Athlon (t-bird core) the other is an Athlon 64 3400+. I had my pref. set to connect to network every 2 days and have just changed it to 1 day, but no results. I'll try to attach the messages from the 1.2 Gig machine as it contains messages such as computer overcommited and work fetch suspended. If anyone could help I would appriciate it very much.

11/20/2005 5:42:12 AM||Starting BOINC client version 5.2.7 for windows_intelx86
11/20/2005 5:42:12 AM||libcurl/7.14.0 OpenSSL/0.9.8 zlib/1.2.3
11/20/2005 5:42:12 AM||Data directory: C:\\Program Files\\BOINC
11/20/2005 5:42:13 AM||Processor: 1 AuthenticAMD AMD Athlon(tm) Processor
11/20/2005 5:42:13 AM||Memory: 255.42 MB physical, 617.48 MB virtual
11/20/2005 5:42:13 AM||Disk: 18.63 GB total, 14.84 GB free
11/20/2005 5:42:13 AM|rosetta@home|Computer ID: 59981; location: home; project prefs: default
11/20/2005 5:42:13 AM|Einstein@Home|Computer ID: 438562; location: home; project prefs: default
11/20/2005 5:42:13 AM||General prefs: from Einstein@Home (last modified 2005-11-19 16:17:15)
11/20/2005 5:42:13 AM||General prefs: no separate prefs for home; using your defaults
11/20/2005 5:42:13 AM||Remote control not allowed; using loopback address
11/20/2005 5:42:13 AM|rosetta@home|Resuming computation for result 1hz6A_abrelaxmode_random_length20_jitter02_omega_sim_aneal_39523_0 using rosetta version 479
11/20/2005 5:42:13 AM||Suspending work fetch because computer is overcommitted.
11/20/2005 5:42:13 AM||Using earliest-deadline-first scheduling because computer is overcommitted.
11/20/2005 5:42:42 AM||request_reschedule_cpus: project op
11/20/2005 5:42:44 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
11/20/2005 5:42:44 AM|Einstein@Home|Reason: Requested by user
11/20/2005 5:42:44 AM|Einstein@Home|Note: not requesting new work or reporting results
11/20/2005 5:42:54 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
11/20/2005 5:51:05 AM||request_reschedule_cpus: project op
11/20/2005 5:51:10 AM||request_reschedule_cpus: project op
11/20/2005 5:51:21 AM||request_reschedule_cpus: project op
11/20/2005 5:51:25 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
11/20/2005 5:51:25 AM|Einstein@Home|Reason: Requested by user
11/20/2005 5:51:25 AM|Einstein@Home|Note: not requesting new work or reporting results
11/20/2005 5:51:28 AM||request_reschedule_cpus: project op
11/20/2005 5:51:29 AM||request_reschedule_cpus: project op
11/20/2005 5:51:31 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
11/20/2005 5:51:38 AM||request_reschedule_cpus: project op
11/20/2005 5:51:40 AM||request_reschedule_cpus: project op
11/20/2005 6:01:21 AM||request_reschedule_cpus: project op
11/20/2005 6:01:23 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
11/20/2005 6:01:23 AM|Einstein@Home|Reason: Requested by user
11/20/2005 6:01:23 AM|Einstein@Home|Note: not requesting new work or reporting results
11/20/2005 6:01:28 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded
11/20/2005 6:01:28 AM|Einstein@Home|General preferences have been updated
11/20/2005 6:01:28 AM||General prefs: from Einstein@Home (last modified 2005-11-20 06:01:05)
11/20/2005 6:01:28 AM||General prefs: no separate prefs for home; using your defaults

F. Prefect

In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move.....Douglas Adams

J D K
J D K
Joined: 27 Aug 05
Posts: 86
Credit: 103878
RAC: 0

Downloading new work

Boinc thinks it has to much work, so just let it run and it should even out in time..

Take a look here

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5869
Credit: 113330675422
RAC: 36782552

RE: I've beem running

Quote:
I've beem running Einstein for several days on 5 machines that are not networked.

What exactly do you mean? Not connected to each other (ie not on a LAN) or not connected to the internet?? Can you please elaborate on how each machine manages to upload results or download work from anywhere?

Quote:
For some reason 2 of my machines are not receiving new work.

Please read your messages. BOINC is telling you that it thinks you have too much already. This is a temporary thing. Once the excess work is safely crunched, BOINC will then allow you to get more work. Your resource shares will be honoured in time if you leave BOINC alone.

Quote:
I had my pref. set to connect to network every 2 days and have just changed it to 1 day, but no results.

If you keep changing your "connect" setting you will make it more difficult for BOINC to settle into a nice efficient routine of always having an appropriate amount of work from each project. To get your computers overcommitted, you have probably had it higher than 2 days at some stage. Now that you have it to 1 day please do NOT increase it again until you complete all the work that is at risk of blowing past a deadline. In fact if you want BOINC to get back on an even keel faster, then reduce it further, say to 0.5 days or less. When BOINC is happy that no deadlines are at risk, it will drop out of EDF mode and start normal crunching of all your projects. Please stop fiddling and allow BOINC to do this for you. Please note that it may take BOINC a week or more to restore sanity to your computers. Please be patient.

Quote:

11/20/2005 5:42:13 AM||Suspending work fetch because computer is overcommitted.
11/20/2005 5:42:13 AM||Using earliest-deadline-first scheduling because computer is overcommitted.
11/20/2005 5:42:42 AM||request_reschedule_cpus: project op
11/20/2005 5:42:44 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
11/20/2005 5:42:44 AM|Einstein@Home|Reason: Requested by user
11/20/2005 5:42:44 AM|Einstein@Home|Note: not requesting new work or reporting results
11/20/2005 5:42:54 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded

At a minimum you need to read and understand what BOINC is saying to you in the above snippet. This stuff is explained in the Wiki and you need to go and read about it there. Here is a summary:-

  • * BOINC tells you your machine has too much work (ie your "connect" setting has been far too large at some stage) and it wont allow more work.
    * BOINC tells you that it's going into EDF mode to protect you against possibly overrunning deadlines.
    * The next few lines are what you see when you try to force BOINC by issuing an update.
    * BOINC patiently tries to reason with you by telling you that it didn't need to contact the server so you are just "fiddling" for no good reason.
    * BOINC tells you that it completed your "request" successfully and nothing happened because nothing needed to happen.

If you read your messages you will see you actually forced this update three times and BOINC resisted you three times. You will see the same note from BOINC that nothing was needed. You can almost hear the exasperation in BOINC's voice :).

If you actually walk away and leave it alone, BOINC will sort it out for you.

Cheers,
Gary.

F. Prefect
F. Prefect
Joined: 7 Nov 05
Posts: 135
Credit: 1016868
RAC: 0

RE: RE: I've beem running

Message 19552 in response to message 19551

Quote:
Quote:
I've beem running Einstein for several days on 5 machines that are not networked.

What exactly do you mean? Not connected to each other (ie not on a LAN) or not connected to the internet?? Can you please elaborate on how each machine manages to upload results or download work from anywhere?

I have 5 "standalone" computers not connected in any way.

Quote:
For some reason 2 of my machines are not receiving new work.

Please read your messages. BOINC is telling you that it thinks you have too much already. This is a temporary thing. Once the excess work is safely crunched, BOINC will then allow you to get more work. Your resource shares will be honoured in time if you leave BOINC alone.

That's what I couldn't figure out. I had no Einstein jobs remaining

Quote:
I had my pref. set to connect to network every 2 days and have just changed it to 1 day, but no results.

If you keep changing your "connect" setting you will make it more difficult for BOINC to settle into a nice efficient routine of always having an appropriate amount of work from each project. To get your computers overcommitted, you have probably had it higher than 2 days at some stage. Now that you have it to 1 day please do NOT increase it again until you complete all the work that is at risk of blowing past a deadline. In fact if you want BOINC to get back on an even keel faster, then reduce it further, say to 0.5 days or less. When BOINC is happy that no deadlines are at risk, it will drop out of EDF mode and start normal crunching of all your projects. Please stop fiddling and allow BOINC to do this for you. Please note that it may take BOINC a week or more to restore sanity to your computers. Please be patient.

The only 2 settings I have tried have been 1 day and 2 days, but I'll try .5 for both Einstein and Rosetta and see what kind of results that produces.

Quote:

11/20/2005 5:42:13 AM||Suspending work fetch because computer is overcommitted.
11/20/2005 5:42:13 AM||Using earliest-deadline-first scheduling because computer is overcommitted.
11/20/2005 5:42:42 AM||request_reschedule_cpus: project op
11/20/2005 5:42:44 AM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
11/20/2005 5:42:44 AM|Einstein@Home|Reason: Requested by user
11/20/2005 5:42:44 AM|Einstein@Home|Note: not requesting new work or reporting results
11/20/2005 5:42:54 AM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded

At a minimum you need to read and understand what BOINC is saying to you in the above snippet. This stuff is explained in the Wiki and you need to go and read about it there. Here is a summary:-

  • * BOINC tells you your machine has too much work (ie your "connect" setting has been far too large at some stage) and it wont allow more work.
    * BOINC tells you that it's going into EDF mode to protect you against possibly overrunning deadlines.
    * The next few lines are what you see when you try to force BOINC by issuing an update.
    * BOINC patiently tries to reason with you by telling you that it didn't need to contact the server so you are just "fiddling" for no good reason.
    * BOINC tells you that it completed your "request" successfully and nothing happened because nothing needed to happen.

If you read your messages you will see you actually forced this update three times and BOINC resisted you three times. You will see the same note from BOINC that nothing was needed. You can almost hear the exasperation in BOINC's voice :).

If you actually walk away and leave it alone, BOINC will sort it out for you.

I'm beginning to realize that. After running FaD for the past 2 1/2 years and SETI for a couple of years before that, I got used to having to do more "maintainance" so to speak.

But I wanted to let you know what finally worked, quite possibly by accident. Since all of my Einstein jobs were completed and uploaded I was about to try
the "reset program" option, but before I did, I tried one more shot in the dark and suspended Rosetta, which is the other project running on that machine. Almost immediately I began to receive a 7.9 MB download from Einstein.

Regardless, I appriciate your reply and hopefully learned a few things that will be of use in the future, and will read the material you recommened.

thanks again,
F. Prefect


In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move.....Douglas Adams

Tern
Tern
Joined: 27 Jul 05
Posts: 309
Credit: 99440614
RAC: 25302

RE: I tried one more shot

Message 19553 in response to message 19552

Quote:
I tried one more shot in the dark and suspended Rosetta, which is the other project running on that machine. Almost immediately I began to receive a 7.9 MB download from Einstein.

BOINC Manager will keep "x" days worth of work (the 'connect to every' setting) on your machine. And it will base which project to get work from on your resource share settings and how much work you have done for the various projects so far (Long Term Debt). But... if you have a project suspended, then it isn't included in the calculations, so even if you "owed" a lot of CPU time to Rosetta, and thus it was not allowing you to get Einstein work, as soon as you suspended Rosetta, it was out of the equation. So more Einstein work was downloaded. However - the "debt" is still owed, so whenever you _unsuspend_ Rosetta, Einstein will just sit, until that debt is paid. With a one or two day cache, this won't be too bad - with a larger cache, you can wind up in deadline trouble by doing this. ("Overcommitted" messages, but not able to compensate even running Earliest Deadline First...)

If you are on Windows, you may want to download BOINCView. It's a "BOINC Manager replacement" with some features added and others missing. I don't use it to replace BOINC Manager, but I do keep it around as an easy way to look at the Long Term Debt for my various projects. Generally though, if you just trust the Manager and don't change things very often, it'll do it's job and keep you out of trouble. It's almost the opposite of things like SETIspy, etc... the LESS you do, the better it can do it's job! My most-stable running machine is my daughter's laptop; I installed BOINC on it months ago and haven't touched it since. The machines I sit in front of (and micromanage...) every day, tend to get out of balance...

John McLeod VII
John McLeod VII
Moderator
Joined: 10 Nov 04
Posts: 547
Credit: 632255
RAC: 0

RE: RE: I tried one more

Message 19554 in response to message 19553

Quote:
Quote:
I tried one more shot in the dark and suspended Rosetta, which is the other project running on that machine. Almost immediately I began to receive a 7.9 MB download from Einstein.

BOINC Manager will keep "x" days worth of work (the 'connect to every' setting) on your machine. And it will base which project to get work from on your resource share settings and how much work you have done for the various projects so far (Long Term Debt). But... if you have a project suspended, then it isn't included in the calculations, so even if you "owed" a lot of CPU time to Rosetta, and thus it was not allowing you to get Einstein work, as soon as you suspended Rosetta, it was out of the equation. So more Einstein work was downloaded. However - the "debt" is still owed, so whenever you _unsuspend_ Rosetta, Einstein will just sit, until that debt is paid. With a one or two day cache, this won't be too bad - with a larger cache, you can wind up in deadline trouble by doing this. ("Overcommitted" messages, but not able to compensate even running Earliest Deadline First...)

If you are on Windows, you may want to download BOINCView. It's a "BOINC Manager replacement" with some features added and others missing. I don't use it to replace BOINC Manager, but I do keep it around as an easy way to look at the Long Term Debt for my various projects. Generally though, if you just trust the Manager and don't change things very often, it'll do it's job and keep you out of trouble. It's almost the opposite of things like SETIspy, etc... the LESS you do, the better it can do it's job! My most-stable running machine is my daughter's laptop; I installed BOINC on it months ago and haven't touched it since. The machines I sit in front of (and micromanage...) every day, tend to get out of balance...


Actually, in this case, BOINC has at least one result that is in time trouble. If a project is suspended, and more work is downloaded, SOME WORK WILL BE RETURNED AFTER DEADLINE AND MAY BE WORTHLESS.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5869
Credit: 113330675422
RAC: 36782552

RE: But I wanted to let you

Quote:
But I wanted to let you know what finally worked, quite possibly by accident. Since all of my Einstein jobs were completed and uploaded I was about to try the "reset program" option, but before I did, I tried one more shot in the dark and suspended Rosetta, which is the other project running on that machine. Almost immediately I began to receive a 7.9 MB download from Einstein.

Whilst Bill has explained things admirably, I just want to mention a couple of other things that you really need to latch onto. Firstly, I'm very glad you didn't do the reset thing. That should be regarded as absolutely last resort when there really is a problem.

Secondly, you only mention EAH and rosetta but are these really the only two projects you are running? It's hard to imagine how you became overcommitted if you only have two projects and your "connect" setting has never been higher than 2. (Update: I've just gone back through all your previous messages and found one from Nov 08 where you seem to be saying that you are actually running 5 projects?)

Thirdly, you need to fully understand that having no EAH work on your computer was NOT a problem. BOINC was doing its level best to honour your chosen resource shares. You don't seem to understand that. Consider this example. Let's imagine that on your machine, an EAH result takes 12 hours and a rosetta result takes 2 hours. These are just example numbers as I've never even visited the rosetta website, let alone crunched any. I have no idea of an actual rosetta time but just bear with the numbers I have given. Let's also assume that you crunch 24/7 and that your resource share is 50/50. Ideally at the end of a day BOINC will have allowed 12 hours of EAH crunching and 12 hours of rosetta crunching. It may not happen exactly that way each day but over the longer term it will average out to exactly that. Now I'm almost willing to wager that when you see only one EAH unit but SIX rosetta units being done in a day that you think that BOINC is favouring rosetta to the exclusion of EAH? Is that a fair assessment? Now if you just happen to get too much work and you get a whole bunch of rosetta work that is at risk of missing a deadline, BOINC will TEMPORARILY stop EAH from getting new work just to clear the excess rosetta. However BOINC will faithfully keep track of this (through LTD - long term debt) and will "pay back" that debt to EAH when it can.

Fourthly, you don't seem prepared to answer questions or tell us information about your setup that would allow us to give better advice. For example, you keep saying that none of your machines are networked, but I think you really mean that they are not permanently on a LAN. They have to be networked at some stage in order to download and upload. I was just trying to determine if you have broadband or dialup and how you get each machine to talk to the server when it needs to. Do you have a phone line that you manually switch from machine to machine?? I'm only trying to find out because these sort of little details make a huge difference to the advice that will be given to you.

You shouldn't take any of these comments personally. I'm hoping you can appreciate that it is difficult to offer advice when we don't have all the information. I'm trying to help you understand how important it is to describe in as much detail as possible, all of the factors, no matter how trivial the details might seem to you. The devil really is in the detail. If we have to guess too much (or have to remember details that were posted weeks ago in a whole range of different threads) you are going to get faulty advice which then makes us look like idiots when it doesn't solve your problem.

I guess the key things I'd like to be sure about now are how you achieve your connections and exactly how many projects you are running on each of your boxes that are showing any signs of distress.

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.