How do I get E@H to recognize my PC and start giving me W.U. again.

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

Gary Roberts wrote:George

Gary Roberts wrote:
George wrote:
....
6/16/2019 9:43:50 AM | Einstein@Home | Fetching scheduler list

but there is no evidence of any reply showing in the log.  There should be a line which basically says something to the effect that, "...Master file fetch succeeded." - but there isn't.  I don't really know why but my guess is that there is something wrong with the way the request is being made and the server is refusing to respond.  This seems to tie in with Floyd's observation of an empty master file for Einstein.  In the 3 master file sizes you provided near the end of your post you can see the Einstein one is zero bytes.

I'm wondering if this could be related to some sort of connectivity problem with Einstein@home. Perhaps a firewall, proxy server or antivirus program blocking something.
When Boinc starts communication it should always be followed by at least one line telling the result of that communication, success or failure.
Maybe try to enable some more log options in cc_config.xml (sched_ops and http_debug comes to mind) and try to connect to the project again. Then post the log of that attempt if you can't decipher it by yourself.
To enable the options then either edit the file in a plain text editor changing the relevant tags to 1 instead of 0 or go to Options -> Event Log options... in Boinc's advanced view, tick the relevant boxes and then apply and save.

Quote:
Then look a few lines earlier for the value of <rpc_seqno> - the remote procedure call sequence number.  If that number doesn't have the 'correct' value, that might be why Einstein is refusing to talk to your computer.  It's part of a security measure to detect if a computer is trying to pretend to be a different computer.

My understanding of <rpc_seqno> is that as long as it's higher than the value on the server all is fine and accepted, but if it's the same or lower then the server will think your trying to cheat and issue a new hostID. It shouldn't have any effect on the hosts ability to communicate with the project.
Thinking about it Boinc will have to increase the number by one before contacting the project and if that communication fails the server might or might not know anything about the increase depending on when the failure occurred. Then before the next communication attempt Boinc will probably increase the number again to be on the safe side and if that succeeds then all will be fine.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117566639966
RAC: 35289788

George wrote:I'm not sure

George wrote:
I'm not sure when I'll get back to this post as I am in the process of moving, but I will be back for sure.

George, please don't put this on hold again :-).  You've gone a few miles already and you probably just have a few inches to go! :-).  It's hard to get back up to speed after a time delay!  Too much else happens in between :-).

If you could just give some information about what is in the state file of your 'problem' machine, this might all get resolved very quickly, and right now.  Then I won't have to remember the details at a later date :-).  I tend to forget stuff quite quickly.  My brain doesn't work as well as it used to, sadly :-(.

Below is an example of what I'd like to see.  It's taken from the state file (client_state.xml) of one of my machines, after searching for the string 'einstein' as previously described.  I've started at the previous line which has the Einstein project opening tag.  I've highlighted in blue, the things that I'm particularly interested in if you don't want to post it all.  If you are prepared to post a similar block as shown below, that would be much appreciated.  Don't expose any potentially sensitive stuff (*****) and perhaps not the real <rpc_seqno> value.  Mine is a bogus value, as is my hostID since it's not relevant here.  The main thing to know about your <rpc_seqno> is its relationship to what you see on the 'details' page on the website.  And for your hostID, is it also the same as what's on the website.

<project>
    <master_url>http://einstein.phys.uwm.edu/</master_url>
    <project_name>Einstein@Home</project_name>
    <symstore></symstore>
    <user_name>Gary Roberts</user_name>
    <team_name>Ellison Crunchers</team_name>
    <host_venue>school</host_venue>
    <email_hash>*****</email_hash>
    <cross_project_id>*****</cross_project_id>
    <external_cpid></external_cpid>
    <cpid_time>1107977728.000000</cpid_time>
    <user_total_credit>25001675204.252983</user_total_credit>
    <user_expavg_credit>34358528.284019</user_expavg_credit>
    <user_create_time>1107977728.000000</user_create_time>
    <rpc_seqno>123</rpc_seqno>
    <userid>12521</userid>
    <teamid>2879</teamid>
    <hostid>1234</hostid>
    <host_total_credit>10747885.114181</host_total_credit>
    <host_expavg_credit>128139.007655</host_expavg_credit>
    <host_create_time>1148653210.000000</host_create_time>
    <nrpc_failures>0</nrpc_failures>
    <master_fetch_failures>0</master_fetch_failures>
    <min_rpc_time>1560815975.373631</min_rpc_time>
    <next_rpc_time>0.000000</next_rpc_time>
    <rec>43303.090867</rec>
    <rec_time>1560815886.229267</rec_time>
    <resource_share>900.000000</resource_share>
    <desired_disk_usage>0.000000</desired_disk_usage>
    <duration_correction_factor>0.507304</duration_correction_factor>
    <sched_rpc_pending>0</sched_rpc_pending>
    <send_time_stats_log>0</send_time_stats_log>
    <send_job_log>0</send_job_log>
    <njobs_success>669</njobs_success>
    <njobs_error>0</njobs_error>
    <elapsed_time>2278562.243342</elapsed_time>
    <last_rpc_time>1560815915.373631</last_rpc_time>
    <verify_files_on_app_start/>
    <rsc_backoff_time>
        <name>CPU</name>
        <value>0.000000</value>
    </rsc_backoff_time>
    <rsc_backoff_interval>
        <name>CPU</name>
        <value>0.000000</value>
    </rsc_backoff_interval>
    <no_rsc_pref>CPU</no_rsc_pref>
    <rsc_backoff_time>
        <name>ATI</name>
        <value>0.000000</value>
    </rsc_backoff_time>
    <rsc_backoff_interval>
        <name>ATI</name>
        <value>0.000000</value>
    </rsc_backoff_interval>
    <scheduler_url>https://scheduler.einsteinathome.org/EinsteinAtHome_cgi/cgi</scheduler_url>

 I hope to hear back from you shortly :-).

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117566639966
RAC: 35289788

Holmis wrote:I'm wondering if

Holmis wrote:
I'm wondering if this could be related to some sort of connectivity problem with Einstein@home. Perhaps a firewall, proxy server or antivirus program blocking something.
When Boinc starts communication it should always be followed by at least one line telling the result of that communication, success or failure.


That troubled me as well.  You do expect to see a failure message rather than no message at all.  I guess you may well be right - particularly if it turns out that what is in the machine's state file agrees with what is on the website for that particular hostID.  If there is nothing in the state file to explain the problem, extra logging options should be tried.  I don't know much about that stuff so I'm hoping it mightn't come to that :-).  I've tended to assume that since the client can talk to the other two projects, then it's not being blocked but maybe that's not true.  I don't know enough about those sorts of issues.

Holmis wrote:
My understanding of <rpc_seqno> is that as long as it's higher than the value on the server all is fine and accepted, but if it's the same or lower then the server will think your trying to cheat and issue a new hostID. It shouldn't have any effect on the hosts ability to communicate with the project.
Thinking about it Boinc will have to increase the number by one before contacting the project and if that communication fails the server might or might not know anything about the increase depending on when the failure occurred. Then before the next communication attempt Boinc will probably increase the number again to be on the safe side and if that succeeds then all will be fine.


Yes, that's essentially my understanding as well.  Take the situation of a project that is down, or otherwise incommunicado.  A client might make repeated attempts to make contact and if each one potentially increments the sequence number, the server, eventually when back on line, might see a value significantly larger than what it knows about and would need to accept it despite the difference.

My biggest problem in trying to get a correct picture of this issue is that I don't use Windows and I don't use standard install procedures.  So I've been trying to use the approach of getting agreement between what the host has and what the server expects.  I was hoping a discrepancy there might shed some light.

I understand that removing the project and adding it back again might work (even though I can't remember ever doing that myself so don't have the experience to judge), or that resetting the project might do the same, but I was interested to use this case to get a better understanding of what caused it in the first place and if it could be fixed with a simple 'correction' to the state file.

Cheers,
Gary.

GWGeorge007
GWGeorge007
Joined: 8 Jan 18
Posts: 3061
Credit: 4965957686
RAC: 1413454

Gary Roberts wrote:George,

Gary Roberts wrote:
George, please don't put this on hold again :-).  You've gone a few miles already and you probably just have a few inches to go! :-).  It's hard to get back up to speed after a time delay!  Too much else happens in between :-).

Okay, I'm hooked!  I don't want to leave this unfinished anymore than you do... regardless of what else I need to do.

Gary Roberts wrote:
My brain doesn't work as well as it used to, sadly :-(.

I'm with you on that!  ;*)

Gary Roberts wrote:
The main thing to know about your <rpc_seqno> is its relationship to what you see on the 'details' page on the website.  And for your hostID, is it also the same as what's on the website.

Here is a full copy of my <project> sequence for Einstein@Home from my client_state.xml file.  What you colored blue in yours, I also colored blue in mine and I added the color red to the <master_url_fetch_pending/> and <sched_rpc_pending>6</sched_rpc_pending> lines.

<project>
     <master_url>http://einstein.phys.uwm.edu/</master_url>
     <project_name>Einstein@Home</project_name>
     <symstore></symstore>
     <user_name></user_name>
     <team_name></team_name>
     <host_venue></host_venue>
     <email_hash></email_hash>
     <cross_project_id></cross_project_id>
     <external_cpid></external_cpid>
     <cpid_time>0.000000</cpid_time>
     <user_total_credit>0.000000</user_total_credit>
     <user_expavg_credit>0.000000</user_expavg_credit>
     <user_create_time>0.000000</user_create_time>
     <rpc_seqno>0</rpc_seqno>
     <userid>0</userid>
     <teamid>0</teamid>
     <hostid>0</hostid>
     <host_total_credit>0.000000</host_total_credit>
     <host_expavg_credit>0.000000</host_expavg_credit>
     <host_create_time>0.000000</host_create_time>
     <nrpc_failures>2</nrpc_failures>
     <master_fetch_failures>7</master_fetch_failures>
     <min_rpc_time>1560904358.021746</min_rpc_time>
     <next_rpc_time>0.000000</next_rpc_time>
     <rec>0.000000</rec>
     <rec_time>1560823942.748280</rec_time>
     <resource_share>100.000000</resource_share>
     <desired_disk_usage>0.000000</desired_disk_usage>
     <duration_correction_factor>0.963984</duration_correction_factor>
     <sched_rpc_pending>6</sched_rpc_pending>
     <send_time_stats_log>0</send_time_stats_log>
     <send_job_log>0</send_job_log>
     <njobs_success>0</njobs_success>
     <njobs_error>0</njobs_error>
     <elapsed_time>0.000000</elapsed_time>
     <last_rpc_time>0.000000</last_rpc_time>
     <master_url_fetch_pending/>
     <rsc_backoff_time>
         <name>CPU</name>
         <value>0.000000</value>
     </rsc_backoff_time>
     <rsc_backoff_interval>
         <name>CPU</name>
         <value>0.000000</value>
     </rsc_backoff_interval>
     <rsc_backoff_time>
         <name>NVIDIA</name>
         <value>0.000000</value>
     </rsc_backoff_time>
     <rsc_backoff_interval>
         <name>NVIDIA</name>
         <value>0.000000</value>
     </rsc_backoff_interval>
     <cpu_ec>0.000000</cpu_ec>
     <cpu_time>0.000000</cpu_time>
     <gpu_ec>0.000000</gpu_ec>
     <gpu_time>0.000000</gpu_time>
     <disk_usage>0.000000</disk_usage>
     <disk_share>3937053354.666667</disk_share>
 </project>

As you can see I have either "0" or "nothing" in the line items you were curious about, and most other lines too.

The <rpc_seqno>0</rpc_seqno> obviously won't match what's on the details page on the website.  My 'Number of times client has contacted server" is 4 digits (XXXX).  So...

What do I do now?

Cheers back atcha,

George

 

George

Proud member of the Old Farts Association

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117566639966
RAC: 35289788

George wrote:As you can see I

George wrote:

As you can see I have either "0" or "nothing" in the line items you were curious about, and most other lines too.

The <rpc_seqno>0</rpc_seqno> obviously won't match what's on the details page on the website.  My 'Number of times client has contacted server" is 4 digits (XXXX).  So...

What do I do now?


OK, here's what I'm hoping will do the job.  If after reading, it all seems too much, there are safer (but much less satisfying) alternatives that unfortunately wont preserve your current stats for that host.  If you take your time, the stuff below is a lot easier than it first might appear.  Read carefully - ask if not sure.

Before you do any actual editing, check your preferences on the website for two things.  You have selected (in the Einstein project preferences) a search that you'd like to get work for (perhaps just one to start with).  You have (at least temporarily) set computing preferences to give a fairly minimal initial work cache size.  Stuff that I've mentioned before :-).  Then, have a think about the followings steps.

  1. Stop BOINC completely so your other projects won't be trying to write into the state file.
  2. When stopped, take a copy of the state file and put it somewhere (like the desktop) where you can work on it without the risk of damaging the original.
  3. You need to edit the state file copy with a PLAIN TEXT editor.  This is the most important part.  Windows notepad (or something better of your choice that you are familiar with) must be used. DO NOT USE A WORD PROCESSOR!!!  If you have any doubts about what might be suitable, please ASK before proceeding!!
  4. Open the copy of the state file with your plain text editor of choice and be very careful not to change anything outside the Einstein project area that we have been talking about.
  5. Go to the line that shows the master_url for Einstein.  Look at the final result that I've shown below.  Remove all the lines from your state file copy that aren't in the example.  You will be leaving a few lines where data is wrong or missing.  Correct the value or add it in if it's completely missing.
  6. Add in the <dont_request_more_work/> line that isn't already in what you showed.  Hopefully, your text editor will be smart enough to indent it to match the other lines.  Hopefully indenting doesn't really matter but I like to preserve it.
  7. When you have finished editing, carefully double check everything and then save it to the desktop, overwriting the previous version.  Make sure its full name is exactly client_state.xml.
  8. Go to your BOINC data folder and rename the old client_state.xml file there to save it - just in case.  Perhaps you could call it something like client_state.old or client_state.bak - any name you like as long as it's not already in use.
  9. Make a new copy of the edited state file from your desktop and install it in the BOINC data directory.

Some comments:

The plan is this.  We are giving the scheduler just enough information (and nothing more) so that it recognises your Einstein hostID.  With that achieved, it should be able to send back all the missing stuff and that is why we can leave out a lot of the wrong or missing information currently there.  We are assuming that your account file for Einstein that you kindly showed in your earlier message has a correct authenticator 32 character string.  I can't do anything about that and hopefully it will be correct.

In case BOINC is unhappy with the edits in some way (or anything else), I suggest removing the network cable before launching BOINC.  If BOINC seems happy enough and only complains about no internet connection, you can plug the cable back in.  First just browse the event log for any more serious issues.  Make sure there are no unexpected complaints and one thing to look for is the Einstein hostID that we've hopefully allowed the client to find :-).

After things settle from the startup flurry of normal messages, you can restore the network cable.  When the machine establishes an internet connection, BOINC should attempt to download the master file from Einstein.  If that happens (you might have to prompt it with an 'update') you should see the "succeeded" message that was previously missing.  At that point you can think about "allowing new work".  We deliberately prevented that in the state file with that extra added line.  Don't do the allow until the master file has successfully downloaded because you want the client to be told about website preference changes before you go asking for work.

With the edited state file in place, when you re-start BOINC, the other two projects should start working as normal. If anything goes haywire with them (it shouldn't) don't plug the network back in because we can always retrieve the situation from the backup state file that we saved earlier.  Hopefully we won't need it.

So here is exactly what I would like you to create for the Einstein project portion of your state file copy residing on your desktop.  I'm not adding any coloration.

<project>
    <master_url>http://einstein.phys.uwm.edu/</master_url>
    <project_name>Einstein@Home</project_name>
    <user_name>George</user_name>
    <rpc_seqno>3396</rpc_seqno>
    <userid>983881</userid>
    <hostid>12623039</hostid>
    <dont_request_more_work/>
    <scheduler_url>https://scheduler.einsteinathome.org/EinsteinAtHome_cgi/cgi</scheduler_url>
</project>

It's rather late here and I'm going home.  Have a good think about the above and I'll answer any questions (or correct any mistakes that get pointed out :-). ) first thing in the morning - about 7 hours from now :-).

EDIT:  I'm back on deck now and having perused last night's effort, I realise I should have added a comment about the <scheduler_url> line as well.  I chose to include that along with the <dont_request_more_work/> line because it will be needed to get work.  My understanding is that this url will be sent when the master file is downloaded but, just in case my understanding isn't correct, I felt it wouldn't do any harm to include it up front.

Cheers,
Gary.

GWGeorge007
GWGeorge007
Joined: 8 Jan 18
Posts: 3061
Credit: 4965957686
RAC: 1413454

Hi Gary,I'm sorry I took so

Hi Gary,

I'm sorry I took so long, now I'm not moving until Tuesday (ugh!) so I have a few days to set things right (hopefully).

I do have a question.

Gary Roberts wrote:
Go to the line that shows the master_url for Einstein.  Look at the final result that I've shown below.  Remove all the lines from your state file copy that aren't in the example.  You will be leaving a few lines where data is wrong or missing.  Correct the value or add it in if it's completely missing.

I have looked and verified all as you have written it and they appear fine to me.  I'm not sure what, if any, lines where data is wrong or missing.  To me, they do appear correct.

<project>
    <master_url>http://einstein.phys.uwm.edu/</master_url>
    <project_name>Einstein@Home</project_name>
    <user_name>George</user_name>
    <rpc_seqno>3396</rpc_seqno>
    <userid>983881</userid>
    <hostid>12623039</hostid>
    <dont_request_more_work/>
    <scheduler_url>https://scheduler.einsteinathome.org/EinsteinAtHome_cgi/cgi</scheduler_url>
</project>

Gary Roberts wrote:
EDIT:  I'm back on deck now and having perused last night's effort, I realise I should have added a comment about the <scheduler_url> line as well.  I chose to include that along with the <dont_request_more_work/> line because it will be needed to get work.  My understanding is that this url will be sent when the master file is downloaded but, just in case my understanding isn't correct, I felt it wouldn't do any harm to include it up front.

So... I should leave the <scheduler_url> line as is and let it go as you have presented it?

And as for you're being inexperienced with Windows, yes, I am using Windows Notepad to write the client_state.xml file.

To be clear, I replace all of my <project> info for Einstein with what you have with no spaces, leaving the rest of the file as is.  Save the original file with a suffix that makes sense to me and place the edited file in it's place with the original name.  Do I have that correct?

I know I sound paranoid but I just want to be absolutely clear before hitting the key to reboot.  :>)

And if I haven't said so yet, thank you very much for all you have helped me with.

George

Proud member of the Old Farts Association

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117566639966
RAC: 35289788

George wrote:I do have a

George wrote:

I do have a question.

Gary Roberts wrote:
Go to the line that shows the master_url for Einstein.  Look at the final result that I've shown below.  Remove all the lines from your state file copy that aren't in the example.  You will be leaving a few lines where data is wrong or missing.  Correct the value or add it in if it's completely missing.

I have looked and verified all as you have written it and they appear fine to me.  I'm not sure what, if any, lines where data is wrong or missing.  To me, they do appear correct.

<project>
    <master_url>http://einstein.phys.uwm.edu/</master_url>
    <project_name>Einstein@Home</project_name>
    <user_name>George</user_name>
    <rpc_seqno>3396</rpc_seqno>
    <userid>983881</userid>
    <hostid>12623039</hostid>
    <dont_request_more_work/>
    <scheduler_url>https://scheduler.einsteinathome.org/EinsteinAtHome_cgi/cgi</scheduler_url>
</project>

Sorry, apparently I was doing my usual trick of over-explaining things :-).  What I was trying to say was that you would be removing lots of lines but leaving some.   For example, in what you originally showed, <user_name></user_name> was a line with a missing value and <user_id>0</user_id> was a line with a wrong value.  I was trying to make things easier for you by telling you not to delete everything (and start from scratch) but to retain those lines which just needed a small fix to make them useful.

If the Einstein <project> section of your full state file looks like the above and if your file is called exactly "client_state.xml" then you are ready to give it a try.

George wrote:
So... I should leave the <scheduler_url> line as is and let it go as you have presented it?

Yes.  In your earlier message, there was no scheduler_url showing.  I assume it would be sent with the master file download and if so it wouldn't be a problem not having it to start with.  However, even if it isn't needed right when you first restart BOINC, it wouldn't create any problem by being there so I decided it wouldn't hurt to put in in with the other lines.

George wrote:
To be clear, I replace all of my <project> info for Einstein with what you have with no spaces, leaving the rest of the file as is.  Save the original file with a suffix that makes sense to me and place the edited file in it's place with the original name.  Do I have that correct?

If, by "spaces" you mean "blank lines", then absolutely.  You don't want unneeded blank lines.

George wrote:
I know I sound paranoid but I just want to be absolutely clear before hitting the key to reboot.

It's good to be paranoid about this sort of 'surgery'.  I'll stop being paranoid when you tell me it's all working again :-).  You've said "thanks" many times so there's no need to be paranoid about that :-).  I'll be very satisfied if it just works without any other dramas :-).  Fingers crossed! :-).

Cheers,
Gary.

GWGeorge007
GWGeorge007
Joined: 8 Jan 18
Posts: 3061
Credit: 4965957686
RAC: 1413454

EINSTEIN@HOME IS WORKING

EINSTEIN@HOME IS WORKING AGAIN!!!  :^)

I am so happy that it's working.  I did as you said, edited a few lines that were missing information, left others alone, restarted BOINC and it refreshed the file to the point of not getting work.  I then deleted the line of <dont_request_more_work/> , restarted BOINC again, and it's working.

Thank you once again, Gary.  You're a miracle worker.

 

George

Proud member of the Old Farts Association

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117566639966
RAC: 35289788

Congratulations on a job well

Congratulations on a job well done.  Give yourself a big pat on the back.  There are probably not too many people in your position who would have the determination to continue on as you have done.  It takes some courage - and a certain amount of blind faith :-) - to jump in and hack the state file.  You have done very well!!

Just one little point you missed.  You didn't need to stop BOINC and manually delete the <dont_request_more_work/> line.  I probably didn't explain well enough what the intention was.  You were supposed to open BOINC Manager (advanced view) and select the Einstein project on the projects tab.  You would then have seen the status of the project showing as something like "Wont get new tasks" and in the commands area to the left, a button reading "Allow new tasks" (or something like that - exact wording tends to change with BOINC version).  So all you needed was that one click.

As I mentioned previously, do yourself a favour and start using BOINC Manager so that you become familiar with all the menu items.  If you want to read up a bit first, here is a good place to start.

Cheers,
Gary.

GWGeorge007
GWGeorge007
Joined: 8 Jan 18
Posts: 3061
Credit: 4965957686
RAC: 1413454

Thank you again!  I've been

Thank you again!  I've been using BOINC Manager since I first joined the SETI project, and I've delved into BAM and Grid Iron a little.  I've had much success with the HELP services, specifically (can I say his name?), and he has convinced me that I should just use BOINC Manager.  So... a long story short, I've been using it since.

I didn't realize that I should have looked to the left in B.M., but that's behind us now.

You do great work!

George

Proud member of the Old Farts Association

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.