Project declare up, but no reporting for me

archae86
archae86
Joined: 6 Dec 05
Posts: 3146
Credit: 7059734931
RAC: 1136759
Topic 198312

The maintenance outage for November 18 is declared successfully completed in the Technical News thread, but update requests from my hosts with work awaiting report still fail this way:
[pre]11/18/2015 7:14:35 AM update requested by user
11/18/2015 7:14:36 AM Sending scheduler request: Requested by user.
11/18/2015 7:14:36 AM Reporting 14 completed tasks
11/18/2015 7:14:36 AM Requesting new tasks for NVIDIA GPU
11/18/2015 7:14:37 AM Scheduler request failed: Server returned nothing (no headers, no data)
11/18/2015 7:14:40 AM Project communication failed: attempting access to reference site
11/18/2015 7:14:41 AM Internet access OK - project servers may be temporarily down. [/pre]

Do I need to do something to get my boinc to communicate to the new place? scheduler.einsteinathome.org Or is patience all that is needed?

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2142
Credit: 2774211346
RAC: 851229

Project declare up, but no reporting for me

Mostly patience, I think - and maybe an update or two.

Quote:
18/11/2015 12:16:24 | Einstein@Home | Sending scheduler request: To fetch work.
18/11/2015 12:16:24 | Einstein@Home | [sched_op] Intel GPU work request: 19807.02 seconds; 0.00 devices
18/11/2015 12:16:25 | Einstein@Home | Scheduler request completed: got 1 new tasks
18/11/2015 12:16:31 | Einstein@Home | Started download of PM0068_01121_180.bin4
18/11/2015 12:16:31 | Einstein@Home | Started download of PM0068_01121_181.bin4
18/11/2015 12:16:33 | Einstein@Home | Finished download of PM0068_01121_180.bin4
18/11/2015 12:16:33 | Einstein@Home | Finished download of PM0068_01121_181.bin4
18/11/2015 12:16:33 | Einstein@Home | Started download of PM0068_01121.zap
18/11/2015 12:16:34 | Einstein@Home | Finished download of PM0068_01121.zap
...
18/11/2015 13:22:30 | Einstein@Home | Sending scheduler request: Requested by user.
18/11/2015 13:22:30 | Einstein@Home | Reporting 1 completed tasks
18/11/2015 13:22:32 | Einstein@Home | Scheduler request completed
18/11/2015 13:22:32 | Einstein@Home | [sched_op] handle_scheduler_reply(): got ack for task PM0068_00761_32_0


(times are UTC)

Edit - my scheduler url is still einstein5.aei.uni-hannover.de

Bikermatt
Bikermatt
Joined: 9 May 10
Posts: 4
Credit: 1188994085
RAC: 0

My systems cannot complete

My systems cannot complete scheduler requests either. Uploads are working.

11/18/2015 6:36:26 AM | Einstein@Home | Scheduler request failed: Peer certificate cannot be authenticated with given CA certificates

Edit: This is just on windows boxes. Linux machines are working.

Christian Beer
Christian Beer
Joined: 9 Feb 05
Posts: 595
Credit: 127711258
RAC: 347639

Let's see if we have two

Let's see if we have two separate problems here or just two symptoms of the same problem.

@archae86: Can you please look in your client_state.xml file and tell me what scheduler URL is used for Einstein@home. Just look for in the file. There should be at least one entry for each project.

@Bikermatt: Does this happen for both versions 7.0 and 7.2?

Bikermatt
Bikermatt
Joined: 9 May 10
Posts: 4
Credit: 1188994085
RAC: 0

RE: Let's see if we have

Quote:

Let's see if we have two separate problems here or just two symptoms of the same problem.

@Bikermatt: Does this happen for both versions 7.0 and 7.2?

It was both, I rebooted my network and all is working now.

archae86
archae86
Joined: 6 Dec 05
Posts: 3146
Credit: 7059734931
RAC: 1136759

RE: Let's see if we have

Quote:

Let's see if we have two separate problems here or just two symptoms of the same problem.

@archae86: Can you please look in your client_state.xml file and tell me what scheduler URL is used for Einstein@home. Just look for in the file. There should be at least one entry for each project.

Just looking for the string scheduler_url in a current client_state.xml I find these:

http://albert.phys.uwm.edu/EinsteinAtHome_cgi/cgi

http://scheduler.einsteinathome.org/EinsteinAtHome_cgi/cgi

http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi

That is all, and I imagine the albert and seti ones are of no interest, and quite possibly out-of-date, as I've not run work at either for some time.

[edited a minute after first posting to add that I just now commanded updates on all three of my main hosts, all of which gave responses like the lines I posted before. I have not yet restarted BOINC nor rebooted, but as I have a couple of days cache, it may be better for me to let things run as a possible example of unattended machine behavior. Of course I'm happy to do either if that would be useful diagnostically.]

Christian Beer
Christian Beer
Joined: 9 Feb 05
Posts: 595
Credit: 127711258
RAC: 347639

RE: RE: Let's see if we

Quote:
Quote:

Let's see if we have two separate problems here or just two symptoms of the same problem.

@Bikermatt: Does this happen for both versions 7.0 and 7.2?

It was both, I rebooted my network and all is working now.

Yes, we disabled SSL for the scheduler at the moment (which seems to be a problem for older Clients). I just verified that our certificate is valid since Client version 7.2.4 so you should think about upgrading those 7.0 Clients. We are going to switch to full SSL support at some point in the future.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2142
Credit: 2774211346
RAC: 851229

RE: Yes, we disabled SSL

Quote:
Yes, we disabled SSL for the scheduler at the moment (which seems to be a problem for older Clients). I just verified that our certificate is valid since Client version 7.2.4 so you should think about upgrading those 7.0 Clients. We are going to switch to full SSL support at some point in the future.


Co-incidentally, there's a discussion going on at BOINC and WCG about using newer certificates with older clients.

BOINC: https://boinc.berkeley.edu/dev/forum_thread.php?id=10630

WCG: https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,38551_offset,0#505472

Christian Beer
Christian Beer
Joined: 9 Feb 05
Posts: 595
Credit: 127711258
RAC: 347639

RE: [edited a minute after

Quote:
[edited a minute after first posting to add that I just now commanded updates on all three of my main hosts, all of which gave responses like the lines I posted before. I have not yet restarted BOINC nor rebooted, but as I have a couple of days cache, it may be better for me to let things run as a possible example of unattended machine behavior. Of course I'm happy to do either if that would be useful diagnostically.]


Your machine seems to have picked up the new scheduler url but now can't connect to it. Can you try to flush your DNS cache by executing ipconfig /flushdns from a command prompt? You can also try to visit the new scheduler URL in a browser and see if you get a response. This would at least prove that your computer knows where scheduler.einsteinathome.org is. If it is still not solved and your cache is running low try a reboot. That should at least clear the (no header, no data) part.

archae86
archae86
Joined: 6 Dec 05
Posts: 3146
Credit: 7059734931
RAC: 1136759

RE: Can you try to flush

Quote:
Can you try to flush your DNS cache by executing ipconfig /flushdns from a command prompt? You can also try to visit the new scheduler URL in a browser and see if you get a response. This would at least prove that your computer knows where scheduler.einsteinathome.org is. If it is still not solved and your cache is running low try a reboot. That should at least clear the (no header, no data) part.

Sorry for slow response, I was away from home for a few hours.

On return, I first attempted another update. That got the same result as I've documented.

I then tried a flushdns, and received confirmation.

A subsequent update on that host got the same result as before.

I tried a command line ping which came out this way:

ping -n 1 scheduler.einsteinathome.org

Pinging scheduler.einsteinathome.org [130.75.116.40] with 32 bytes of data:
Request timed out.

Ping statistics for 130.75.116.40:
Packets: Sent = 1, Received = 0, Lost = 1 (100% loss)

I then tried a traceroute, the results of which show multiple points in Germany, including Hannover:

C:\Users\Peter Stoll>tracert scheduler.einsteinathome.org

Tracing route to scheduler.einsteinathome.org [130.75.116.40]
over a maximum of 30 hops:

1 <1 ms <1 ms <1 ms TP-SHARE [192.168.0.1]
2 7 ms 9 ms 9 ms 96.120.0.13
3 10 ms 10 ms 8 ms te-0-3-0-11-sur02.sandia.nm.albuq.comcast.net [68.86.183.81]
4 9 ms 10 ms 9 ms be-3-ar02.albuquerque.nm.albuq.comcast.net [68.86.182.9]
5 8 ms 9 ms 9 ms be-100-ar01.albuquerque.nm.albuq.comcast.net [68.86.182.37]
6 16 ms 17 ms 21 ms be-33654-cr01.1601milehigh.co.ibone.comcast.net [68.86.95.237]
7 20 ms 20 ms 18 ms be-11719-cr02.denver.co.ibone.comcast.net [68.86.86.77]
8 20 ms 20 ms 17 ms ae14.edge3.Denver1.Level3.net [4.68.127.129]
9 142 ms * * ae-2-70.edge5.Frankfurt1.Level3.net [4.69.154.73]
10 * * * Request timed out.
11 142 ms 140 ms 142 ms 212.162.4.6
12 145 ms 145 ms 146 ms cr-han2-be6-7.x-win.dfn.de [188.1.144.222]
13 147 ms 147 ms 148 ms xr-han1-pc1.x-win.dfn.de [188.1.145.249]
14 147 ms 145 ms 145 ms kr-han68.x-win.dfn.de [188.1.232.206]
15 152 ms 145 ms 151 ms BWINgate-Vl9.netz.uni-hannover.de [130.75.11.217]
16 147 ms 148 ms 147 ms gate-w1-0.netz.uni-hannover.de [130.75.1.213]
17 147 ms 147 ms 149 ms einstein10.aei.uni-hannover.de [130.75.116.40]

So it appears that the juice cans and string connect from me to "over there", but something simple blocks useful response.

I tried just pasting scheduler.einsteinathome.org into a Chrome URL window, and found myself looking at a page describing itself as an Apache2 Debian Default page. The text on that page suggests a normal user should not see it if things are running as intended.

I currently have the adapter configuration of the TCP/IPv4 properties set to automatic regarding both IP and DNS.

I'll stop here in case there is something I might try which might be useful to you. I've still got a couple of days cache, so if there is a chance my situation applies to a significant number of others, just possibly there is something useful to learn from my troubles.

Confession: DHCP services for my network are currently coming from a new router I just installed yesterday. While I currently have it configured almost entirely on TP-Link defaults, it remains possible that in my network fiddling I've somehow set the conditions leading to this problem. However other things I do on the computers, including extensive web browsing, and frequent data transmission of weather data to a server in the Netherlands, all seem to be working properly so far as I can tell.

archae86
archae86
Joined: 6 Dec 05
Posts: 3146
Credit: 7059734931
RAC: 1136759

I had the thought of pasting

I had the thought of pasting the IP (130.75.116.40) that my ping response told me scheduler.einsteinathome.org resolved to on my host into a browser URL box. Somewhat to my surprise I found myself looking at the Einstein@Home home page, in an up-to-date copy as B was user of the day.

The other thing I tried, which also changed nothing, was to stop using automatic DNS service, and instead put in two Google-service DNS servers
8.8.8.8 as primary
8.8.4.4 as secondary
followed by another flushdns
These last things I tried on a secondary host, but appear to have changed nothing in behaviors.

I'm puzzled, but I'm pretty ignorant on this particular part of the vast computer kingdom.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.