scheduler server not reachable, since over 24 hours... - what's up?

Dirk Sadowski
Dirk Sadowski
Joined: 26 Oct 09
Posts: 24
Credit: 103412693
RAC: 0
Topic 218511

Hello.

 

My BOINC client can't connect to the Einstein@home scheduler server, since over 24 hours...

Just error messages like this:

Einstein@Home    27.03.19 | 16:46:34    Scheduler request failed: Error 408    
Einstein@Home    27.03.19 | 17:02:44    Scheduler request failed: HTTP internal server error

 

Where is the problem, on Einsteins' or my side?

 

Thanks.

 

 

EDIT: BTW., reboot of the PC didn't helped.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117783791940
RAC: 34690837

Sutaru Tsureku wrote:Where is

Sutaru Tsureku wrote:
Where is the problem, on Einsteins' or my side?

I haven't noticed any particular problem in the last 24 hours.  I haven't seen other reports of this type so I'm guessing it might be something at your end or somewhere on the route to the project.  If you enable appropriate logging flags in cc_config.xml, you should be able to get a better idea of what/where the problem is.  I don't have relevant experience but perhaps something like <http_debug> and/or <http_xfer_debug> as documented here.

 

Cheers,
Gary.

Dirk Sadowski
Dirk Sadowski
Joined: 26 Oct 09
Posts: 24
Credit: 103412693
RAC: 0

I made a reboot of the

I made a reboot of the DSL-router and the PC.

I inserted:

<sched_ops>1</sched_ops>
<file_xfer_debug>1</file_xfer_debug>
<http_debug>1</http_debug>
<http_xfer_debug>1</http_xfer_debug>
<network_status_debug>1</network_status_debug>
<sched_op_debug>1</sched_op_debug>

- to the cc_config.xml file.

 

[here and there I inserted "******", maybe infos not for public?]

 

- - - - - - - - - -

HTTP_OP::init_get(): https://einsteinathome.org/rss_main.php
HTTP_OP::libcurl_exec(): ca-bundle set
[ID#0] Info: Connection 17 seems to be dead!
[ID#0] Info: Closing connection 17
[ID#0] Info: TLSv1.2 (OUT), TLS alert, Client hello (1):
[ID#0] Info: Trying 129.89.61.70...
[ID#0] Info: Connected to einsteinathome.org (129.89.61.70) port ****** (#19)
[ID#0] Info: ALPN, offering http/1.1
[ID#0] Info: Cipher selection: ******
[ID#0] Info: successfully set certificate verify locations:
[ID#0] Info: CAfile: C:\******
[ID#0] Info: CApath: none
[ID#0] Info: TLSv1.2 (OUT), TLS header, Certificate Status (22):
[ID#0] Info: TLSv1.2 (OUT), TLS handshake, Client hello (1):
[ID#0] Info: TLSv1.2 (IN), TLS handshake, Server hello (2):
[ID#0] Info: TLSv1.2 (IN), TLS handshake, Certificate (11):
[ID#0] Info: TLSv1.2 (IN), TLS handshake, Server key exchange (12):
[ID#0] Info: TLSv1.2 (IN), TLS handshake, Server finished (14):
[ID#0] Info: TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
[ID#0] Info: TLSv1.2 (OUT), TLS change cipher, Client hello (1):
[ID#0] Info: TLSv1.2 (OUT), TLS handshake, Finished (20):
[ID#0] Info: TLSv1.2 (IN), TLS change cipher, Client hello (1):
[ID#0] Info: TLSv1.2 (IN), TLS handshake, Finished (20):
[ID#0] Info: SSL connection using TLSv1.2 / ******
[ID#0] Info: ALPN, server did not agree to a protocol
[ID#0] Info: Server certificate:
[ID#0] Info: subject: C=US; postalCode=53201; ST=Wisconsin; L=Milwaukee; street=P.O. Box 413; street=2200 E. Kenwood Blvd.; O=University of Wisconsin-Milwaukee; OU=Physics; CN=einsteinathome.org
[ID#0] Info: start date: Nov 19 00:00:00 2018 GMT
[ID#0] Info: expire date: Nov 19 23:59:59 2019 GMT
[ID#0] Info: subjectAltName: einsteinathome.org matched
[ID#0] Info: issuer: C=US; ST=MI; L=Ann Arbor; O=Internet2; OU=InCommon; CN=InCommon RSA Server CA
[ID#0] Info: SSL certificate verify ok.
[ID#0] Sent header to server: GET /rss_main.php HTTP/1.1

[ID#0] Sent header to server: Host: einsteinathome.org

[ID#0] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.6.33)

[ID#0] Sent header to server: Accept: */*

[ID#0] Sent header to server: Accept-Encoding: deflate, gzip

[ID#0] Sent header to server: Content-Type: application/x-www-form-urlencoded

[ID#0] Sent header to server: Accept-Language: de_DE

[ID#0] Sent header to server:

[ID#0] Received header from server: HTTP/1.1 ****** OK

[ID#0] Received header from server: Date: Thu, 28 Mar 2019 10:48:38 GMT

[ID#0] Received header from server: Server: Apache

[ID#0] Received header from server: Last-Modified: Thu, 28 Mar 2019 10:00:04 GMT

[ID#0] Received header from server: ETag: "******"

[ID#0] Received header from server: Expires: Sun, 19 Nov 1978 05:00:00 GMT

[ID#0] Received header from server: Cache-Control: must-revalidate

[ID#0] Received header from server: X-Content-Type-Options: nosniff

[ID#0] Received header from server: X-Frame-Options: sameorigin

[ID#0] Received header from server: Strict-Transport-Security: max-age=31536000; includeSubDomains; preload

[ID#0] Received header from server: Content-Type: application/rss+xml; charset=utf-8

[ID#0] Received header from server: Set-Cookie: ******; expires=Sat, 20-Apr-2019 14:21:58 GMT; Max-Age=2000000; path=/; domain=einsteinathome.org; secure

[ID#0] Received header from server: Vary: Accept-Encoding

[ID#0] Received header from server: Content-Encoding: gzip

[ID#0] Received header from server: Content-Length: 1539

[ID#0] Received header from server:

[ID#0] HTTP: wrote 4100 bytes
[ID#0] Info: Connection #19 to host einsteinathome.org left intact
sched RPC pending: Requested by user
Starting scheduler request
Sending scheduler request: Requested by user.
Reporting 10 completed tasks
Requesting new tasks for CPU
CPU work request: 1037520.00 seconds; 4.00 devices
NVIDIA GPU work request: 0.00 seconds; 0.00 devices
HTTP_OP::init_post(): https://scheduler.einsteinathome.org/EinsteinAtHome_cgi/cgi
HTTP_OP::libcurl_exec(): ca-bundle set
[ID#1] Info: Trying 130.75.116.40...
status: online
[ID#1] Info: Connected to scheduler.einsteinathome.org (130.75.116.40) port ****** (#20)
[ID#1] Info: ALPN, offering http/1.1
[ID#1] Info: Cipher selection: ******
[ID#1] Info: successfully set certificate verify locations:
[ID#1] Info: CAfile: C:\******
[ID#1] Info: CApath: none
[ID#1] Info: TLSv1.2 (OUT), TLS header, Certificate Status (22):
[ID#1] Info: TLSv1.2 (OUT), TLS handshake, Client hello (1):
[ID#1] Info: TLSv1.2 (IN), TLS handshake, Server hello (2):
[ID#1] Info: TLSv1.2 (IN), TLS handshake, Certificate (11):
[ID#1] Info: TLSv1.2 (IN), TLS handshake, Server key exchange (12):
[ID#1] Info: TLSv1.2 (IN), TLS handshake, Server finished (14):
[ID#1] Info: TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
[ID#1] Info: TLSv1.2 (OUT), TLS change cipher, Client hello (1):
[ID#1] Info: TLSv1.2 (OUT), TLS handshake, Finished (20):
[ID#1] Info: TLSv1.2 (IN), TLS change cipher, Client hello (1):
[ID#1] Info: TLSv1.2 (IN), TLS handshake, Finished (20):
[ID#1] Info: SSL connection using TLSv1.2 / ******
[ID#1] Info: ALPN, server did not agree to a protocol
[ID#1] Info: Server certificate:
[ID#1] Info: subject: C=US; postalCode=53201; ST=Wisconsin; L=Milwaukee; street=P.O. Box 413; street=2200 E. Kenwood Blvd.; O=University of Wisconsin-Milwaukee; OU=Physics; CN=einsteinathome.org
[ID#1] Info: start date: Nov 19 00:00:00 2018 GMT
[ID#1] Info: expire date: Nov 19 23:59:59 2019 GMT
[ID#1] Info: subjectAltName: scheduler.einsteinathome.org matched
[ID#1] Info: issuer: C=US; ST=MI; L=Ann Arbor; O=Internet2; OU=InCommon; CN=InCommon RSA Server CA
[ID#1] Info: SSL certificate verify ok.
[ID#1] Sent header to server: POST /EinsteinAtHome_cgi/cgi HTTP/1.1

[ID#1] Sent header to server: Host: scheduler.einsteinathome.org

[ID#1] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.6.33)

[ID#1] Sent header to server: Accept: */*

[ID#1] Sent header to server: Accept-Encoding: deflate, gzip

[ID#1] Sent header to server: Content-Type: application/x-www-form-urlencoded

[ID#1] Sent header to server: Accept-Language: de_DE

[ID#1] Sent header to server: Content-Length: 510853

[ID#1] Sent header to server: Expect: 100-continue

[ID#1] Sent header to server:

[ID#1] Received header from server: HTTP/1.1 100 Continue

[ID#1] Info: We are completely uploaded and fine
[ID#1] Received header from server: HTTP/1.1 500 Internal Server Error

[ID#1] Received header from server: Date: Thu, 28 Mar 2019 10:48:40 GMT

[ID#1] Received header from server: Server: Apache

[ID#1] Received header from server: Content-Length: 531

[ID#1] Received header from server: Connection: close

[ID#1] Received header from server: Content-Type: text/html; charset=iso-8859-1

[ID#1] Received header from server:

[ID#1] HTTP: wrote 531 bytes
[ID#1] Info: Closing connection 20
[ID#1] Info: TLSv1.2 (OUT), TLS alert, Client hello (1):
Scheduler request failed: HTTP internal server error
Deferring communication for 01:27:45
Reason: Scheduler request failed
status: online

- - - - - - - - - -

 

Upladed results waiting for report.

Upload work.

Scheduler server connection don't work.

CPU have no WUs for to crunch... :-(

 

I have no idea why my BOINC can't connect to the Einstein scheduler server. :-(

 

Please help.

 

Thanks.

N30dG-ARM
N30dG-ARM
Joined: 20 Oct 17
Posts: 23
Credit: 22094059
RAC: 0

Exactly the same Problem

Exactly the same Problem here. Was on a business trip the last 4 day's. And now nearly all my devices are out of work.

The last running ones will be out of work in the next few hour's.

 

Strangly, It seem's that I'm not able to upload my finished FGRP-WU's. BRP-WU's seem's to upload just fine.

But in both case's I do not get any new WU's:

28/03/2019 16:43:54 | Einstein@Home | Scheduler request failed: HTTP internal server error

N30dG-ARM
N30dG-ARM
Joined: 20 Oct 17
Posts: 23
Credit: 22094059
RAC: 0

Seem's to affect only Host's

Seem's to affect only Host's running FGRP-App as Anonymous platform.

You can just simply remove your app_info.xml from projekt folder, restart your boinc-client and everythink should work just fine.

 

Sadly, this is no option for me CryCryCry.

 

Dirk Sadowski
Dirk Sadowski
Joined: 26 Oct 09
Posts: 24
Credit: 103412693
RAC: 0

I deleted the app_info.xml

I deleted the app_info.xml file in the project folder...

and BOINC can connect again to the Einstein scheduler server.

 

That's fine, but BOINC couldn't report the already crunched and uploaded WUs.

It looks like the server re-sent them to BOINC, so my PC crunch again the same WUs a second time...

 

I use app_info.xml file to have the estimed times more realistic.

I couldn't imagine that this could confuse the Einstein server...

A pity, it looks like I couldn't use the app_info.xml file any more? Because the project admins don't support this file usage gladly? Because noone of the admins noticed it that there is currently a problem?

 

Thanks.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117783791940
RAC: 34690837

The admins are very busy at

The admins are very busy at the moment with setting up new GW searches, including one for GPUs.  There would be lots of server-side changes going on and I suspect the difficulty you are having with the anonymous platform mechanism could easily be just a little bit of 'collateral damage' :-).  I would be surprised if there was any deliberate attempt to disable it.

Where there is a standard platform available which is compatible with your hardware, you really should just allow it to be used automatically in the normal way.  The anonymous platform mechanism is special and really just for those (quite small number of people) that are not supported by any of the standard platforms.

It is unlikely that the people who could fix this will be reading the boards at the moment.  If you really need this mechanism, I would suggest sending a PM to Bernd Machenschalk.  You could also send a copy to Shawn Kwang and ask him to pass on to the proper people since Bernd may not see the PM in a timely manner.  Make sure you point them to the discussion in this thread.

Cheers,
Gary.

N30dG-ARM
N30dG-ARM
Joined: 20 Oct 17
Posts: 23
Credit: 22094059
RAC: 0

Sutaru Tsureku

Sutaru Tsureku wrote:

 That's fine, but BOINC couldn't report the already crunched and uploaded WUs.

It looks like the server re-sent them to BOINC, so my PC crunch again the same WUs a second time...

I'm sorry, I know about that, just forgoten to write it. When you remove the app_info or only change the version-number in the app_info, Boinc aborts all tasks.

 

Sutaru Tsureku wrote:

A pity, it looks like I couldn't use the app_info.xml file any more? Because the project admins don't support this file usage gladly? Because noone of the admins noticed it that there is currently a problem?

It seem's to be only the case for the FGRP. My Hosts running BRP, as Anonymous platform, running just fine.

I think they didn't notice it because it affects only a very small number of Users.

 

Anyway, I've send Bernd a email.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250682743
RAC: 35161

Sorry, this was caused by a

Sorry, this was caused by a bug in a scheduler modification that we made for the "O1OD1E" application. It should be fixed now. Please post here if it isn't.

BM

Dirk Sadowski
Dirk Sadowski
Joined: 26 Oct 09
Posts: 24
Credit: 103412693
RAC: 0

It looks like it work again

It looks like it work again with app.info.xml file...

 

@admins

BTW., it's possible that you add the function 'average processing rate' per app to the Einstein server software?

I mean, other BOINC projects don't use still the fixed DCF system - there every app have his own 'average processing rate'. The estimated calculation times and the WU cache are more realistic. And the CPU and GPU WU estimated times don't go 'crazy' if the real calculation time vary.

This would be very helpful, then I wouldn't need to use the app_info.xml file... ;-)

 

Thanks.

Dirk Sadowski
Dirk Sadowski
Joined: 26 Oct 09
Posts: 24
Credit: 103412693
RAC: 0

Sutaru Tsureku

Sutaru Tsureku wrote:

Hello.

 

My BOINC client can't connect to the Einstein@home scheduler server, since over 24 hours...

Just error messages like this:

Einstein@Home    27.03.19 | 16:46:34    Scheduler request failed: Error 408    
Einstein@Home    27.03.19 | 17:02:44    Scheduler request failed: HTTP internal server error

 

Where is the problem, on Einsteins' or my side?

 

Thanks.

 

 

EDIT: BTW., reboot of the PC didn't helped.

FYI

OK, it looks like it happens again...

Scheduler request failed: Error 408

Scheduler request failed: HTTP internal server error

 

Last successfully contact to the scheduler server:

4 Jun 2019 21:15:06 UTC

 

I'll send a PM to Bernd Machenschalk, we will see what happens...

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.