Project downtime tomorrow

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4335

Credit: 252494281

RAC: 35894

7 Oct 2014 11:19:49 UTC

Topic 197746

(moderation:

)

Einstein@Home will be shut down tomorrow (Wednesday Oct 8) morning (CEST) to perform some urgently necessary database work. We expect this to take a couple of hours.

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 2997605654

RAC: 713047

Project downtime tomorrow

8 Oct 2014 15:34:46 UTC

Message 124072

(moderation:

)

Server seems to be back up (I can post here!), but I'm getting a connection error when I try to report completed tasks.

Quote:

08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Info: Connected to einstein.phys.uwm.edu (129.89.61.70) port 80 (#5142)
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Info: Adding handle: conn: 0x37dfe80
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Info: Adding handle: send: 0
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Info: Adding handle: recv: 0
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Info: Curl_addHandleToPipeline: length: 1
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Info: - Conn 5142 (0x37dfe80) send_pipe: 1, recv_pipe: 0
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Sent header to server: POST /EinsteinAtHome_cgi/cgi HTTP/1.1
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.4.22)
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Sent header to server: Host: einstein.phys.uwm.edu
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Sent header to server: Accept: */*
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Sent header to server: Accept-Encoding: deflate, gzip
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Sent header to server: Content-Type: application/x-www-form-urlencoded
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Sent header to server: Accept-Language: en_GB
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Sent header to server: Content-Length: 190700
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Sent header to server: Expect: 100-continue
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Sent header to server:
08/10/2014 16:33:57 | Einstein@Home | [http] [ID#1] Received header from server: HTTP/1.1 100 Continue
08/10/2014 16:33:59 | Einstein@Home | [http] [ID#1] Received header from server: HTTP/1.1 404 Not Found
08/10/2014 16:33:59 | Einstein@Home | [http] [ID#1] Received header from server: Date: Wed, 08 Oct 2014 15:29:15 GMT
08/10/2014 16:33:59 | Einstein@Home | [http] [ID#1] Info: Server Apache/2.2.3 (CentOS) is not blacklisted
08/10/2014 16:33:59 | Einstein@Home | [http] [ID#1] Received header from server: Server: Apache/2.2.3 (CentOS)
08/10/2014 16:33:59 | Einstein@Home | [http] [ID#1] Received header from server: Content-Length: 306
08/10/2014 16:33:59 | Einstein@Home | [http] [ID#1] Received header from server: Content-Type: text/html; charset=iso-8859-1
08/10/2014 16:33:59 | Einstein@Home | [http] [ID#1] Info: HTTP error before end of send, stop sending
08/10/2014 16:33:59 | Einstein@Home | [http] [ID#1] Received header from server:
08/10/2014 16:33:59 | Einstein@Home | [http] [ID#1] Info: Closing connection 5142
08/10/2014 16:34:00 | Einstein@Home | Scheduler request failed: HTTP file not found

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4335

Credit: 252494281

RAC: 35894

Yep, the scheduler URL was

8 Oct 2014 15:38:00 UTC

Message 124073 in response to message 124072

(moderation:

)

Yep, the scheduler URL was changed.

Do you happen to know how from the project side we can instruct the clients to read the new URL from the "Master URL" (i.e. index page)?

According to the client code the client should do this automatically after 10 consecutive failures, which may take a while.

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 2997605654

RAC: 713047

Yes, that worked. After a few

8 Oct 2014 15:41:04 UTC

Message 124074 in response to message 124073

(moderation:

)

Yes, that worked. After a few manual updates (bypassing the 4-hour backoff each time), it found the new

http://einstein5.aei.uni-hannover.de/EinsteinAtHome_cgi/cgi and we're back in business, with new work downloaded and running.

Edit - I don't think you can 'instruct' the client to do anything without it contacting the scheduler first - and once that's happened, you don't need to tell it to do anything else. Just wait, and let time (and itchy trigger fingers) do the rest.

Mumak

Joined: 26 Feb 13

Posts: 335

Credit: 3587871583

RAC: 1420009

Yep, for me too - I needed to

8 Oct 2014 15:54:49 UTC

Message 124075

(moderation:

)

Yep, for me too - I needed to do about 4-5 Update requests.

Tom*

Joined: 9 Oct 11

Posts: 54

Credit: 366729484

RAC: 0

Very painless as it doesn't

8 Oct 2014 16:00:38 UTC

Message 124076

(moderation:

)

Very painless as it doesn't need to timeout just gets a file not found
5th update gets the master file.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4335

Credit: 252494281

RAC: 35894

That was a pretty long day

8 Oct 2014 16:06:15 UTC

Message 124077

(moderation:

)

That was a pretty long day for us. The basic things should be working again. Some minor things (stats export, scheduler log publishing, db purging) don't work yet, but we'll do this after getting some sleep. Tomorrow I may also give a more extensive report on what we actually did.

archae86

Joined: 6 Dec 05

Posts: 3163

Credit: 7330801687

RAC: 2321167

RE: 5th update gets the

8 Oct 2014 17:09:01 UTC

Message 124078 in response to message 124076

(moderation:

)

Quote:

5th update gets the master file.

Perhaps there is a difference depending on whether work is being requested.

Three of my PC's that wanted work seemed to take about 5 update requests each, but my laptop, which was off all night and had work to report but none to request, logged eleven "Scheduler request failed: HTTP file not found" entries before finally doing the "Fetching scheduler list, Master file download succeeded" pair, after which the next update request succeeded.

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 2997605654

RAC: 713047

RE: RE: 5th update gets

8 Oct 2014 17:32:13 UTC

Message 124079 in response to message 124078

(moderation:

)

Quote:

Quote:
5th update gets the master file.

Perhaps there is a difference depending on whether work is being requested.

Three of my PC's that wanted work seemed to take about 5 update requests each, but my laptop, which was off all night and had work to report but none to request, logged eleven "Scheduler request failed: HTTP file not found" entries before finally doing the "Fetching scheduler list, Master file download succeeded" pair, after which the next update request succeeded.

Computers which were active during the (European day / American night) probably got through their first few attempts during the 'down for maintenance' period, so fewer were needed to reach the "after 10 consecutive failures" trigger that Bernd mentioned. If the machine has been off, you need to do them all yourself.

MAGIC Quantum M...

Joined: 18 Jan 05

Posts: 1937

Credit: 1475966144

RAC: 1254278

I had several tasks waiting

8 Oct 2014 19:27:42 UTC

Message 124080

(moderation:

)

I had several tasks waiting to be sent back and after a few tries it started to work again and sent and received once again.

(and I am back to having all 7 hosts running again)

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6591

Credit: 329384025

RAC: 284057

Minor problem with thread

8 Oct 2014 22:23:34 UTC

Message 124081

(moderation:

)

Minor problem with thread marking : just on reading a thread it wasn't marked as read, but the "Mark all threads as read" button fixed it.

But now having just tested again via reading, it's fine now. Oh well .... :-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Project downtime tomorrow

Forums › Technical News

Comment viewing options

Forums › Technical News