// DBOINCP-300: added node comment count condition in order to get Preview working ?>
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250444187
RAC: 35219
7 Oct 2014 11:19:49 UTC
Topic 197746
(moderation:
)
Einstein@Home will be shut down tomorrow (Wednesday Oct 8) morning (CEST) to perform some urgently necessary database work. We expect this to take a couple of hours.
Edit - I don't think you can 'instruct' the client to do anything without it contacting the scheduler first - and once that's happened, you don't need to tell it to do anything else. Just wait, and let time (and itchy trigger fingers) do the rest.
That was a pretty long day for us. The basic things should be working again. Some minor things (stats export, scheduler log publishing, db purging) don't work yet, but we'll do this after getting some sleep. Tomorrow I may also give a more extensive report on what we actually did.
Perhaps there is a difference depending on whether work is being requested.
Three of my PC's that wanted work seemed to take about 5 update requests each, but my laptop, which was off all night and had work to report but none to request, logged eleven "Scheduler request failed: HTTP file not found" entries before finally doing the "Fetching scheduler list, Master file download succeeded" pair, after which the next update request succeeded.
Perhaps there is a difference depending on whether work is being requested.
Three of my PC's that wanted work seemed to take about 5 update requests each, but my laptop, which was off all night and had work to report but none to request, logged eleven "Scheduler request failed: HTTP file not found" entries before finally doing the "Fetching scheduler list, Master file download succeeded" pair, after which the next update request succeeded.
Computers which were active during the (European day / American night) probably got through their first few attempts during the 'down for maintenance' period, so fewer were needed to reach the "after 10 consecutive failures" trigger that Bernd mentioned. If the machine has been off, you need to do them all yourself.
Project downtime tomorrow
)
Server seems to be back up (I can post here!), but I'm getting a connection error when I try to report completed tasks.
Yep, the scheduler URL was
)
Yep, the scheduler URL was changed.
Do you happen to know how from the project side we can instruct the clients to read the new URL from the "Master URL" (i.e. index page)?
According to the client code the client should do this automatically after 10 consecutive failures, which may take a while.
BM
BM
Yes, that worked. After a few
)
Yes, that worked. After a few manual updates (bypassing the 4-hour backoff each time), it found the new
http://einstein5.aei.uni-hannover.de/EinsteinAtHome_cgi/cgi
and we're back in business, with new work downloaded and running.Edit - I don't think you can 'instruct' the client to do anything without it contacting the scheduler first - and once that's happened, you don't need to tell it to do anything else. Just wait, and let time (and itchy trigger fingers) do the rest.
Yep, for me too - I needed to
)
Yep, for me too - I needed to do about 4-5 Update requests.
-----
Very painless as it doesn't
)
Very painless as it doesn't need to timeout just gets a file not found
5th update gets the master file.
That was a pretty long day
)
That was a pretty long day for us. The basic things should be working again. Some minor things (stats export, scheduler log publishing, db purging) don't work yet, but we'll do this after getting some sleep. Tomorrow I may also give a more extensive report on what we actually did.
BM
BM
RE: 5th update gets the
)
Perhaps there is a difference depending on whether work is being requested.
Three of my PC's that wanted work seemed to take about 5 update requests each, but my laptop, which was off all night and had work to report but none to request, logged eleven "Scheduler request failed: HTTP file not found" entries before finally doing the "Fetching scheduler list, Master file download succeeded" pair, after which the next update request succeeded.
RE: RE: 5th update gets
)
Computers which were active during the (European day / American night) probably got through their first few attempts during the 'down for maintenance' period, so fewer were needed to reach the "after 10 consecutive failures" trigger that Bernd mentioned. If the machine has been off, you need to do them all yourself.
I had several tasks waiting
)
I had several tasks waiting to be sent back and after a few tries it started to work again and sent and received once again.
(and I am back to having all 7 hosts running again)
Minor problem with thread
)
Minor problem with thread marking : just on reading a thread it wasn't marked as read, but the "Mark all threads as read" button fixed it.
But now having just tested again via reading, it's fine now. Oh well .... :-)
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal