"Project is down" for 19 hours now

Richard Schumacher
Richard Schumacher
Joined: 8 Aug 06
Posts: 32
Credit: 14212314
RAC: 0
Topic 194303

But there are no messages about it. What's up?

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 686042413
RAC: 597943

"Project is down" for 19 hours now

Quote:
But there are no messages about it. What's up?

The database performance problems that are mentioned in the latest news item on the project home page are not completely resolved yet. The project admins are working on it, please stay tuned.

CU
Bikeman

mikey
mikey
Joined: 22 Jan 05
Posts: 11888
Credit: 1828038366
RAC: 206751

RE: But there are no

Quote:
But there are no messages about it. What's up?

Seems to be a weekly weekend thing! Hopefully it will be fixed, or someone assigned to check it on weekends, soon.

Stranger7777
Stranger7777
Joined: 17 Mar 05
Posts: 436
Credit: 417491205
RAC: 33539

I'm looking at the status

I'm looking at the status page for about 2 weeks and see that nothing has chaged yet. ABP assimilator is not working for a long time, file deleter doesn't work all this time. Soon or late it will get the server on its knees because of "no more free space available".

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4265
Credit: 244921643
RAC: 16886

We still have serious

We still have serious problems with our database and are working on it. For the moment we disabled the scheduler (which was getting mostly DB connection errors anyway) to let the rest of the daemons catch up.

ABP1 validator and assimilator are running, however they will never again run on the machine the server status page has direct access to, so I removed the status signs from there.

Also due to DB connection problems the server status page isn't updated, but still shows an old status of 10:37 UTC.

BM

BM

adrianxw
adrianxw
Joined: 21 Feb 05
Posts: 242
Credit: 322654862
RAC: 0

I can neither get nor report

I can neither get nor report wu's at the moment. I assume this to be a known effect of the known outage and only mention it in case it was not!

Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 686042413
RAC: 597943

RE: I can neither get nor

Message 92454 in response to message 92453

Quote:
I can neither get nor report wu's at the moment. I assume this to be a known effect of the known outage and only mention it in case it was not!

Yes, this is to be expected whenever the scheduler is turned off.

CU
Bikeman

BarryAZ
BarryAZ
Joined: 8 May 05
Posts: 190
Credit: 320740540
RAC: 9548

Looks like there was an

Looks like there was an update to the home page earlier today.

Quote:
But there are no messages about it. What's up?


Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4265
Credit: 244921643
RAC: 16886

Ok, the daemons have worked

Ok, the daemons have worked through the backlogs and the DB looks responsive again. I started the scheduler, let's see how things go.

BM

BM

Gundolf Jahn
Gundolf Jahn
Joined: 1 Mar 05
Posts: 1079
Credit: 341280
RAC: 0

RE: Ok, the daemons have

Message 92457 in response to message 92456

Quote:

Ok, the daemons have worked through the backlogs and the DB looks responsive again. I started the scheduler, let's see how things go.

BM


Still (or again?) at UTC+2:21/04/2009 11:45:44|Einstein@Home|Message from server: Project is temporarily shut down for maintenanceAnd the server status hangs at 8:04 AM UTC on Tuesday, 21 April 2009.

Gruß,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 686042413
RAC: 597943

RE: RE: Ok, the daemons

Message 92458 in response to message 92457

Quote:
Quote:

Ok, the daemons have worked through the backlogs and the DB looks responsive again. I started the scheduler, let's see how things go.

BM


Still (or again?) at UTC+2:21/04/2009 11:45:44|Einstein@Home|Message from server: Project is temporarily shut down for maintenanceAnd the server status hangs at 8:04 AM UTC on Tuesday, 21 April 2009.

Gruß,
Gundolf

Some of my hosts did get fresh work today. Until the problem is resolved (which will be announced on the home page), I'm afraid there will be phases where the scheduler is turned off and on repeatedly to see how the system reacts to different countermeasures for the problem until the performance is acceptable again. Please ignore the status page for the time being, generating it causes extra load on the database which would interfere with the problem solving right now.

CU
Bikeman

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.