UWM power outage Sat Nov 17

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4267
Credit: 244926081
RAC: 16571
Topic 196600

There is a planned power outage on the UWM campus affecting the physics building on Sat Nov 17, lasting from 7 AM - 2 PM local time (should be 1 PM - 8 PM UTC).

According to current plans the network connection to the outside world will be available, so we'll keep the core project servers running on UPS for that time, and ideally you here on Einstein@Home shouldn't notice anything (Albert@Home will be shut down, though). Not all may actually go according to plans, though.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4267
Credit: 244926081
RAC: 16571

UWM power outage Sat Nov 17

As you may have noticed, not all went to plan. The UPS failed and we had to shut down the project servers.

We're currently bringing the project up bit by bit. By tomorrow, all should be running again.

BM

BM

robertmiles
robertmiles
Joined: 8 Oct 09
Posts: 127
Credit: 20889670
RAC: 72902

For two of my computers, the

For two of my computers, the UPS has a feature you might find useful in such situations. Shortly before it runs out of battery power, it tells the computer to pause all programs, then write the current state of memory to the hard drive. The computer then goes into sleep mode - power is off, but if restarted, it will reload the memory from that section of the hard drive, instead of a normal reboot, and let all programs resume. Some programs fail during this, probably those with timeouts running and not set up to recover after those timeouts occur during sleep mode.

You may want to check if the model of UPS you're using has that feature, and if so, whether suitable software for it is available for your operating system.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4267
Credit: 244926081
RAC: 16571

The main project servers were

The main project servers were shut down cleanly (apart from a problem shutting down the DB on Albert). The Einstein@Home rack does have a local UPS with enough capacity to shut down the machines cleanly (and automatically). The problem was that the UPS of the data center (Nemo cluster) should have lasted long enough to keep the Einstein@Home machines running for the full (planned) seven hours, but we found that in reality it lasted for only 1.5h. So we shut down the machines manually.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4267
Credit: 244926081
RAC: 16571

One remnant of the power

One remnant of the power outage is that since then the replica DB that serves the web pages (including the server status page) seems terribly slow at times, we still don't know why. It is configured identical to the master DB, which runs quite fine. We're still investigating.

BM

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.