Major Problem - Logging In at Startup

RiversideCityCampus
RiversideCityCampus
Joined: 30 Jan 06
Posts: 6
Credit: 8016145
RAC: 0
Topic 191067

A number of machines in the lab are experiencing a problem at startup.

boinc.exe is running at close to 100% initially at a normal not low priority. It takes 10 to 20 minutes to get to the startup Windows XP screen while logging into the network.

boinc.exe is still running for quite a while making the machine extrememly sluggish.

When albert_4.37 finally kicks in everything is performing at normal operating speeds.

What is causing boinc.exe to hang everything up.

Mark E. Lehr

RiversideCityCampus
RiversideCityCampus
Joined: 30 Jan 06
Posts: 6
Credit: 8016145
RAC: 0

Major Problem - Logging In at Startup

Further Information,

The lease time on the IP address is 1 hour in the future.
Probably some kind of time change problem.

boinc.exe is waiting for this time.

I change the clock one hour ahead and immediately, albert starts.

Machine then behaves very quickly.

Is boinc.exe checking time before starting albert?

Mark E. Lehr

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2955429872
RAC: 721671

Nothing to do with

Nothing to do with BOINC/Einstein, but....

A quick look at your computer list shows a couple of servers, and lots of workstations. Is one of your servers a Domain Controller, and are the workstations joined as members of that domain?

I've come across this 'very slow login' for domain members, when the server DNS is badly configured or misbehaving: a Windows Active Directory domain seems to rely very heavily on DNS to communicate with the local server while starting up.

So my suggestion would be: disable BOINC on a test machine, and see if it starts normally. If it's still slow to login, you've got an underlying network problem, and you've got to address/resolve that before moving on to BOINC issues.

Hope that helps.

RiversideCityCampus
RiversideCityCampus
Joined: 30 Jan 06
Posts: 6
Credit: 8016145
RAC: 0

We run deepfreeze on all our

Message 27974 in response to message 27973

We run deepfreeze on all our machines. Nothing is saved to the hard drive when shutdown occurs. This is necessary in a lab enviornment or we would be re-imaging the machines weekly.

When the time changed, the new Dell machines save some of their bios information to the hard drive. Unfortunately, this caused a major hiccup.

Daylight savings time caused the system clock and the lease time for the IP to be 1 hour different.

For some reason, BOINC started but would not start Albert. BOINC is a higher priority then Albert and the new Dells just took 20 minutes to login. BOINC would sit there and wait until the hour had passed before starting Albert. Albert is a lower priority, and when the hour had passed the machines became responsive.

The solution was to disable and re-enable deepfreeze. The clocks aligned and the machines behaved normally.

Mark E. Lehr

PS Due to deepfreeze we have also had to write a script that will merge machines daily. Nothing is saved to the hard-drive. All work is lost at night when the machines shutdown.

PSS We also installed 4 machines around campus to wake-on-lan. So if students turn the machines off we can automatically wake them up.

PSSS When the Einstein site goes down our network goes beserk. We have about 1000 machines and many of them try uploading and downloading info but can't reach the site. It calms down after awhile but it can be a little hair raising.

RiversideCityCampus
RiversideCityCampus
Joined: 30 Jan 06
Posts: 6
Credit: 8016145
RAC: 0

RE: Nothing to do with

Message 27975 in response to message 27973

Quote:

Nothing to do with BOINC/Einstein, but....

A quick look at your computer list shows a couple of servers, and lots of workstations. Is one of your servers a Domain Controller, and are the workstations joined as members of that domain?

I've come across this 'very slow login' for domain members, when the server DNS is badly configured or misbehaving: a Windows Active Directory domain seems to rely very heavily on DNS to communicate with the local server while starting up.

So my suggestion would be: disable BOINC on a test machine, and see if it starts normally. If it's still slow to login, you've got an underlying network problem, and you've got to address/resolve that before moving on to BOINC issues.

Hope that helps.

We did disable BOINC and it was the interaction between the order of software installed, BOINC, the Windows DNS, deepfreeze, and Dells saving to the hard-drive. Disabling and Enabling deepfreeze solved the problem. However, a year from now when we have the time change in this direction again it will cause the same problem.

MarkF
MarkF
Joined: 12 Apr 05
Posts: 393
Credit: 1516715
RAC: 0

Mark E. Lehr: Do you realize

Mark E. Lehr:
Do you realize you have over 6,000 clients registered? Most of which appear to new instances of previously registered CPUs. A lot of the time you are abandoning 2 or 3 work units. This is leaving your quorum cohorts too wait for two weeks for dead WU to time out before they can get credit.

It is great to see such a large CPU farm join E@H. If you are having problems may I suggest that you work them out with just a couple of systems before you turn on the rest of your 1000.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2955429872
RAC: 721671

Ooooh, nasty. I see what

Ooooh, nasty.

I see what MarkF means about the abandoned WUs - I'd been wondering why my 'pending credit' list had been growing so much recently!

Could you spare any server drive space for the BOINC installations? I understand you want to keep the local drives clean, but if you run BOINC as a service with its own account, and only allow that account to access the server share, you should be fairly secure. Then the BOINC data could be preserved across sessions - the network drive will be less efficient, of course, but you should win that back by not dropping half-finished WUs at the end of the session.

MarkF
MarkF
Joined: 12 Apr 05
Posts: 393
Credit: 1516715
RAC: 0

Can someone get in touch with

Can someone get in touch with Mark E. Lehr? He has added about 600 new clients today.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2955429872
RAC: 721671

RE: Can someone get in

Message 27979 in response to message 27978

Quote:
Can someone get in touch with Mark E. Lehr? He has added about 600 new clients today.

There's an email address in his profile. I've mailed him.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2955429872
RAC: 721671

Mission accomplished. A

Mission accomplished. A configuration change broke the script, but it's sorted now.

MarkF
MarkF
Joined: 12 Apr 05
Posts: 393
Credit: 1516715
RAC: 0

Great, many happy comp cycles

Great, many happy comp cycles to you both!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.