My Apologies To The Einstein Crunchers

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7059464931
RAC: 1206156

RE: My feeling is that most

Quote:
My feeling is that most of the symptoms you're seeing, including the problem the BOINC Manager has communicating with the client, are due to this start-up disk access for Einsten tasks. And the disk access demands of your 8 i7 cores at startup will be even more severe than the demands of my 8 E5320 cores


This seemingly unproductive period during Einstein execution has troubled me for a long time. I just timed a startup on my E5620 host (four cores running HT, so 8 virtual CPU's). It took almost three minutes from the time it started running the aps before it reached the "useful" phase when the enormous I/O, appreciable idle time, and significant CPU time charged to System went away and normal computing began. With the flavor of WU currently on my host, Process Explorer showed just over 3 Gigabytes of I/O Read to that point. It appears that this number goes up as the frequency explored by the WU goes up--so this effect keeps getting worse as we press on to higher and higher frequencies.

I believe I've noticed that the actual wall clock time required can be materially increased by one's Antivirus application. If your AV makes it easy to pause protection, it might amuse you to compare this startup time with and without. I'm running Kaspersky on this host, and when I tried the comparison just now, there was little difference--perhaps ten or twenty percent without, and that possibly really an artifact of longer time since boot rather than really an AV side effect. Still, in the past on another host I think I've seen a much larger effect with other AV programs--probably eSet NOD32 and possibly Norton of some years ago.

Siran d'Vel'nahr
Siran d'Vel'nahr
Joined: 15 Feb 05
Posts: 104
Credit: 1538869
RAC: 0

Greetings everyone, ***

Greetings everyone,

*** UPDATE ***

After I had made my change to my boot.ini file, I continued to read the page that Wumpus posted an URL to, located here. Evidently, due to trouble that the switch, /3GB, was causing, requiring PAE aware drivers and what PAE sent to non-PAE aware drivers, Micro$oft effectively disabled PAE. Adding the /3GB switch does virtually nothing now. I have no idea if PAE aware drivers even exist now since that article was written. I have no ambition to go scouring the universe in the hopes of finding them when it won't do me any good anyway, according to that article.

I noticed the "nothing" that it causes last night when I tried to start my ftp client, FileZilla. I instantly got an error message saying, roughly, "FileZilla is not a valid Windows 32 application". I got rid of that alert and tried to do something in Firefox and received the same alert for Firefox. So, I closed all my running apps and reversed the modification I made to boot.ini and rebooted. I then started all the apps mentioned in my post last night. FileZilla started no problem this time and no further problems with Firefox.

The i7 has been running now for just over 10 hours with no adverse affects, so far. Task manager is reporting 1.1GB of page file usage and 1.96GB of available physical memory, up and down just slightly, respectively, from last night. As I have mentioned before, my problem would manifest itself after about 8 to 10 hours of continued operation, whether or not I was doing anything on the i7.

I have a short day at work today, don't have to be there until 1pm my time, so until I leave I will continue to do my regular routine on the i7 and monitor it for any abnormalities. When I get home from work, I will see how it is doing then, without me doing anything on it.

@Aaron (Ascholten):

Quote:
another thing you can try as you attempt to ramp your boinc work back up is to set it for low processor use and ramp it up slowly so you observe and hopefully intervene before anything goes south on you too badly.


Good idea! :) I will do that. Another thing I did was to change the cache size in my BOINC (local) preferences to 0 (zero) days and changed those in my preferences on Einstein and VP. I'm assuming that putting a 0 (zero) in that preference in BOINC will cause BOINC to use the value set in my account preferences. Would that be a correct assumption?

Have a great day everyone! :)

Keep on BOINCing...! :)

CAPT Siran d'Vel'nahr XO
USS Vre'kasht NCC-33187

Siran's website: [ ONLINE! ]

tullio
tullio
Joined: 22 Jan 05
Posts: 2118
Credit: 61407735
RAC: 0

I am currently running 4

I am currently running 4 BOINC projects on my Linux box with a 32-bit SuSE Linux 11.1 pae, 5 GB RAM, 2 160 GB disks plus a 1.4 TB external disk as backup. My running projects are AQUA (multithreading), Einstein, QMC and CPDN Famous with an extended deadline. All switch regularly every 60 minutes on my Opteron 1210 CPU with two cores. AQUA takes both cores, the others only one.
Tullio

Siran d'Vel'nahr
Siran d'Vel'nahr
Joined: 15 Feb 05
Posts: 104
Credit: 1538869
RAC: 0

Greetings everyone, ***

Greetings everyone,

*** UPDATE ***

Well, I do believe I have narrowed my problem down to Windoze or one of 2 apps, and the one related to this and other projects has the higher percentage of suspicion between the apps. I believe that my problem is caused by BOINC and/or Firefox. But then, I had also suspected that my Windoze installation was hosed, which also has a high probability.

After 11 hours of continuous operation, the dog doo hit the fan. I was researching an ISO file called Ultimate Boot CD. It was recommended by a user over on the SETI forum. I was attempting to find a suitable mirror to download the ISO when Firefox locked up tight. I had to use brute force and enlist Task managers help in removing the app. Once it was gone, I shut down BOINC. Everything appeared to be back to normal, notice I said "appeared".

I was checking things out in Task manager when I noticed that my BOINC tray icon was still in the tray. I also watched as it and the icon for a utility called TThrottle switched from normal to all black, to normal, to all black, etc. (NOTE: those 2 are related in that TThrottle is a regulator for BOINC to help keep CPU temps to a minimum). Firefox would not restart, I kept getting a few different error messages. One was something about a .DRV file not being a "valid image" and another was a cryptic message that the "application failed to initialize properly...".

Also, after a few minutes, while in Task manager, I noticed that when switching to the performance tab, the previous tab display was under the graphs. I shut down Task manager and it would no longer restart, giving me that same cryptic message about not initializing properly.

I re-booted the i7...

Now, one thing I have failed to mention in all of this concerns Windoze booting up. I didn't mention this because, well, now that I think about it, I have, but I haven't mentioned this concern because doing a Windoze repair install sux in the extreme. And doing a clean install sux even more.

When the Windoze cylon splash screen is doing its thing, it gets to the point of checking for a mouse and keyboard when a tiny little graphic artifact appears on screen. It's the same colors as the logo. It's about 1 or 2 pixels high and about 10 or 12 pixels wide. The last time I did a repair install, I was seeing that same artifact when the cylon splash was on screen. I call it "cylon" because of the blue progress indicator. ;)

Now, in my update yesterday, I stated that the i7 ran for 26+ hours continuously without any apparent "side affect", without BOINC running. I have given my reason to suspect Windoze and with the event this morning, it gives me cause to suspect BOINC and/or Firefox. Or is it possible to be something conflictual with all 3 running concurrent?

This is really starting to get to me. And now that I think about it, it is just so suspicious that this all starts when I re-attache to Einstein...

Keep on BOINCing...! :) (only my Linux box is fulfilling that... something to be said about running Linux...)

CAPT Siran d'Vel'nahr XO
USS Vre'kasht NCC-33187

Siran's website: [ ONLINE! ]

Gundolf Jahn
Gundolf Jahn
Joined: 1 Mar 05
Posts: 1079
Credit: 341280
RAC: 0

RE: Another thing I did was

Quote:
Another thing I did was to change the cache size in my BOINC (local) preferences to 0 (zero) days and changed those in my preferences on Einstein and VP. I'm assuming that putting a 0 (zero) in that preference in BOINC will cause BOINC to use the value set in my account preferences. Would that be a correct assumption?


Nope. The local zero will override the online values.

You'll either have to set them locally too or manually edit your global_prefs_override.xml file and delete the offending lines. With the next change of your local preferences, BOINC will enter the online values into your local preferences, though.

Gruß,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

ML1
ML1
Joined: 20 Feb 05
Posts: 347
Credit: 86314215
RAC: 213

RE: Greetings

Quote:

Greetings everyone,

*** UPDATE ***

Well, I do believe I have narrowed my problem down to Windoze or one of 2 apps, ... But then, I had also suspected that my Windoze installation was hosed, which also has a high probability.

After 11 hours of continuous operation, the dog doo hit the fan.

... utility called TThrottle switched from normal to all black, to normal, to all black, etc. (NOTE: those 2 are related in that TThrottle is a regulator for BOINC to help keep CPU temps to a minimum). ...

All things in running computers should be completely deterministic and certain. There should be no need for any uncertainty and 'probability'. Unless that is you go quantum... :-)

Are you completely sure you don't have a hardware failure or marginal operation?

Cooling fan problem or cooling fan control problem?

Can you run Memtest86+ overnight without error?

Can you run a full disk check including surface scan without error?

(And have you got a backup of all your work?!)

Quote:

I re-booted the i7...

Now, one thing I have failed to mention in all of this concerns Windoze booting up. I didn't mention this because, well, now that I think about it, I have, but I haven't mentioned this concern because doing a Windoze repair install sux in the extreme. And doing a clean install sux even more.

I gave up with that silliness long ago. A partial fix was using ghosting to quickly roll back to an earlier known good state. A more comprehensive fix was in moving off Windows in the first place!

Quote:
When the Windoze cylon splash screen is doing its thing, ... I call it "cylon" because of the blue progress indicator. ;)

Good giggle. Never drew that association before! :-)

Wait, didn't they wipe us out...

Quote:
... the i7 ran for 26+ hours continuously without any apparent "side affect", without BOINC running. I have given my reason to suspect Windoze and with the event this morning, it gives me cause to suspect BOINC and/or Firefox. Or is it possible to be something conflictual with all 3 running concurrent?...

If you have marginal hardware operation, they could be falsely implicated just by merely being the most often present applications. You must eliminate all possibility that you might be suffering hardware problems. (Oooops... Mustn't fall into the Cylon-speak!)

The next suspect is then whether you are suffering any malware or anti-virus side effects...

Quote:
Keep on BOINCing...! :) (only my Linux box is fulfilling that... something to be said about running Linux...)

Linux can run well even on old hardware, but even Linux cannot always survive hardware failure.

Hope of help.

Happy cruinchin',
And Good Luck,

Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)

mikey
mikey
Joined: 22 Jan 05
Posts: 11964
Credit: 1833700212
RAC: 224746

RE: Greetings

Quote:

Greetings everyone,

*** UPDATE ***

I was checking things out in Task manager when I noticed that my BOINC tray icon was still in the tray. I also watched as it and the icon for a utility called TThrottle switched from normal to all black, to normal, to all black, etc. (NOTE: those 2 are related in that TThrottle is a regulator for BOINC to help keep CPU temps to a minimum). Firefox would not restart, I kept getting a few different error messages. One was something about a .DRV file not being a "valid image" and another was a cryptic message that the "application failed to initialize properly...".

Stop running the TThrottle program during your testing, it could be the culprit and you wouldn't even know it. Run just the basic stuff, then one by one add in the additional programs. I have several i7 machines that run Firefox, Boinc with Einstein, and do not see any of your problems. All of my machines are home built and with mostly used components with the exception of cpu, memory and motherboard. I think you need to get back to the basics and start over from there. Does that mean a fresh Windows install, no not right now, but it does mean to stop all add on programs for right now and see if things go okay. Do the testing that ML1 suggests and if everything tests out ok then get rid of TThrottle, or just stop it from running, go back to IE for now, etc, etc. ALSO MAKE SURE you have done ALL the windows updates! I know some are a pain and seem to do more harm then good but right now you have nothing to lose, your current setup won't work and you are stuck in drydock!!

Siran d'Vel'nahr
Siran d'Vel'nahr
Joined: 15 Feb 05
Posts: 104
Credit: 1538869
RAC: 0

Greetings everyone, @

Greetings everyone,

@ Mikey and Martin: I am in the process of running some hardware tests. I downloaded an ISO called Ultimate Boot CD and burned it to disk. It has a lot of great stuff on it. It runs on a Linux kernel and looks a LOT like the old M$-DOG screen displays of yesteryear.

To be honest with you guys, and I know you're just trying to help and I really appreciate it but, I seriously do not believe that I am having any kind of hardware problem. But, I will do some hardware testing anyway.

*** UPDATE ***
I already ran some diagnostics on my boot HDD. It came back clean as a whistle. It had better, everything in the i7 except the case and PSU is barely 6 months old. The CD has a few CPU stress testing utilities that I will use to make sure the CPU is fine. I ran memtest86 on the i7 over night, 9 minutes shy of 11 hours total. No errors at all. :) I didn't have a lot of time to do more exploring on the CD, I needed to get ready for work, but I'm sure there are more hardware diagnostic/testing programs on that CD.

I still seriously believe that when I re-attached to Einstein, something got whacked upside the head causing me these headaches, and it wasn't me getting whacked. ;) And, I do not believe in coincidence. I do not believe that some piece of hardware decided to start heading south at the same time I re-attached to Einstein. The probability of that happening is just to danged low, in my opinion.

Some reasoning behind my beliefs:
Before all this started happening, the only hardware problem I knew of was my video card. It has since been changed and the problems no longer exist. The i7 ran for 30+ days, the longest I have ever seen a Windoze PC go without a re-boot, until I received an update from MS that required a re-boot. We know, most re-boots are caused by MS and their updates. SETI was crunching 24/7. I maintained an average 200+ WUs in my cache at all times. NOT a single problem.

BOINC has not been doing any crunching for a few days now and every time the i7 runs with it running, after 8 to 10 hours the crap starts happening. When BOINC is NOT running at all, the i7 runs without a problem, for longer than 24 hours. It's only when BOINC runs that the problem appears. I didn't have any problems with BOINC or in general until I re-attached to Einstein.

As mentioned yesterday, I suspect that something in Windoze has been corrupted. I also mentioned the symptom that is the cause of my suspicion. We'll see what happens when I run more hardware tests this week end.

Keep on BOINCing...! :)

CAPT Siran d'Vel'nahr XO
USS Vre'kasht NCC-33187

Siran's website: [ ONLINE! ]

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1707
Credit: 1074745646
RAC: 1217621

My 2 machines with the 9 year

My 2 machines with the 9 year old XP Pro OS wanted to give you all a "BAH" and an "Oy Vey" and they said that they don't mind sharing this room with the W7 but their RAC speaks for themselves.

mikey
mikey
Joined: 22 Jan 05
Posts: 11964
Credit: 1833700212
RAC: 224746

RE: Greetings everyone, @

Quote:

Greetings everyone,

@ Mikey and Martin: I am in the process of running some hardware tests. I downloaded an ISO called Ultimate Boot CD and burned it to disk. It has a lot of great stuff on it. It runs on a Linux kernel and looks a LOT like the old M$-DOG screen displays of yesteryear.

To be honest with you guys, and I know you're just trying to help and I really appreciate it but, I seriously do not believe that I am having any kind of hardware problem. But, I will do some hardware testing anyway.

*** UPDATE ***
I already ran some diagnostics on my boot HDD. It came back clean as a whistle. It had better, everything in the i7 except the case and PSU is barely 6 months old. The CD has a few CPU stress testing utilities that I will use to make sure the CPU is fine. I ran memtest86 on the i7 over night, 9 minutes shy of 11 hours total. No errors at all. :) I didn't have a lot of time to do more exploring on the CD, I needed to get ready for work, but I'm sure there are more hardware diagnostic/testing programs on that CD.

I still seriously believe that when I re-attached to Einstein, something got whacked upside the head causing me these headaches, and it wasn't me getting whacked. ;) And, I do not believe in coincidence. I do not believe that some piece of hardware decided to start heading south at the same time I re-attached to Einstein. The probability of that happening is just to danged low, in my opinion.

Some reasoning behind my beliefs:
Before all this started happening, the only hardware problem I knew of was my video card. It has since been changed and the problems no longer exist. The i7 ran for 30+ days, the longest I have ever seen a Windoze PC go without a re-boot, until I received an update from MS that required a re-boot. We know, most re-boots are caused by MS and their updates. SETI was crunching 24/7. I maintained an average 200+ WUs in my cache at all times. NOT a single problem.

BOINC has not been doing any crunching for a few days now and every time the i7 runs with it running, after 8 to 10 hours the crap starts happening. When BOINC is NOT running at all, the i7 runs without a problem, for longer than 24 hours. It's only when BOINC runs that the problem appears. I didn't have any problems with BOINC or in general until I re-attached to Einstein.

As mentioned yesterday, I suspect that something in Windoze has been corrupted. I also mentioned the symptom that is the cause of my suspicion. We'll see what happens when I run more hardware tests this week end.

Keep on BOINCing...! :)

I do not think it is hardware related either but if you don't check and it is you will have wasted alot of time. It is kind of like the Doctor that uses his stethoscope to see if your heart is still beating when he can see that you are breathing, he KNOWS you heart is working because it stops AFTER your breathing as you die, but he does it anyway.

I just checked your latest problem units and the error code is -120, I checked a wiki and it says the problem is:
ERR_RSA_FAILED -120 RSA key check failed for file RSA key check failed for a file

Now I have no idea what that means but do you run this under your Admin login account or a different account? If a different account maybe the permissions aren't what Einstein is looking for? When you installed Boinc did you chose the default options or did you customize them? When I do it I always uncheck 'use the screensaver' and then check to 'let anyone manage Boinc' but other than that I use the defaults? You did not chose a 'service' install by chance did you?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.