is it einstein or my hardware?

paul milton
paul milton
Joined: 16 Sep 05
Posts: 329
Credit: 35825044
RAC: 0

ok, most of the tests ive ran

ok, most of the tests ive ran didnt cause errors. thats the annoying part for example it ran fine for 12 days up until i ran then intel burn test (more on that in a second). or tests will cause the system to be none responsive. or require sole use of the system.

im in a predicament where id like to use a linux standalone but cant. i need access to the system through out the day. if i had another system that worked, different story.

and dell insists on "their" tests only, which is convenient for them since theirs does not run none stop, it does one pass and its done.

i ran the intel burn last night. it did fine....

Processor: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
Clock Speed: 3.39 GHz
Active Physical Cores: 8
Total System Memory: 12239 MB

Stress Level: Very High (4096 MB)
Testing started on 6/5/2014 12:25:51 PM
Time (s) Speed (GFlops) Result
[12:27:37] 77.634 99.2829 3.414758e-002
[12:29:22] 78.512 98.1727 3.414758e-002
[12:31:06] 78.005 98.8104 3.414758e-002
[12:32:50] 77.721 99.1721 3.414758e-002
[12:34:34] 77.694 99.2062 3.414758e-002
[12:36:18] 77.703 99.1944 3.414758e-002

etc, UNTIL. i set it to extreme and maximum. watching task manager after hitting start. i watched the memory fill up until there where about 500MB's left, i could hear the cpu fan reving up slowly. then about 10 seconds after it got going, the monitor cut off. not just black out, but "sleep mode" it lost the video feed. i knew at that point the system had shut down (the power light only blinks for a split second on restart so unless you know to expect it and be watching your not really sure) the dell logo popped up. system booted in to windows.

no memory dump. and the only stuff in event viewer was the usual "unexpected shut down" but nothing leading up to it.

i think in this case it got to hot and the bios just forced off the system. while that points me in the right direction (i think) it doesnt really give me any thing i can give to them.

im just going to have to compile a letter and dvd of the memory dumps and send it to dell legal and HOPE some one there can understand the problem instead of just handing it off to "support" again.

seeing without seeing is something the blind learn to do, and seeing beyond vision can be a gift.

mikey
mikey
Joined: 22 Jan 05
Posts: 12699
Credit: 1839101536
RAC: 3702

RE: ok, most of the tests

Quote:

ok, most of the tests ive ran didnt cause errors. thats the annoying part for example it ran fine for 12 days up until i ran then intel burn test (more on that in a second). or tests will cause the system to be none responsive. or require sole use of the system.

im in a predicament where id like to use a linux standalone but cant. i need access to the system through out the day. if i had another system that worked, different story.

and dell insists on "their" tests only, which is convenient for them since theirs does not run none stop, it does one pass and its done.

i ran the intel burn last night. it did fine....

Processor: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
Clock Speed: 3.39 GHz
Active Physical Cores: 8
Total System Memory: 12239 MB

Stress Level: Very High (4096 MB)
Testing started on 6/5/2014 12:25:51 PM
Time (s) Speed (GFlops) Result
[12:27:37] 77.634 99.2829 3.414758e-002
[12:29:22] 78.512 98.1727 3.414758e-002
[12:31:06] 78.005 98.8104 3.414758e-002
[12:32:50] 77.721 99.1721 3.414758e-002
[12:34:34] 77.694 99.2062 3.414758e-002
[12:36:18] 77.703 99.1944 3.414758e-002

etc, UNTIL. i set it to extreme and maximum. watching task manager after hitting start. i watched the memory fill up until there where about 500MB's left, i could hear the cpu fan reving up slowly. then about 10 seconds after it got going, the monitor cut off. not just black out, but "sleep mode" it lost the video feed. i knew at that point the system had shut down (the power light only blinks for a split second on restart so unless you know to expect it and be watching your not really sure) the dell logo popped up. system booted in to windows.

no memory dump. and the only stuff in event viewer was the usual "unexpected shut down" but nothing leading up to it.

i think in this case it got to hot and the bios just forced off the system. while that points me in the right direction (i think) it doesnt really give me any thing i can give to them.

im just going to have to compile a letter and dvd of the memory dumps and send it to dell legal and HOPE some one there can understand the problem instead of just handing it off to "support" again.

NOT defending Dell but they will probably say that setting it to "extreme and maximum" is causing the problems and the systems specs aren't designed to do that, and you are out of luck. NOT that you should be, but they will probably say you 'should' have bought all the upgraded top of the line components if you want to run your pc at "extreme and maximum". And that their pc's run just fine under 'normal' conditions, and your problems are not their problem.

One thing I did notice is that Nvidia now offers the GeForce 337.88 Driver as opposed to your version of 335.23. This MAY help as you say the monitor goes off pretty quickly during the shutdown process, it could be a driver problem.

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1889
Credit: 1413021148
RAC: 1198373

I just had to read through

I just had to read through this again Paul,

Do you watch the temps. on the CPU and GPU to see if either gets up too high while it is running our GPU tasks?

I have all of mine running 24/7 so I tend to check them all once per day especially now when the weather is getting hotter here.

That is the only thing that Einstein tasks can do to that one you have.

I still have my older 3-core hosts with XP Pro running 24/7 and doing tasks X3 also running T4T tasks with VB so if your newer one can't do these tasks it either runs hot or there is a OS problem and I would just do a factory re-install of the OS (I have done that myself)

Yours has plenty of memory and a good CPU as long as they don't fail.

Now I must admit your Geforce GTX 645 with only 144 CUDA cores would be the first thing I would replace since it is not really a good combination with the nice CPU and RAM yours has.......even a 550Ti would be better for about the same price. (and overclocked)

Good luck (which is more likely than a warranty)

WB8ILI
WB8ILI
Joined: 20 Feb 05
Posts: 45
Credit: 1105754788
RAC: 2816798

Paul - This won't help

Paul -

This won't help you except to let you know I had a similar issue a while ago. I don't have the exact date written down but I would guess about a year ago.

I have a desktop, Windows XP, AMD X4 945 processor, 4GB RAM, GTX 660 card.

Einstein apps would consistently crash the computer. I think it was the black screen of death but would have to repeat the experiment to be sure.

I eventually just gave up. I have been running SETI, Climate Prediction , and Milkyway now for a year(?) with no problems.

So it is the Einstein application causing the problem or some fault in my H/W or S/W that only Einstein exploits? I don't know.

paul milton
paul milton
Joined: 16 Sep 05
Posts: 329
Credit: 35825044
RAC: 0

RE: NOT defending Dell but

Quote:


NOT defending Dell but they will probably say that setting it to "extreme and maximum" is causing the problems and the systems specs aren't designed to do that, and you are out of luck. NOT that you should be, but they will probably say you 'should' have bought all the upgraded top of the line components if you want to run your pc at "extreme and maximum". And that their pc's run just fine under 'normal' conditions, and your problems are not their problem.

One thing I did notice is that Nvidia now offers the GeForce 337.88 Driver as opposed to your version of 335.23. This MAY help as you say the monitor goes off pretty quickly during the shutdown process, it could be a driver problem.

and i would point out to dell that this is an XPS 8700, marketed as an "extreme performance system" :) the monitor went off quick during that test, not during normal use or the BSOD. thanks for letting me know about the driver though, nvidias "experience" software didnt let me know about the update. ill install that now but i know it wont do any good (11 months in to the same problem)

Quote:

I just had to read through this again Paul,

Do you watch the temps. on the CPU and GPU to see if either gets up too high while it is running our GPU tasks?

I have all of mine running 24/7 so I tend to check them all once per day especially now when the weather is getting hotter here.

That is the only thing that Einstein tasks can do to that one you have.

I still have my older 3-core hosts with XP Pro running 24/7 and doing tasks X3 also running T4T tasks with VB so if your newer one can't do these tasks it either runs hot or there is a OS problem and I would just do a factory re-install of the OS (I have done that myself)

Yours has plenty of memory and a good CPU as long as they don't fail.

Now I must admit your Geforce GTX 645 with only 144 CUDA cores would be the first thing I would replace since it is not really a good combination with the nice CPU and RAM yours has.......even a 550Ti would be better for about the same price. (and overclocked)

Good luck (which is more likely than a warranty)

i was actually coming over here to ask how accurate speedfan is at temps compared to "core temp" i was looking up the max this cpu could handle and found that. but theres a big descrepentsy between what they report (about 40*f difference) at idle the cpu is at 140*f which "should" be fine, but im wondering if this is actually a fan curve issue (which would still be a dell fault since they are the only ones that can adjust that) speed fan shows the cpu fan at 12% when i attempt to increase this theres a 30 second lag before it starts to increase, and it only goes up in steps. and speedfan can "not" lower them, only a system restart resets them.

if i increase them to 60% the temps drop down to 110 - 115*F. as for the gpu it shows that running at 153*f with the cpu at 131*f

ive thought about heat before, but it doesnt add up, the crashes are 0x124 (general hardware fault), and they happen usually every 3 days or so. in fact it crashed just last night, an hour after a reboot.

as for the factory reinstall, let me clarify whats been done to this system so far :)

they replaced the GPU with the first crash. after that weve done..
reinstall from o/s recovery partition
reinstall from flash drive
replaced memory modules
replaced motherboard, memory modules, psu
replaced hard drive for a "fresh" install of the o/s, and memory modules, mother board.

the "only" parts not replaced are the CPU and HSF.

ive lived with a system that crashed regularly before, the thing is, this systems supposed to have an mSATA ssd installed, those dont take kindly to random periodic crashes (thus why its sitting in a drawer collecting dust)

i would LOVE to upgrade the GPU, but not until i pin this down. and i have disabled GPU processing in the passed thinking that was the issue, but it still crashed.

if i have to live with this, what ill probably end up doing is replacing the heat sink and thermal compound with something better, and hot wiring the cpu fan to run at 100%, i had to do that with an old pavillion when i changed out the case because of the fan curve, it was one of those designed with a cooling tunnel that didnt exist with the new case. so a fan adapter and some errr, creative wiring. fixed that problem.

now, i thought i ruled out heat because if i disable einstein and just let the system stay at idle, guess what? it still crashes with that exact same error. thats what lead me to the cpu. hey, maybe im wrong, i can admit that if i am. but this problems driving me nuts.

seeing without seeing is something the blind learn to do, and seeing beyond vision can be a gift.

WB8ILI
WB8ILI
Joined: 20 Feb 05
Posts: 45
Credit: 1105754788
RAC: 2816798

Paul - One suggestion

Paul -

One suggestion that was made was to try other applications. I didn't read in this thread where you tried that.

Load up a bunch of SETI and/or Milkyway tasks (or some other project(s)) making sure you are using the GPU to its fullest (run multiple GPU tasks) and see what happens.

Edit:

Paul - I looked through my notes and only Einstein GPU tasks crashed my system. CPU only tasks were OK.

paul milton
paul milton
Joined: 16 Sep 05
Posts: 329
Credit: 35825044
RAC: 0

any body got some ketchup for

any body got some ketchup for this shoe? at least, partly.

so, i shut down Einstein and let the system idle for an hour while i did other things. came back to find speedfan showing the CPU at 65*f, wellll i know that was wrong. coretemp was showing 80*f. adjusted the offset accordingly.

i then used speedfan to max the fans (had to wait 10 minutes for them to SLOWLY step up, i still say thats a fan curve glitch in the bios), then ran the intelburn at maximum settings again. this time the system didnt crash, but it did get very very unstable. programs unresponsive, even programs vanishing from the screen for a few seconds and coming back. it does seem that coretemp was still able to record data though, the cpu hit a max of 165*f with fans at 100%, i dont have the nerve to try that again with the fans "as is"

i still dont think that caused the 0x124, unless something was severely stressing the cpu other than einstein at random times, suppose thats possible.

im considering a ZALMAN CNPS9900MAX-B "if" itl fit, any suggestions on thermal compound? or will the stuff that comes with it ZM-STG2 do? its been so long, the last stuff i used was arctic silver ceramique. im guessing theres better stuff around by now.

Quote:


Paul -

One suggestion that was made was to try other applications. I didn't read in this thread where you tried that.

Load up a bunch of SETI and/or Milkyway tasks (or some other project(s)) making sure you are using the GPU to its fullest (run multiple GPU tasks) and see what happens.

Edit:

Paul - I looked through my notes and only Einstein GPU tasks crashed my system. CPU only tasks were OK.

i havent done that because it crashed with or with out the GPU, though now that i know speedfans temps where off by a mile i need to find something else to monitor GPU temp.

seeing without seeing is something the blind learn to do, and seeing beyond vision can be a gift.

mikey
mikey
Joined: 22 Jan 05
Posts: 12699
Credit: 1839101536
RAC: 3702

RE: im considering a

Quote:

im considering a ZALMAN CNPS9900MAX-B "if" itl fit, any suggestions on thermal compound? or will the stuff that comes with it ZM-STG2 do? its been so long, the last stuff i used was arctic silver ceramique. im guessing theres better stuff around by now.
Paul -
Quote:

I love those fans, and I still use the Artic Silver paste when I build/rebuild systems. The fans are big though, but honestly the newer Hyper 212 is just as big. http://www.newegg.com/Product/Product.aspx?Item=N82E16835103099

paul milton
paul milton
Joined: 16 Sep 05
Posts: 329
Credit: 35825044
RAC: 0

RE: I love those fans, and

Quote:

I love those fans, and I still use the Artic Silver paste when I build/rebuild systems. The fans are big though, but honestly the newer Hyper 212 is just as big. http://www.newegg.com/Product/Product.aspx?Item=N82E16835103099

i actually looked at that one, the 600RPM on the low side put me off though, as weird as it sounds since i cant adjust the fan curve i was looking for a high low rpm.. i need to crack open this case and break out a ruler. ill end up doing that tomorrow, right now my allergies have me amazingly tired and worn out, definitely dont want to deal with dust bunnies right now.

aside: ive had amazingly bad luck with fans over the years, happen to know about how long the fan on the zalman might last? ive had fans die after 6 months or less (i was beginning to think i had some sort of curse on me or something lol)

and yeah, i spent the better part of the morning researching TIM's, to many mixed results, even the stuff that comes with it has mixed reviews. some folks saying it had "chunks" in the mix.. think ill just stick to what i know and trust, will be adding a new tube of AS5 to the shopping cart..

right now i have NNW set for Einstein. left coretemp on since yesterday it peaked at 171*f and is currently at 155*f..

seeing without seeing is something the blind learn to do, and seeing beyond vision can be a gift.

WB8ILI
WB8ILI
Joined: 20 Feb 05
Posts: 45
Credit: 1105754788
RAC: 2816798

Remember that the thermal

Remember that the thermal paste is not there to be the primary path for heat from the CPU to the heat sink. Most of the heat is transferred from metal to metal contact. The paste is there just to fill in the microscopic air gaps with something better than air.

It is probably more important that you don't slop on the paste but put on a thin layer than the thermal properties of the paste.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.