Hello, I have a system with a 7950x3d that crashes occasionally when running CPU tasks and for the life of me, I can't figure out why. The system is running completely stock; PBO and CO are completely disabled. The memory is running in the default JEDEC profile for 4800 mt/s. All drivers are as up to date as can be as is the BIOS(the motherboard is an Asus B650E-I if that is helpful). I have tested the system with three different power supplies, an 800w, an 850 and a 1000, all with the same result. I have also run both windows memory diagnostic and memtest86+ both of which give the memory a pass. The windows event logger only has an event 41 kernel power event logged and has no details as to what happened. I also attempted to log many of the system values with the logger in HWinfo64 however I do not have much confidence that it was able to record what happened as the crash occurred. When the crashes occur, the system tends to lock up for a few seconds, often going to a completely green screen before rebooting. I would like to think it is not a power issue because a) the system is not immediately off when it crashes; it sits on but locked up for a couple moments before it reboots and b) I changed the setting in BIOS for what to do upon loss of ac; if it loses ac, it is not supposed to reboot on its own. It also does not appear to matter how much of the CPU I allow the project to use; I have set it at anything from 10% to 100% and it still crashes although they seem to occur faster at higher percentages. Ultimately my question is this: is there some other test that anyone can think of to narrow down the cause of these crashes. Or is anyone else running these applications on the same processor and is it behaving correctly?
Copyright © 2024 Einstein@Home. All rights reserved.
Hi CYRIX, Just for grins,
)
Hi CYRIX,
Just for grins, try switching out one of your other GPUs and see if it follows the GPU or remains with your computer's motherboard et al.. That would be at least a big help.
Proud member of the Old Farts Association
Hi Cyrix you didn't mention
)
Hi Cyrix you didn't mention system temps either, you might find a temp app that you can watch and see what temp it is when it crashes, if it hapeens quicker at 100% than at 10% like you said to me that's a clue. It could also be a harddrive but that's not as likely with todays SSD and nvme drives.
GWGeorge007 wrote: Hi
)
I ran it without the 4090 for some time earlier this year and it had the same problem. The ATI/AMD GPU is the integrated GPU. I usually use that one for display out but I could try disabling it and running it again and seeing if the behaviour changes.
mikey wrote: Hi Cyrix you
)
I have been monitoring and logging temps with HWINFO64 and none of them have been particularly out of line. The CPU sits at the boost limit temperature of 89c pretty much the entire time however I am told that is pretty much by design for this processor. The memory tends to sit in the low 60's as does the main m.2 ssd. I have long had the suspicion that one of the PMICs on the memory modules is either over temping or otherwise dropping out however I have no evidence of this. HWINFO does keep track of if the PMIC reports an over temperature or overload condition. That being said, I don't know if it would be able to save it to the log correctly before it shuts down and crashes the computer. I have also run memory stress test programs that got the memory much hotter, 75+ c, and they kept going.
I don't want to celebrate
)
I don't want to celebrate prematurely however, I have been running the system for several hours under conditions that would have previously caused a crash within an hour and it has not crashed yet. Disabling the IGPU seems to have worked. It even seems to have solved some benign graphical glitches that I otherwise had. I will continue testing of course and see if I can get several days of no crashes.
Cyrix wrote: GWGeorge007
)
WOO HOO!!!
If you really need a fancy screen output you can always add in a 2nd stand alone gpu and just exclude it from all the Projects so it doesn't crunch.
Cyrix wrote:I don't want to
)
Okay, that's great news. Keep us posted on your progress.
Proud member of the Old Farts Association
Celebration premature;
)
Celebration premature; managed to get it to crash while running a bunch of other stuff at the same time.
mikey wrote: Cyrix
)
I can't really add any more parts since this is a very small form factor build. As is, the only reason I was using the built in gpu for display out is that it reduces the idle power draw by a couple watts. As it was, I never allowed work to be assigned to it in the first place and whenever I was having problems, it was when I was running cpu only tasks.