Since installing my new video card, I have been having problems with the GPU overheating. I run speedfan, and every time I would restart after the screensaver had been running, the temp of my GPU would be sky high. The icon in the program was a ball of flame! As soon as the system was up, this temperature would start to go back down towards normal.
I tried all kinds of changes to the settings, both enabling/disabling the GPU, changing the amount of time before the screen blanks, everything else I could think of, but nothing made any difference.
Unless I can find a way around this, then my days with BOINC are pretty-much over - after close to 6 million units spread over 4 projects...
I don't understand why it overheats more then, rather than when the PC is running and displaying all the normal stuff. The video card fan is working, and I also have working fans on the cpu, and the memory. There is a good-quality case fun blowing air out the back of the case constantly. The video card is not jammed between 2 other cards - the fan side of it is open to the air.
This is not a BOINC issue alone, since it also behaves like this if I just run one of the normal windows screensavers for a couple of minutes before blanking the screen. Currently, I don't even run a screen saver, just blank the screen after a couple of minutes.
I have good airflow around the case - I rearranged everything after my last 2 PCs melted their motherboards... The PC sits up on top of the desk, rather than down in the 'proper' shelf. I try to stop the cat climbing on it, too...
Does anyone have any ideas? I hate to quit working on this stuff after so long - about 15 years now, I think. Ever since my 486 box...
Try just running the GPU fan at 100% regardless of temperature ie. irrespective of SpeedFan smarts. My reading of the SpeedFan blurb indicates that it is the user's responsibility to deduce the thermal behaviour of a given system : so maybe yours just needs that full setting. If that doesn't work then you need other cooling apparatus ?
Cheers, Mike.
( edit ) Also you have a Windows 7 machine and the SpeedFan site indicates a problem with that target system with a version of their software. Do you have SpeedFan v4.48 ?
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
No, I have speedfan 4.49. Also, I should have explained, but I don't control fan speeds with speedfan. Everything runs 100%, and I just use the program to monitor temperatures is all.
I don't really know what else I can do, without some kind of 'heroic' measures. There is plenty of airspace in the case, and all the fans run constantly. I clean out dust pretty regularly. I don't run graphic-intensive games or anything like that. It's just a 'standard' PC, really. I have upgraded the cooling, but I am not sure what else I can do to it. I think that the case fan took the last fan connector available on the MB. There may be adaptors available to power a fan from a 'normal' power connection - I think that is how I power the fans over the memory DIMMS - but there isn't really anywhere to connect another fan physically...
As I write this, the CPU fan is running at 1675 RPM, the chassis fan is running at 1155 RPM, and the only other fan registering, which I assume is the GPU, i srunning at 1150 RPM. All the fan readings fluctuate by a few degrees every few seconds, but not very much. The GPU temperature is currently 108 degrees F, the Core temp is 74 degrees, and everything else is somewhere in the 90-degree range.
Is there anything else readily available that can monitor temperatures for this stuff? I don't want to just disregard speedfan warnings, especially given my history of melting PCs! It really does not seem logical, though. The machine is running now, while I do email and general stuff - pretty much how it is used most of the time if I am not working on my music stuff. All the temperatures are completely stable - I don't think there has been a fluctuation of even a degree while I have been writing this. Why on Earth should the temperature of *any* component start to rise when I *stop* working on the machine? Earlier today, I ran Malwarebyte's program while I was doing other stuff. Their program is throttled back to not use more than 50% of my CPU, whereas at the moment the CPU usage is a whopping 4%. The machine was not running any hotter at that time than it is now...
I saw an ad recently in a UK magazine for a 'deluxe gaming' case. This thing had about 5 doors down the front, each of which can hold a large case fan... If I had a few hundred bucks to spend, maybe I would invest in something like that. But I don't...
(A) Test the apparent behaviour of the GPU temp sensor. If you can roughly find out where it physically is on the video card then one can, say, spray coolant on it ( or not ) and note whether the figures are different. By default just spray the GPU's heatsink. There's a couple of non-conducting hydrocarbon based sprays about that will evaporate pretty rapidly on contact with hotness and thus produce coolness. DownUnda you can get these for around $15 AUD and they generally come with a narrow drinking-straw tube for precise placement of the spray.
(B) What's the threshold for SpeedFan displaying a ball-of-flame icon ?
(C) Get a non-contact IR thermometer, hopefully with a narrow field of view, and do (A) with that. DownUnda you can get these for around $50 AUD ....
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
Ok, I can try the first one, at least. I need to be careful, though, because spraying cold compressed air out of a can to try and blow dust out of the cooling fan on my *last* card did not go well. Hence, the new card! The sudden cold apparently seized something up, and the fan never ran well or quietly again.
I never thought about trying the spray while monitoring temperature, though...
The thermometer idea is also a good one, but I don't see me being able to do that for quite some time, unless I can borrow one from somewhere. It's not that it would cost a fortune, but I drive a school bus for my sins. We have 13 days of school left before breaking for the summer. While I *love* having 3 months off, if we don't drive, we don't earn... I have spent most of my spare cash for months on setting myself up with music/software/hardware additions to keep me occupied for the duration.
Phil.
Quote:
Oooh ... :-) :-)
Couple of thoughts :
(A) Test the apparent behaviour of the GPU temp sensor. If you can roughly find out where it physically is on the video card then one can, say, spray coolant on it ( or not ) and note whether the figures are different. By default just spray the GPU's heatsink. There's a couple of non-conducting hydrocarbon based sprays about that will evaporate pretty rapidly on contact with hotness and thus produce coolness. DownUnda you can get these for around $15 AUD and they generally come with a narrow drinking-straw tube for precise placement of the spray.
(B) What's the threshold for SpeedFan displaying a ball-of-flame icon ?
(C) Get a non-contact IR thermometer, hopefully with a narrow field of view, and do (A) with that. DownUnda you can get these for around $50 AUD ....
The sudden cold apparently seized something up, and the fan never ran well or quietly again.
Maybe the cold caused a contraction of the metal in a soldered joint and thus exposed/enhanced a crack in said joint. The other possibility is the ( allegedly silicone ) lubricant in some fan bearing designs - quite easy to disturb alas, so once it's off the correct surfaces it don't return.
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
A common reason for fan problems like this is a small piece of magnetic material got blown or fell into the fan motor (they aren't well sealed) where it ends up between the stator and the magnets. Depending on the fan, removing the rotor and cleaning off the surface of the magnets will fix it.
I have installeed the EVGA Precision monitoring software, and tried just a few experiments. Turning on the 'mystify' screensaver, and running the preview for less than a minute resulted in the GPU temp going up from 45 degrees C to almost 60 degrees. This is with the fan speed set to 100%. This is 2 different programs reporting the same effect, which suggests that there is *something* going on...
Phil.
Quote:
Is there anything else readily available that can monitor temperatures for this stuff? I don't want to just disregard speedfan warnings, especially given my history of melting PCs! It really does not seem logical, though. The machine is running now, while I do email and general stuff - pretty much how it is used most of the time if I am not working on my music stuff. All the temperatures are completely stable - I don't think there has been a fluctuation of even a degree while I have been writing this. Why on Earth should the temperature of *any* component start to rise when I *stop* working on the machine? Earlier today, I ran Malwarebyte's program while I was doing other stuff. Their program is throttled back to not use more than 50% of my CPU, whereas at the moment the CPU usage is a whopping 4%. The machine was not running any hotter at that time than it is now...
I have juggled some parameters around to keep the card running at no more than the low 80s while running BOINC - and the company assures me that this is not a problem.
So, I'm baaa-aaak!
Now, I don't know if I have altered something during my struggles, but when I update my projects, I get
5/24/2013 8:00:58 AM | Einstein@Home | Not requesting tasks: project is not highest priority
As far as I can tell, the program is set to maintain 5 days of work. At this point, I have a single Einstein task being processed, with an estimate of less than 2 hours remaining, and no more waiting in the wings.
Anyone know how I can give these projects some priority?
Thanks,
Phil.
Quote:
Replying to myself again...
I have installeed the EVGA Precision monitoring software, and tried just a few experiments. Turning on the 'mystify' screensaver, and running the preview for less than a minute resulted in the GPU temp going up from 45 degrees C to almost 60 degrees. This is with the fan speed set to 100%. This is 2 different programs reporting the same effect, which suggests that there is *something* going on...
Phil.
Quote:
Is there anything else readily available that can monitor temperatures for this stuff? I don't want to just disregard speedfan warnings, especially given my history of melting PCs! It really does not seem logical, though. The machine is running now, while I do email and general stuff - pretty much how it is used most of the time if I am not working on my music stuff. All the temperatures are completely stable - I don't think there has been a fluctuation of even a degree while I have been writing this. Why on Earth should the temperature of *any* component start to rise when I *stop* working on the machine? Earlier today, I ran Malwarebyte's program while I was doing other stuff. Their program is throttled back to not use more than 50% of my CPU, whereas at the moment the CPU usage is a whopping 4%. The machine was not running any hotter at that time than it is now...
If your host is attached to several projects, then it's BOINC (based on the resource share of each project) who adjust the priority of them...
If the project is not at high priority it means that your host did a relatively lot of work for Einstein (or at least BOINC things that) and now is focusing on other projects ...
You can rise the resource share value for Einstein if you want your host working more often for this project... Anyway this is a long term thing, so you wont see instant changes in the behaviour... Let it run for a couple of weeks after changing anything to allow it to settle...
Replying to my own post
)
Replying to my own post here...
Since installing my new video card, I have been having problems with the GPU overheating. I run speedfan, and every time I would restart after the screensaver had been running, the temp of my GPU would be sky high. The icon in the program was a ball of flame! As soon as the system was up, this temperature would start to go back down towards normal.
I tried all kinds of changes to the settings, both enabling/disabling the GPU, changing the amount of time before the screen blanks, everything else I could think of, but nothing made any difference.
Unless I can find a way around this, then my days with BOINC are pretty-much over - after close to 6 million units spread over 4 projects...
I don't understand why it overheats more then, rather than when the PC is running and displaying all the normal stuff. The video card fan is working, and I also have working fans on the cpu, and the memory. There is a good-quality case fun blowing air out the back of the case constantly. The video card is not jammed between 2 other cards - the fan side of it is open to the air.
This is not a BOINC issue alone, since it also behaves like this if I just run one of the normal windows screensavers for a couple of minutes before blanking the screen. Currently, I don't even run a screen saver, just blank the screen after a couple of minutes.
I have good airflow around the case - I rearranged everything after my last 2 PCs melted their motherboards... The PC sits up on top of the desk, rather than down in the 'proper' shelf. I try to stop the cat climbing on it, too...
Does anyone have any ideas? I hate to quit working on this stuff after so long - about 15 years now, I think. Ever since my 486 box...
Thanks,
Phil.
Try just running the GPU fan
)
Try just running the GPU fan at 100% regardless of temperature ie. irrespective of SpeedFan smarts. My reading of the SpeedFan blurb indicates that it is the user's responsibility to deduce the thermal behaviour of a given system : so maybe yours just needs that full setting. If that doesn't work then you need other cooling apparatus ?
Cheers, Mike.
( edit ) Also you have a Windows 7 machine and the SpeedFan site indicates a problem with that target system with a version of their software. Do you have SpeedFan v4.48 ?
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
No, I have speedfan 4.49.
)
No, I have speedfan 4.49. Also, I should have explained, but I don't control fan speeds with speedfan. Everything runs 100%, and I just use the program to monitor temperatures is all.
I don't really know what else I can do, without some kind of 'heroic' measures. There is plenty of airspace in the case, and all the fans run constantly. I clean out dust pretty regularly. I don't run graphic-intensive games or anything like that. It's just a 'standard' PC, really. I have upgraded the cooling, but I am not sure what else I can do to it. I think that the case fan took the last fan connector available on the MB. There may be adaptors available to power a fan from a 'normal' power connection - I think that is how I power the fans over the memory DIMMS - but there isn't really anywhere to connect another fan physically...
As I write this, the CPU fan is running at 1675 RPM, the chassis fan is running at 1155 RPM, and the only other fan registering, which I assume is the GPU, i srunning at 1150 RPM. All the fan readings fluctuate by a few degrees every few seconds, but not very much. The GPU temperature is currently 108 degrees F, the Core temp is 74 degrees, and everything else is somewhere in the 90-degree range.
Is there anything else readily available that can monitor temperatures for this stuff? I don't want to just disregard speedfan warnings, especially given my history of melting PCs! It really does not seem logical, though. The machine is running now, while I do email and general stuff - pretty much how it is used most of the time if I am not working on my music stuff. All the temperatures are completely stable - I don't think there has been a fluctuation of even a degree while I have been writing this. Why on Earth should the temperature of *any* component start to rise when I *stop* working on the machine? Earlier today, I ran Malwarebyte's program while I was doing other stuff. Their program is throttled back to not use more than 50% of my CPU, whereas at the moment the CPU usage is a whopping 4%. The machine was not running any hotter at that time than it is now...
I saw an ad recently in a UK magazine for a 'deluxe gaming' case. This thing had about 5 doors down the front, each of which can hold a large case fan... If I had a few hundred bucks to spend, maybe I would invest in something like that. But I don't...
Thanks for the suggestions.
Phil.
Oooh ... :-) :-) Couple of
)
Oooh ... :-) :-)
Couple of thoughts :
(A) Test the apparent behaviour of the GPU temp sensor. If you can roughly find out where it physically is on the video card then one can, say, spray coolant on it ( or not ) and note whether the figures are different. By default just spray the GPU's heatsink. There's a couple of non-conducting hydrocarbon based sprays about that will evaporate pretty rapidly on contact with hotness and thus produce coolness. DownUnda you can get these for around $15 AUD and they generally come with a narrow drinking-straw tube for precise placement of the spray.
(B) What's the threshold for SpeedFan displaying a ball-of-flame icon ?
(C) Get a non-contact IR thermometer, hopefully with a narrow field of view, and do (A) with that. DownUnda you can get these for around $50 AUD ....
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
Ok, I can try the first one,
)
Ok, I can try the first one, at least. I need to be careful, though, because spraying cold compressed air out of a can to try and blow dust out of the cooling fan on my *last* card did not go well. Hence, the new card! The sudden cold apparently seized something up, and the fan never ran well or quietly again.
I never thought about trying the spray while monitoring temperature, though...
The thermometer idea is also a good one, but I don't see me being able to do that for quite some time, unless I can borrow one from somewhere. It's not that it would cost a fortune, but I drive a school bus for my sins. We have 13 days of school left before breaking for the summer. While I *love* having 3 months off, if we don't drive, we don't earn... I have spent most of my spare cash for months on setting myself up with music/software/hardware additions to keep me occupied for the duration.
Phil.
RE: The sudden cold
)
Maybe the cold caused a contraction of the metal in a soldered joint and thus exposed/enhanced a crack in said joint. The other possibility is the ( allegedly silicone ) lubricant in some fan bearing designs - quite easy to disturb alas, so once it's off the correct surfaces it don't return.
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
A common reason for fan
)
A common reason for fan problems like this is a small piece of magnetic material got blown or fell into the fan motor (they aren't well sealed) where it ends up between the stator and the magnets. Depending on the fan, removing the rotor and cleaning off the surface of the magnets will fix it.
Replying to myself
)
Replying to myself again...
I have installeed the EVGA Precision monitoring software, and tried just a few experiments. Turning on the 'mystify' screensaver, and running the preview for less than a minute resulted in the GPU temp going up from 45 degrees C to almost 60 degrees. This is with the fan speed set to 100%. This is 2 different programs reporting the same effect, which suggests that there is *something* going on...
Phil.
And again... I have
)
And again...
I have juggled some parameters around to keep the card running at no more than the low 80s while running BOINC - and the company assures me that this is not a problem.
So, I'm baaa-aaak!
Now, I don't know if I have altered something during my struggles, but when I update my projects, I get
5/24/2013 8:00:58 AM | Einstein@Home | Not requesting tasks: project is not highest priority
As far as I can tell, the program is set to maintain 5 days of work. At this point, I have a single Einstein task being processed, with an estimate of less than 2 hours remaining, and no more waiting in the wings.
Anyone know how I can give these projects some priority?
Thanks,
Phil.
If your host is attached to
)
If your host is attached to several projects, then it's BOINC (based on the resource share of each project) who adjust the priority of them...
If the project is not at high priority it means that your host did a relatively lot of work for Einstein (or at least BOINC things that) and now is focusing on other projects ...
You can rise the resource share value for Einstein if you want your host working more often for this project... Anyway this is a long term thing, so you wont see instant changes in the behaviour... Let it run for a couple of weeks after changing anything to allow it to settle...