GPU Not Working

Brent
Brent
Joined: 10 Aug 08
Posts: 28
Credit: 6641989
RAC: 0
Topic 226633

Hello, all

I need some help please. I am running a Dell XPS 8900 computer with Windows 10 Home 64-bit, INTEL Core i7-6700 CPU @ 3.40GMz (4 cores, 8 Logical Processors) and 64 GB Ram. and a NVIDIAGeForce GT 730 2GB DDR3 Video Card. I believe everything is up to date. However lately, I am having trouble running GPU project and getting listed credit for these tasks. In the past Collatz, Einstein and GPU Grid did work, but no more. Asteroids@home is currently down and Rosetta@home never did work right. I need help with these projects working on my system or other GPU project that will work on my system. I don't have the money at this time to upgrade my PC or Video Card (and hopefully shouldn't need to). At this point, my only option is to only run CPU projects are give up on Boinc completely.

mikey
mikey
Joined: 22 Jan 05
Posts: 11888
Credit: 1828033366
RAC: 208115

This part of the error

This part of the error message says "Network access is denied" do you perhaps have your antivirus set to block Einstein? An easy setup for Boinc and a/v's is to to allow the whole group of Boinc folders, in windows c:\program data/boinc, that way every Boinc project can connect as needed and mot false positive notifications be excluded but any real virus will be caught as it tries to infect over folders.

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 260
Credit: 6910071637
RAC: 21942229

@BRENT:   What a/v have

@BRENT:

 

What a/v have you installed ?   Have you done a "full scan" .

 

What tool are you using to monitor  GPU  heat status ?

Maybe just overheating and dying !

 

Your  GPU  GT730 should be doing fine on ALL projects if not overloaded or overheating !

Relax and have a beer on me ...

 

 

Brent
Brent
Joined: 10 Aug 08
Posts: 28
Credit: 6641989
RAC: 0

I am using Avast Free (main)

I am using Avast Free (main) plus Malwarebytes and Super Anti Spyware. I have used all 3 for some time without any problems. When this problem first started I ran all 3 scans.

I use Open Hardware Monitor to monitor temperatures. Most recent peaks are GPU 67 and CPU 80.

I also, this morning, cleaned out the dust with vacuum cleaner. I am currently running this project and will see if any of this helped (fingers crossed).

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

If your motherboard has a

If your motherboard has a "GPU ready" pci-e slots free you could also try if moving that GPU into a different slot made any difference.

Brent
Brent
Joined: 10 Aug 08
Posts: 28
Credit: 6641989
RAC: 0

Nope!! I got another

Nope!! I got another Computation Error. I am sure I did not adjust my antivirus to block Einstein. nor do I believe that moving the GPU to a different slot will make any difference. Until we find something else that works for Einstein, can someone suggest another GPU application I can try for Boinc?

Thanks, Brent

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

Brent wrote:I am using Avast

Brent wrote:
I am using Avast Free (main) plus Malwarebytes and Super Anti Spyware. I have used all 3 for some time without any problems. When this problem first started I ran all 3 scans.

Looks like your computer is running Microsoft Windows 10 Core x64. If that edition includes Microsoft Defender then I would absolutely uninstall and completely remove Avast Free and Super Anti Spyware.

One more scenario to try out:

Adjust your project settings so that your computer would download for example 10 more Einstein FGRPB1G GPU tasks. Then disconnect your computer from internet. Exit all those antivirus softwares completely and make sure they are not running any kind of background services. Then let BOINC just crunch those GPU tasks and see how it goes.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5842
Credit: 109378496202
RAC: 35980788

Brent wrote:I need some help

Brent wrote:
I need some help please.

I seem to recall some pearl of wisdom along the lines of "The {deity of choice} helps those who help themselves" :-).  Sorry - couldn't resist :-).

A good place to start is the tasks list for the computer having the problem.  That list is dynamic with older entries dropping off and newer entries being added.  At the moment, the oldest two entries are compute errors whilst the next one is a successfully completed task.  I would expect older tasks may drop off relatively soon so what I see now may not last for very long.

If you click on the task ID link for the oldest task, you get to see the stderr output returned to the project. It indeed says (right near the start) that "Network access is denied".  The people suggesting that this is the reason should really know better than to trust this.  They are just giving you bogus recommendations.

The science app does indeed produce an error code when tasks fail.  The code value is specific to the app and only of any use to the app developer (or someone who actually has a list of the code meanings).  The app is designed for all platforms and codes used are NOT Windows codes.  Stupid Windows thinks they are and so gives rubbish reasons.  You just ignore these Windows messages and scan down to near the end to find the point where the task actually failed.

In the case of this compute error I looked at, the error message near the bottom was,

error in opencl_qsort

In the full tasks list I linked to, notice that the oldest 8 tasks had names starting with "LATeah4013L....".  Notice also that of these 8, only four completed successfully (around 18K secs) whilst the other 4 all failed at much shorter times.  The GT730 is an old, low end GPU that is not really good enough for the difficulty of these tasks so there is quite a high failure rate.

The tasks that follow the older 8 have a different name - starting "LATeah3011L....".  Notice they ALL fail around 12.5K secs.  Now here's the thing.  We have been running the eah4 series of tasks for quite a long time.  They seem to have now finished (quite a few observations about this in recent posts) and have been replaced by the eah3 series which compute quite differently and in significantly shorter time.  Both series are crunched by the same FGRPB1G app.

For many years, FGRPB1G tasks have had two stages of computation - the main stage up to 89.997% completed and a followup stage where all the candidate signals from the main stage are assembled into a "top ten" list for sending back to the project.  For eah4 series, this took an extremely short length of time (just grab the 10 best with no additional processing).  For the eah3 series, there is a significant amount of extra time needed.

There has been no announcement about this but my guess is that there is extra double precision GPU processing being done on the top ten candidates before they are being returned.  No doubt there is also back and forth data transfers between CPU and GPU to support this so the time taken is not just a factor of the speed/capability of the GPU's double precision hardware.  If there is a slow CPU and slow data transfer, this would also add to the total time the whole procedure takes.

So here's my guess.  For your failed eah3 series tasks, the ~12.5K secs time is probably just for the main stage of computation.  The error message for one of these failed tasks was:-

ERROR: /home/bema/source/fermilat/src/bridge_fft_clfft.c:1073: clFinish failed. status=-36

The failure at this point would seem to indicate that your GPU is too weak to even start the much more intensive double precision work in the followup stage.  You need a better GPU to complete these tasks.

Cheers,
Gary.

Brent
Brent
Joined: 10 Aug 08
Posts: 28
Credit: 6641989
RAC: 0

Richie wrote: Brent wrote:I

Richie wrote:

Brent wrote:
I am using Avast Free (main) plus Malwarebytes and Super Anti Spyware. I have used all 3 for some time without any problems. When this problem first started I ran all 3 scans.

Looks like your computer is running Microsoft Windows 10 Core x64. If that edition includes Microsoft Defender then I would absolutely uninstall and completely remove Avast Free and Super Anti Spyware.

One more scenario to try out:

Adjust your project settings so that your computer would download for example 10 more Einstein FGRPB1G GPU tasks. Then disconnect your computer from internet. Exit all those antivirus softwares completely and make sure they are not running any kind of background services. Then let BOINC just crunch those GPU tasks and see how it goes.

I am not going to erase my virus protection without something to convince me this will fix everything. I would rather delete Microsoft Defender!

Also, for what it is worth, from 10-29-21 to 12-18-21 Einstein was running OK. During this time I did not add or change any HW or SW except for software security updates.

Then, things went bad with Einstein on 12-19-21 to present. Also. Einstein went bad from 10-23-21 to 10-28-31 (that's as far back as my data goes). And during this entire 60 day time-frame, everything was working fine on my system except for Einstein and other Boinc GPU projects. And you really want me to wipe out what is protecting my system! I will drop out of Boinc completely before I would do this! I think Einstein really needs to to look at what they changed during these bad time-frames and fix their end of this issue.

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

Brent wrote:And you really

Brent wrote:
And you really want me to wipe out what is protecting my system!

No, I said "If that edition includes Microsoft Defender". If your Windows edition is running Defender then in 2021 Q4 you won't gain any kind of advantage from running those two additional "antivirus" softwares. All they could do is cause problems while interfering with the system. Malwarebytes still might be useful in some circumstances though.

In addition, I was suggesting to do it temporarily while your computer wasn't connected to internet and was crunching only.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3681
Credit: 33814281182
RAC: 37823679

Running multiple AV softwares

Running multiple AV softwares is a recipe for conflict and headaches. 
 

you should only run one application for this. Not 3. 

_________________________________________________________________________

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.