Still receiving AMD tasks despite being disabled

Bill
Bill
Joined: 2 Jun 17
Posts: 38
Credit: 249,301,674
RAC: 279,063

Gary Roberts wrote:Perhaps

Gary Roberts wrote:
Perhaps you should report your observations over on the BOINC boards to see if someone (like Richard Haselgrove for example) might take a look in the BOINC code to see how BOINC comes up with its 'available memory' figure.  If BOINC reported a 2GB limit rather than the much larger figure, then the Einstein scheduler (hopefully) would know not supply tasks that  need more memory.

Someone has explained the memory situation to me in another post on another form long ago.  I don't remember how it worked.  I don't think BOINC is doing something wrong, but I can ask.

Gary Roberts wrote:
Are you able to disable the internal GPU and would that solve the problem for you?

I didn't think that was an option; I don't recall seeing that in BIOS.  Regardless, I am not interested in disabling the integrated GPU.  First, it used to work fine for E@H.  Second, it works just fine crunching MW@H.  I'd rather have it crunch on any project then no projects...it actually does pretty good for such a small GPU.

Gary Roberts wrote:
Direct question - did you ever see these listed on the tasks tab of BOINC Manager?  Or was it just a list of errors you saw on the website?

Funny enough, after I read this question, I went and double-checked BOINC Manager.  I currently have a slug (20+) AMD GPU tasks in the queue for E@H.  They have downloaded, but have not been crunched yet.  I tried forcing them to crunch by suspending all other projects; they didn't start (that is to say, the AMD APU was idle).  I then suspended all other tasks in E@H manually, and still, none of the AMD GPU tasks ran.  I also disabled the only other project running AMD GPU tasks (MW@H), closed BOINC Manager & Client, and after restarting BOINC, it still didn't run the AMD GPU tasks.

Ultimately, all I'm really wondering is if the "Resource Settings" for this project is not working correctly.  It says

Use AMD GPU:
<div>Request AMD GPU tasks from this project.

And I have selected "No", but I am still getting AMD GPU tasks.  Either this setting is misleading, or broken, or I'm doing something wrong.  If something is broken, I'm not sure if it is something that the BOINC developers need to fix, or if it is project specific.

 

Thank you for the help, Gary.  We're making some small project, but we are not there yet.

 

Edit: One more thing to clarify:  non-preferred applications is set to NO.

Edit #2:  Ah...another rub...since I have restarted, these tasks say "GPU missing, Ready to start (1 CPU + 1 AMD/ATI GPU)

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,845
Credit: 109,864,905,420
RAC: 30,376,765

Bill wrote:... I went and

Bill wrote:
... I went and double-checked BOINC Manager.  I currently have a slug (20+) AMD GPU tasks in the queue for E@H.  They have downloaded, but have not been crunched yet.  I tried forcing them to crunch by suspending all other projects; they didn't start (that is to say, the AMD APU was idle).

I've checked what the website says about the new tasks (there are 36 showing) and by clicking on the task ID link for the very first one it showedApplication: Gamma-ray pulsar binary search #1 on GPUs v1.22 (FGRPopencl-nvidia) windows_x86_64In other words, the app name contains 'nvidia' so these tasks were intended for your nvidia GPU and NOT the AMD APU.  This is the same as what I saw back when I first started looking at this issue.  I don't understand why BOINC thinks these are for the internal GPU because the scheduler certainly didn't send them for that device.

It's true that the same task could be processed by either type of GPU.  However, it seems the scheduler was responding to an nvidia work request.  What application name do you see listed on the tasks tab in BOINC Manager?  There will be either 'ati' or 'nvidia' in the app name which will tell you what the client thinks.

Bill wrote:
Ultimately, all I'm really wondering is if the "Resource Settings" for this project is not working correctly.  It saysUse AMD GPU:   Request AMD GPU tasks from this project.And I have selected "No", but I am still getting AMD GPU tasks.  Either this setting is misleading, or broken, or I'm doing something wrong.  If something is broken, I'm not sure if it is something that the BOINC developers need to fix, or if it is project specific.

There is nothing obviously wrong with disabling the APU because the scheduler isn't sending you "ati" branded tasks.  It would appear that if your BOINC Manager is saying (in the app name) that the tasks are "ati" type and not "nvidia" type, that's something I've never experienced before and have no idea how or why that could have happened.

The scheduler quite clearly says it sent you nvidia tasks.  If they've somehow changed along the way, I have no clue.

Bill wrote:
Edit #2:  Ah...another rub...since I have restarted, these tasks say "GPU missing, Ready to start (1 CPU + 1 AMD/ATI GPU)

I can (sort of) understand this bit.  Your APU is truly 'not available to BOINC' so I guess that type of GPU should be 'MISSING' :-).  Jokes aside, all I can suggest is to go through your full suite of local and website preferences very carefully to see if you can spot something that might be doing this.  Maybe someone familiar with the BOINC code could see how or why this might have happened.  It's beyond me.

I have looked at your website tasks list and have seen all I picked (about 5) showing 'nvidia'.  Do you have 36 of these on your host and do they all show "ati" on the App tab and "GPU MISSING" for status on the Tasks tab?

Cheers,
Gary.

Bill
Bill
Joined: 2 Jun 17
Posts: 38
Credit: 249,301,674
RAC: 279,063

Gary, I think you are


Gary, I think you are mistaken. I'm definitely receiving AMD GPU tasks for E@H. task 1042881150 is one of them:

Task 1042881150
Name: h1_0416.55_O2C02Cl4In0__O2MDFS2_Spotlight_416.75Hz_244_0
Workunit ID: 509350749
Created: 11 Dec 2020 8:59:49 UTC
Sent: 12 Dec 2020 2:12:54 UTC
Report deadline: 18 Dec 2020 8:59:50 UTC
Received: 1 Jan 1970 0:00:00 UTC
Server state: In progress
Outcome: --- Client state: New
Exit status: 0 (0x00000000)
Computer: 12767141
Run time (sec): 0.00
CPU time (sec): 0.00
Peak working set size (MB): 0
Peak swap size (MB): 0
Peak disk usage (MB): 0
Validation state: Initial
Granted credit: 0
Application: Gravitational Wave search O2 Multi-Directional GPU v2.09 (GW-opencl-ati) windows_x86_64

Wait a minute...how did I receive this file 1 Jan 1970?? The clock looks right on this computer, nor have I played with the clock.

A few other things to add. First, I realized I did have something in my cc_config to exclude E@H AMD GPU work. I must have had this in there a long time ago and forgot about it:

<exclude_gpu>
<url>http://einstein.phys.uwm.edu/</url>
<type>ATI</type>
<app>einstein_O2MDF</app>
</exclude_gpu>

Second, my ultimate goal here is that I suspect I need to send in a bug report on Github. I'm just trying to first confirm I'm not messing anything up obvious. If it isn't me, I am trying to determine if this is a E@H or BOINC bug so the appropriate people know.

I have turned on my sched_ops flag since this afternoon. I can share that information with anyone interested, although I am not sure what information is needed.

Bill

Bill
Bill
Joined: 2 Jun 17
Posts: 38
Credit: 249,301,674
RAC: 279,063

Okay, I think I see the

Okay, I think I see the problem now.  When I load up BOINC Manager, it says that the computer has preferences set to 'work'.  When I check the location on the website, it says it is set to 'home'.  Somehow, these two are different.  Hitting update in the manager doesn't seem to change it.

I'll have to troubleshoot this later.  I have a laptop running E@H exclusively, so I want to see how that is set up, and how any changes may affect it.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,845
Credit: 109,864,905,420
RAC: 30,376,765

Bill wrote:Gary, I think you

Bill wrote:
Gary, I think you are mistaken.

Quite likely.

Your message mentioned a 'slug' of AMD GPU tasks.  I interpreted that to mean they were all for AMD since you didn't make a comment about nvidia.  I mentioned that I looked at the very first one and found nvidia.  The website loads pretty slow for me so I wasn't too keen to look at a whole bunch.  It would appear then that you had a mixture of new task types.  I just again looked at the topmost task in your "In progress" group (159 tasks) and again it shows 'nvidia' - so you still must have both.

It should be much easier for you to scan the tasks tab in BOINC Manager to see how many of each type you have.  It's a pain to try to look at them all, one at a time, through the website.  I can't imagine why you still get AMD tasks.  However, I have no experience with trying to use the exclude mechanism but others seem to use it OK.

Bill wrote:
Wait a minute...how did I receive this file 1 Jan 1970??

It's nothing to do with your clock.  The label for that field is "Received:" which is not when you received it (that's the field labeled "Sent:") but rather the value is a 'placeholder' for when the project receives the completed result back (in the future).  The field should be blank (or perhaps "--") for now, but it obviously contains a zero.  In Unix time format, time zero is the 'epoch' which is midnight on Jan 1st 1970 - I remember it well, so well that I know exactly what I was doing at that time :-) ;-) :-).  Jokes aside, it's just something reading that field as zero and changing the value to a human readable time and date - just like all the other fields which have also been converted from numeric seconds from the epoch to the date they now show.

Bill wrote:
... I realized I did have something in my cc_config to exclude E@H AMD GPU work.

What you have listed looks fine but there must be something wrong somewhere, and perhaps it's been like that for quite a while.  The first thing I would do is to have your output with the extra logging flags included, showing more AMD tasks arriving.  Then create a post over on the BOINC boards with all the evidence and see if someone like Richard can work out why the cc_config exclude option isn't stopping the AMD tasks from being requested.  You should probably save the sched_request and sched_reply files as extra evidence for what was requested and what was delivered when you get a cycle that delivers AMD GPU tasks.  Maybe that will be an easier route than going straight to Github.  Richard will know what's best to do.

Cheers,
Gary.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2,140
Credit: 2,767,845,852
RAC: 990,907

Hi guys. Bill pinged me

Hi guys. Bill pinged me overnight and asked me to come and have a look at this thread - an extra pair of eyes.

I've skimmed through the whole thread, but not read it in great detail. Here are my initial responses, based on similar problems which have arisen in the past.

"Einstein sent ..." This always worries me. Can't SEND unless you REQUEST - what did you ask for? <sched_op_debug> logs are small, but very helpful - they show which device wanted work, and how much. I don't think that's the problem here.

Error lists. I see that Einstein still hasn't implemented the full set of filters. Inconclusive, anyone? Show count by task type? Show device type with application name? But I found task 1041068244 - we can use that.

Aborted by user. David Anderson introduced a bunch of new error codes in 2012, but forgot to handle them properly in the BOINC php web sources. 201 is one of those - it should be 'Aborted by client'.

Application. The one I picked says "Gravitational Wave search O2 Multi-Directional GPU v2.07 (GW-opencl-ati)". Note both 'opencl' and 'ati'.

Host. It has "AMD AMD Radeon(TM) Vega 8 Graphics (7204MB)", and no more. BOINC websites would show details of the OpenCL version supported, if any.

EXIT_MISSING_COPROC. I'm going to stick my neck out here, and suggest that the real problem might be 'error_missing_opencl_driver' - Windows 10 had a bad habit (at least in the early days) of supplying incomplete driver packages. What do BOINC's startup messages say about the GPUs detected? The server should check that both the device, and the driver, reach minimum spec. Otherwise, 'ATI present, but driver absent' would have the effect you're seeing.

Tasks sent to unselected device. In generic BOINC, people have commented that 'Run test applications?' can over-ride preferences for specific devices. But a mis-matched venue setting would have the same effect.

exclude_gpu. This works - taking local control overcomes most anomalies. Always refer to the user manual when suggesting or making changes. The startup event log will confirm that you've got it right (or not...)

I think that'll do to get you started... :-)

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,845
Credit: 109,864,905,420
RAC: 30,376,765

Bill wrote:Okay, I think I

Bill wrote:
Okay, I think I see the problem now.  When I load up BOINC Manager, it says that the computer has preferences set to 'work'.

When I posted my previous message, I didn't hang around.  I was already late.  I hadn't seen this latest message of yours (or Richard's later comments) until this morning.

Back on 8th Dec, in this message, you clearly stated that the 'location' wasn't the problem.  I have been relying on the accuracy of that since you seemed very definite in what you said - ie "Yes, the computer is set to home, and that is the location where I have the AMD GPUs disabled."  More fool me for not insisting that you check it again.  Nobody but you can see the location so we rely on what you say.  When we look at a particular host, that information isn't shown to us.

For future reference, just go to your list of computers on the website and click the 'details' link for a host whose location you want to check or change.  At the bottom of that details page there is a drop down menu where you can set the location you want. If you change it to the desired setting, that whole page will update with a highlighted message at the top telling you that the new location will be in force when the host in question next contacts the scheduler.  Use an 'update' on your host to force the process if you desire.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,845
Credit: 109,864,905,420
RAC: 30,376,765

Richard Haselgrove wrote:Hi

Richard Haselgrove wrote:
Hi guys. Bill pinged me overnight and asked me to come and have a look at this thread - an extra pair of eyes.

Hi Richard, I'm very sorry that your time has been wasted.

It's really my problem for not insisting on more checks that the locations of host and preference set being modified, truly were the same.

Cheers,
Gary.

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1,364
Credit: 3,562,358,667
RAC: 241

Gary Roberts wrote: Bill

Gary Roberts wrote:

Bill wrote:
Okay, I think I see the problem now.  When I load up BOINC Manager, it says that the computer has preferences set to 'work'.

When I posted my previous message, I didn't hang around.  I was already late.  I hadn't seen this latest message of yours (or Richard's later comments) until this morning.

Back on 8th Dec, in this message, you clearly stated that the 'location' wasn't the problem.  I have been relying on the accuracy of that since you seemed very definite in what you said - ie "Yes, the computer is set to home, and that is the location where I have the AMD GPUs disabled."  More fool me for not insisting that you check it again.  Nobody but you can see the location so we rely on what you say.  When we look at a particular host, that information isn't shown to us.

For future reference, just go to your list of computers on the website and click the 'details' link for a host whose location you want to check or change.  At the bottom of that details page there is a drop down menu where you can set the location you want. If you change it to the desired setting, that whole page will update with a highlighted message at the top telling you that the new location will be in force when the host in question next contacts the scheduler.  Use an 'update' on your host to force the process if you desire.

 

Stuff like this is ultimately why I stopped trying to control anything by location preferences years ago.  It might be 'easier' than setting something up in BAM or editing XML config files locally; but my lifetime record is 0-lots for remembering I screwed around with anything that way when trying to figure out why some change to my default settings isn't sticking on one of my systems.  

 

If I could rename them to something like "CPU only", "NVidia Boxes", "AMD Boxes", etc maybe; but home, work, etc just fade into the memory hole because they have no relation to actual settings.

Bill
Bill
Joined: 2 Jun 17
Posts: 38
Credit: 249,301,674
RAC: 279,063

Sorry about that, everyone. 

Sorry about that, everyone.  I didn't catch that the location setting was displayed in the event log, my bad.  I don't understand why there was a discrepancy in the locations for the same computer.  I don't recall if I had ever changed the location, but if I did, it would have been months ago.  Since then there have been plenty of reboots, updates, refreshes, etc. that I would have though the setting on the server would have dictated the location when the client reached out.

On the plus side, I have not had a problem since!  I'm considering this issue closed.  I hope there are no ill feelings all around.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.