correlate wus on server status page to GPU types

robl
robl
Joined: 2 Jan 13
Posts: 1709
Credit: 1454479971
RAC: 8894
Topic 216440

when I look at the WUs available on the server status page how do I tell which WUs are available for AMD, NVIDIA GPUs etc.  right now I have suddenly stopped processing AMD GPU WUs but my interpretation of the server status page indicates that there are WUs available for AMD GPUs.  

solling2
solling2
Joined: 20 Nov 14
Posts: 219
Credit: 1563236631
RAC: 50741

AFAIK, the server doesn't

AFAIK, the server doesn't even have to differentiate between different GPU brands as it will send text files with data (.dat) only. An operating system specific application for how the GPU has to handle that already sits on your system since project installation (or publication of a new one). Then the driver arranges the run. So the server stats page just lists overall numbers. If my assessment is right, you may possibly have changed your settings accidentally? 

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

robl wrote:when I look at the

robl wrote:
when I look at the WUs available on the server status page how do I tell which WUs are available for AMD, NVIDIA GPUs etc.  right now I have suddenly stopped processing AMD GPU WUs but my interpretation of the server status page indicates that there are WUs available for AMD GPUs.  

The direct answer would be something like 'Compare the server status page and the Applications page' and that's best done if one have the translation key between the application name and it's abbreviation shown on the server status page.

A better way of going about finding out the problem would be to ask oneself what's changed?
A good starting point would be the last server contact log for the computer in question, in this case host #12619116.
The the log can be viewed by going to one's list of computers and clicking on the last contact time for the host.

From the above linked log one can ascertain the reason to why no work for the GPU is sent: (trimmed a bit for readability.)

2018-09-24 16:26:43.1224 [version] parsed project prefs setting 'gpu_util_fgrp': 0.330000
2018-09-24 16:26:43.1224 [version] OpenCL GPU RAM required min: 803209216.000000, supplied: 51097600
2018-09-24 16:26:43.1225 [version] Checking plan class 'FGRPopencl1K-ati'
2018-09-24 16:26:43.1225 [version] parsed project prefs setting 'gpu_util_fgrp': 0.330000
2018-09-24 16:26:43.1225 [version] OpenCL GPU RAM required min: 1048576000.000000, supplied: 51097600

This part of the log indicates that Boinc reports the GPU of having 48 MB of RAM. (That's not a typo!)
The list of computers claims the same.
I would start with either updating or reinstalling the GPU driver to see if this corrects the memory detection fault.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5842
Credit: 109379606188
RAC: 35975079

The last time your computer

The last time your computer fetched GPU work was on 13th September.  Perhaps there was an update for your computer just after that time that has caused problems.  Definitely looks like a driver/OpenCL libs issue.  The BOINC startup messages might contain a clue, particularly if you note any difference compared to previous startup messages.  You may be able to find an older message in stdoutdae.old if there isn't one in stdoutdae.txt.

 

Cheers,
Gary.

robl
robl
Joined: 2 Jan 13
Posts: 1709
Credit: 1454479971
RAC: 8894

Gary Roberts wrote:The last

Gary Roberts wrote:

The last time your computer fetched GPU work was on 13th September.  Perhaps there was an update for your computer just after that time that has caused problems.  Definitely looks like a driver/OpenCL libs issue.  The BOINC startup messages might contain a clue, particularly if you note any difference compared to previous startup messages.  You may be able to find an older message in stdoutdae.old if there isn't one in stdoutdae.txt.

 

Late today I thought that maybe the problem was driver related (I had not noted your driver comment above Gary) so I rebooted the PC and in came the GPU jobs.  I am not sure why this happened.  I am running ubuntu 16 and have not done any updates lately.  Anyway it is working once again after 6 lost days of crunching.

Thanks to all who replied.  

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.