All things Nvidia GPU

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,751
Credit: 35,766,146,090
RAC: 39,717,125

coolbits=28 should get you

coolbits=28 should get you what you need.

I always use the following command for this.

sudo nvidia-xconfig --thermal-configuration-check --cool-bits=28 --enable-all-gpus

_________________________________________________________________________

Joseph Stateson
Joseph Stateson
Joined: 7 May 07
Posts: 173
Credit: 2,985,188,821
RAC: 1,166,587

Ian&Steve C.

Ian&Steve C. wrote:

coolbits=28 should get you what you need.

I always use the following command for this.

sudo nvidia-xconfig --thermal-configuration-check --cool-bits=28 --enable-all-gpus

 

That worked!  Unfortunately I had to re-edit my xorg.conf file to add Intel back in.

Going to try setting clock speed down on that board which is always 84c

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,751
Credit: 35,766,146,090
RAC: 39,717,125

does it have good

does it have good airflow?

what fan speeds are you running? (I run mine at 75% constant, 80% on GPUs in a case with less airflow)

you can also try power limiting.

use nvidia-smi to enforce a power limit with the -pl argument.

for example, to power limit GPU#0 (first) to 225W, you would do the following:

sudo nvidia-smi -i 0 -pl 225

 

reducing power does wonders for reducing temperatures. and often times you can overclock on top of the reduced power limit to bring clock speeds and performance very near the unaltered state, but at a lot less stress and temps. I do this on all my systems.

my 225W 2080Tis run at ~60-67C under full load on a mining rig. (22C ambient)

_________________________________________________________________________

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4,780
Credit: 17,808,808,709
RAC: 4,002,332

Coolbits=4 only allows fan

Coolbits=4 only allows fan control.  Coolbits=28 allows fan control AND core clock and memory clock control.

Also, you can just delete the multi-display layout since you are not using multiple display capable cards.  Prevents the mouse cursor from traveling off the right side of the display to the non-existent displays.

The section you posted should look like this.

Section "ServerLayout"
    Identifier     "Layout0"
    Screen         "intel"
    Screen      0  "Screen0"
    Screen      1  "Screen1"
    Screen      2  "Screen2"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

 

Joseph Stateson
Joseph Stateson
Joined: 7 May 07
Posts: 173
Credit: 2,985,188,821
RAC: 1,166,587

Changing the clock speed had

Changing the clock speed had not effect but that power limit worked fine!  I also changed that layout.

+-------------------------------+----------------------+----------------------+
|   2  NVIDIA P102-100     Off  | 00000000:0B:00.0 Off |                  N/A |
|  0%   78C    P0   225W / 225W |    794MiB /  5059MiB |     95%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

Two of the boards have external fans so there is no fan speed report "0"

Just discovered that the desktop does not work if I remove the monitor from the motherboard.  Cannot even use VNC to get in.  Have to reconnect the monitor to run nvidia-settings under VNC.

 Motherboard takes a DVI connector.  I will have to rig an adapter to HDMI and then use a dummy HDMI load.

 

Is there an xorg setting to fake having a monitor?

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4,780
Credit: 17,808,808,709
RAC: 4,002,332

You can get a HDMI or VGA

You can get a HDMI or VGA "dummy" plug to put on the video card output to load the card and make Xorg think you have a monitor connected.  This is similar to what I put on my RPi that I connect to headless.  Allows me the standard desktop through VNC.

HDMI dummy plug for headless

And yes you can fake a monitor or create a virtual desktop.  That is what the "right of monitor 0" and the following was for in your original config file. You can create separate desktop spaces so for example you can have a set of application open and running on one desktop and have another desktop with a different set of applications running so that you can reduce the clutter of one desktop.

If you set up the Desktop environment you could just move your mouse cursor over the right edge of the prime desktop and it will traverse to your other desktop and the desktop display will switch to the other desktops.

 

 

 

Stephen "Heretic"
Stephen "Heretic"
Joined: 5 Feb 17
Posts: 94
Credit: 645,067,679
RAC: 0

 . . . And now it's back

 . . . And now it's back again.  I have reached a compromise theory.  Ian was mainly correct in that the absence of the graph was because I had only recently set statistics sharing to yes, but also any sudden spike in the numbers can cause the graph to temporarily vanish again.

Stephen

Sorry if my musings bore anyone ....  <shrug>

Stephen

 

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 216
Credit: 8,485,089,045
RAC: 3,573,742

Question (and I tried looking

Question (and I tried looking this up but could not find the answer). Why does it report our GPUs with the following memory?

NVIDIA NVIDIA RTX A6000 (4095MB)

when each has 48 Gb? 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4,780
Credit: 17,808,808,709
RAC: 4,002,332

Because BOINC only uses a

Because BOINC only uses a 32bit routine to interrogate Nvidia cards.  So the max it can report is 4GB.  There is a issue raised in the BOINC github repository but it has never been acted upon.  The fix is available but never applied to the code. Blame David Anderson, the main BOINC developer for dragging his heels for years.

https://github.com/BOINC/boinc/issues/1773

I have applied the fix to my client and it reports my Nvidia cards correctly.

 

Betreger
Betreger
Joined: 25 Feb 05
Posts: 987
Credit: 1,454,687,331
RAC: 674,010

Boca Raton Community HS

Boca Raton Community HS wrote:

Question (and I tried looking this up but could not find the answer). Why does it report our GPUs with the following memory?

NVIDIA NVIDIA RTX A6000 (4095MB)

when each has 48 Gb? 

As I understand Boinc only sees no more the 4MB of GPU memory.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.