All things Nvidia GPU

Boca Raton Comm...

Joined: 4 Nov 15

Posts: 248

Credit: 10672755586

RAC: 17516626

Ian&Steve C. wrote:have you

25 Jan 2023 2:41:16 UTC

Message 207084 in response to message 207069

(moderation:

)

Ian&Steve C. wrote:

have you tried swapping the GPUs between these systems to see if the clock speed anomalies or performance issues follow the card or stay with the system?

This is a good idea. I will actually have an interesting comparison within the next few days- we will have our other two new TR systems (identical to the TR workstation we are talking about having the issue, except they have the 24 core TR CPU and half the memory (all DIMMs are full though). If these ALSO have the same issue, then I think we will be able to say it is not the actual GPU. If they are faster, then it might be the GPU, then I would try the swap idea to confirm.

Tom M wrote:

Different gpu driver versions?

Sometimes Amd has motherboard drivers for windows.

I like Ian&Steves either clean reset the new TR gpus. Or clean swap the gpus.

Are the pci slots set to "auto" or to Gen4 or Gen3?

Are the cards showing how hard they are loaded? Any differences between systems.

Is the TR MB bios current?

Hth,

Tom M

Same driver version but I am wondering if there is some odd Dell driver activity happening. I have seen that Dell likes to release specific Nvidia drivers via their distribution platform in the past. I downloaded the newest Nvidia drivers directly from the Nvidia website so maybe there is an odd conflict going on. I know there is software out there that completely wipes GPU drivers (anyone remember the name?), so I might start fresh with drivers, as you all recommend.

I will check the bios to see what the pcie slots are set to.

There IS something odd about the Windows Task Manager- CUDA does not show up. My Windows 11 laptop also does this, so I was not surprised, but all of the computation is being shown under "3D" and it is sustained at 80%, with a few spikes to 100% here and there. Under the Nvidia control panel, it is showing CUDA usage is 100%.

I will be back in front of the workstation tomorrow, but I am curious if the "hardware-accelerated GPU scheduling" is on or off. Either way, I will play around with this.

Thank you all for your suggestions!

mikey

Joined: 22 Jan 05

Posts: 12715

Credit: 1839118161

RAC: 3623

Boca Raton Community HS

25 Jan 2023 12:27:28 UTC

Message 207097 in response to message 207084

(moderation:

)

Boca Raton Community HS wrote:

Same driver version but I am wondering if there is some odd Dell driver activity happening. I have seen that Dell likes to release specific Nvidia drivers via their distribution platform in the past. I downloaded the newest Nvidia drivers directly from the Nvidia website so maybe there is an odd conflict going on. I know there is software out there that completely wipes GPU drivers (anyone remember the name?), so I might start fresh with drivers, as you all recommend.

DDU Display Driver Uninstaller

Tom M

Joined: 2 Feb 06

Posts: 6478

Credit: 9605294944

RAC: 5627406

From guru3d I think.

25 Jan 2023 23:09:43 UTC

Message 207114 in response to message 207097

(moderation:

)

From guru3d I think.

A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!

mikey

Joined: 22 Jan 05

Posts: 12715

Credit: 1839118161

RAC: 3623

Tom M wrote: From guru3d I

26 Jan 2023 0:02:20 UTC

Message 207118 in response to message 207114

(moderation:

)

Tom M wrote:

From guru3d I think.

YES I have them on a tab in my browser because I trust their downloads to be clean and virus free.

catavalon21

Joined: 5 Nov 11

Posts: 22

Credit: 180122627

RAC: 223

Installed the 4070 ti, ran a

27 Jan 2023 3:16:12 UTC

Message 207176

(moderation:

)

Installed the 4070 ti, ran a few tasks under Win10 stock app and several Linux FGRP on the custom app. Single tasks at a time.

The W10 tasks are a little better than half the time on the 1660 ti

https://einsteinathome.org/host/12799118/tasks/2/0

On the Linux side they are also a little quicker than half the time, around 2 minutes 25

https://einsteinathome.org/host/12880857/tasks/2/0?sort=desc&order=Sent

Sorry for the clunky way of linking.

A small data sample but FWIW.

Fred

Tom M

Joined: 2 Feb 06

Posts: 6478

Credit: 9605294944

RAC: 5627406

What I am looking at is

29 Jan 2023 4:28:45 UTC

Message 207288

(moderation:

)

What I am looking at is command line scripting under Linux for Nvidia

I want to set all gpus for a power limit of say 300 watts.

I want to set the graphics memory overclock to 1500

I don't want to change the graphics clock.

I have run "coolbits" and can do this in Nvidia-X server.

So how do you setup sudo in the script?

And why doesn't the Nvidia-X server gui display the changes I made on the command line.

sudo nvidia-smi -pl 300

sudo nvidia-settings -a '[gpu:0]/GPUGraphicsClockOffset[3]=0' -a '[gpu:0]/GPUMemoryTransferRateOffset[3]=1500'

sudo nvidia-settings -a '[gpu:1]/GPUGraphicsClockOffset[3]=0' -a '[gpu:0]/GPUMemoryTransferRateOffset[3]=1500'

========================

tommiller@Ryzen-Charon:~$ nvidia-smi
Sat Jan 28 22:27:03 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.13    Driver Version: 525.60.13    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap|         Memory-Usage | GPU-Util Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0 NVIDIA GeForce ... Off | 00000000:09:00.0 On |                  N/A |
|100%   72C    P2   299W / 300W |   4036MiB / 12288MiB |    100%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1 NVIDIA GeForce ... Off | 00000000:0A:00.0 Off |                  N/A |
| 78%   63C    P2   298W / 300W |   3767MiB / 12288MiB |    100%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
| GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A N/A       982      G   /usr/lib/xorg/Xorg                 21MiB |
|    0   N/A N/A      1773      G   /usr/lib/xorg/Xorg                 88MiB |
|    0   N/A N/A      1977      G   /usr/bin/gnome-shell               37MiB |
|    0   N/A N/A   1347777      G   /usr/lib/firefox/firefox          121MiB |
|    0   N/A N/A   1368585      C   ...-pc-linux-gnu-opencl_v1.0     1874MiB |
|    0   N/A N/A   1369086      C   ...-pc-linux-gnu-opencl_v1.0     1874MiB |
|    1   N/A N/A       982      G   /usr/lib/xorg/Xorg                  5MiB |
|    1   N/A N/A      1773      G   /usr/lib/xorg/Xorg                  6MiB |
|    1   N/A N/A   1368618      C   ...-pc-linux-gnu-opencl_v1.0     1874MiB |
|    1   N/A N/A   1369320      C   ...-pc-linux-gnu-opencl_v1.0     1874MiB |
+-----------------------------------------------------------------------------+
tommiller@Ryzen-Charon:~$

=========================

I have spent an hour plunking around which this. And I doubt I will ever become a highly-skilled Linux professional. I seem to have gotten beyond "novice" and reached "beginner" where I am likely to stay.

Thank you.

Tom M

A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3980

Credit: 47393892642

RAC: 64727507

Put all the commands in a

29 Jan 2023 5:04:35 UTC

Message 207289

(moderation:

)

Put all the commands in a bash script without sudo in the commands

run the script with sudo.

_________________________________________________________________________

Keith Myers

Joined: 11 Feb 11

Posts: 4981

Credit: 18798998104

RAC: 7834403

Here is my script I named

29 Jan 2023 7:28:09 UTC

Message 207292

(moderation:

)

Here is my script I named gpuoverclock.sh that sits on the Desktop. It is one of the first things I run when I first boot the PC. It sets up the cards for each host. The only difference in the script is the number of cards, the power levels I set for them and the clocks depending on what kind of cards are in the host.

#!/bin/bash

/usr/bin/nvidia-smi -pm 1

nvidia-smi -i 0 -pl 200
nvidia-smi -i 1 -pl 200
nvidia-smi -i 2 -pl 200

/usr/bin/nvidia-settings -a "[gpu:0]/GPUPowerMizerMode=1"
/usr/bin/nvidia-settings -a "[gpu:1]/GPUPowerMizerMode=1"
/usr/bin/nvidia-settings -a "[gpu:2]/GPUPowerMizerMode=1"

/usr/bin/nvidia-settings -a "[gpu:0]/GPUFanControlState=1"
/usr/bin/nvidia-settings -a "[fan:0]/GPUTargetFanSpeed=85"
/usr/bin/nvidia-settings -a "[fan:1]/GPUTargetFanSpeed=90"
/usr/bin/nvidia-settings -a "[gpu:1]/GPUFanControlState=1"
/usr/bin/nvidia-settings -a "[fan:2]/GPUTargetFanSpeed=85"
/usr/bin/nvidia-settings -a "[fan:3]/GPUTargetFanSpeed=90"
/usr/bin/nvidia-settings -a "[gpu:2]/GPUFanControlState=1"
/usr/bin/nvidia-settings -a "[fan:4]/GPUTargetFanSpeed=85"
/usr/bin/nvidia-settings -a "[fan:5]/GPUTargetFanSpeed=90"

/usr/bin/nvidia-settings -a "[gpu:0]/GPUMemoryTransferRateOffset[4]=800" -a "[gpu:0]/GPUGraphicsClockOffset[4]=60"
/usr/bin/nvidia-settings -a "[gpu:1]/GPUMemoryTransferRateOffset[4]=800" -a "[gpu:1]/GPUGraphicsClockOffset[4]=60"
/usr/bin/nvidia-settings -a "[gpu:2]/GPUMemoryTransferRateOffset[4]=800" -a "[gpu:2]/GPUGraphicsClockOffset[4]=60"

As you can see from the first shebang at the top of the script, it is meant to be run as root since you need root access to set the persistence mode.

Tom M

Joined: 2 Feb 06

Posts: 6478

Credit: 9605294944

RAC: 5627406

Thank you.

29 Jan 2023 11:18:33 UTC

Message 207295 in response to message 207289

(moderation:

)

Thank you.

A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!

Tom M

Joined: 2 Feb 06

Posts: 6478

Credit: 9605294944

RAC: 5627406

Thank you. I am certain I

29 Jan 2023 11:26:59 UTC

Message 207296 in response to message 207292

(moderation:

)

Thank you. I am certain I will have some questions after I edit it for two and one gpu setups.

Like what performance gain does persistence mode provide?

If the gpus seem to be cooling fine do need I need manual settings?

Power mizer mode "automatic" seems to be most productive for my systems. That is mode zero?

Tom M

A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!

All things Nvidia GPU

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner