All things Nvidia GPU

GWGeorge007
GWGeorge007
Joined: 8 Jan 18
Posts: 2,997
Credit: 4,926,034,438
RAC: 132,503

Skip Da Shu wrote: I'm sure

Skip Da Shu wrote:

I'm sure we've been down this path before but I decided to give it another shot and am about to run out of steam again..

Running dual Nvida (30x0) card under Linux Mint BOINC with coolbits.   If anybody wants to read most recent foray...

https://forums.linuxmint.com/viewtopic.php?t=404389

I can actually get the xorg part working but then it breaks Cinnamon :-(

Skip

What version of Cinnamon are you running?  Linux Mint - Cinnamon v20.3 thru v21.2 version?

Cinnamon is the 3D accelerated version, which should normally be used. If you experience problems with your video driver (e.g. artifacts or crashing), try the Cinnamon (Software Rendering) session, which disables 3D acceleration. 

I had tried Cinnamon before I switched to Ubuntu.  Yes, I did like it too.  But I also had problems running BOINC with many options, like Coolbits.  They do have several versions of Cinnamon out which leads me to think that they are trying to make it work the way you and many others are wanting it to.  See versions:

Also, take a look at this site.  It has many tips on making 'changes' that can help you get Cinnamon to work for you.

https://wiki.archlinux.org/title/Cinnamon  (last edited: 15 June 2023, at 16:47)

George

Proud member of the Old Farts Association

Skip Da Shu
Skip Da Shu
Joined: 18 Jan 05
Posts: 150
Credit: 997,993,698
RAC: 789,962

Thanx guys. It looks like

Thanx guys. 

It looks like I have some reading to do and maybe try to get back to a raw xorg file via the nvidia xorg generator... which months ago led me a black screen and how I ended up with the 20.nvidia.conf that I have now.  But to honest I don't remember what was removed from the generated xorg that got this working (with coolbits on GPU0 only).

On this desktop box I have Mint v20.3 with Cinnamon 5.2.7, the 5.15.0-84 kernel.  The following will get coolbits on both cards but with a crashed Cinnamon in "fallback" mode.  If I comment out just the Screen1 line in the ServerLayout section Cinnamon doesn't crash but then coolbits gets "not used" for the 2nd GPU (3060ti).

From /usr/share/X11/xorg.conf.d/20-nvidia.conf:  (well actually a copy of the coolbits working version, the one I'm running at this minute is the 'Cinnamon works' version w/o Screen1).

Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0"
    Screen      1  "Screen1"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "InputDevice"
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "AOC"
    ModelName      "32G2WG3"
    Option         "DPMS"
EndSection

Section "Monitor"
    Identifier     "Monitor1"
    Option         "DPMS"
EndSection

# this will become device 0, lower card w/ monitor
Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "NVIDIA GeForce RTX 3070"
    BusID          "PCI:04:0:0"
    Option         "ConnectedMonitor" "DFP"
    Option         "Coolbits" "12"    
EndSection

# this will become device 1, upper card w/o monitor
Section "Device"
    Identifier     "Device1"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BoardName      "NVIDIA GeForce RTX 3060 Ti"
    BusID          "PCI:43:0:0"
    Option         "Coolbits" "12"
EndSection

Section "Screen"
        Identifier     "Screen0"
        Device         "Device0"
        Monitor        "Monitor0"
        DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Section "Screen"
        Identifier     "Screen1"
        Device         "Device1"
        Monitor        "Monitor1"
        DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

I'm really beginning to wonder why I've spent so very much time on 60-75MHz of sclk on that 3060ti.

Skip

Skip Da Shu
Skip Da Shu
Joined: 18 Jan 05
Posts: 150
Credit: 997,993,698
RAC: 789,962

George, The strange part here

George, The strange part here is that it's ONLY this dual card box that is giving me trouble with it.  I have other boxes (single GPU, AMD and Nvidia) that are working fine.

Skip

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6,258
Credit: 8,911,203,658
RAC: 10,319,282

Skip,I am ignorant as

Skip,

I am ignorant as well as away from my computer(s) but what happens if you clone the second monitor code from the first. Leave everything in place? Change the monitor name.

I wonder what happens if you put the connected monitor option in the 2nd video card but null the second parameter with ""

Hth,

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

GWGeorge007
GWGeorge007
Joined: 8 Jan 18
Posts: 2,997
Credit: 4,926,034,438
RAC: 132,503

Skip Da Shu wrote:George,

Skip Da Shu wrote:

George, The strange part here is that it's ONLY this dual card box that is giving me trouble with it.  I have other boxes (single GPU, AMD and Nvidia) that are working fine.

Skip

Did you try this:

 sudo nvidia-xconfig --thermal-configuration-check --cool-bits=28 --enable-all-gpus

See if your computer crashes or if you get both, or all NVIDIA GPUs working correctly.

I use this on my two desktop computers, both have AMD Ryzen CPUs, my 3950X computer has two different NVIDIA GPUs, a 3080 & 2070Super, and it works well.  The other is an 5950X with [2] 3080Ti's, working well also.

George

Proud member of the Old Farts Association

Skip Da Shu
Skip Da Shu
Joined: 18 Jan 05
Posts: 150
Credit: 997,993,698
RAC: 789,962

Tom M wrote:Skip,I am

Tom M wrote:

Skip,

I am ignorant as well as away from my computer(s) but what happens if you clone the second monitor code from the first. Leave everything in place? Change the monitor name.

I wonder what happens if you put the connected monitor option in the 2nd video card but null the second parameter with ""

Hth,

Tom M

I used to have "unknown" in vendor and model and found that it made no diff so I removed those 2 lines.  What's left on Monitor1 is same as in Monitor0.

Lemme save this and go look to see... On the 2nd GPU it was reporting "not used" on both the connected monitor and coolbits in the xorg.0.log file.

Skip Da Shu
Skip Da Shu
Joined: 18 Jan 05
Posts: 150
Credit: 997,993,698
RAC: 789,962

GWGeorge007 wrote: Did you

GWGeorge007 wrote:

Did you try this:

 sudo nvidia-xconfig --thermal-configuration-check --cool-bits=28 --enable-all-gpus

See if your computer crashes or if you get both, or all NVIDIA GPUs working correctly.

I use this on my two desktop computers, both have AMD Ryzen CPUs, my 3950X computer has two different NVIDIA GPUs, a 3080 & 2070Super, and it works well.  The other is an 5950X with [2] 3080Ti's, working well also.

I'll rename 20-nvidia.conf and try to build an xorg.conf via nvidia-xconfig again to see what I get as I don't remember using --enable-all-gpus in earlier attempts.

Skip

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6,258
Credit: 8,911,203,658
RAC: 10,319,282

https://www.servethehome.com/

https://www.servethehome.com/nvidia-dgx-versus-nvidia-hgx-what-is-the-difference/

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6,258
Credit: 8,911,203,658
RAC: 10,319,282

I now get why "NVIDIA Tesla

I now get why "NVIDIA Tesla P100 16GB PCI-e 3.0 GPU" have been popping up all over the Top 50.

https://www.techpowerup.com/gpu-specs/tesla-p100-pcie-16-gb.c2888

They are faster than an rtx 2060 and pretty cheap.

Tom M

 

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,911
Credit: 43,728,245,976
RAC: 63,214,366

they are decent. again since

they are decent. again since they have ok FP64 performance (BRP7) and a good amount of VRAM (GW and other projects). most of the people running them now are folks that came over from Milkyway where they were a great bang for buck. I'm not sure many people bought them specifically to use on Einstein.

however, they do not have any fan and only have passive heatsinks. so a cooler fan is necessary. either put them in a ducted high airflow server case, or attach some super loud blower fans. personally I wont go for Pascal and older anymore, but if you already have them, it's not bad to use them on BRP7.

Titan V is a better buy and more plug and play on the nvidia side of things IMO since it has a fan already. running petri's app, my titan Vs only use about 120W (BRP7). incredible power efficiency. if you also contribute to Folding@home, Primegrid, and/or Asteroids, 40-series cards might be more worth looking into.

_________________________________________________________________________

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.