Skip, I don't know. But more conversation. The better for me.
I have explored extreme MTM over clocking. And tried out the graphics over clocking as suggested by gamer's reports.
They aren't stable for boinc tasks.
It is also clear that different settings work better on different applications.
Earlier today I bumped up to +11/+101
I think Keith Meyers once explained that graphics OC's incremented by 10 so you need 10+ to move to the next step up.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Tom, This is a script that runs at login on this box. The other RTX3080 (same brand/model) I have has a different sclk & mclk table. I'll have to pull that one over to show the diff.
As I'd noted I bought both of of these used and I strongly suspect the BIOS was reflashed on at least one of them. I don't know if one is stock or if neither are stock. This is the more complex & documented one. Ignore any reference to "top" or "bottom" card from when this was a 2 card box.
Most of these lines are documentation for me but as you can see this card does not follow a strict +/- 10 to move the sclk around. Also be aware that sclk will drop as total wattage increases so what power you test at will change the results and this is also how NVid keeps you under the max PL setting. I had attempted to do my testing to see sclk results at sub 200w.
As an example: A sclk table offset of 114 or 107 or 105 will get a 2085 sclk @ ~160w but sclk will drop to 2070 at about 183w with the 105 offset whereas the 107 offset will carry that to >200w and a 114 offset will carry it to ~300w . For me, on this card the 105 offset was more stable across BRP7 / GPUGRID / Prime WUs (as I recall it was GPUGRID that caused me to reduce sclk not BRP7)
mclk seems to stay at what it's initially set to.
#!/bin/bash
#
# set_nvidia_config, plagiarized from Gord xxxxx ~5/20/2023
#
thisHost=`hostname`
echo "setting nvidia config for $thisHost..."
/usr/bin/nvidia-settings -a GPUPowerMizerMode=1
#-------------------------------------------------------------------------------------------#
# card on top w/ monitor GPU:0(Device0), Gigabyte RTX 3080 10GB,
# out of box: 1950 / 9251(18502), PL370 (GDDR6X eff mclk = mclk x2 x4 x2)
/usr/bin/nvidia-smi -i 0 -pm 1
/usr/bin/nvidia-smi -i 0 -pl 350
#sclk 2085 - 10/14/23
#sclk 2070 - 2/7/24, back to 2085 2/11/24
#
/usr/bin/nvidia-settings -a "[gpu:0]/GPUGraphicsClockOffset[4]=90" #- 2085
# /usr/bin/nvidia-settings -a "[gpu:0]/GPUGraphicsClockOffset[4]=89" #- 2070
# /usr/bin/nvidia-settings -a "[gpu:0]/GPUGraphicsClockOffset[4]=85" #- 2055
#mclk 9285 - 10/14/23
#mclk 9283(18566) - 2/11/24
#
# /usr/bin/nvidia-settings -a "[gpu:0]/GPUMemoryTransferRateOffset[4]=70" #- 9285
/usr/bin/nvidia-settings -a "[gpu:0]/GPUMemoryTransferRateOffset[4]=66" #- 9283
# /usr/bin/nvidia-settings -a "[gpu:0]/GPUMemoryTransferRateOffset[4]=62" #- 9281
# /usr/bin/nvidia-settings -a "[gpu:0]/GPUMemoryTransferRateOffset[4]=58" #- 9279
PS: This box only runs BOINC GPU tasks & does Monero XMR on the CPU.
PPS: If anybody has this "Gigabyte RTX3080 Gaming OC" bought new in a Linux box maybe you can tell me if either of these have the "stock" offset tables?
The clock bins on the old Turing family used to be every 13 Mhz. I don't know if they are similar for Ampere or not.
You'd have to test. Keep bumping the clock offset up and watching the reporting clocks. You will see no movement of the clock from bumping every 5 Mhz successively for example and the all of sudden the clock will jump up into the next bin and stay stuck there until the required offset increment is bumped again.
PPS: If anybody has this "Gigabyte RTX3080 Gaming OC" bought new in a Linux box maybe you can tell me if either of these have the "stock" offset tables?
tech powerup usually has a listing of gpu Bio's available for download. Might take a look and compare your version # to theirs.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
The clock bins on the old Turing family used to be every 13 Mhz. I don't know if they are similar for Ampere or not.
You'd have to test. Keep bumping the clock offset up and watching the reporting clocks. You will see no movement of the clock from bumping every 5 Mhz successively for example and the all of sudden the clock will jump up into the next bin and stay stuck there until the required offset increment is bumped again.
As shown in the commented lines... sclk freq jump is always 15MHz but the offset to get that bump varies on these two.
mclk freq jump is always 2MHz for each offset change of 4.
PPS: If anybody has this "Gigabyte RTX3080 Gaming OC" bought new in a Linux box maybe you can tell me if either of these have the "stock" offset tables?
tech powerup usually has a listing of gpu Bio's available for download. Might take a look and compare your version # to theirs.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Are there any other strict two slot wide Nvidia GPU's?
Titan V
I am especially interested in rtx 3080 ti or faster GPU's.
So far the rtx 3080 ti's that are claiming 2 slots are still wider than a Titan V.
Tom M
I use the INNO3D 2 slot 4070Ti. I bought it specifically because it's a 2 slot card. The 4070Ti Super is the same. It's a great card, I've had no issues although I'm struggling to get any good results out of it with E@H currently.
There's not many 2 slot cards these days.
Best to look at my 3900X system as it's solely for crunching vs my 3950X which I pause from time to time.
Not sure it's faster than a 3080Ti (bandwidth issue?)
How many tasks do you run with Meerkat work? I just checked and I was running 4 wu's for 334/sec/task which doesn't look great imo. I'm rechecking the best throughput now.
When I tried 2x on my windows rtx 3080 ti running brp7/meerKat it took more than twice as long to run the individual tasks.
However 3x on the All Sky GW task took no longer than 1x(?)
That has been my rule of thumb. I also watch the cuda load on the GPU under windows. If it is running below 90 percent I can usually gain by adding one more task.
I looked at your ryen 3900 system. It might be interesting to retest the baseline 1x for a few hours. See if it is faster than the 4x your running. It very well maybe that the memory width is not a choke point.
I believe All Sky GW at 3x or possibly higher will raise your system rac.
Respectfully.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Skip, I don't know. But more
)
Skip, I don't know. But more conversation. The better for me.
I have explored extreme MTM over clocking. And tried out the graphics over clocking as suggested by gamer's reports.
They aren't stable for boinc tasks.
It is also clear that different settings work better on different applications.
Earlier today I bumped up to +11/+101
I think Keith Meyers once explained that graphics OC's incremented by 10 so you need 10+ to move to the next step up.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
This is a script that runs at
)
Tom, This is a script that runs at login on this box. The other RTX3080 (same brand/model) I have has a different sclk & mclk table. I'll have to pull that one over to show the diff.
As I'd noted I bought both of of these used and I strongly suspect the BIOS was reflashed on at least one of them. I don't know if one is stock or if neither are stock. This is the more complex & documented one. Ignore any reference to "top" or "bottom" card from when this was a 2 card box.
Most of these lines are documentation for me but as you can see this card does not follow a strict +/- 10 to move the sclk around. Also be aware that sclk will drop as total wattage increases so what power you test at will change the results and this is also how NVid keeps you under the max PL setting. I had attempted to do my testing to see sclk results at sub 200w.
As an example: A sclk table offset of 114 or 107 or 105 will get a 2085 sclk @ ~160w but sclk will drop to 2070 at about 183w with the 105 offset whereas the 107 offset will carry that to >200w and a 114 offset will carry it to ~300w . For me, on this card the 105 offset was more stable across BRP7 / GPUGRID / Prime WUs (as I recall it was GPUGRID that caused me to reduce sclk not BRP7)
mclk seems to stay at what it's initially set to.
Hope this mess is decipherable.
Skip
Tom,Here's the guts of
)
Tom,
Here's the guts of the script from the other box. As you can see it has different offset tables but had = out of the box default clocks.
PS: This box only runs BOINC GPU tasks & does Monero XMR on the CPU.
PPS: If anybody has this "Gigabyte RTX3080 Gaming OC" bought new in a Linux box maybe you can tell me if either of these have the "stock" offset tables?
The clock bins on the old
)
The clock bins on the old Turing family used to be every 13 Mhz. I don't know if they are similar for Ampere or not.
You'd have to test. Keep bumping the clock offset up and watching the reporting clocks. You will see no movement of the clock from bumping every 5 Mhz successively for example and the all of sudden the clock will jump up into the next bin and stay stuck there until the required offset increment is bumped again.
Skip Da Shu wrote: PPS: If
)
tech powerup usually has a listing of gpu Bio's available for download. Might take a look and compare your version # to theirs.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Keith Myers wrote:The clock
)
As shown in the commented lines... sclk freq jump is always 15MHz but the offset to get that bump varies on these two.
mclk freq jump is always 2MHz for each offset change of 4.
Tom M wrote: Skip Da Shu
)
Thanx, I'll take a look.
https://www.techradar.com/pro
)
https://www.techradar.com/pro/want-to-shove-30-gpus-in-a-computer-system-heres-an-ai-solution-that-will-work-as-long-as-you-are-using-dell-liqid-allows-one-r760-server-to-connect-to-a-whopping-30-nvidia-gpus-for-now-with-amd-and-intel-likely-soon
Another beyond my means high GPU count server. :)
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Tom M wrote: Are there any
)
I use the INNO3D 2 slot 4070Ti. I bought it specifically because it's a 2 slot card. The 4070Ti Super is the same. It's a great card, I've had no issues although I'm struggling to get any good results out of it with E@H currently.
There's not many 2 slot cards these days.
Best to look at my 3900X system as it's solely for crunching vs my 3950X which I pause from time to time.
Not sure it's faster than a 3080Ti (bandwidth issue?)
How many tasks do you run with Meerkat work? I just checked and I was running 4 wu's for 334/sec/task which doesn't look great imo. I'm rechecking the best throughput now.
Chooka,Thank you for the
)
Chooka,
Thank you for the slot information.
When I tried 2x on my windows rtx 3080 ti running brp7/meerKat it took more than twice as long to run the individual tasks.
However 3x on the All Sky GW task took no longer than 1x(?)
That has been my rule of thumb. I also watch the cuda load on the GPU under windows. If it is running below 90 percent I can usually gain by adding one more task.
I looked at your ryen 3900 system. It might be interesting to retest the baseline 1x for a few hours. See if it is faster than the 4x your running. It very well maybe that the memory width is not a choke point.
I believe All Sky GW at 3x or possibly higher will raise your system rac.
Respectfully.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!