Brp7/MeerKat 1x vs 2x crunching speeds

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3965
Credit: 47204632642
RAC: 65467903

it's built into the driver.

it's built into the driver. you don't *need* to do anything outside of setting the environment variable for CUDA_VISIBLE_DEVICES and running the command to start the mps daemon. when it's running you will see the nvidia-cuda-mps-server running as a process on your GPU in nvidia-smi. it's best to start the daemon before you start BOINC, as it wont take effect on any in-progress tasks, and will pickup subsequest tasks as they spin up. 

you can also play with active_thread_percentage to tune things further.

beware of the major caveat with MPS. it is CUDA ONLY. you cannot run OpenCL tasks while MPS is running. that means you wont be able to run opencl work from other projects, or even the GW app from Einstein. you will need to stop the MPS server before you can run OpenCL tasks again.

 

_________________________________________________________________________

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6471
Credit: 9594945305
RAC: 6390508

I have a question about the

I have a question about the Interaction between a non-overclocked rtx 3080 ti, gen1 / 1X pcie bus, a 2 core cpu AND Invalids.

I am wondering if an "over-committed" cpu would tend to lead to more Invalids?

My Mining-Rig is trying to run 4 rtx 3080 ti's under the above description.  And more often than not it is not only processing slower than "average" eg. taking more than 240s (4 minutes) and it is "throwing" enough Invalids to stop or slowdown any increase in RAC towards the nominal 4M it should end up at.

Some days it is running 20+% Invalids with a flat RAC.  Dec 22 it appears to have run about 18% Invalids.

This is ALL in the context of Brp7/MeerKat (optimized/Antonymous platform) tasks running 1 task per gpu.

If this supposition is correct one way to test this is start running a single rtx 3080 ti for at least a week on that machine.  If the % of Invalids the machine has drops to below or near 5% then I think I may have a case for the "over-committed" cpu.  Possibly.  

I know that I am getting quite high bandwidth reported per gpu.  Up to 27% on one gpu.  So an alternative supposition of not enough bandwidth could not be ruled out.

If I can confirm that 1 gpu on this system runs with the typical Invalid rate of near 5% then the only way I see to test the over-committed cpu would be to upgrade the cpu and try running 4-5 gpus on it again?

Or just stop "nibbling around the edges" and give up this particular configuration?

So I suppose the issue is how much "good money" do I want to potentially throw after the bad money.  To date I have spent less than the cost of a new AsRock Epycd8 motherboard on this experiment.

And I have an Epycd8 waiting in the wings if/when this experiment dies.

Tom M

 

 

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.