Einstein FGRPB1G Linux/Nvidia Special app "AIO"

Keith Myers

Joined: 11 Feb 11

Posts: 5061

Credit: 19279707739

RAC: 7242760

I see Ian's fingers are

17 Feb 2023 17:19:19 UTC

Message 208328

(moderation:

)

I see Ian's fingers are faster.

You can always add the run_manager script or just point to the client by adding the application to the Startup Applications list via that app. That way the client will run automatically at boot.

Boca Raton Comm...

Joined: 4 Nov 15

Posts: 303

Credit: 11498123119

RAC: 13807817

BRILLIANT! Success! htt

17 Feb 2023 17:54:11 UTC

Message 208331

(moderation:

)

BRILLIANT!

Success!

https://einsteinathome.org/host/13125618/tasks/0/0

That did it. Thank you all for getting us up and running with this (these) new system(s).

Do you think we should be running 2 or 3 concurrently on this hardware? I know you all suggest two, but wanted to know your thoughts.

Keith Myers

Joined: 11 Feb 11

Posts: 5061

Credit: 19279707739

RAC: 7242760

I'd do at least 2 on the

17 Feb 2023 18:08:38 UTC

Message 208332 in response to message 208331

(moderation:

)

I'd do at least 2 on the 4090's, likely 3 will be better production. Look at the gpu utilization in nvidia-smi to see how many fit and when the total time per task starts going higher than the time of a 1X run.

Boca Raton Comm...

Joined: 4 Nov 15

Posts: 303

Credit: 11498123119

RAC: 13807817

Keith Myers wrote: I'd do at

17 Feb 2023 18:12:22 UTC

Message 208333 in response to message 208332

(moderation:

)

Keith Myers wrote:

I'd do at least 2 on the 4090's, likely 3 will be better production. Look at the gpu utilization in nvidia-smi to see how many fit and when the total time per task starts going higher than the time of a 1X run.

And the way to do this is with the app config file? I have it set to run 3 concurrent on the project settings, but I know that is probably trumped by the modifications done with this special app?

Ian&Steve C.

Joined: 19 Jan 20

Posts: 4155

Credit: 50094418219

RAC: 42347133

you have more control using

17 Feb 2023 18:23:37 UTC

Message 208334

(moderation:

)

you have more control using an app_config.xml file.

when you have tasks running, can you run a command for me and report the output? I've always had the hunch that the 4090 performance was limited by the GPU memory bandwidth. it has too many cores and not enough memory bandwidth to use them all to their full potential.

run this command while the tasks are about halfway through running a task.

nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv

_________________________________________________________________________

Boca Raton Comm...

Joined: 4 Nov 15

Posts: 303

Credit: 11498123119

RAC: 13807817

Ian&Steve C. wrote: you have

17 Feb 2023 18:57:21 UTC

Message 208337 in response to message 208334

(moderation:

)

Ian&Steve C. wrote:

you have more control using an app_config.xml file.

when you have tasks running, can you run a command for me and report the output? I've always had the hunch that the 4090 performance was limited by the GPU memory bandwidth. it has too many cores and not enough memory bandwidth to use them all to their full potential.

run this command while the tasks are about halfway through running a task.

nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv

Done. I tried it under a few settings, all running 1 work unit.

Baseline (not running E@H, using “adaptive” power setting)

$ nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv

name, pci.bus_id, utilization.gpu [%], utilization.memory [%], clocks.current.sm [MHz], power.draw [W], memory.used [MiB], pcie.link.gen.current, pcie.link.width.current

NVIDIA GeForce RTX 4090, 00000000:41:00.0, 0 %, 1 %, 210 MHz, 33.28 W, 175 MiB, 1, 16

While running (1 work unit, using “adaptive” power setting)

$ nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv

name, pci.bus_id, utilization.gpu [%], utilization.memory [%], clocks.current.sm [MHz], power.draw [W], memory.used [MiB], pcie.link.gen.current, pcie.link.width.current

NVIDIA GeForce RTX 4090, 00000000:41:00.0, 84 %, 91 %, 2850 MHz, 281.55 W, 2282 MiB, 3, 16

While running (1 work unit, using “prefer maximum performance” power setting)

$ nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv

name, pci.bus_id, utilization.gpu [%], utilization.memory [%], clocks.current.sm [MHz], power.draw [W], memory.used [MiB], pcie.link.gen.current, pcie.link.width.current

NVIDIA GeForce RTX 4090, 00000000:41:00.0, 87 %, 94 %, 2850 MHz, 286.23 W, 2282 MiB, 3, 16

Boca Raton Comm...

Joined: 4 Nov 15

Posts: 303

Credit: 11498123119

RAC: 13807817

Ian&Steve C. wrote: you have

17 Feb 2023 19:12:05 UTC

Message 208340 in response to message 208334

(moderation:

)

Ian&Steve C. wrote:

you have more control using an app_config.xml file.

My brain is all jumbled up from today- what would be the name of the special version of the app for the app_config.xml file? Would the rest of this be correct? I want to try two concurrent first:

<app_config>
   <app>
      <name>name</name>
      <gpu_versions>
          <gpu_usage>.5</gpu_usage>
          <cpu_usage>.1</cpu_usage>
      </gpu_versions>
   </app>
</app_config>

Ian&Steve C.

Joined: 19 Jan 20

Posts: 4155

Credit: 50094418219

RAC: 42347133

Load up 2x tasks per GPU and

17 Feb 2023 19:15:09 UTC

Message 208341

(moderation:

)

Load up 2x tasks per GPU and repeat please.

but this is supporting my theory. Look how high the memory controller load is compared to GPU load. Running 1 task has the memory bus almost maxed already with only 85% GPU utilization. Running 2x will probably have the memory at 100% and the core still not at 100%

what is your power limit set to? You can see that with the default “nvidia-smi” command.

_________________________________________________________________________

Ian&Steve C.

Joined: 19 Jan 20

Posts: 4155

Credit: 50094418219

RAC: 42347133

Boca Raton Community HS

17 Feb 2023 19:16:45 UTC

Message 208342 in response to message 208340

(moderation:

)

Boca Raton Community HS wrote:

Ian&Steve C. wrote:

you have more control using an app_config.xml file.

My brain is all jumbled up from today- what would be the name of the special version of the app for the app_config.xml file? Would the rest of this be correct? I want to try two concurrent first:
<app_config>
   <app>
      <name>name</name>
      <gpu_versions>
          <gpu_usage>.5</gpu_usage>
          <cpu_usage>.1</cpu_usage>
      </gpu_versions>
   </app>
</app_config>

name should be ‘hsgamma_FGRPB1G’

set cpu_usage to 1.0. That’s more inline with what is actually used.

_________________________________________________________________________

Boca Raton Comm...

Joined: 4 Nov 15

Posts: 303

Credit: 11498123119

RAC: 13807817

Updated to run

17 Feb 2023 19:51:02 UTC

Message 208345

(moderation:

)

Updated to run two:

nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv
name, pci.bus_id, utilization.gpu [%], utilization.memory [%], clocks.current.sm [MHz], power.draw [W], memory.used [MiB], pcie.link.gen.current, pcie.link.width.current
NVIDIA GeForce RTX 4090, 00000000:41:00.0, 98 %, 97 %, 2850 MHz, 295.30 W, 4238 MiB, 3, 16

Power setting when not running two tasks (baseline):

$ nvidia-smi
Fri Feb 17 14:29:27 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.78.01    Driver Version: 525.78.01    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap|         Memory-Usage | GPU-Util Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0 NVIDIA GeForce ... Off | 00000000:41:00.0 On |                  Off |
| 0%   37C    P8    42W / 480W |    213MiB / 24564MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
| GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A N/A      1522      G   /usr/lib/xorg/Xorg                 77MiB |
|    0   N/A N/A      3825      G   /usr/lib/firefox/firefox          133MiB |
+-----------------------------------------------------------------------------+

Power setting when running two:

nvidia-smi
Fri Feb 17 14:31:01 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.78.01    Driver Version: 525.78.01    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap|         Memory-Usage | GPU-Util Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0 NVIDIA GeForce ... Off | 00000000:41:00.0 On |                  Off |
| 0%   51C    P2   298W / 480W |   4238MiB / 24564MiB |    100%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
| GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A N/A      1522      G   /usr/lib/xorg/Xorg                 77MiB |
|    0   N/A N/A      3825      G   /usr/lib/firefox/firefox          133MiB |
|    0   N/A N/A      4740      C   ...-pc-linux-gnu-opencl_v1.0     2010MiB |
|    0   N/A N/A      4744      C   ...-pc-linux-gnu-opencl_v1.0     2010MiB |
+-----------------------------------------------------------------------------+

So, basically, there is still A LOT of compute power left on the table that cannot even be used? Even the wattage under load is way less then I would have thought. All can be tracked back to the memory bus?

Running three concurrently:

nvidia-smi
Fri Feb 17 14:49:11 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.78.01    Driver Version: 525.78.01    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap|         Memory-Usage | GPU-Util Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0 NVIDIA GeForce ... Off | 00000000:41:00.0 On |                  Off |
| 0%   55C    P2   303W / 480W |   6260MiB / 24564MiB |    100%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
| GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A N/A      1522      G   /usr/lib/xorg/Xorg                 89MiB |
|    0   N/A N/A      3825      G   /usr/lib/firefox/firefox          133MiB |
|    0   N/A N/A      4959      G   /usr/bin/nvidia-settings            0MiB |
|    0   N/A N/A      5400      C   ...-pc-linux-gnu-opencl_v1.0     2010MiB |
|    0   N/A N/A      5404      C   ...-pc-linux-gnu-opencl_v1.0     2010MiB |
|    0   N/A N/A      5436      C   ...-pc-linux-gnu-opencl_v1.0     2010MiB |

Einstein FGRPB1G Linux/Nvidia Special app "AIO"

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner