Einstein FGRPB1G Linux/Nvidia Special app "AIO"

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18715996062
RAC: 6371902

I see Ian's fingers are

I see Ian's fingers are faster.

You can always add the run_manager script or just point to the client by adding the application to the Startup Applications list via that app.  That way the client will run automatically at boot.

 

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 240
Credit: 10555205586
RAC: 25267491

BRILLIANT! Success! htt

BRILLIANT!

Success!

https://einsteinathome.org/host/13125618/tasks/0/0

That did it. Thank you all for getting us up and running with this (these) new system(s).

Do you think we should be running 2 or 3 concurrently on this hardware? I know you all suggest two, but wanted to know your thoughts.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18715996062
RAC: 6371902

I'd do at least 2 on the

I'd do at least 2 on the 4090's, likely 3 will be better production. Look at the gpu utilization in nvidia-smi to see how many fit and when the total time per task starts going higher than the time of a 1X run.

 

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 240
Credit: 10555205586
RAC: 25267491

Keith Myers wrote: I'd do at

Keith Myers wrote:

I'd do at least 2 on the 4090's, likely 3 will be better production. Look at the gpu utilization in nvidia-smi to see how many fit and when the total time per task starts going higher than the time of a 1X run.

 

And the way to do this is with the app config file? I have it set to run 3 concurrent on the project settings, but I know that is probably trumped by the modifications done with this special app?

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3945
Credit: 46760812642
RAC: 64101529

you have more control using

you have more control using an app_config.xml file.

when you have tasks running, can you run a command for me and report the output? I've always had the hunch that the 4090 performance was limited by the GPU memory bandwidth. it has too many cores and not enough memory bandwidth to use them all to their full potential.

run this command while the tasks are about halfway through running a task.

nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv

_________________________________________________________________________

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 240
Credit: 10555205586
RAC: 25267491

Ian&Steve C. wrote: you have

Ian&Steve C. wrote:

you have more control using an app_config.xml file.

when you have tasks running, can you run a command for me and report the output? I've always had the hunch that the 4090 performance was limited by the GPU memory bandwidth. it has too many cores and not enough memory bandwidth to use them all to their full potential.

run this command while the tasks are about halfway through running a task.

nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv

 

Done. I tried it under a few settings, all running 1 work unit.

Baseline (not running E@H, using “adaptive” power setting)

$ nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv

name, pci.bus_id, utilization.gpu [%], utilization.memory [%], clocks.current.sm [MHz], power.draw [W], memory.used [MiB], pcie.link.gen.current, pcie.link.width.current

NVIDIA GeForce RTX 4090, 00000000:41:00.0, 0 %, 1 %, 210 MHz, 33.28 W, 175 MiB, 1, 16

 

 

While running (1 work unit, using “adaptive” power setting)

$ nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv

name, pci.bus_id, utilization.gpu [%], utilization.memory [%], clocks.current.sm [MHz], power.draw [W], memory.used [MiB], pcie.link.gen.current, pcie.link.width.current

NVIDIA GeForce RTX 4090, 00000000:41:00.0, 84 %, 91 %, 2850 MHz, 281.55 W, 2282 MiB, 3, 16

 

While running (1 work unit, using “prefer maximum performance” power setting)

$ nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv

name, pci.bus_id, utilization.gpu [%], utilization.memory [%], clocks.current.sm [MHz], power.draw [W], memory.used [MiB], pcie.link.gen.current, pcie.link.width.current

NVIDIA GeForce RTX 4090, 00000000:41:00.0, 87 %, 94 %, 2850 MHz, 286.23 W, 2282 MiB, 3, 16

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 240
Credit: 10555205586
RAC: 25267491

Ian&Steve C. wrote: you have

Ian&Steve C. wrote:

you have more control using an app_config.xml file.

 

My brain is all jumbled up from today- what would be the name of the special version of the app for the app_config.xml file? Would the rest of this be correct? I want to try two concurrent first: 

<app_config>
   <app>
      <name>name</name>
      <gpu_versions>
          <gpu_usage>.5</gpu_usage>
          <cpu_usage>.1</cpu_usage>
      </gpu_versions>
   </app>
</app_config>

 

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3945
Credit: 46760812642
RAC: 64101529

Load up 2x tasks per GPU and

Load up 2x tasks per GPU and repeat please. 
 

but this is supporting my theory. Look how high the memory controller load is compared to GPU load. Running 1 task has the memory bus almost maxed already with only 85% GPU utilization. Running 2x will probably have the memory at 100% and the core still not at 100% 

 

what is your power limit set to? You can see that with the default “nvidia-smi” command. 
 

 

_________________________________________________________________________

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3945
Credit: 46760812642
RAC: 64101529

Boca Raton Community HS

Boca Raton Community HS wrote:

Ian&Steve C. wrote:

you have more control using an app_config.xml file.

 

My brain is all jumbled up from today- what would be the name of the special version of the app for the app_config.xml file? Would the rest of this be correct? I want to try two concurrent first: 

<app_config>
   <app>
      <name>name</name>
      <gpu_versions>
          <gpu_usage>.5</gpu_usage>
          <cpu_usage>.1</cpu_usage>
      </gpu_versions>
   </app>
</app_config>

 

 

name should be ‘hsgamma_FGRPB1G’ 

set cpu_usage to 1.0. That’s more inline with what is actually used. 

_________________________________________________________________________

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 240
Credit: 10555205586
RAC: 25267491

Updated to run

Updated to run two:

nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv
name, pci.bus_id, utilization.gpu [%], utilization.memory [%], clocks.current.sm [MHz], power.draw [W], memory.used [MiB], pcie.link.gen.current, pcie.link.width.current
NVIDIA GeForce RTX 4090, 00000000:41:00.0, 98 %, 97 %, 2850 MHz, 295.30 W, 4238 MiB, 3, 16

 

Power setting when not running two tasks (baseline):

$ nvidia-smi
Fri Feb 17 14:29:27 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.78.01    Driver Version: 525.78.01    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:41:00.0  On |                  Off |
|  0%   37C    P8    42W / 480W |    213MiB / 24564MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1522      G   /usr/lib/xorg/Xorg                 77MiB |
|    0   N/A  N/A      3825      G   /usr/lib/firefox/firefox          133MiB |
+-----------------------------------------------------------------------------+

 

Power setting when running two:

nvidia-smi
Fri Feb 17 14:31:01 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.78.01    Driver Version: 525.78.01    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:41:00.0  On |                  Off |
|  0%   51C    P2   298W / 480W |   4238MiB / 24564MiB |    100%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1522      G   /usr/lib/xorg/Xorg                 77MiB |
|    0   N/A  N/A      3825      G   /usr/lib/firefox/firefox          133MiB |
|    0   N/A  N/A      4740      C   ...-pc-linux-gnu-opencl_v1.0     2010MiB |
|    0   N/A  N/A      4744      C   ...-pc-linux-gnu-opencl_v1.0     2010MiB |
+-----------------------------------------------------------------------------+

 

So, basically, there is still A LOT of compute power left on the table that cannot even be used? Even the wattage under load is way less then I would have thought. All can be tracked back to the memory bus?

 

Running three concurrently:

nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv
name, pci.bus_id, utilization.gpu [%], utilization.memory [%], clocks.current.sm [MHz], power.draw [W], memory.used [MiB], pcie.link.gen.current, pcie.link.width.current
NVIDIA GeForce RTX 4090, 00000000:41:00.0, 100 %, 100 %, 2850 MHz, 300.80 W, 6260 MiB, 3, 16

nvidia-smi
Fri Feb 17 14:49:11 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.78.01    Driver Version: 525.78.01    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:41:00.0  On |                  Off |
|  0%   55C    P2   303W / 480W |   6260MiB / 24564MiB |    100%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1522      G   /usr/lib/xorg/Xorg                 89MiB |
|    0   N/A  N/A      3825      G   /usr/lib/firefox/firefox          133MiB |
|    0   N/A  N/A      4959      G   /usr/bin/nvidia-settings            0MiB |
|    0   N/A  N/A      5400      C   ...-pc-linux-gnu-opencl_v1.0     2010MiB |
|    0   N/A  N/A      5404      C   ...-pc-linux-gnu-opencl_v1.0     2010MiB |
|    0   N/A  N/A      5436      C   ...-pc-linux-gnu-opencl_v1.0     2010MiB |

 

 

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.