All things Amd GPU

mountkidd

Joined: 14 Jun 12

Posts: 176

Credit: 12547962555

RAC: 8036792

Hi Martin, Looking at the

26 Apr 2023 20:12:41 UTC

Message 211569

(moderation:

)

Hi Martin,

Looking at the task log for one of your GR tasks from earlier today, I see a huge problem:

03:32:25 (24556): [debug]: Set up communication with graphics process.
...
% C 0 67
% C 0 135
% C 0 203
...         checkpoints ~70s apart
% C 0 611
% C 0 679
% C 0 747   (12:27 -> 03:44:52)
...
04:36:14 (24556): [normal]: done. calling boinc_finish(0).

Start time to the last checkpoint was 12m27s. Last checkpoint to 'done' was 51m22s later. This is (mostly) the remaining 11% part that gets processed on the cpu, typically 15-20s cpu processing.

What is the cpu doing that prevents it from dealing with this? Too much cpu work from BRP/GW? Priorities favoring cpu work?

Scrooge McDuck

Joined: 2 May 07

Posts: 1052

Credit: 17869501

RAC: 12422

mountkidd schrieb:% C 0

27 Apr 2023 7:52:58 UTC

Message 211585 in response to message 211569

(moderation:

)

mountkidd wrote:

% C 0 203

...         checkpoints ~70s apart
% C 0 611
% C 0 679
% C 0 747   (12:27 -> 03:44:52)

These aren't seconds but some kind of subroutine counter or progress counter. So 747 isn't 12m27s. Depending on the CPU speed (e.g. iGPU), GPU speed, BOINC's CPU throttle configuration or the CPU load (other science apps consuming memory bandwidth) the progress between checkpoints differs. There are 11 checkpoints within this task. End time minus start time is ~64 minutes. I assume a checkpoint configuration of 300 seconds (BOINC's client configuration). So maybe 55-60 minutes total time until last checkpoint was written. Afterwards final toplist calculation starts which is often done on the CPU as many/most GPU don't support FP64 (64bit... 'double precision'... floating point arithmetics).

There's no obvious problem here. Task runs for 64 minutes on the CPUs' iGPU (Radeon) of this Ryzen 7 Desktop CPU. That's too slow? I don't know. On my old Core i7 such FGRPB1G tasks take 5..12 hours on iGPU, depending on CPU load (memory bandwidth limits).

Scrooge McDuck

Joined: 2 May 07

Posts: 1052

Credit: 17869501

RAC: 12422

mountkidd schrieb:[...] This

27 Apr 2023 12:29:47 UTC

Message 211592 in response to message 211585

(moderation:

)

mountkidd wrote:

[...] This is (mostly) the remaining 11% part that gets processed on the cpu, typically 15-20s cpu processing.

What is the cpu doing that prevents it from dealing with this? Too much cpu work from BRP/GW? Priorities favoring cpu work?

AMD Ryzen 7 7700X iGPU (AMD Radeon(TM) Graphics (12284MB)), DeviceID "gfx1036" (see task log) is FP64 capable (see task log). There's no need to use CPU for final toplist computation.

Scrooge McDuck wrote:

...Afterwards final toplist calculation starts which is often done on the CPU as many/most GPU don't support FP64 (64bit... 'double precision'... floating point arithmetics).

So, I was wrong too...

But Martin (astro-marwil) doesn't want to use this iGPU for BOINC. So... how to activate discrete AMD GPU card for BOINC and how to disable iGPU for BOINC without being forced to switch it off in BIOS? I have no idea but as Ian&SteveC, Mikey and SkipDaShu already suggested: The solution is a customized app_info.xml and/or cc_config.xml file.

project specific exclusion:

<exclude_gpu>
   <url>project_URL</url>
   [<device_num>N</device_num>]
   [<type>NVIDIA|ATI|intel_gpu</type>]
   [<app>appname</app>]
</exclude_gpu>

or for all projects...

<ignore_ati_dev>N</ignore_ati_dev>

astro-marwil

Joined: 28 May 05

Posts: 531

Credit: 641936543

RAC: 1106301

Hello! Many, many thanks

27 Apr 2023 13:57:37 UTC

Message 211602

(moderation:

)

Hello!

Many, many thanks to all of you, for all of your assistance!

Unfortunately, I have no luck with cc_config.xml . It’s now this:

<cc_config>

<exclude_gpu>

<url>https://einstein.phys.uwm.edu/</url>

<device_num>0</device_num>

</exclude_gpu>

</cc_config>

and in the notifications, I get in the first line the remark : unrecognized tag in cc_config.xml: <exclude_gpu>. I tried several, also with hard brackets, without success. Is the url correct? Should it be the address of the server, from which the task become downloaded, which I didn’t find? What else could it be? I assume, it will be a minor failure, like a missing space or so? But I tested so much.

Also with

<cc_config>
<use_all_gpus>1</use_all_gpus>
</cc_config>

I get a corresponding remark in the notifications. And beside of this, it will not work as I want.

Kind regards and happy crunching

Martin

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46604042642

RAC: 64212794

astro-marwil

2 May 2023 14:13:26 UTC

Message 211606 in response to message 211602

(moderation:

)

astro-marwil wrote:

Hello!

Many, many thanks to all of you, for all of your assistance!

Unfortunately, I have no luck with cc_config.xml . It’s now this:

<cc_config>

   <exclude_gpu>

       <url>https://einstein.phys.uwm.edu/</url>

       <device_num>0</device_num>

   </exclude_gpu>

</cc_config>

and in the notifications, I get in the first line the remark : unrecognized tag in cc_config.xml: <exclude_gpu>. I tried several, also with hard brackets, without success. Is the url correct? Should it be the address of the server, from which the task become downloaded, which I didn’t find? What else could it be? I assume, it will be a minor failure, like a missing space or so? But I tested so much.

Also with

<cc_config>
   <use_all_gpus>1</use_all_gpus>
</cc_config>

I get a corresponding remark in the notifications. And beside of this, it will not work as I want.

Kind regards and happy crunching

Martin

you need to put these inside an <options> element.

<cc_config>
   <options>
      <use_all_gpus>1</use_all_gpus>
      <exclude_gpu>
         <url>https://einstein.phys.uwm.edu/</url>
         <device_num>0</device_num>
      </exclude_gpu>
   </options>
</cc_config>

_________________________________________________________________________

Skip Da Shu

Joined: 18 Jan 05

Posts: 151

Credit: 1040066313

RAC: 755947

astro-marwil

27 Apr 2023 16:33:21 UTC

Message 211608 in response to message 211602

(moderation:

)

astro-marwil wrote:

Hello!

Many, many thanks to all of you, for all of your assistance!

Unfortunately, I have no luck with cc_config.xml . It’s now this:

<cc_config>

   <exclude_gpu>

       <url>https://einstein.phys.uwm.edu/</url>

       <device_num>0</device_num>

   </exclude_gpu>

</cc_config>

and in the notifications, I get in the first line the remark : unrecognized tag in cc_config.xml: <exclude_gpu>. I tried several, also with hard brackets, without success. Is the url correct? Should it be the address of the server, from which the task become downloaded, which I didn’t find? What else could it be? I assume, it will be a minor failure, like a missing space or so? But I tested so much.

Also with

<cc_config>
   <use_all_gpus>1</use_all_gpus>
</cc_config>

I get a corresponding remark in the notifications. And beside of this, it will not work as I want.

Kind regards and happy crunching

Martin

Here's a working cc_config.xml from a Ryzen 7 5700G with the iGPU excluded from all BOINC projects but running my monitor on that box. The discrete card (RX 6600) is platform "0". The iGPU is platform "1".

<cc_config>
    <log_flags>
        <file_xfer>1</file_xfer>
        <sched_ops>1</sched_ops>
        <task>1</task>
    </log_flags>
    <options>
        <ignore_ati_dev>1</ignore_ati_dev>
        <max_file_xfers>8</max_file_xfers>
        <max_file_xfers_per_project>5</max_file_xfers_per_project>
        <ncpus>-1</ncpus>
    </options>
</cc_config>

Skip

PS: I'd remove the max_file_xfers lines... left over from trying to do something in the past.

mountkidd

Joined: 14 Jun 12

Posts: 176

Credit: 12547962555

RAC: 8036792

Scrooge McDuck

27 Apr 2023 17:34:32 UTC

Message 211614 in response to message 211585

(moderation:

)

Scrooge McDuck wrote:

There's no obvious problem here. Task runs for 64 minutes on the CPUs' iGPU (Radeon) of this Ryzen 7 Desktop CPU. That's too slow? I don't know.

The obvious problem here is that the task was running on the iGPU/APU when the expectation was for it to run on the RX6600. A 6600 will crunch a GR task in 800-1200s. Taking 3800s is a big red flag.

We can revisit the timing of the checkpoints once the GR tasks are running on the 6600.

Scrooge McDuck

Joined: 2 May 07

Posts: 1052

Credit: 17869501

RAC: 12422

So... Martin (astro-marwil)

29 Apr 2023 13:41:18 UTC

Message 211679

(moderation:

)

So... Martin (astro-marwil) now runs FGRPB1G tasks on iGPU and discrete GPU card in parallel:

discrete GPU task

Received:29 Apr 2023 10:02:47 UTC
Run time (sec):665.24
CPU time (sec):74.00
Using OpenCL device "gfx1032" by: Advanced Micro Devices, Inc.

iGPU task

Received:29 Apr 2023 12:05:50 UTC
Run time (sec):8,340.26
CPU time (sec):160.02
Using OpenCL device "gfx1036" by: Advanced Micro Devices, Inc.

iGPU seems to be slowed down by many parallel O3MD1 CPU tasks consuming memory bandwidth. Maybe until now Martin hasn't had the time to try out different client configurations to disable iGPU... But I'd assume that discrete GPU runs fast enough now.-... 11 minutes... There are tasks running less than 9 minutes too.

Happy crunching...

Exard3k

Joined: 25 Jul 21

Posts: 66

Credit: 56155179

RAC: 0

Scrooge McDuck wrote: So...

29 Apr 2023 17:19:32 UTC

Message 211693 in response to message 211679

(moderation:

)

Scrooge McDuck wrote:

So... Martin (astro-marwil) now runs FGRPB1G tasks on iGPU and discrete GPU card in parallel:

iGPU seems to be slowed down by many parallel O3MD1 CPU tasks consuming memory bandwidth.

Big problem of consumer boards. I see memory bottlenecks even without using iGPU. Even though it's DDR5, two channels just don't cut it. Especially not if some iGPU wants their share too. And there is no nice value or rate limit for memory :)

And you don't get an iGPU on EPYC or W-3400.

astro-marwil

Joined: 28 May 05

Posts: 531

Credit: 641936543

RAC: 1106301

Hallo!It does work now as

2 May 2023 12:09:24 UTC

Message 211828

(moderation:

)

Hallo!

It does work now as wanted!!!

Last night I found the error in the code, I copied direct from the end of Ian&Steves thread Message 211606, the missing closing remark in the next-to-last line. I didn't see it, even hundred times before. Before that, I deleted the <exclude-gpu>. That gave also error remarks, but worked for prolonged time with both GPUs. Until yesterday, when an update for .NET Framework became installed. From then on, only the iGPU was in charge, as before.

Nevertheless, it was a great fun for me to write some lines of code after 30 to 40 years.

From mid of may on, I'll start to optimize the workload for maximum of Cobblestone/Wh. Now there are running 7 threads of O3 (CPU) and one of FGRP1G in parallel. That gives 85 to 90% of CPU-load and 75% of RAM-load averaged over 1 minute. All cores are about equally loaded. Don't forget, I'm using Process Lasso, with priority set to less than normal for O3 (CPU) and real-time for FGRP1G. Crunching more threads in parallel increases all, CPU-, RAM-load and running times of the threads.

Many thanks to all of you!

Kind regards and happy crunching

Martin

All things Amd GPU

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner