Improvements in the code of the clients

DF1DX

Joined: 14 Aug 10

Posts: 105

Credit: 3840066854

RAC: 4868834

Interesting. No crash here

28 Aug 2021 10:13:05 UTC

Message 188577

(moderation:

)

Interesting. No crash here but about 10% slower on my Radeon VII with 3 WUs:

1.28: ~522 s,
1.18: ~475 s

clinfo shows:
Platform Version OpenCL 2.1 AMD-APP (3075.10)

bozz4science

Joined: 4 May 20

Posts: 15

Credit: 67643923

RAC: 3894

Thanks Ian for the detailed

28 Aug 2021 10:46:56 UTC

Message 188579 in response to message 188537

(moderation:

)

Thanks Ian for the detailed reply. Much appreciated :) I applaud your effort in helping to bring these changes to fruition and thank all other volunteers that cooperated in testing & development, especially petri33, the GPU Users team and Bernd for working with you on the deployment of the incremental changes on the platform here!

Petri sounds like a hell of a guy to being able to test code on the fly in real-time!!

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7218704931

RAC: 974809

On my 6800 running on a

28 Aug 2021 16:10:11 UTC

Message 188590 in response to message 188579

(moderation:

)

On my 6800 running on a Windows 10 system with the latest AMD driver at 2X with moderate clock limitation:

The beta test 1.28 FGRP gave 4% higher production than the production release 1.22. I've seen no abnormal terminations on 1.28, and at this point already have several validations.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46607972642

RAC: 64209488

DF1DX wrote:Interesting. No

28 Aug 2021 16:52:54 UTC

Message 188592 in response to message 188577

(moderation:

)

DF1DX wrote:

Interesting. No crash here but about 10% slower on my Radeon VII with 3 WUs:

1.28: ~522 s,
1.18: ~475 s

clinfo shows:
Platform Version OpenCL 2.1 AMD-APP (3075.10)

hmm I wonder how you got a proper app on linux. my schedule requests are still trying for the incorrectly named FGRPopencl2ati, while you were able to get the right FGRPopencl2-ati.

_________________________________________________________________________

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117468520582

RAC: 35484092

Ian&Steve C. wrote:hmm I

28 Aug 2021 19:55:45 UTC

Message 188604 in response to message 188592

(moderation:

)

Ian&Steve C. wrote:

hmm I wonder how you got a proper app on linux. my schedule requests are still trying for the incorrectly named FGRPopencl2ati, while you were able to get the right FGRPopencl2-ati.

It looks like Bernd has not removed/disabled (or whatever) the incorrect plan class but simply added the correct one as well.

For an RX 570 with no 'tricks' applied, I get entries for both in the scheduler log. The old one still gives the [CRITICAL] response. For the correct one, the message is quite clear now as to why no work is being sent:-

[version] OpenCL device version required min: 200, supplied: 102

Obviously, if I were to fudge the device version in coproc_info.xml, my setup would pass this test. The platform version is already noted there as 2.1 so it would be a very simple edit. Not much point in watching a bunch of test tasks fail though, so I wont be doing that :-). In DF1DX's case, the platform version is 2.1 so if there was a similar device version, the test app and tasks for it would be sent.

A bit strange that there seems to be a loss of performance though.

Cheers,
Gary.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46607972642

RAC: 64209488

mine never checks for the

28 Aug 2021 23:48:32 UTC

Message 188610 in response to message 188604

(moderation:

)

mine never checks for the "good" one. only the bad one, nvidia ones, and normal ones.

2021-08-28 23:43:24.7156 [PID=14422] [send] [HOST#12830576] will accept beta work. Scanning for beta work.</p>

<pre>
2021-08-28 23:43:24.7256 [PID=14422]    [version] Checking plan class 'FGRPopencl-ati'
2021-08-28 23:43:24.7285 [PID=14422]    [version] reading plan classes from file '/BOINC/projects/EinsteinAtHome/plan_class_spec.xml'
2021-08-28 23:43:24.7285 [PID=14422]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2021-08-28 23:43:24.7286 [PID=14422]    [version] GPU RAM calculated: min: 766 MB, use: 600 MB, WU#570813978 CPU: 429 MB
2021-08-28 23:43:24.7286 [PID=14422]    [version] Peak flops supplied: 5e+10
2021-08-28 23:43:24.7286 [PID=14422]    [version] plan class ok
2021-08-28 23:43:24.7286 [PID=14422]    [version] Checking plan class 'FGRPopencl-nvidia'
2021-08-28 23:43:24.7286 [PID=14422]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2021-08-28 23:43:24.7286 [PID=14422]    [version] No CUDA devices found
2021-08-28 23:43:24.7286 [PID=14422]    [version] Checking plan class 'FGRPopencl1K-ati'
2021-08-28 23:43:24.7286 [PID=14422]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2021-08-28 23:43:24.7286 [PID=14422]    [version] GPU RAM calculated: min: 1000 MB, use: 600 MB, WU#570813978 CPU: 429 MB
2021-08-28 23:43:24.7286 [PID=14422]    [version] Peak flops supplied: 5e+10
2021-08-28 23:43:24.7286 [PID=14422]    [version] plan class ok
2021-08-28 23:43:24.7286 [PID=14422]    [version] Checking plan class 'FGRPopencl1K-nvidia'
2021-08-28 23:43:24.7286 [PID=14422]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2021-08-28 23:43:24.7286 [PID=14422]    [version] No CUDA devices found
2021-08-28 23:43:24.7287 [PID=14422]    [version] Checking plan class 'FGRPopenclTV-nvidia'
2021-08-28 23:43:24.7287 [PID=14422]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2021-08-28 23:43:24.7287 [PID=14422]    [version] No CUDA devices found
2021-08-28 23:43:24.7287 [PID=14422]    [version] Checking plan class 'FGRPopencl2ati'
2021-08-28 23:43:24.7287 [PID=14422] [CRITICAL]   Unknown plan class: FGRPopencl2ati
2021-08-28 23:43:24.7287 [PID=14422]    [version] Checking plan class 'FGRPopencl2Pup-nvidia'
2021-08-28 23:43:24.7287 [PID=14422]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2021-08-28 23:43:24.7287 [PID=14422]    [version] No CUDA devices found
2021-08-28 23:43:24.7288 [PID=14422]    [version] Best version of app hsgamma_FGRPB1G is 1.18 ID 945 FGRPopencl1K-ati (150.76 GFLOPS)

_________________________________________________________________________

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117468520582

RAC: 35484092

That's weird! Here's the

29 Aug 2021 8:32:00 UTC

Message 188617

(moderation:

)

That's weird! Here's the same section of the scheduler log that I noticed fairly soon after Bernd made his mea culpa announcement :-). I had a feeling that a fix might occur fairly quickly so was on the lookout for it.

</p>

<pre>
2021-08-28 09:57:59.0606 [PID=29432]    [send] [HOST#506163] will accept beta work.  Scanning for beta work.
2021-08-28 09:57:59.0708 [PID=29432]    [version] Checking plan class 'FGRPopencl-ati'
2021-08-28 09:57:59.0736 [PID=29432]    [version] reading plan classes from file '/BOINC/projects/EinsteinAtHome/plan_class_spec.xml'
2021-08-28 09:57:59.0736 [PID=29432]    [version] parsed project prefs setting 'gpu_util_fgrp': 0.500000
2021-08-28 09:57:59.0736 [PID=29432]    [version] GPU RAM calculated: min: 766 MB, use: 600 MB, WU#568320566 CPU: 429 MB
2021-08-28 09:57:59.0736 [PID=29432]    [version] Peak flops supplied: 5e+10
2021-08-28 09:57:59.0736 [PID=29432]    [version] plan class ok
2021-08-28 09:57:59.0736 [PID=29432]    [version] Checking plan class 'FGRPopencl-nvidia'
2021-08-28 09:57:59.0736 [PID=29432]    [version] parsed project prefs setting 'gpu_util_fgrp': 0.500000
2021-08-28 09:57:59.0736 [PID=29432]    [version] No CUDA devices found
2021-08-28 09:57:59.0737 [PID=29432]    [version] Checking plan class 'FGRPopencl1K-ati'
2021-08-28 09:57:59.0737 [PID=29432]    [version] parsed project prefs setting 'gpu_util_fgrp': 0.500000
2021-08-28 09:57:59.0737 [PID=29432]    [version] GPU RAM calculated: min: 1000 MB, use: 600 MB, WU#568320566 CPU: 429 MB
2021-08-28 09:57:59.0737 [PID=29432]    [version] Peak flops supplied: 5e+10
2021-08-28 09:57:59.0737 [PID=29432]    [version] plan class ok
2021-08-28 09:57:59.0737 [PID=29432]    [version] Checking plan class 'FGRPopencl1K-nvidia'
2021-08-28 09:57:59.0737 [PID=29432]    [version] parsed project prefs setting 'gpu_util_fgrp': 0.500000
2021-08-28 09:57:59.0737 [PID=29432]    [version] No CUDA devices found
2021-08-28 09:57:59.0737 [PID=29432]    [version] Checking plan class 'FGRPopenclTV-nvidia'
2021-08-28 09:57:59.0737 [PID=29432]    [version] parsed project prefs setting 'gpu_util_fgrp': 0.500000
2021-08-28 09:57:59.0737 [PID=29432]    [version] No CUDA devices found
2021-08-28 09:57:59.0738 [PID=29432]    [version] Checking plan class 'FGRPopencl2-ati'
2021-08-28 09:57:59.0738 [PID=29432]    [version] parsed project prefs setting 'gpu_util_fgrp': 0.500000
2021-08-28 09:57:59.0738 [PID=29432]    [version] GPU RAM calculated: min: 1000 MB, use: 750 MB, WU#568320566 CPU: 429 MB
2021-08-28 09:57:59.0738 [PID=29432]    [version] OpenCL device version required min: 200, supplied: 102
2021-08-28 09:57:59.0738 [PID=29432]    [version] Checking plan class 'FGRPopencl2ati'
2021-08-28 09:57:59.0738 [PID=29432] [CRITICAL]   Unknown plan class: FGRPopencl2ati
2021-08-28 09:57:59.0738 [PID=29432]    [version] Checking plan class 'FGRPopencl2Pup-nvidia'
2021-08-28 09:57:59.0738 [PID=29432]    [version] parsed project prefs setting 'gpu_util_fgrp': 0.500000
2021-08-28 09:57:59.0738 [PID=29432]    [version] No CUDA devices found
2021-08-28 09:57:59.0739 [PID=29432]    [version] Best version of app hsgamma_FGRPB1G is 1.18 ID 945 FGRPopencl1K-ati (195.12 GFLOPS)</pre>

<p>

Notice that 2-ati is checked and disallowed for OpenCL version reasons before it gets to the wrong 2ati check that gives the error response.

You can see the timestamp - just before 10am UTC - 8PM my time. Your timestamp is quite a bit later so I'm wondering if further changes might have been made during that interval which has caused the correct plan class to somehow be overlooked/non-functional for you. Hopefully Bernd will notice these comments and check the situation.

My host no longer has beta work allowed and I'm not going to revert it yet again. I'm always short of locations (venues) and I had to jump through a few hoops to get beta enabled without risking other hosts as well during the exercise. I tend to avoid beta like the plague unless I can easily set up a single machine. I have no appetite for risking a whole bunch. Fortunately, the group was small enough and I was able to suspend network access on other members for the duration without too much effort.

Cheers,
Gary.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46607972642

RAC: 64209488

yeah I guess I'll have to

29 Aug 2021 16:06:59 UTC

Message 188628

(moderation:

)

yeah I guess I'll have to wait for Bernd to address whatever is happening there. maybe he just needs to remove the bad plan class so it stops checking it?

I also noticed that the nvidia 1.28 app is now out of beta. its a production app now. everyone should be able to get it.

_________________________________________________________________________

Wedge009

Joined: 5 Mar 05

Posts: 122

Credit: 17364054280

RAC: 7178117

I was told this is where the

30 Aug 2021 0:47:52 UTC

Message 188650

(moderation:

)

I was told this is where the beta applications for FGRPB1G is primarily being discussed. I'm reporting zero success on v1.28 with my older AMD hardware, have only received the beta applications on Windows, none for Linux.

https://einsteinathome.org/content/fgrpopencl2-ati-beta-test-application-broken

I'll see if I can get anything for my old Pascal-based GPUs, since I've been advised the application changes are supposed to benefit NV GPUs more.

Edit: Looks like I already had some from the weekend - it's completing in around 5 minutes and my recollection is they used to complete at close to 7 minutes, so that's a good improvement. Still requires a lot of wait time on the CPU, by the looks of it.

Soli Deo Gloria

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46607972642

RAC: 64209488

looks like Bernd got around

30 Aug 2021 16:21:07 UTC

Message 188678

(moderation:

)

looks like Bernd got around to cleaning up the scheduler issue with the incorrect plan class issue. I wanted to try this app out because I know this system has the right drivers and wanted to see if it is indeed a problem with the app itself.

RX 570 4GB (Polaris)

Ubuntu 20.04.3 LTS, 5.11.0-27 kernel

ROCm 4.2 drivers

my Linux/RX570 picked up a handful of 1.28 tasks now. they are processing normally. maybe 2-3% slower than the 1.18 tasks, but no errors.

https://einsteinathome.org/task/1161439011

host: https://einsteinathome.org/host/12830576

will re-test with my code applied over top of this. It does seem likely that the people who are having issues probably comes down to the drivers and not the app itself. I would highly recommend that anyone having issues on Linux, at least try to use the latest ROCm drivers instead of the AMDGPU-Pro drivers which have a more limited ROCr or PAL implementation which I've never found to work with these new features on older GPUs (I never got PAL drivers to work properly in newer kernels, and ROCr doesnt have proper 2.0 support for old GPUs). Vega "should" work with ROCr in the AMDGPU-Pro package based on the information I've been given, but as always with AMD drivers, what should work and what actually works can often be two totally different things.

_________________________________________________________________________

Improvements in the code of the clients

Forums › Wish List

Comment viewing options

Forums › Wish List