1.28 Gamma-ray pulsar binary search #1 on GPUs (FGRPopencl2-ati)

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 449
Credit: 208,736,569
RAC: 31,781
Topic 225945

I am now running the new 1.28 GR work units on an RX 570 (Ubuntu 20.04.3).

They are taking around 10 minutes, supported by six cores of a Ryzen 3600.

https://einsteinathome.org/host/12878436/tasks/2/0?sort=desc&order=Sent

 

This is about the same as the 1.18.

https://einsteinathome.org/content/what-do-you-expect-crunching-einsteinhome#comment-188429

catavalon21
catavalon21
Joined: 5 Nov 11
Posts: 3
Credit: 26,388,814
RAC: 186,730

On my 1660ti running Ubuntu

On my 1660ti running Ubuntu as well my times have gone from the 750 second range

 

to the 450 second range

 

 

solling2
solling2
Joined: 20 Nov 14
Posts: 197
Credit: 513,956,711
RAC: 738,616

For my Polaris20/i5 test

For my Polaris20/i5 test system I noticed the following times (x2, no power/undervolting tool applied):

Mint 20.2 [5.4.0], Boinc 7.16.6:

960 sec: coproc file in original status, so app version 1.18

960 sec: coproc file in trick status, so app version 1.28  

Xubuntu 18.04.5 [5.4.0], Boinc 7.9.3:

1015 sec: driver amdgpu-pro 20.40, so app version 1.18

solling2
solling2
Joined: 20 Nov 14
Posts: 197
Credit: 513,956,711
RAC: 738,616

Oddly however, for all 1.28

Oddly however, for all 1.28 workunits that I crunched after a pause, I got messages like the following, and they disappeared after uploading without points. (The ones before the pause were all fine.)

2021-09-02 12:08:19.6748 [PID=2287 ] [CRITICAL]   [HOST#12900320] [RESULT#1162302446] [WU#571500842] result already over [outcome=1 validate_state=0]: result already reported as success
2021-09-02 12:08:19.6748 [PID=2287 ]    [handle] [HOST#12900320] [RESULT#1162312911] [WU#571505665] got result (DB: server_state=5 outcome=1 client_state=5 validate_state=1 delete_state=2)

2021-09-02 12:08:19.6793 [PID=2287 ] [debug]   [HOST#12900320] MSG(high) Completed result LATeah4012L00_948.0_0_0.0_1994655_0 refused: result already reported as success


Thereafter, I aborted a few remaining ones. They also disappeared (instead of appearing in the error column) from the dashboard. Odd!

 

archae86
archae86
Joined: 6 Dec 05
Posts: 3,020
Credit: 4,981,167,045
RAC: 3,061,949

By now I have run many

By now I have run many hundreds of 1.28 tasks on four different GPU's residing on three different hosts.
Two 5700s
one 6800 XT
one 6800

While I have not seen the failures that others have reported, I also have seen extremely little performance benefit. Tentatively, I think I see about 1% performance benefit in two cases where the GPU was running tasks at 2X multiplicity while the other cases where it was running 3X or 4X have no discernible benefit at all.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 912
Credit: 6,646,165,501
RAC: 28,158,745

Thanks arch. could you run a

Thanks arch. could you run a few tests on both a Navi and Big Navi system doing 1x w/1.22 and 1x w/1.28? Thats where I observed the reported 20% increase in pre-release, with 1x. I do think the app still needs some tuning though since it's not performing the same as pre-release code. so something else might need to be added.

 

The failures that most people have seen, seem to be driver related.

Windows drivers seem to not have proper OpenCL support for Vega or older (even though they report they do), so they produce errors. I have not seen a single host that was Windows based with Vega or Polaris gen cards with a successful run. I took my Polaris RX570 GPU and entire platform that worked on Linux, and it would not work on Windows.

I do not believe it to be a problem with the windows application itself since you (arch) have been successful in running Windows drivers with Navi and Big Navi. Wedge had issues with one of his hosts on Linux with ROCm drivers and a Vega GPU, but I believe his issue to be hardware incompatabiliity (very old CPU, which seems to be important to ROCm drivers) and his failure mode was different than everyone else's (no error, tasks just didnt run, no progress, no GPU utilization).

Linux hosts with ROCm drivers have been shown to work on several hosts with the new app with Polaris cards and Vega cards (no change in speed, or slightly slower) and with Navi cards as well (small speedup).

_____________________________________________

DF1DX
DF1DX
Joined: 14 Aug 10
Posts: 72
Credit: 2,104,906,754
RAC: 1,815,184

Linux Mint 19.3 Xfce,

Linux Mint 19.3 Xfce, amdgpu-pro 20.30, Radeon VII, x 3:

1.18: 472 s

1.28: 520 s

archae86
archae86
Joined: 6 Dec 05
Posts: 3,020
Credit: 4,981,167,045
RAC: 3,061,949

Ian&Steve C. wrote:run a few

Ian&Steve C. wrote:
run a few tests on both a Navi and Big Navi system doing 1x w/1.22 and 1x w/1.28? Thats where I observed the reported 20% increase in pre-release, with 1x.

I wondered if you perhaps had run the tests at 1X.  I don't think any serious Einstein Navi user runs FGRP that way, so if it turns out to be true that the benefit is strongly multiplicity dependent, that blows a big hole in the case for any AMD benefit worth having.

I've started work toward generating 1X comparisons, first on a 5700 system, then on a 6800 system.  I should have at least one comparison before the end of the day.

I, in turn, have a request of you.  Please don't refer to me by a diminutive.  I'm fine with you calling me archae86 or Peter, but don't wish to be called arch or Pete.  Thank you.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 912
Credit: 6,646,165,501
RAC: 28,158,745

Tom's gains on his 5700

Tom's gains on his 5700 continued to show improvment on 2x also. it wasnt just on 1x.

edit:

to be more accurate, Here were the results we observed with Tom's 5700

  • 1.18app (default) 1x tasks = 400s run time.
  • 1.18app (with patch) 1x tasks = 324s run time.
  • 1.18app (default) 2x tasks = 610s (305s effective)
  • 1.18app (with patch) 2x tasks = 565s (282s effective)
  • 1.28app (default) 2x tasks = 595s (297s effective)

So there was about 8-10% advantage for 2x operations with test code, but only ~2-3% advantage with the 1.28 app.

 

I was just looking for a more apples to apples comparison to be sure that there are still only marginal gains when running 1x.

 

the new 1.28 amd app more than likely is still missing some of the key optimizations petri made in our test code, and Bernd has informed me that he will be away for most of September, so it's unlikely to see any update until next month.

_____________________________________________

catavalon21
catavalon21
Joined: 5 Nov 11
Posts: 3
Credit: 26,388,814
RAC: 186,730

I completely missed that this

I completely missed that this was an AMD/ATI thread; sorry about that.  I was so smitten with "Petri33 was here" ... well,

#idiot

 

 

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 449
Credit: 208,736,569
RAC: 31,781

catavalon21 wrote: I

catavalon21 wrote:

I completely missed that this was an AMD/ATI thread; sorry about that.  I was so smitten with "Petri33 was here" ... well,

I am perfectly happy to see Nvidia results.  I just happen to have my RX 570 on Einstein most of the time, since that is what it does best.  Why don't you start a separate thread for Nvidia, to keep them straight?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.