Discussion Thread for the Continuous GW Search known as O2MD1 (now O2MDF - GPUs only)

robl

Joined: 2 Jan 13

Posts: 1709

Credit: 1454565658

RAC: 3171

curios. on my linux host

9 Oct 2019 1:13:27 UTC

Message 173768

(moderation:

)

curios. on my linux host with amd gpu i am seeing the following: run time/cpu time of 1081/525, but on some other members windows pc 1346/1061. The WUs are both "Gravitational Wave search O2 Multi-Directional v1.10 () x86_64-pc-linux-gnu" The windows pc is utilizing an AMD Radeon (TM) R9 390 Series (8192MB) while I am running a AMD Radeon (TM) RX 480 Graphics (8097MB) I suppose the cpu time difference of 525 and 1061 could be attributed to the GPUs.

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

I think it could be a 'linux

9 Oct 2019 3:24:48 UTC

Message 173769

(moderation:

)

I think it could be a 'linux vs windows' thing. I'm running 3x with RX 580 on another host. Total run times may fluctuate somewhat (and occasionally there are some strange black sheeps included) but cpu time / run time factor seems to be very constantly 0.7 for that RX 580 + windows. I checked my two R9 390 + windows hosts and same factor for them is constantly 0.8. None of those systems are currently setup for dual-boot. Would've been nice to find out what the cpu times would be under linux. Mmmm, I have a faint memory that maybe something similar about the cpu time of a gpu application in linux vs. windows has been discussed earlier at this forum.

edit: I see you have a Ryzen cpu in that host. Perhaps different type of cpus and systems may have an effect on the cpu times in general... with this new GW gpu app. I don't remember how it has been with the previous gpu applications.

Zalster

Joined: 26 Nov 13

Posts: 3117

Credit: 4050672230

RAC: 0

Wow, these GPU O2MD1 run

9 Oct 2019 5:30:05 UTC

Message 173772

(moderation:

)

Wow, these GPU O2MD1 run really fast compared to their CPU counterparts. 190 seconds compared to 25K seconds.

Edit.. Also looks like the app got updated to 1.10 from 1.09

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5850

Credit: 110021245576

RAC: 22699514

In this message in the O2AS

9 Oct 2019 5:54:35 UTC

Message 173773

(moderation:

)

In this message in the O2AS discussion thread, I posted data for crunch times for the V1.09 GPU app when running on 4 different CPU/GPU combinations for task multiplicities up to 4x. The results showed significant improvement in output (ie reduction in secs/task) in all cases when using the higher multiplicities. With the advent of the O2MD1 search using GPUs, I'm keen to get similar information for the new V1.10 app. It's a different type of search (directed at specific targets rather than covering the whole sky) so performance is likely to be quite different.

I decided to use 2 hosts that were used in the previous tests, the Q6600/RX 460 and the i5-3470/RX 570 which were the 1st and 4th from the previous list. Both hosts got work for frequencies around the 215Hz mark so well above the low end values that were reported by others earlier on.

I found these tasks were able to crunch very quickly. There are already enough returned results to provide some information about expected crunch times so I'll give details here using the same format (columns, abbreviations, etc) as previously. Each concurrent GPU task had access to the support of a full CPU core.


CPU / GPU (Cores / Threads / GHz)    Tsks     Multi   Pnd  Val   Inc   Inv   Err   Productivity values (secs/task)
=================================    ====     =====   ===   ===   ===   ===   ===   ===============================
Q6600 / RX 460 (4C / 4T / 2.4 GHz)     20     1,2,3     20    0     0     0     0   1300s,  975s,  712s
i5-3470/RX 570 (4C / 4T / 3.2 GHz)     28     1,2,3,4   28    0     0     0     0    586s,  380s,  330s,  312s

Only small numbers were crunched at the lowest multiplicities - just enough to get a basic value for the crunch time. The bulk of the results were at 3x for the RX 460 and 4x for the RX 570. The crunch times seemed to become rather more variable for the 570 at 4x so I didn't try 4x for the 460 or anything higher than 4x for the 570. There was good consistency in the times for both hosts up to 3x.

The CPU time component for each task was surprisingly constant irrespective of multiplicity. I guess that suggests a fairly constant amount of CPU work per task which shows as a relatively uniform time if there's always a full core available. The slower CPU will use more time to provide that constant amount of work. Here is a small table to show the typical values of elapsed time/CPU time for both hosts at the multiplicities used.


GPU Type     Multi     Elapsed     CPU     Tsks
========      =====     =======     ===     ====
RX 460         1x         1300      496        1
RX 460         2x         1950      509        4
RX 460         3x         2135      452       15
RX 570         1x          586      278        1
RX 570         2x          760      278        4
RX 570         3x          990      286        3
RX 570         4x         1246      310       20

All results so far are pending. Since it may be a while before any validations are performed, I've switched the hosts back to FGRPB1G until it becomes clear that validation is OK. I don't see much point in crunching more until we see how validation goes.

Cheers,
Gary.

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

Zalster wrote:Wow, these GPU

9 Oct 2019 6:07:08 UTC

Message 173774 in response to message 173772

(moderation:

)

Zalster wrote:

Wow, these GPU O2MD1 run really fast compared to their CPU counterparts. 190 seconds compared to 25K seconds.

Looks like that particular gpu can finish its duty cycle before the cpu get's its own workload done.... run times are smaller than cpu time That's computational speed metal !

cecht

Joined: 7 Mar 18

Posts: 1434

Credit: 2472688144

RAC: 835781

For the v1.10 app on my two

9 Oct 2019 11:36:39 UTC

Message 173779

(moderation:

)

For the v1.10 app on my two RX570s, running at 3X, I have 10 valids, with 280 pending and no errors or invalids, yet. That's looking hopeful!

Ideas are not fixed, nor should they be; we live in model-dependent reality.

robl

Joined: 2 Jan 13

Posts: 1709

Credit: 1454565658

RAC: 3171

cecht wrote:For the v1.10 app

9 Oct 2019 11:51:54 UTC

Message 173780 in response to message 173779

(moderation:

)

cecht wrote:

For the v1.10 app on my two RX570s, running at 3X, I have 10 valids, with 280 pending and no errors or invalids, yet. That's looking hopeful!

Yes, except for: https://einsteinathome.org/goto/comment/173777

cecht

Joined: 7 Mar 18

Posts: 1434

Credit: 2472688144

RAC: 835781

robl wrote:cecht wrote:For

9 Oct 2019 17:51:55 UTC

Message 173787 in response to message 173780

(moderation:

)

robl wrote:

cecht wrote:
For the v1.10 app on my two RX570s, running at 3X, I have 10 valids, with 280 pending and no errors or invalids, yet. That's looking hopeful!

Yes, except for: https://einsteinathome.org/goto/comment/173777

Ahh, the joy of beta testing. :/

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5850

Credit: 110021245576

RAC: 22699514

cecht wrote:Ahh, the joy of

9 Oct 2019 21:31:29 UTC

Message 173790 in response to message 173787

(moderation:

)

cecht wrote:

Ahh, the joy of beta testing. :/

After I did some initial testing yesterday, I got quite a few more tasks but then decided to not run them. I was tempted to think all would be well but, .... so back to FGRPB1G they went for the overnight run. As I survey the scene this morning, I'm sure glad I was cautious :-).

Unfortunately, a new app with more sensitivity sounds like longer crunch times .... I guess that dramatic speed increase we were seeing may just be too good to be true :-).

Cheers,
Gary.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4273

Credit: 245270259

RAC: 12184

Our internal test showed a

10 Oct 2019 11:16:00 UTC

Message 173795

(moderation:

)

Our internal test showed a runtime increase by about 20% (both CPU and GPU). We thought this to be justified.

Discussion Thread for the Continuous GW Search known as O2MD1 (now O2MDF - GPUs only)

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner