New Improved Gravational Wave App - Discussion

Tom M

Joined: 2 Feb 06

Posts: 6439

Credit: 9567406445

RAC: 8695848

My understanding is the

6 Apr 2024 4:22:01 UTC

Message 223881

(moderation:

)

My understanding is the faster the CPU processes an All Sky GW the less time the whole tasks takes.

So I have shut off the SMT on this system.

The top end of the MHz on the CPU hasn't gone up much, if at all.

I am expecting the total runtime average of the tasks to drop.

And the predicted RAC to increase.

Tom M

A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!

Keith Myers

Joined: 11 Feb 11

Posts: 4964

Credit: 18714511339

RAC: 6363128

Tom, your understanding of

6 Apr 2024 15:35:58 UTC

Message 223892 in response to message 223881

(moderation:

)

Tom, your understanding of SMT is a bit lacking. In modern processors, SMT does not cause any loss of performance due to contention of cpu architecture resources for the most part.

As you noticed, you did not gain any clock frequency when you turned SMT off. The faster the clocks, the faster the computation completes. Only then would work returned go up along with the RAC.

If anything, your RAC will drop because you are only doing half the work you were doing before.

Tom M

Joined: 2 Feb 06

Posts: 6439

Credit: 9567406445

RAC: 8695848

Keith,I understand your

6 Apr 2024 16:55:07 UTC

Message 223894 in response to message 223892

(moderation:

)

Keith,

I understand your point. I believe that trying this will not slow the individual task processing down I believe the CPU processing of the tasks are not coded as multi-threading.

I will note that the 7601 HPC recommendations also suggest turning off SMT.

My All Sky GW tasks appear to spend about 50 percent of their clock time running purely on the CPU.

Tom M

A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46747262642

RAC: 64158571

Tom you should switch to the

6 Apr 2024 17:06:01 UTC

Message 223896

(moderation:

)

Tom you should switch to the 1.08/1.15 CUDA app. It does much better than the opencl app in my opinion.

1.07 is the Nvidia-OpenCL app. this uses CPU for the recalc sections
1.08 was the fist release of the Nvidia-CUDA app. this uses CPU for the recalc also
1.14 is the working release of the Nvidia-CUDA app with GPU based recalc, does not rely on the CPU so much. this app is the default now (non-beta)
1.15 is the same binary file as 1.08. 1.15 is in the beta channel so if you select beta, you will get this one. identical to 1.08 in every way.

I like the 1.14 app (CUDA, GPU based recalc) for my Volta cards. maybe it's the wide memory bus that makes this app faster than 1.08/1.15 here.

i like the 1.08/1.15 app for my 3080Tis. they are consistently faster than 1.14 on that system.

but either app should be faster than the 1.07 app you're using now.

_________________________________________________________________________

Keith Myers

Joined: 11 Feb 11

Posts: 4964

Credit: 18714511339

RAC: 6363128

Ian, how did you persuade the

6 Apr 2024 20:21:06 UTC

Message 223897 in response to message 223896

(moderation:

)

Ian, how did you persuade the scheduler to send you the 1.14 app? Setting beta gets you the 1.15 app.

I don't remember ever testing for the gpu-recalc 1.14 app and wanted to see how it compares to 1.08/1.15.

[Edit] Nevermind I found your post about the team package containing it.

Keith Myers

Joined: 11 Feb 11

Posts: 4964

Credit: 18714511339

RAC: 6363128

I looked at the output of

6 Apr 2024 21:07:35 UTC

Message 223898

(moderation:

)

I looked at the output of both the 1.08 and 1.14 apps and they both state that recalc takes place on the cpu.

I thought the 1.14 app was supposed to do the recalc stages on the gpu?

Did the devs not change the code that produces the output log to indicate the recalcs are taking place on the gpu?

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46747262642

RAC: 64158571

Keith Myers wrote: Ian, how

6 Apr 2024 21:24:39 UTC

Message 223899 in response to message 223897

(moderation:

)

Keith Myers wrote:

Ian, how did you persuade the scheduler to send you the 1.14 app? Setting beta gets you the 1.15 app.

I don't remember ever testing for the gpu-recalc 1.14 app and wanted to see how it compares to 1.08/1.15.

[Edit] Nevermind I found your post about the team package containing it.

with beta/test applications selected, you should be sent 1.15.

with no beta/test selected, you should be sent 1.14. this is the default now. but yeah you can just edit your app_info and move the new file over.

_________________________________________________________________________

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46747262642

RAC: 64158571

Keith Myers wrote:I looked

6 Apr 2024 21:35:00 UTC

Message 223900 in response to message 223898

(moderation:

)

Keith Myers wrote:

I looked at the output of both the 1.08 and 1.14 apps and they both state that recalc takes place on the cpu.

I thought the 1.14 app was supposed to do the recalc stages on the gpu?

Did the devs not change the code that produces the output log to indicate the recalcs are taking place on the gpu?

1.08 app output: https://einsteinathome.org/task/1598957736

Quote:

...
2024-04-06 17:13:56.0437 (3474657) [normal]: Search FstatMethod used: 'ResampGPU'
2024-04-06 17:13:56.0437 (3474657) [normal]: Recalc FstatMethod used: 'DemodSSE'
...

1.14 output: https://einsteinathome.org/task/1598820483

Quote:

...
2024-04-06 21:10:16.7971 (362925) [normal]: Search FstatMethod used: 'ResampGPU'
2024-04-06 21:10:16.7971 (362925) [normal]: Recalc FstatMethod used: 'DemodGPU'
...

1.14 is definitely using the GPU. there's still a bit of GPU activity during the recalc portions [37.5-50.0%] and [87.5-100%]. the outputs are different between them IMO.

_________________________________________________________________________

Keith Myers

Joined: 11 Feb 11

Posts: 4964

Credit: 18714511339

RAC: 6363128

I guess I was looking at the

6 Apr 2024 22:31:36 UTC

Message 223902 in response to message 223900

(moderation:

)

I guess I was looking at the text stating how much memory was used on the cpu. Didn't notice the part you quoted.

I saw the wattage go down on the gpu to about half it normally uses in the beginning and end of the task.

Maybe a tad bit slower on the gpu than the cpu on my 3090. Not enough tasks completed yet to definitively state it slower. Maybe 20-30 seconds slower on average. Hard to tell because of the task variability.

I just edited in the 1.14 app into my app_info to change over. Only trying it out on this daily driver. Not seeing enough benefit so far to justify swapping every host over yet.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46747262642

RAC: 64158571

yeah it probably depends a

6 Apr 2024 23:27:38 UTC

Message 223904

(moderation:

)

yeah it probably depends a lot on what your specific GPU/CPU combo is.

my theory is that overall memory bus width and/or bandwidth might be a factor for which is better. I don't have much real evidence for that other than some passing comments from Bernd about random memory access patterns and looking at the specs of things.

the EPYC DDR4 8-ch mem is a wide pipe (512-bit). but the 3072-bit and 4096-bit links of the Titan V and V100 (resp.) HBM is wider, and on these systems the GPU app is better. but on the 3080Ti host they only have a relatively small 384-bit bus, and my host with 3080tis (same 64-core EPYC CPU as the titan V systems) does better with CPU recalc 1.08 app. and it wasnt even close. 1.14 was considerably slower from what i remember on that host like 30% slower.

might be interesting to see how the 1.14 app responds to similar cards on a platform with a more consumer CPU with only dual channel memory vs the 1.08 app. i forget which of your hosts is the daily driver. if it's one of the 7950x systems, the fast DDR5 with fast 7950x might be closing the gap.

_________________________________________________________________________

New Improved Gravational Wave App - Discussion

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner