All-Sky Gravitational Wave Search on O3 data (O3ASHF1)

San-Fernando-Valley

Joined: 16 Mar 16

Posts: 459

Credit: 10386411151

RAC: 12138152

The host configuration

4 Dec 2024 19:07:08 UTC

Message 230521 in response to message 230520

(moderation:

)

The host configuration doesn't change on my host.

Or please explain how this coud happen.

Maybe I'm not aware of that.

sfv

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4330

Credit: 251201407

RAC: 41073

I can see that the new tasks

4 Dec 2024 20:23:55 UTC

Message 230523

(moderation:

)

I can see that the new tasks run 75% longer than the old ones ON AVERAGE. It seems that high power GPUs (like the ones we tested on) are affected less than older or integrated ones.

The validator needs some more adjustments, it'll have to wait until tomorrow office hours (CET).

Wedge009

Joined: 5 Mar 05

Posts: 128

Credit: 17502976208

RAC: 6429333

Curious to know where that

4 Dec 2024 21:25:37 UTC

Message 230524

(moderation:

)

Curious to know where that 75% figure comes from. Is it across the entire project or what the development team is using?

Looking forward to seeing if validation rates are good as well. That will help inform us on where to commit limited resources. Thanks.

Soli Deo Gloria

mmonnin

Joined: 29 May 16

Posts: 292

Credit: 3444696540

RAC: 1881334

So close to reaching 10k

4 Dec 2024 21:33:53 UTC

Message 230525

(moderation:

)

So close to reaching 10k hours with the app. I'm not sure if the tasks I have left on one PC will get me there.

My tasks went from:

h1_1683.60_O3aC01Cl1In0__O3ASHF1d_1684.00Hz_625_1

to

h1_0162.80_O3aLC01Cl1In0__O3ASBu_163.00Hz_51642_0

and almost tripled in run time. Some are 160, 167, 198 Hz

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4330

Credit: 251201407

RAC: 41073

Wedge009 wrote:Curious to

4 Dec 2024 22:37:00 UTC

Message 230526 in response to message 230524

(moderation:

)

Wedge009 wrote:

Curious to know where that 75% figure comes from. Is it across the entire project or what the development team is using?

The run time of all (successful) new tasks on a host is averaged, and then the run times of all old tasks on that host (in the DB) is averaged. Then the ratio avg(new)/avg(old) is taken per host, and finally this ratio is averaged over all hosts of the project that successfully completed both old and new tasks. The resulting overall ratio is 1.75, which means +75%.

pututu

Joined: 6 Apr 17

Posts: 67

Credit: 653417392

RAC: 1

Bernd Machenschalk

5 Dec 2024 1:10:32 UTC

Message 230527 in response to message 230526

(moderation:

)

Bernd Machenschalk wrote:

Wedge009 wrote:

Curious to know where that 75% figure comes from. Is it across the entire project or what the development team is using?

The run time of all (successful) new tasks on a host is averaged, and then the run times of all old tasks on that host (in the DB) is averaged. Then the ratio avg(new)/avg(old) is taken per host, and finally this ratio is averaged over all hosts of the project that successfully completed both old and new tasks. The resulting overall ratio is 1.75, which means +75%.

Did the host with "high power gpus" that you were using run a single task per gpu? What was the ratio for that single host alone? I previously ran O3AS with higher frequency batch with more than one task per GPU to improve gpu utilization.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 4016

Credit: 47655847867

RAC: 43453727

comparing the average "tasks

5 Dec 2024 1:31:29 UTC

Message 230529

(moderation:

)

comparing the average "tasks completed per day" is probably a more useful metric that will account for any changes to host configuration with old an new tasks.

on one of my hosts, my old runtimes were like ~1800s, and new runtimes are ~3900s. on it's face that looks like only a 2x slowdown, but the missing context is that I was running 5 tasks at a time to get that 1800s runtime, and only 3 tasks at a time to come to the 3900s runtime. these are two very optimized configurations that get the most out of each kind of task.

86400/1800 = 48, 48x5 = 240 tasks per day
86400/3900 = 22.15, 22.15*3 = 66.5 tasks per day

that puts the new tasks about 3.6x less productive than the old tasks.

and with the credit adjustment from increasing the estimated flops (which means credit should now be 20,000), that's about a reduction of about 1.8x in effective ppd, credit per day.

_________________________________________________________________________

Boca Raton Comm...

Joined: 4 Nov 15

Posts: 258

Credit: 10743806666

RAC: 12020965

Ian&Steve C. wrote:I was

5 Dec 2024 1:52:20 UTC

Message 230530 in response to message 230529

(moderation:

)

Ian&Steve C. wrote:

I was running 5 tasks at a time to get that 1800s runtime, and only 3 tasks at a time to come to the 3900s runtime.

I am also seeing that 3, maybe 4, simultaneous tasks is the "limit" even on high end gpus. The cores/memory bus are staying fully saturated 95% of the time running 3x. Running 4x will keep cores/bus fully saturated but we have yet to determine which is better (it might be 3x). This general behavior was seen across 4 hosts (rtx a4500, rtx a6000, and both 4090 systems).

Genral observation: It seems that the amount of concurrent work units will be cut in half on most of ours systems. Or, that is where we will start and then further optimize. I would suggest others "start" optimization by cutting concurrent in half and then adjusting from there. This suggestion would only apply if your system was already optimized.

wujj123456

Joined: 16 Sep 08

Posts: 20

Credit: 2028768793

RAC: 2092243

Ian&Steve C. wrote:comparing

5 Dec 2024 4:33:10 UTC

Message 230534 in response to message 230529

(moderation:

)

Ian&Steve C. wrote:

comparing the average "tasks completed per day" is probably a more useful metric that will account for any changes to host configuration with old an new tasks.

Yes, this is the right way to compute that would cancel out concurrency configurations.

I have a script recording all finished tasks. On my 7950X+4070Ti host, I was finishing ~200-220 tasks per day with two tasks splayed to run concurrently. With the new tasks having very little idle GPU time, I configured it to run one per GPU and it finished 55 tasks yesterday. I tried running two tasks per GPU with the new tasks too, but that pretty much just doubled runtime.

Because of the significant reduction in CPU-only periods and increase in GPU compute, the slowdown ratio depends a lot on how under-powered one's CPU is relative to GPU. The weaker the CPU relatively, the less slowdown one would notice from the new tasks.

pututu

Joined: 6 Apr 17

Posts: 67

Credit: 653417392

RAC: 1

When I was running O3AS on

5 Dec 2024 6:05:55 UTC

Message 230535

(moderation:

)

When I was running O3AS on Radeon VII, I was running 6 tasks per gpu to keep the average gpu utilization in the high 90%+ most of the time. At least now the low freq tasks will let me free up some cpu cores. Also did anyone notice that the cpu time for doing the stat recalculation appears to have dropped by about half (at least for those with fast cpu)?

All-Sky Gravitational Wave Search on O3 data (O3ASHF1)

Forums › Technical News

Comment viewing options

Forums › Technical News