All-Sky Gravitational Wave Search on O3 data (O3ASHF1)

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 459
Credit: 10386411151
RAC: 12138152

The host configuration

The host configuration doesn't change on my host.

Or please explain how this coud happen.

Maybe I'm not aware of that.

sfv

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4330
Credit: 251201407
RAC: 41073

I can see that the new tasks

I can see that the new tasks run 75% longer than the old ones ON AVERAGE. It seems that high power GPUs (like the ones we tested on) are affected less than older or integrated ones.

The validator needs some more adjustments, it'll have to wait until tomorrow office hours (CET).

BM

Wedge009
Wedge009
Joined: 5 Mar 05
Posts: 128
Credit: 17502976208
RAC: 6429333

Curious to know where that

Curious to know where that 75% figure comes from. Is it across the entire project or what the development team is using?

Looking forward to seeing if validation rates are good as well. That will help inform us on where to commit limited resources. Thanks.

Soli Deo Gloria

mmonnin
mmonnin
Joined: 29 May 16
Posts: 292
Credit: 3444696540
RAC: 1881334

So close to reaching 10k

So close to reaching 10k hours with the app. I'm not sure if the tasks I have left on one PC will get me there.

 

My tasks went from:

h1_1683.60_O3aC01Cl1In0__O3ASHF1d_1684.00Hz_625_1 
    
to 

h1_0162.80_O3aLC01Cl1In0__O3ASBu_163.00Hz_51642_0 
 

and almost tripled in run time. Some are 160, 167, 198 Hz

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4330
Credit: 251201407
RAC: 41073

Wedge009 wrote:Curious to

Wedge009 wrote:

Curious to know where that 75% figure comes from. Is it across the entire project or what the development team is using?

The run time of all (successful) new tasks on a host is averaged, and then the run times of all old tasks on that host (in the DB) is averaged. Then the ratio avg(new)/avg(old) is taken per host, and finally this ratio is averaged over all hosts of the project that successfully completed both old and new tasks. The resulting overall ratio is 1.75, which means +75%.

BM

pututu
pututu
Joined: 6 Apr 17
Posts: 67
Credit: 653417392
RAC: 1

Bernd Machenschalk

Bernd Machenschalk wrote:

Wedge009 wrote:

Curious to know where that 75% figure comes from. Is it across the entire project or what the development team is using?

The run time of all (successful) new tasks on a host is averaged, and then the run times of all old tasks on that host (in the DB) is averaged. Then the ratio avg(new)/avg(old) is taken per host, and finally this ratio is averaged over all hosts of the project that successfully completed both old and new tasks. The resulting overall ratio is 1.75, which means +75%.

Did the host with "high power gpus" that you were using run a single task per gpu? What was the ratio for that single host alone? I previously ran O3AS with higher frequency batch with more than one task per GPU to improve gpu utilization. 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 4016
Credit: 47655847867
RAC: 43453727

comparing the average "tasks

comparing the average "tasks completed per day" is probably a more useful metric that will account for any changes to host configuration with old an new tasks.

on one of my hosts, my old runtimes were like ~1800s, and new runtimes are ~3900s. on it's face that looks like only a 2x slowdown, but the missing context is that I was running 5 tasks at a time to get that 1800s runtime, and only 3 tasks at a time to come to the 3900s runtime. these are two very optimized configurations that get the most out of each kind of task.

86400/1800 = 48, 48x5 = 240 tasks per day
86400/3900 = 22.15, 22.15*3 = 66.5 tasks per day

that puts the new tasks about 3.6x less productive than the old tasks.

and with the credit adjustment from increasing the estimated flops (which means credit should now be 20,000), that's about a reduction of about 1.8x in effective ppd, credit per day.

 

_________________________________________________________________________

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 258
Credit: 10743806666
RAC: 12020965

Ian&Steve C. wrote:I was

Ian&Steve C. wrote:

I was running 5 tasks at a time to get that 1800s runtime, and only 3 tasks at a time to come to the 3900s runtime. 

 

I am also seeing that 3, maybe 4, simultaneous tasks is the "limit" even on high end gpus. The cores/memory bus are staying fully saturated 95% of the time running 3x. Running 4x will keep cores/bus fully saturated but we have yet to determine which is better (it might be 3x). This general behavior was seen across 4 hosts (rtx a4500, rtx a6000, and both 4090 systems). 

Genral observation: It seems that the amount of concurrent work units will be cut in half on most of ours systems. Or, that is where we will start and then further optimize. I would suggest others "start" optimization by cutting concurrent in half and then adjusting from there. This suggestion would only apply if your system was already optimized.

wujj123456
wujj123456
Joined: 16 Sep 08
Posts: 20
Credit: 2028768793
RAC: 2092243

Ian&Steve C. wrote:comparing

Ian&Steve C. wrote:
comparing the average "tasks completed per day" is probably a more useful metric that will account for any changes to host configuration with old an new tasks.

Yes, this is the right way to compute that would cancel out concurrency configurations.

I have a script recording all finished tasks. On my 7950X+4070Ti host, I was finishing ~200-220 tasks per day with two tasks splayed to run concurrently. With the new tasks having very little idle GPU time, I configured it to run one per GPU and it finished 55 tasks yesterday. I tried running two tasks per GPU with the new tasks too, but that pretty much just doubled runtime.

Because of the significant reduction in CPU-only periods and increase in GPU compute, the slowdown ratio depends a lot on how under-powered one's CPU is relative to GPU. The weaker the CPU relatively, the less slowdown one would notice from the new tasks.

pututu
pututu
Joined: 6 Apr 17
Posts: 67
Credit: 653417392
RAC: 1

When I was running O3AS on

When I was running O3AS on Radeon VII, I was running 6 tasks per gpu to keep the average gpu utilization in the high 90%+ most of the time. At least now the low freq tasks will let me free up some cpu cores. Also did anyone notice that the cpu time for doing the stat recalculation appears to have dropped by about half (at least for those with fast cpu)?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.