The O2-All Sky Gravitational Wave Search on GPUs - discussion thread.

Matt White

Joined: 9 Jul 19

Posts: 120

Credit: 280798376

RAC: 0

I saw these pop up on my task

13 Aug 2019 12:29:53 UTC

Message 172608

(moderation:

)

I saw these pop up on my task list this morning, on both my LINUX/AMD and Win 7/NVIDIA boxes. As soon as they make it through the queue, which should be late today, I'll post results.

Clear skies,

Matt

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7224274931

RAC: 1015772

Arif Mert Kapicioglu wrote:

13 Aug 2019 13:38:22 UTC

Message 172609 in response to message 172606

(moderation:

)

Arif Mert Kapicioglu wrote:

Currently running one tough the GPU load is fluctuating between %21-27. Win 10 X64, Vega 64, GPU temp 47 Celcius.

I left my previous Einstein settings and Process Lasso settings intact from the Einstein GPU GW application of a few weeks ago. These new ones are running on a Windows 10 host with an AMD RX 570 and a modern but slow six-core Intel i5-9400F CPU.

Running at 4X, the application is keeping the GPU reasonably busy, with GPU-Z reporting an average GPU load of 82%. I'm working to spread out the percent complete of the four tasks, in hopes of keeping the GPU pretty busy through the end/beginning transition periods.

Each task is reported as using roughly 90% of a CPU, and as the actual CPU has cheap TIM and the (gasp!) Intel-provided cooler, my CPU temperature has risen from about 35C when running 2X on Gamma-Ray pulsar to about 62C.

One project-level gripe: when I look at the workunit link for one of these which is in progress, I get the rather uninformative "Tasks are pending for this workunit." notation rather than the helpful display of individual issued and returned tasks provided for current Gamma-Ray pulsar work.

As I don't think these will mix well with the GRP tasks, I'm currently planning to deplete my GRP queue by running a couple of hours pure GRP alternating with a couple of hours pure GW. Perhaps by the time I get my GRP work to zero I may actually get some initial validation successes or failures on the 1.07 GW work.

Jim1348

Joined: 19 Jan 06

Posts: 463

Credit: 257957147

RAC: 0

My first work unit, running

13 Aug 2019 14:58:44 UTC

Message 172615

(moderation:

)

My first work unit, running singly on an RX 570, finished in 82 minutes (Win7 64-bit, 19.5.2 drivers). This card is running at the default speed of 1244 MHz, but undervolted to 1.010 volts. The GPU-Z averages are:

GPU Temp: 51C
GPU Load: 42%
GPU Power: 51.5 watts
CPU %: 104.05 (from BoincTasks) on an i7-4771

So I think it might help a little to leave two cores free, unless you have a faster CPU than a Haswell.

cecht

Joined: 7 Mar 18

Posts: 1534

Credit: 2907162109

RAC: 2157887

My first six tasks had an

14 Aug 2019 1:30:08 UTC

Message 172636

(moderation:

)

My first six tasks had an average task completion time of ~35 min when running app 1.07 on a Linux host with two RX 570, each running 3x tasks. My Pentium CPU has 2-cores/4-threads, so I set app_config CPU usage at 0.5 CPU, giving an average CPU task time of ~17 min. During most of a run, CPU usage was steady at ~80% for two threads and ~90% for the other two threads, but when tasks on two GPU hit the 99% mark, CPU usage held at ~95% and 100% for about three minutes until the tasks completed. GPU usage averaged ~60% throughout, bouncing around mostly between 30% and 80%. (Each GPU was set to run at a P-state of 6, which corresponds to a steady shader clock speed of 1071 MHz @ 906 mV.)

It's nice to see that cutting back on CPU allocation doesn't seem to have any major downside to task completion time, but I haven't compared different app_config settings yet.

I was getting ~6.5hr task completion time for O2AS20-500 runs with the 1.01 app (CPU only) when running two tasks each @ 0.9 CPU allocation and also running six concurrent FGRPB1G GPU tasks (0.2 CPU ea.). So this new 1.07 GPU GW app offers up a nice 12-fold bump in task speed.

Now let's see how the validations go!

Ideas are not fixed, nor should they be; we live in model-dependent reality.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7224274931

RAC: 1015772

I wonder if there is a

14 Aug 2019 4:39:12 UTC

Message 172637 in response to message 172636

(moderation:

)

I wonder if there is a problem? When my host tried to get work a little while ago, there was none available to it. I notice on the Einstein server status page that O2AS20-500 shows with zero tasks to send and the work generator "not running" (red).

I imagine that by tomorrow at least someone here should have gotten a validation if this is working. My twelve are all still pending, but I'll add over another dozen overnight.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117649579451

RAC: 35199432

cecht wrote:... It's nice to

14 Aug 2019 6:44:46 UTC

Message 172638 in response to message 172636

(moderation:

)

cecht wrote:

... It's nice to see that cutting back on CPU allocation doesn't seem to have any major downside to task completion time, but I haven't compared different app_config settings yet.

I'm guessing that the 6 concurrent GPU tasks (3 per GPU) are the only tasks running? In other words, you are not allowing any V1.01 CPU tasks to run on any CPU cores?

Your setting of 0.5 for cpu_usage would cause BOINC to budget 3 CPU threads for GPU support duties. Theoretically, you could possibly be running a single V1.01 CPU task on the last thread if BOINC was allowed to use 100% of your threads.

If you are not running any CPU tasks, the cpu_usage setting is irrelevant since the 6 GPU tasks will 'get in the queue' with each other to receive whatever support they actually require, when it becomes available, irrespective of the setting you use. For example if you set cpu_usage to 0.1 and also made sure no V1.01 CPU tasks could run, only the V1.07 GPU tasks, it wouldn't make any difference to your GPU run times. In other words, the setting doesn't restrict a GPU task from accessing any available CPU resources if nothing else is using that resource.

What archae86 mentioned some time ago about the GW GPU app not behaving well for slower, older, fewer CPU core setups like I've been using, certainly appears to be correct based on the CPU secs being consumed for GPU support. In your case, I think the next experiment you should try (still with no CPU tasks running) is to run 8 GPU tasks (4 per GPU) and see if you get any further improvement. Since the CPU secs for each task (as shown in your tasks list) is less than 50% of the elapsed time, Maybe you might get a further improvement.

Because there are no 'ready-to-send' tasks, (as Peter reported) I'm guessing Bernd may have disabled task generation until he sees if the first lot so far do result in good validations. Of course, it could be some sort of failure but I suspect it could be just a limited release for 'overnight' in Hannover and that if things look good in the morning, more tasks might be released after inspection for any 'problems' :-).

Cheers,
Gary.

cecht

Joined: 7 Mar 18

Posts: 1534

Credit: 2907162109

RAC: 2157887

archae86 wrote:I imagine that

14 Aug 2019 11:19:14 UTC

Message 172641 in response to message 172637

(moderation:

)

archae86 wrote:

I imagine that by tomorrow at least someone here should have gotten a validation if this is working. My twelve are all still pending, but I'll add over another dozen overnight.

I've had 2 V1.07 tasks validate overnight, with 36 pending and no invalids.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Matt White

Joined: 9 Jul 19

Posts: 120

Credit: 280798376

RAC: 0

I'm still waiting for the

14 Aug 2019 12:36:41 UTC

Message 172642

(moderation:

)

I'm still waiting for the V1.07's to work their way thru the queue. The NVIDIA tasks might start processing today. I have my GPU task allocation set to .5, so it will be interesting to see how they run. this morning, I have 15 V1.07 GW tasks, along with 14 FGRP V1.22 Binary Pulsar tasks. the latter tasks are taking about 2 hours and 12 minutes each. I figure another 12 hours before I start seeing crunching on the new GW tasks.

Clear skies,

Matt

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7224274931

RAC: 1015772

cecht wrote:I've had 2 V1.07

14 Aug 2019 16:21:06 UTC

Message 172649 in response to message 172641

(moderation:

)

cecht wrote:

I've had 2 V1.07 tasks validate overnight, with 36 pending and no invalids.

And since you wrote that, your valid count has moved up to 4. Better yet, those four include a variety of cross-platform pairings.

My situation is less encouraging. With 21 pending, I now have two inconclusives, and zero valid. I think this means that two of the 21 had an available quorum partner at some time recently that the validator ran (I suspect it is being hand-flown intermittently, so far) and got past basic sanity check but did not match the partner closely enough to avoid a rematch. As the project is not enabling us to see the other tasks dispatched on in progress or pending tasks, I can't tell what platform the hosts I've miscompared with are running. (edited to say that an hour later I don't see any inconclusives--not sure if I got confused, or perhaps something really changed. Still zero validations on 1.07, with 24 currently pending, and currently running at 2X)

For a more favorable indicator, the work generator currently shows as running, with 6071 tasks to send. I just now unsuspended all units and tweaked my cache level directive, and promptly got 17 more tasks.

Possibly the fact I have been running 4X may harm my validity prospects, though I note that cecht has gotten one validation with a long enough elapsed time possibly to have run at 4X. His is a Linux PC, which might change that answer. So while I was writing this I fetched some more tasks as part of switching to 2X running.

cecht

Joined: 7 Mar 18

Posts: 1534

Credit: 2907162109

RAC: 2157887

Gary Roberts wrote:I'm

14 Aug 2019 16:20:27 UTC

Message 172651

(moderation:

)

Gary Roberts wrote:

I'm guessing that the 6 concurrent GPU tasks (3 per GPU) are the only tasks running? In other words, you are not allowing any V1.01 CPU tasks to run on any CPU cores?

Right, I was running only V1.07 tasks with no V1.01 CPU tasks. (I set project_max_concurrent to 6, thus limiting runs of queued V1.01 tasks.) But when V1.07 tasks stopped downloading and those in-queue completed, in-queue V1.01 CPU tasks picked up the slack when BOINC dipped below 6 concurrent tasks. Having both V1.01 & V1.07 GW tasks running together really slowed things down, so I've suspended the queued V1.01 CPU tasks. (I had previously set Project preferences to not send any new CPU-only tasks.)

Gary Roberts wrote:

Your setting of 0.5 for cpu_usage would cause BOINC to budget 3 CPU threads for GPU support duties. Theoretically, you could possibly be running a single V1.01 CPU task on the last thread if BOINC was allowed to use 100% of your threads.

If you are not running any CPU tasks, the cpu_usage setting is irrelevant since the 6 GPU tasks will 'get in the queue' with each other to receive whatever support they actually require, when it becomes available, irrespective of the setting you use. For example if you set cpu_usage to 0.1 and also made sure no V1.01 CPU tasks could run, only the V1.07 GPU tasks, it wouldn't make any difference to your GPU run times. In other words, the setting doesn't restrict a GPU task from accessing any available CPU resources if nothing else is using that resource.

Ah, I see. Okay, thanks. I was thinking that cpu_usage was what determined how much CPU time was spent running a GPU task.

Gary Roberts wrote:

What archae86 mentioned some time ago about the GW GPU app not behaving well for slower, older, fewer CPU core setups like I've been using, certainly appears to be correct based on the CPU secs being consumed for GPU support. In your case, I think the next experiment you should try (still with no CPU tasks running) is to run 8 GPU tasks (4 per GPU) and see if you get any further improvement. Since the CPU secs for each task (as shown in your tasks list) is less than 50% of the elapsed time, Maybe you might get a further improvement.

Yes, I'll try 4 tasks/GPU (only) and see if CPU times and completion times change much. I have noticed, while trying to juggle a fresh download of multiple V1.07 tasks so that they all don't hit the 99% mark at the same time on a GPU (using the clunky method of suspending and resuming individual tasks), that CPU usage across all 4 threads climbs about 10% with each additional task added. I will see whether I can squeeze in 8 concurrent tasks without tapping out the CPU.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

The O2-All Sky Gravitational Wave Search on GPUs - discussion thread.

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner