Gravitational Wave Engineering run on LIGO O1 Open Data

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3576311001

RAC: 815675

Gary Roberts wrote:Richie

16 Apr 2019 10:37:20 UTC

Message 170736 in response to message 170725

(moderation:

)

Gary Roberts wrote:

Richie wrote:
DanNeely wrote:
Is there a way to opt out of the GPU tasks from the engineering run until they're able to perform better while still running CPU work from it?

'ON' for CPU and 'OFF' for all GPUs (AMD , Nvidia , Intel).

I suspect Dan would just want to exclude O1OD1E GPU tasks and not FGRPB1G tasks as well. Your suggestion excludes all types of GPU crunching. Off the top of my head (I've never tried it) a possible way would be to use the app_config.xml mechanism and use both the name and plan class tags to identify just the GPU version. Perhaps setting the cpu_usage and gpu_usage (or maybe the max_concurrent) for that combination to zero might effectively exclude those tasks without affecting anything else. It would be worth experimenting.

Correct about wanting to keep FGRPB1G tasks running. I'm willing to screw around with the app config file to see if I can make something work. I'd need the names of the apps to call out within the file before I can make the attempt though.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5878

Credit: 118831126133

RAC: 22439083

If you look in the state file

16 Apr 2019 10:59:35 UTC

Message 170737

(moderation:

)

If you look in the state file for all the <app> ..... </app> blocks for the Einstein project, you will see all the valid <name> ... </name> lines that are known. I think the one you need is einstein_O1OD1E. For plan classes, I think they're listed on the current applications page but I'm pretty sure I've seen the various ones listed in the last scheduler contact logs.

Cheers,
Gary.

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

I see problems with my hosts

16 Apr 2019 13:53:25 UTC

Message 170743

(moderation:

)

I see problems with all my hosts trying to run v0.12. Tasks stop at 0:00:40 elapsed time with this message:
"Task ...(task ID shows here)..... postponed for 600 seconds: Waiting to acquire lock". Windows 10 and 7 hosts... and happens both with AMD and Nvidia cards. Even 1 task at a time won't run. Task process will disappear from process list (in Windows Task Manager) while this condition happens.

Jim1348

Joined: 19 Jan 06

Posts: 463

Credit: 257957147

RAC: 0

Richie wrote:I see problems

16 Apr 2019 15:46:14 UTC

Message 170745 in response to message 170743

(moderation:

)

Richie wrote:

I see problems with all my hosts trying to run v0.12. Tasks stop at 0:00:40 elapsed time

Same here, RX 570 on Win7 64-bit.

Shadak

Joined: 3 Oct 09

Posts: 20

Credit: 20966427

RAC: 0

running v0.12 on GTX 1070 and

17 Apr 2019 10:12:04 UTC

Message 170765

(moderation:

)

running v0.12 on GTX 1070 and everything is fine

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

Shadak wrote:running v0.12 on

17 Apr 2019 10:34:36 UTC

Message 170766 in response to message 170765

(moderation:

)

Shadak wrote:

running v0.12 on GTX 1070 and everything is fine

I'm not sure what tasks you're referring to. I looked at this host but looks like it's been crunching only FGRPB1G v1.20 tasks with the Nvidia. That would be a different app.

* Server seems to be sending currently v0.11 GPU tasks to Nvidia... from this GW app

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

v0.13 is running perfect.

17 Apr 2019 12:42:33 UTC

Message 170772

(moderation:

)

v0.13 is running perfectly. Both Nvidia and AMD cards show stable operation even with 4x configuration (like they did with v0.11).

solling2

Joined: 20 Nov 14

Posts: 219

Credit: 1578387945

RAC: 16485

Richie schrieb:v0.13 is

17 Apr 2019 14:38:30 UTC

Message 170774 in response to message 170772

(moderation:

)

Richie wrote:

v0.13 is running perfectly...

Right - congrats to the devs to this step forward! :-) A tiny unfortunate flaw though remains from my point of view in the sense that the core requirement adds up to what is set as core utilisation in the Boinc manager. As the app doesn't respect that setting, it's up to the user to free a core. But that has already been discussed in the tec news section.

Jim1348

Joined: 19 Jan 06

Posts: 463

Credit: 257957147

RAC: 0

My first 0.13 completed in 62

17 Apr 2019 15:31:28 UTC

Message 170776

(moderation:

)

My first 0.13 completed in 62 minutes on a RX 570 (Win7 64-bit). The power was 40 watts (GPU-Z). That is quite nice as compared to the earlier version, and about as power efficient as running a work unit on a CPU.

One curiosity is that it now uses a full CPU core. I suppose that is a result of the priority change. But the other OpenCl projects I run do not require a full core on AMD cards (only on Nvidia cards). It is not a problem if that is what is needed.

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3576311001

RAC: 815675

Gary Roberts wrote:Richie

17 Apr 2019 22:21:37 UTC

Message 170781 in response to message 170725

(moderation:

)

Gary Roberts wrote:

Richie wrote:
DanNeely wrote:
Is there a way to opt out of the GPU tasks from the engineering run until they're able to perform better while still running CPU work from it?

'ON' for CPU and 'OFF' for all GPUs (AMD , Nvidia , Intel).

I suspect Dan would just want to exclude O1OD1E GPU tasks and not FGRPB1G tasks as well. Your suggestion excludes all types of GPU crunching. Off the top of my head (I've never tried it) a possible way would be to use the app_config.xml mechanism and use both the name and plan class tags to identify just the GPU version. Perhaps setting the cpu_usage and gpu_usage (or maybe the max_concurrent) for that combination to zero might effectively exclude those tasks without affecting anything else. It would be worth experimenting.

Setting cpu/gpu usage to 0 threw an error message when I tried reading the config file. Going the other direction and setting hardware requirements well in excess of what my boxes have worked great though:

<app> <name>einstein_O1OD1E</name> <gpu_versions> <gpu_usage>99</gpu_usage> <cpu_usage>99</cpu_usage> </gpu_versions> </app>

after ~12 hours on each of two systems I'm reasonably confident this is working as expected, I'm getting a mix of O1OD1E CPU tasks and Fermi GPU tasks on both, but not anything I don't want.

Gravitational Wave Engineering run on LIGO O1 Open Data

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner