Gravitational Wave Engineering run on LIGO O1 Open Data

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1,364
Credit: 3,562,358,667
RAC: 0

Gary Roberts wrote:Richie

Gary Roberts wrote:
Richie wrote:
DanNeely wrote:
Is there a way to opt out of the GPU tasks from the engineering run until they're able to perform better while still running CPU work from it?

'ON' for CPU and 'OFF' for all GPUs (AMD , Nvidia , Intel).

I suspect Dan would just want to exclude O1OD1E GPU tasks and not FGRPB1G tasks as well.  Your suggestion excludes all types of GPU crunching.  Off the top of my head (I've never tried it) a possible way would be to use the app_config.xml mechanism and use both the name and plan class tags to identify just the GPU version.  Perhaps setting the cpu_usage and gpu_usage (or maybe the max_concurrent) for that combination to zero might effectively exclude those tasks without affecting anything else.  It would be worth experimenting.

 

Correct about wanting to keep FGRPB1G tasks running.  I'm willing to screw around with the app config file to see if I can make something work.  I'd need the names of the apps to call out within the file before I can make the attempt though.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,870
Credit: 115,679,230,239
RAC: 34,673,543

If you look in the state file

If you look in the state file for all the <app> ..... </app> blocks for the Einstein project, you will see all the valid <name> ... </name> lines that are known.  I think the one you need is einstein_O1OD1E.  For plan classes, I think they're listed on the current applications page but I'm pretty sure I've seen the various ones listed in the last scheduler contact logs.

 

Cheers,
Gary.

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1,702,989,778
RAC: 0

I see problems with my hosts

I see problems with all my hosts trying to run v0.12. Tasks stop at 0:00:40 elapsed time with this message:
"Task ...(task ID shows here)..... postponed for 600 seconds: Waiting to acquire lock". Windows 10 and 7 hosts... and happens both with AMD and Nvidia cards. Even 1 task at a time won't run. Task process will disappear from process list (in Windows Task Manager) while this condition happens.

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 463
Credit: 257,957,147
RAC: 0

Richie wrote:I see problems

Richie wrote:
I see problems with all my hosts trying to run v0.12. Tasks stop at 0:00:40 elapsed time

Same here, RX 570 on Win7 64-bit.

Shadak
Shadak
Joined: 3 Oct 09
Posts: 20
Credit: 20,966,427
RAC: 0

running v0.12 on GTX 1070 and

running v0.12 on GTX 1070 and everything is fine

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1,702,989,778
RAC: 0

Shadak wrote:running v0.12 on

Shadak wrote:
running v0.12 on GTX 1070 and everything is fine

I'm not sure what tasks you're referring to. I looked at this host but looks like it's been crunching only FGRPB1G v1.20 tasks with the Nvidia. That would be a different app.

* Server seems to be sending currently v0.11 GPU tasks to Nvidia... from this GW app

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1,702,989,778
RAC: 0

v0.13 is running perfect.

v0.13 is running perfectly. Both Nvidia and AMD cards show stable operation even with 4x configuration (like they did with v0.11).

solling2
solling2
Joined: 20 Nov 14
Posts: 219
Credit: 1,575,668,167
RAC: 65,755

Richie schrieb:v0.13 is

Richie wrote:
v0.13 is running perfectly...

Right - congrats to the devs to this step forward! :-) A tiny unfortunate flaw though remains from my point of view in the sense that the core requirement adds up to what is set as core utilisation in the Boinc manager. As the app doesn't respect that setting, it's up to the user to free a core. But that has already been discussed in the tec news section.

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 463
Credit: 257,957,147
RAC: 0

My first 0.13 completed in 62

My first 0.13 completed in 62 minutes on a RX 570 (Win7 64-bit).  The power was 40 watts (GPU-Z).  That is quite nice as compared to the earlier version, and about as power efficient as running a work unit on a CPU.

One curiosity is that it now uses a full CPU core.  I suppose that is a result of the priority change.  But the other OpenCl projects I run do not require a full core on AMD cards (only on Nvidia cards).  It is not a problem if that is what is needed.

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1,364
Credit: 3,562,358,667
RAC: 0

Gary Roberts wrote:Richie

Gary Roberts wrote:
Richie wrote:
DanNeely wrote:
Is there a way to opt out of the GPU tasks from the engineering run until they're able to perform better while still running CPU work from it?

'ON' for CPU and 'OFF' for all GPUs (AMD , Nvidia , Intel).

I suspect Dan would just want to exclude O1OD1E GPU tasks and not FGRPB1G tasks as well.  Your suggestion excludes all types of GPU crunching.  Off the top of my head (I've never tried it) a possible way would be to use the app_config.xml mechanism and use both the name and plan class tags to identify just the GPU version.  Perhaps setting the cpu_usage and gpu_usage (or maybe the max_concurrent) for that combination to zero might effectively exclude those tasks without affecting anything else.  It would be worth experimenting.

 

 

Setting cpu/gpu usage to 0 threw an error message when I tried reading the config file.  Going the other direction and setting hardware requirements well in excess of what my boxes have worked great though:

 

<app>     <name>einstein_O1OD1E</name>     <gpu_versions>         <gpu_usage>99</gpu_usage>         <cpu_usage>99</cpu_usage>     </gpu_versions> </app>

 

after ~12 hours on each of two systems I'm reasonably confident this is working as expected, I'm getting a mix of O1OD1E CPU tasks and Fermi GPU tasks on both, but not anything I don't want.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.