Can I use a GTX 690 (dual gpu card) on Einstein?

Cruncher-American
Cruncher-American
Joined: 24 Mar 05
Posts: 70
Credit: 4503260497
RAC: 5262237
Topic 226477

Found an old GTX 690 on Craigslist here in Boston and (just for giggles) I am trying to get it to run Einstein on a 3rd machine. (Note: each processor has 2GB of video RAM). Looks like it tried to run 3 WUs on each GPU when it started up, and they all failed. Right now, one WU is happily running on it and my machine is in the midst of waiting for ~4800 secs. to make the next request for work.

 

I would guess they crashed out because not enough GPU ram for 3 WU on each.

 

Is that right?

 

What can I do, if anything, to try fewer WUs/GPU? 

 

Thanks for any suggestions!

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7024914931
RAC: 1808801

Running 3X for Einstein

Running 3X for Einstein Gravity Wave GPU work on a 2GB video RAM card is not close to being OK.

On the other hand you could set the location (aka venue) for that machine to a different one of the four available than your other machines, giving you independence in setting preferences.  Having done that, you could set the applications preference for the chosen venue to be:

Gamma-ray pulsar binary search #1 (GPU)

And you could set what I call the multiplicity (how many GPU tasks run at once on a particular GPU) to 1X by specifying 

GPU utilization factor of FGRP apps: 

to the value 1.0

That should be pretty safe so far as raw card capability of the GPU is concerned--while running GW at 3X is most certainly not.

Good luck.

 

 

Cruncher-American
Cruncher-American
Joined: 24 Mar 05
Posts: 70
Credit: 4503260497
RAC: 5262237

Thanks for the suggestion! I

Thanks for the suggestion! I will see what I can do and report back. 

Cruncher-American
Cruncher-American
Joined: 24 Mar 05
Posts: 70
Credit: 4503260497
RAC: 5262237

I added an app_config.xml to

I added an app_config.xml to the Einstein folder on the machine, thinking that I would start it with 2/gpu and, if that fails, back it off to 1. Here's what I used:

 

<app_config>
   <app>
   <name>einstein_O3AS</name>
      <gpu_versions>
      <gpu_usage>0.5</gpu_usage>
      <cpu_usage>1</cpu_usage>
      </gpu_versions>
   </app>
   <app>
   <name>hsgamma_FGRP5</name>
      <gpu_versions>
      <gpu_usage>0.5</gpu_usage>
      <cpu_usage>1</cpu_usage>
      </gpu_versions>
   </app>
</app_config>

 

and it works fine.

Also, the 2 WUs in parallel run about 38 minutes each, where the one I managed to run to completion before the changeover took 25 minutes, so (assuming it was of average duration), I can now do 4 in roughly the same time as 3 @ 1 at a time, so about a 33% increase in work done.

Also, no errors yet (12 have run to completion so far). I will wait to see if I get any invalids, too. None so far.

Cool!

 

Thanks again for the help!

 

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5842
Credit: 109411201161
RAC: 34939744

Cruncher-American wrote:I

Cruncher-American wrote:
I added an app_config.xml to the Einstein folder on the machine ...

Please be aware that the 2nd half of the file you showed is wrong (and probably being ignored).  'FGRP5' is a CPU only app and there are no "gpu_versions" for it.  The gamma-ray pulsar GPU app is quite different and is referred to as 'FGRPB1G' rather than 'FGRP5'.  Also, if you tried to run both the GW 'O3AS' tasks and the GRP 'FGRPB1G' tasks simultaneously, you would likely see various problems.  It would be wise to just stick with 'O3AS', since it seems to be working fine now.

Fortunately, your tasks list shows just one type of CPU task (FGRP5) and one type of GPU task (O3AS) so you have landed on a combination that is likely to run without too much trouble :-).

In the past, trying to run O3AS GPU tasks on a 2GB GPU would indeed cause "insufficient memory" problems as archae86 mentioned.  I'm not currently running any O3AS so I don't have recent experience but your O3AS tasks show that the current memory requirements must be lower for the latest iteration of these tasks.  This could change in the future as different pulsar spin frequency ranges are explored, so you do need to keep an eye on things.  If you start seeing tasks failing, just check the stderr output returned to the project for 'memory allocation' type error messages.

Cheers,
Gary.

Cruncher-American
Cruncher-American
Joined: 24 Mar 05
Posts: 70
Credit: 4503260497
RAC: 5262237

Yup, I was wrong about that.

Yup, I was wrong about that. I didn't get any error msgs in startup, so I didn't think about it.

Thanks for the heads-up.

But my other two crunchers do get occasional batches of 10-20 or so Gamma Ray WUs for the graphics cards, so I will change that part from 5 -> B1G, and see if that works ok.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.