Top Production apps OS3GW or Brp7-meerKat - Discussion

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6758
Credit: 9723166616
RAC: 1956383

San-Fernando-Valley

San-Fernando-Valley wrote:

Tom M wrote:

...

Additional conversation has convinced me that 1x is the only choice for Windows and Brp7/meerKat.

...

Maybe I'm off track again.

My results (rounded up more or less) for running tasks only on one Titan V are:

    1 task    420 sec
    2 tasks   330 sec
    3 tasks   320 sec    |  more or less
    4 tasks   320 sec    |  the same time

So what am I missing out on ?

 

sfv

You maybe missing out on nothing. 

If those numbers hold up on your Windows/Titan V machines, go with them. 4x would calculate out to be about 3.5M Rac / gpu.  Since this is an outlier, I am going to guess that your numbers will get worse.

It looks like you may be running two 3 Titan V boxes.  If that is true.  I would just let them "bang heads" and see which Operating system comes out on top?

I am currently claiming that the new Windows (beta) O3AS running at 2x is likely the top RAC generator for Windows-based Nvidia machines.

And if you are running MPS under Linux, with the optimized O3AS then 2x is the top RAC generator.

I don't have any data for high end Radeon gpus under Windows.  Only my dinky iGpu Radeon.

Respectfully,

A Proud member of the O.F.A.  (Old Farts Association).

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 293
Credit: 11176459479
RAC: 10369862

AndreyOR wrote: Here're

AndreyOR wrote:

Here're links to screenshots of HWiINFO GPU data running BRP7 single and double for about half an hour each, doubles were staggered at about 50%. BRP7 2x, BRP7 1x

I got a big enough sample size of O3AS running doubles (started out staggered at about 50%).  The average of almost 60 tasks is 1525 sec/task.  That's .2% slower than the fastest single task I have (1522 sec) and 15.7% faster than the slowest (1809 sec). So on average it seems to be ~8.5% faster time per task running doubles with this new version of O3AS tasks.

I am confused about these data. If both work units (on 2x) were mid-run then why is the core load and wattage lower when running 2x? The memory usage makes sense, but that seems odd to me. Ian&Steve C, is that normal for the Titan V, since you have been using them for a long time?

San-Fernando-Valley wrote:

Maybe I'm off track again.

My results (rounded up more or less) for running tasks only on one Titan V are:

    1 task    420 sec
    2 tasks   330 sec
    3 tasks   320 sec    |  more or less
    4 tasks   320 sec    |  the same time

So what am I missing out on ?

sfv

This is also what I see- 2x is better and then no improvement past 2x in Windows with the A4500. 

O3AS might be more productive on a given system- there is no substitute for just letting it run to see what happens in the long term. 

cecht
cecht
Joined: 7 Mar 18
Posts: 1596
Credit: 2997286637
RAC: 1389306

Boca Raton Community HS

Boca Raton Community HS wrote:

O3AS might be more productive on a given system- there is no substitute for just letting it run to see what happens in the long term. 

+1

Ideas are not fixed, nor should they be; we live in model-dependent reality.

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 534
Credit: 10581333693
RAC: 5400708

Tom M wrote: ... You maybe

Tom M wrote:

...

You maybe missing out on nothing. 

If those numbers hold up on your Windows/Titan V machines, go with them. 4x would calculate out to be about 3.5M Rac / gpu.  Since this is an outlier, I am going to guess that your numbers will get worse.

It looks like you may be running two 3 Titan V boxes.  If that is true.  I would just let them "bang heads" and see which Operating system comes out on top?

...

What do you mean by an "outlier" ?

I'm running way more than 2 boxes on Titans.

I was "told" that Ubuntu is way faster than Windows ... ?

So, now what ?

cheers

sfv

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6758
Credit: 9723166616
RAC: 1956383

San-Fernando-Valley

San-Fernando-Valley wrote:

What do you mean by an "outlier" ?

I'm running way more than 2 boxes on Titans.

I was "told" that Ubuntu is way faster than Windows ... ?

So, now what ?

cheers

sfv

==edit==

Lots of high flouting' discussion deleted.

==end edit===

I don't have a clue.  All my experience shows on the same hardware Windows crunches "slower" than Linux-based crunching.  But that doesn't mean its some kind of "law".  It just means we didn't have any counter-examples show up.

Respectfully,

A Proud member of the O.F.A.  (Old Farts Association).

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 293
Credit: 11176459479
RAC: 10369862

Does anyone that uses the

Does anyone that uses the Titan V or Quadro GPUs also implement TCC mode (Windows only if multiple GPUs are installed)? We switched one of the A4500 GPUs to this mode and we think we do see an uplift in performance from this GPU. We do not have many work units completed yet in this mode, so we will see. 

It took us some changes in the bios to make the second GPU show up after switching to TCC but it completely makes sense that this mode would be superior (apparently launches CUDA instances faster). 

If we do see an improvement, we will post more info/instructions for what we had to do to make this work on our machine. 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 4115
Credit: 49134095085
RAC: 32276955

San-Fernando-Valley

San-Fernando-Valley wrote:

Tom M wrote:

...

You maybe missing out on nothing. 

If those numbers hold up on your Windows/Titan V machines, go with them. 4x would calculate out to be about 3.5M Rac / gpu.  Since this is an outlier, I am going to guess that your numbers will get worse.

It looks like you may be running two 3 Titan V boxes.  If that is true.  I would just let them "bang heads" and see which Operating system comes out on top?

...

What do you mean by an "outlier" ?

I'm running way more than 2 boxes on Titans.

I was "told" that Ubuntu is way faster than Windows ... ?

So, now what ?

cheers

sfv



Linux is absolutely still faster. both for the better more optimized application, and the ability to use MPS.

Tom is getting confused by your numbers. he's misreading your stated runtimes as wall clock times, instead of the effective per task runtime. your Titan V will not do 3.5M ppd that Tom is claiming. I mean you can just look at the results yourself and see it's not doing that. and your runtime that shows clearly that your 3x config ran in ~1300s "wall clock", which makes perfect sense to me.

Linux optimized + MPS is closer to 200s per task "effective" on a Titan V.


 

_________________________________________________________________________

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6758
Credit: 9723166616
RAC: 1956383

Ian&Steve C.

Ian&Steve C. wrote:

Linux is absolutely still faster. both for the better more optimized application, and the ability to use MPS.

Tom is getting confused by your numbers.

===edit===

Linux optimized + MPS is closer to 200s per task "effective" on a Titan V.

"What me confused? Nah."

Attributed to Alfred E Neuman of Mad Magazine.

A Proud member of the O.F.A.  (Old Farts Association).

AndreyOR
AndreyOR
Joined: 28 Jul 19
Posts: 46
Credit: 746565711
RAC: 880678

Boca Raton Community HS

Boca Raton Community HS wrote:

I am confused about these data. If both work units (on 2x) were mid-run then why is the core load and wattage lower when running 2x? The memory usage makes sense, but that seems odd to me. Ian&Steve C, is that normal for the Titan V, since you have been using them for a long time?

The GPU Core load is higher an 2x (avg. 98.5% vs. 83.5%) but I did notice too that the wattage is lower at 2x.  Memory controller load is lower at 2x too.  Seems odd to me too.

Another thing that looks peculiar to me are the times you and another use posted.  Mine are reverse:  I get 270-300 sec/task running BRP7 1x and it slows down to 420+ sec when 2x.  It seems like my times 1x are faster than you-all's 2x or 3x.

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5054
Credit: 19120897390
RAC: 5215869

When you run 2X or 3X, you

When you run 2X or 3X, you elapsed times shown for the task need to be divided by the integer to show 'effective' elapsed times.

So  your 420+ second tasks are actually completing in 210 seconds, IOW faster "more productive" than your 270 second tasks at 1X

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.