GPU use optimisation ?

[AF>Amis des La...

Joined: 25 Jun 13

Posts: 4

Credit: 361281496

RAC: 0

20 Oct 2014 8:28:27 UTC

Topic 197758

(moderation:

)

Hello,

Can someone be kind enough to explain me why EINSTEIN's WU's, 280000 GFLOPS only, take such a long time to be completed ?

I am running ACERIBO GPU, and need to run 2 WU's at once to reach a GPU use of about 90 %.

Tasks are completed in about 1 hour.

I am not an IT specialist, but how comes that on other projects I don't want to mention here, 13400000 GFLOPS WU's are completed in less than 30 or 45 minutes ? (depending if using CUDA or OpenCL)

Are "optimised apps" available ?

Thank You

Kind Regards

Phil1966

Logforme

Joined: 13 Aug 10

Posts: 332

Credit: 1714373961

RAC: 0

GPU use optimisation ?

20 Oct 2014 10:33:21 UTC

Message 124172

(moderation:

)

It would help if you mentioned the name of the unmentionable project or a comparison is difficult.

Could it be because the unmentionable projects do simple integer calculations that can be completely performed on the GPU while E@H do complicated floating point math (and thus qualify for the FL in FLOP) that can't be completely done on the GPU and therefore requires alot of CPU as well as GPU?

Who knows?

mikey

Joined: 22 Jan 05

Posts: 12877

Credit: 1884381453

RAC: 155946

RE: It would help if you

20 Oct 2014 11:21:46 UTC

Message 124173 in response to message 124172

(moderation:

)

Quote:

It would help if you mentioned the name of the unmentionable project or a comparison is difficult.

Could it be because the unmentionable projects do simple integer calculations that can be completely performed on the GPU while E@H do complicated floating point math (and thus qualify for the FL in FLOP) that can't be completely done on the GPU and therefore requires alot of CPU as well as GPU?

Who knows?

I think the old nail took a pretty big whack right on it's head with that explanation!! Some projects can get alot of their units into the much faster gpu memory to crunch, others can't, those that can't suffer by comparison. BUT they are still much faster than those projects whose units don't use the gpu at all!

[AF>Amis des La...

Joined: 25 Jun 13

Posts: 4

Credit: 361281496

RAC: 0

RE: It would help if you

21 Oct 2014 17:08:59 UTC

Message 124174 in response to message 124172

(moderation:

)

Quote:

It would help if you mentioned the name of the unmentionable project or a comparison is difficult.

Could it be because the unmentionable projects do simple integer calculations that can be completely performed on the GPU while E@H do complicated floating point math (and thus qualify for the FL in FLOP) that can't be completely done on the GPU and therefore requires alot of CPU as well as GPU?

Who knows?

...

Fortunately, some teammates who calculate for over 10 years have given me more detailed and constructive explanations. As far as I know, there are not 25 "GPU's projects". But most other projects propose optimizations, so cruncher who invest into new hardware may increase their participation.

Thank You anyway.

NB : First and last question on this forum.

ExtraTerrestria...

Joined: 10 Nov 04

Posts: 770

Credit: 587061403

RAC: 109763

Phil, don't let a single

21 Oct 2014 19:20:56 UTC

Message 124175 in response to message 124174

(moderation:

)

Phil, don't let a single answer be representative for an entire forum or project.

The problem with GPUs is that they're not very flexible. Hence simple algorithms without many memory accesses and practically without any CPU intervention can achieve the best hardware utilization. By this I don't mean the percentage shown in monitoring utilities, but rather the throughput in terms of GFlops achieved. The project you're comparing very likely belongs into this category. The roblem with such projects is that.. well, not very many real world problems can be tackled this way. That's why some of them work on completely arbitrary "problems".

Einstein, on the other hand, uses some sophisticated and complex calculations. For BRP GPU tasks it can be GPU memory bandwidth limited, whereas PCIe bandwidth and CPU performance also matter. As far as I understand those algorithms are optimized very well. For nVidia GPUs there may be some room for improvement left with newer compilers, but those do not yet work with Einsteins cross-platform compilation scheme due to some bug(s). Currently no user-supplied apps are available, but the source code is. I don't kow if someone already tried to achieve anything better than what the project delivers.

And finally flop counting itself can be problematic. Suffice to say that there are different ways to do it and sometimes one has to rely on estimates (which could be way off).

Edit: does that match what your team collegue told you? If not I'm certainly interested to hear his ideas.

MrS

Scanning for our furry friends since Jan 2002

GPU use optimisation ?

Forums › Problems and Bug Reports

GPU use optimisation ?

RE: It would help if you

RE: It would help if you

Phil, don't let a single

Comment viewing options

Forums › Problems and Bug Reports