A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Does anyone have a clue about the highest performing GW GPU tasks system?
I am in the process of working up a GW only system and am wondering what kind of goal I should be shooting for.
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
I just got done bumping this system up to 6 gpus with 2 threads per GPU. I wanted to keep as many cpus available for additional expansion of GPU cards as I could. So I set the limit to 0.5 CPU per GPU thread.
The preliminary result was each thread was taking twice as long as it had previously. This means there was no gain in running two threads per GPU.
I have since reset the CPU to the original default (0.9) per GPU thread. I thought I saw a speed up in massive speed up in processing.
==edit===
It maybe that I am now processing at the same speed as a single thread per GPU.
====edit=deleted===
There probably is a hard limit on the number of GPU threads a single system can push on GW tasks. No more that 1 gpu task per thread. Which means you likely can't run 38 GPU threads at "full speed" on a 18 slot MB like the B360-F Pro because it has an 8c/16t limit.
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
I sincerely hope they program O3ASE to run on GPUs with only 2 GB of RAM. I would like to be able to run them not only on my computer with an NVIDIA 1070, but also on my computer with an NVIDIA GeForce GTX 960. There are still a lot of older GPUs out here with 2 GB of RAM that can process a lot of work units per day for Einstein. :-)
O3ASE tasks spend about three minutesrecalculating statistics on CPU after main analysyis on GPU has finished:
2021-05-03 10:57:57.5918 (205377) [normal]: Finished main analysis.
2021-05-03 10:57:57.5921 (205377) [normal]: Recalculating statistics for the final toplist...
2021-05-03 11:00:48.1217 (205377) [normal]: Finished recalculating toplist statistics.
The O2MD did the same much faster (50 seconds)
2021-03-18 22:42:18.3670 (214468) [normal]: Finished main analysis.
2021-03-18 22:42:18.3672 (214468) [normal]: Recalculating statistics for the final toplist...
2021-03-18 22:43:11.3893 (214468) [normal]: Finished recalculating toplist statistics.
Please do not take this too seriously. I'm just wondering:
Is this recalculation just to check if GPU has done something right? It says re calculating. Is it necessary at all?
The GPU sits three minutes idle. Looks bad: 20 s initial verification + 200 s work + 180 seconds idle.
I know, I could run two at a time, but NVIDIA GPUS slow down a lot when doing that. Oh how I miss the Seti mutex implementation that let the other task do GPU-stuff when the other had finished its GPU part. Pre- and post-steps on CPU overlapped nicely with the other task that was doing GPU.
Bernd Machenschalk wrote:The
)
Great! There's still a glitch on the flow though... Downloads are failing. Success rate 0/10. Permanent HTTP errors.
https://einsteinathome.org/task/1069674861
Thank you for the update
)
Thank you for the update Bernd.
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Richie wrote: Bernd
)
Oh dear! Thanks for the note! Shouldn't take long to fix. Sending GW work is suspended until this is fixed.
BM
Does anyone have a clue about
)
Does anyone have a clue about the highest performing GW GPU tasks system?
I am in the process of working up a GW only system and am wondering what kind of goal I should be shooting for.
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
I just got done bumping this
)
I just got done bumping this system up to 6 gpus with 2 threads per GPU. I wanted to keep as many cpus available for additional expansion of GPU cards as I could. So I set the limit to 0.5 CPU per GPU thread.
The preliminary result was each thread was taking twice as long as it had previously. This means there was no gain in running two threads per GPU.
I have since reset the CPU to the original default (0.9) per GPU thread. I thought I saw a speed up in massive speed up in processing.
==edit===
It maybe that I am now processing at the same speed as a single thread per GPU.
====edit=deleted===
There probably is a hard limit on the number of GPU threads a single system can push on GW tasks. No more that 1 gpu task per thread. Which means you likely can't run 38 GPU threads at "full speed" on a 18 slot MB like the B360-F Pro because it has an 8c/16t limit.
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Does anyone know why there
)
Does anyone know why there are no new work units available for O2MDF?
as far as I know they are
)
as far as I know they are working on the transition to O3ASE. O2MDF will go away.
_________________________________________________________________________
Thank you. I look forward
)
Thank you.
I look forward to O3ASE.
I sincerely hope they program O3ASE to run on GPUs with only 2 GB of RAM. I would like to be able to run them not only on my computer with an NVIDIA 1070, but also on my computer with an NVIDIA GeForce GTX 960. There are still a lot of older GPUs out here with 2 GB of RAM that can process a lot of work units per day for Einstein. :-)
Hi Bernd, I looked at my
)
Hi Bernd,
I looked at my .nv/ComputeCache and found some new OpenCL code in the source that has Xlal_BSGL stuff and SemiCoherent code
<code>
... this is just like it was before ..
#ifndef PULSAR_MAX_DETECTORS
#define PULSAR_MAX_DETECTORS 2
#endif
... and a couple of lines below is some new code, that is overidden by the previous definition and has no effect
#ifndef PULSAR_MAX_DETECTORS
#define PULSAR_MAX_DETECTORS 10
#endif
</code>
a) Is this the way it should be?
b) Another question: Is there a known hard upper limit in CopyBSGLSetup() for the UINT4 numDetectors argument ?
2021-05-03
)
O3ASE tasks spend about three minutes recalculating statistics on CPU after main analysyis on GPU has finished:
Please do not take this too seriously. I'm just wondering:
Is this recalculation just to check if GPU has done something right? It says re calculating. Is it necessary at all?
The GPU sits three minutes idle. Looks bad: 20 s initial verification + 200 s work + 180 seconds idle.
I know, I could run two at a time, but NVIDIA GPUS slow down a lot when doing that. Oh how I miss the Seti mutex implementation that let the other task do GPU-stuff when the other had finished its GPU part. Pre- and post-steps on CPU overlapped nicely with the other task that was doing GPU.
Petri33