Discussion Thread for the Continuous GW Search known as O2MD1 (now O2MDF - GPUs only)

petri33
petri33
Joined: 4 Mar 20
Posts: 124
Credit: 4061995819
RAC: 6848421

Keith Myers

Keith Myers wrote:

Quote:

"One task only, just one. -- Sir? -- One ping only, just one. It is time."

Ha ha hah.  Big thanks, Petri for the meme.

 

Hi Keith!

It is time get some sleep now. I'll be back.

 

Petri

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3960
Credit: 47069122642
RAC: 65407114

petri33 wrote:B.t.w Have

petri33 wrote:

B.t.w Have you tried running O3SA on your RTX2080(vanilla or Ti) or on your other similar to mine to take a look of run times for reference? "One task only, just one. -- Sir? -- One ping only, just one. It is time."

 

Thanks Ian&Steve C.

petri, I have run through several tasks. see the task results here: https://einsteinathome.org/host/12830750/tasks/0/55 (just ignore the aborted ones)

 

Running environment/setup:

  • CPU: Ryzen 9 5950X @ 4.45GHz
  • MEM: DDR4 3600MHz CL14 non-ECC UDIMM
  • GPU: RTX 2080 Ti @ 1980MHz; power limit 225W
  • Ubuntu 20.04.2, kernel 5.8.0-50, nvidia driver 460.73
  • all tasks were "h1_0298.40_O3aM1In1__O3ASE1_298.50Hz_xxxx_x"
  • CPU loaded with Universe@home on other threads, total CPU utilization ~92-95%
  • app_config.xml forcing 1CPU - 1GPU, only 1 task per GPU

 

Observations:

  • ~1200MB GPU memory used

  • Task Progress:


    • ~0-15%,  (110-115% CPU thread utilization, ~75% GPU utilization)
    • ~15-99%, (102-105% CPU thread utilization, ~85% GPU utilization)
    • 99-100%, (99-100% CPU thread utilization, ~1-2% GPU utilization)


  • 8 tasks, time to reach: 99% / 100% (m:ss)


    • 5:07 / 7:07
    • 5:08 / 7:09
    • 5:00 / 7:00
    • 5:02 / 7:02
    • 5:04 / 7:05
    • 5:07 / 7:02
    • 5:00 / 6:59
    • 5:07 / 7:07


 

So in my case, it always took about 2 minutes for that final 99-100% push. I'm sure CPU speed plays a big role in the time for this final "recalculation" to complete.

_________________________________________________________________________

petri33
petri33
Joined: 4 Mar 20
Posts: 124
Credit: 4061995819
RAC: 6848421

  Thank you Ian. You sure

 

Thank you Ian. You sure do have a fast CPU.

Petri

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117709822391
RAC: 35055752

petri33 wrote:Where can I

petri33 wrote:

Where can I find the explanation for the three minute computation that used to take 50 seconds.

You can find Bernd's explanation in this message.

In particular, read the paragraph starting, "Regarding the "cleanup at 99%": ...".

Cheers,
Gary.

Raistmer*
Raistmer*
Joined: 20 Feb 05
Posts: 208
Credit: 181427561
RAC: 6840

New is forgotten old quite

New is forgotten old quite often :)))

Mutex approach was used in SETI app at the beginning of GPU era in AP ATi Brook+ apps if I recall correctly, before BOINC was able to support GPU at all. It was not only optimization need, rather compatibility one - cause BOINC didn't differentiate CPU and GPU apps, setup was to use more "CPU" instances and do inter-communications between them to schedule in BOINC-independent way. One, "global" mutex aligned number of CPU vs Brook+ GPU apps while second guarded GPU parts of Brook AstroPulse code (I didn't find good enough FFT lib that didn't require to pass data w/o moving it to system memory (and that killed all performance boost from "Brooked" part) hence full GPU load required few instances in fly).

 

EDIT: More precisely there was so called "team" version of AstroPulse. Via Anonymous platform of course.

BOINC knew about it's CPU part. When BONC scheduler launched CPU part of team app it checked global mutex for current number of GPU instances in fly. If target was reached it continued as CPU optimised app. If not - it launched Brook+ version and awaited its finish. From BOINC side of view there was CPU app associated with particular WU.

Same approach could be used in modern days to use devices, unknown to BOINC like FPGAs (actually modern BOINC version has some mechanism to describe such unknowns in XML configurations, but can;t say how good it - never tested).

bozz4science
bozz4science
Joined: 4 May 20
Posts: 15
Credit: 69043923
RAC: 119432

lol, that 5950X is a beast.

lol, that 5950X is a beast. My 3700X needed roughly 6:30-7:00 min for the CPU-intensive toplist recalculation that essentially renders the O3 GW tasks much less rewarding than the O2-run WUs and FGRPB1G tasks alike.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3960
Credit: 47069122642
RAC: 65407114

bozz4science wrote: lol,

bozz4science wrote:

lol, that 5950X is a beast. My 3700X needed roughly 6:30-7:00 min for the CPU-intensive toplist recalculation that essentially renders the O3 GW tasks much less rewarding than the O2-run WUs and FGRPB1G tasks alike.

unless you were running the CPU at 100% load (overcommiting with too many tasks) i wouldn't expect the 3700X to be more than ~20% slower clock for clock since this is a single threaded process.

i took a peek at your tasks and noticed your wingmen were usually doing the final toplist in 2-3mins too. even on arguably weaker CPUs like an i3-9100. you really need to leave spare CPU resources for the GW tasks, they are CPU bound. the faster your GPU is, the faster your CPU needs to be. and you can never run with 100% CPU, it'll just slow everything down. I also recommend running an app_config.xml file to force 1CPU-1GPU so that BOINC properly accounts for free resources and doesn't try to run too many tasks. 

_________________________________________________________________________

bozz4science
bozz4science
Joined: 4 May 20
Posts: 15
Credit: 69043923
RAC: 119432

Thanks Ian&Steve for taking a

Thanks Ian&Steve for taking a look! This surprised me as well, but somehow all of my E@H tasks showed some weirdness lateley. Threw 15+ errors recently on my system while not having changed any setting and everything running smoothly before. I'll likely revisit this tomorrow morning after running a few tasks over night. I do run an app config at a setting of 1 CPU & 1 GPU and always leave one thread for system overhead, so that is cannot be an issue. Might however be that at the time of processing, windows spawned some stupid system processes, that might have temporarily lead to CPU overcommitment... I'll keep an closer look next time I run these tasks.

Thanks again for your advice!

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3960
Credit: 47069122642
RAC: 65407114

ran through some more O3ASE

ran through some more O3ASE tasks on a different platform.

see the task results here: https://einsteinathome.org/host/12803486/tasks/0/55

 

Running environment/setup:

  • CPU: EPYC 7402P @ 3.30GHz
  • MEM: DDR4 3200MHz CL22 ECC RDIMM
  • GPU: RTX 2080 Ti @ 1980MHz; power limit 225W
  • Ubuntu 20.04.2, kernel 5.8.0-50, nvidia driver 460.73
  • all tasks were "h1_0398.80_O3aM1In1__O3ASE1_399.00Hz_xxxx_x"
  • CPU loaded with Universe@home on other threads, total CPU utilization ~95%
  • app_config.xml forcing 1CPU - 1GPU, only 1 task per GPU

 

Observations:

  • ~1200MB GPU memory used

  • Task Progress:


    • ~0-15%,  (115-118% CPU thread utilization, ~70% GPU utilization)
    • ~15-99%, (105-108% CPU thread utilization, ~80% GPU utilization)
    • 99-100%, (99-100% CPU thread utilization, ~0% GPU utilization)


  • 8 tasks, time to reach: 99% / 100% (m:ss)


    • 5:34 / 9:31
    • 5:24 / 9:19
    • 5:28 / 9:24
    • 5:30 / 9:32
    • 5:35 / 9:31
    • 5:27 / 9:23
    • 5:41 / 9:35
    • 5:30 / 9:27


 

So in this case, it always took around 4 minutes for that final 99-100% push. Slower CPU/Mem here, but also I think the type of task (higher freq, and DF?) play some role too. I'll have to see what the wingmen do.

_________________________________________________________________________

GWGeorge007
GWGeorge007
Joined: 8 Jan 18
Posts: 3066
Credit: 4972007686
RAC: 1430935

Ian&Steve C. wrote: ran

Ian&Steve C. wrote:

ran through some more O3ASE tasks on a different platform.

see the task results here: https://einsteinathome.org/host/12803486/tasks/0/55

Running environment/setup:

  • CPU: EPYC 7402P @ 3.30GHz
  • MEM: DDR4 3200MHz CL22 ECC RDIMM
  • GPU: RTX 2080 Ti @ 1980MHz; power limit 225W
  • Ubuntu 20.04.2, kernel 5.8.0-50, nvidia driver 460.73
  • all tasks were "h1_0398.80_O3aM1In1__O3ASE1_399.00Hz_xxxx_x"
  • CPU loaded with Universe@home on other threads, total CPU utilization ~95%
  • app_config.xml forcing 1CPU - 1GPU, only 1 task per GPU

Observations:

  • ~1200MB GPU memory used

  • Task Progress:


    • ~0-15%,  (115-118% CPU thread utilization, ~70% GPU utilization)
    • ~15-99%, (105-108% CPU thread utilization, ~80% GPU utilization)
    • 99-100%, (99-100% CPU thread utilization, ~0% GPU utilization)


  • 8 tasks, time to reach: 99% / 100% (m:ss)


    • 5:34 / 9:31
    • 5:24 / 9:19
    • 5:28 / 9:24
    • 5:30 / 9:32
    • 5:35 / 9:31
    • 5:27 / 9:23
    • 5:41 / 9:35
    • 5:30 / 9:27


So in this case, it always took around 4 minutes for that final 99-100% push. Slower CPU/Mem here, but also I think the type of task (higher freq, and DF?) play some role too. I'll have to see what the wingmen do.

Just as a point of comparison, I too have some O3ASE tasks on a 3950X platform.

Running environment/setup:

  • CPU: Ryzen 9 3950X at 4,299 MHz (OC)
  • MEM: DDR4 3200MHz CL14 Non-ECC DIMM
  • GPU: (2x) RTX 2070 Super @ ~1980 MHz and ~2025 MHz; No power limit (presently peak draw ~200W)
  • Windows 10 version 1909 (OS Build 18363.1379)
  • Tasks were "h1_0399.80_O3aM1In1__O3ASE1_399.80Hz_xxxx_x" and "h1_0399.80_O3aM1In1__O3ASE1_399.40Hz_xxxx_x"
  • CPU loaded with Universe@home and Milkyway@home on other threads, total CPU utilization 100%
  • app_config.xml forcing 1CPU - 1GPU, only 1 task per GPU

Observations:

  • ~1,120 MB to ~1,520 GPU memory used
  • ~1,750 MHz (both) memory clock

My present progress for O3ASE tasks for my 3950X (computer 12851564):

Too few to make a real observation.  The CPU time required to complete the 5 valid tasks are ~895 secs (14:54 min:sec) up to ~918 secs (15:18 min:sec)

I have no Gamma-ray Pulsar tasks currently in progress in my tasks.  https://einsteinathome.org/host/12851564/tasks/0/0

George

Proud member of the Old Farts Association

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.