Gravitational Wave Engineering run on LIGO O1 Open Data

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7282091708
RAC: 2033569

Continuing observations on my

Continuing observations on my Radeon VII running 0.11.

It hit 99% reported progress after about 80 minutes elapsed time, at which point GPU usage dropped to zero.

GPU-Z reports the GPU clock at 25 MHz(!!!), memory clock at 349, temperature at 30/31, power at 20W, but memory usage still at 1541/189.

That lasted for a bit over 5 minutes (toplist statistics recalculation), after which the task completed and uploaded, with a reported elapsed time of 1:24:51.

If you care to take a look at the task it is 839004421.  As the second task to fulfill the quorum is unsent, it may be some time before we can see validation or not.

Naturally running this task smashed my DCF up by over a factor of ten.  As I've unsuspended the GRP work in my queue, that work is getting burned off in panic mode.  It may be some time before I can try 2X or 3X on the GW work.

 

 

 

 

DF1DX
DF1DX
Joined: 14 Aug 10
Posts: 106
Credit: 3904964432
RAC: 468916

With my Radeon VII (Win 7,

With my Radeon VII (Win 7, AMD 19.4.1, einstein_O1OD1E_0.11_windows_x86_64__GW-opencl-ati-V1.exe) I can confirm the observation of archae86 - very low GPU load and clock frequency (~ 400 MHz) with a single WU.

When starting a second WU my PC crashed after a few minutes. So back to Gamma 1.18.

On the NV 1050Ti in my second host https://einsteinathome.org/de/host/12247194/tasks/2/0
3 WUs run in parallel, time about 7800 s. No problems at the moment.

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 472
Credit: 10400537364
RAC: 2697309

... engineering 0.11 GPU WUs

... engineering 0.11 GPU WUs running between 1 and 6 hours on fast NVIDIA !!   GPU load varying between 1 % and max 10% !!  Much effort for 144,000 GFLOPS.

Some WUs getting ERROR at address x'......0005'

Running WIN7 and WIN10.

I'm stopping 0.11 for now and will wait for "improvements".

 

tolafoph
tolafoph
Joined: 14 Sep 07
Posts: 122
Credit: 74659937
RAC: 0

crashtech schrieb:I'm not

crashtech wrote:
I'm not going to run these for now because they massively underutilize my GPUs. Maybe there is an app_config that would help?

I just started to set it to use more than one task. I set it to 2 now but changing the value for the GW app to 0.5 in the settings on the website. https://einsteinathome.org/de/account/prefs/project

Now the GPU is used between 50 to 55%. But the clock is not at max , I believe. It can run at like 1800MHz, but its between 1600 and 1700 MHz most of the time.

For the next tasks  I will try 3 tasks at once.

Edit: 2 tasks finished in about 7700s, compared to the 5500s for a single one.

mmonnin
mmonnin
Joined: 29 May 16
Posts: 292
Credit: 3444726540
RAC: 361919

Looks like these are similar

Looks like these are similar to the first revisions of the current GPU tasks. Low GPU utilization due to being heavily CPU dependent. Hopefully it'll improve.

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7282091708
RAC: 2033569

Running the GW 0.11 Windows

Running the GW 0.11 Windows AMD application at 3X changed things materially, although the Radeon VII still is very lightly used.

1X   3X    Variable

64%  73%   GPU Load

39W  53W   GPU only Power Draw

463  880   GPU Clock MHz

798  839   Memory Clock MHZ

37C  44C   GPU temperature

39C  49C   Hot Spot GPU temperature

While the memory usage reported barely budged--which seems odd.

As you might suppose, the machine is considerably more productive at 3X than 1X on this work. I've tampered with some things mid-stream, and don't have much results, but on this machine a 1X run this morning took elapsed time of 5091 seconds, while a set of three which just finished (yes, I offset them some, but not enough) took only about 6200 seconds, so a huge productivity boost. One of those validated on completion, which is comforting.  This host has only got four cores, and is not hyperthreaded, so although 4X might well work, and might be slightly more productive, I'm not tempted to try it.

My brand new (today) RX 570 host has six physical cores.  If I get some days of stable running out of it one simpler stuff, I might give 4X on this work a try on it.  That machine has so far run two of these tasks at 1X, with elapsed time around 3780 seconds.  The much faster 1X time may mean the 570 is better for this work than a Radeon VII, but more likely the 9th generation 6-core CPU burns through the CPU portion of the job much faster than the older CPU on my Radeon VII host.

Meanwhile, my primary host is indicated as having 29 days of work on board, as I unintentionally allowed more of the new GW work to download when a spate of running GRP had driven the completion estimates way back down.  I'm afraid a mass abort is in my future but I currently plan to run pure GW GPU for another half day.

 

tolafoph
tolafoph
Joined: 14 Sep 07
Posts: 122
Credit: 74659937
RAC: 0

archae86 schrieb:Meanwhile,

archae86 wrote:
Meanwhile, my primary host is indicated as having 29 days of work on board, as I unintentionally allowed more of the new GW work to download when a spate of running GRP had driven the completion estimates way back down.  I'm afraid a mass abort is in my future but I currently plan to run pure GW GPU for another half day.

Yeah, The extremly different runtimes of 15 min vs 2h is messing with the work I got. I almost ran out of tasks. I changed it from 0.25 to 0.5d of work buffer. But if I run only the 15 min tasks it might download way to many of the 2h ones. But so far I havent gotten any new GW units.

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 463
Credit: 257957147
RAC: 0

mmonnin wrote:Looks like

mmonnin wrote:
Looks like these are similar to the first revisions of the current GPU tasks. Low GPU utilization due to being heavily CPU dependent. Hopefully it'll improve.

Yes, that is the way it was, and it will get better. 

But if they are having that much of a problem with OpenCl, I wonder what the chances are for CUDA?

crashtech
crashtech
Joined: 16 Mar 17
Posts: 3
Credit: 3382978609
RAC: 4601738

I am curious to try running

I am curious to try running this app starting with 4 at a time, with each instance having its own physical core, like so:


<app_config>
  <app>
     <name>einstein_O1OD1E</name>
      <gpu_versions>
      <gpu_usage>0.25</gpu_usage>
      <cpu_usage>2.0</cpu_usage>
      </gpu_versions>
   </app>
</app_config>

Does anyone think this might work, and if so, does anyone know the right project name to place in the app_config? 

Edit: App name added, thanks to Keith Myers for the valuable information.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5024
Credit: 18954976714
RAC: 6428021

The project name is listed in

The name is listed in the client_state.xml file under the project section in the <app> <name> declaration.  The name is what you input into your app_config file.

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.