Gravitational Wave search GPU App version

Due to the excellent work of our French volunteer Christophe Choquet we finally have a working OpenCL version of the Gravitational Wave search ("S6CasA") application. Thank you Christophe!

This App version is currently considered 'Beta' and being tested on Einstein@Home. To participate in the Beta test, you need to edit your Einstein@Home preferences, and set "Run beta/test application versions?" to "yes".

It is currently available for Windows (32 Bit) and Linux (64 Bit) only, and you should have a card which supports double precision FP in hardware.

BM

Comments

Peciak
Peciak
Joined: 16 Jun 09
Posts: 2
Credit: 74133035
RAC: 0

WIN 7 64, i7 870 ATI 7750

WIN 7 64, i7 870 ATI 7750 driver 14.2 -> Error while computing
http://einsteinathome.org/host/10071076/tasks&offset=0&show_names=1&state=5&appid=0

ahj
ahj
Joined: 25 Jul 10
Posts: 17
Credit: 4331992
RAC: 0

RE: My first wu

Quote:
My first wu successfully finished after 450 sec on a ATI r9 280x, GPU utilization showed 82%.
Now waiting for my wingman to see if it validates.

Hmmm... that's 8x more productive than my maxed out i7 4771... look's like CPUs are pretty much redundant now for GW units. That, or an order for a 280x/270x will be made soon :)

Quote:


Edit: first finished wu from AMD A10 7700K APU is in:

http://einsteinathome.org/host/10283382/tasks
3539 sec !!!

Nice! A single APU graphics chip is *as fast* as my i7 when computing GW units! Now that's what I call progress :D

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 756103647
RAC: 1149581

RE: look's like CPUs are

Quote:
look's like CPUs are pretty much redundant now for GW units.

Just to avoid misiunderstandings, I certainly don't want to leave this impression: because if the CPU-only hosts would retreat from the project, the project would lose a lot of its throughput. Currently GPUs provide roughly one third of the throughput (if measured by credits), and launching the GW search with GPU support will NOT change that (it will just shift the split between the GPU enabled BRP,FGRP and GW searches). CPU host would still provide more than half of the scientific throughput.

HB

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 756103647
RAC: 1149581

RE: Might be worth to try

Quote:

Might be worth to try it on Intel GPU's ?

Unfortunately, AFAIK, even the most recent Intel iGPUs do not support double precision via OpenCL. I'm not sure whether this is driven by technological or marketing reasons, but that's the way it is.

Cheers
HB

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 756103647
RAC: 1149581

RE: How often do these

Quote:

How often do these checkpoint? I had been running for over 30 minutes and exited BOINC, and the WU started over from the beginning.

Thanks.

That is strange, on a reasonably fast GPU, the code should have made it to the first checkpoint opportunity within 30 minutes. It's different for the CPU versions, where the problem you mentioned is known to happen (depends on the tasks , IIRC).

HB

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 534
Credit: 662776543
RAC: 573931

Hallo Bikeman! I got 2 tasks

Hallo Bikeman!
I got 2 tasks only. Both finished sucessfully, 1 became validated now. The GPU version is on my ATI HD7790 about factor 16 faster than the purely CPU one. That counts!!!
We hope, to get more of them soon.

Kind regards and happy crunching
Martin

rbpeake
rbpeake
Joined: 18 Jan 05
Posts: 266
Credit: 1145212797
RAC: 623283

RE: RE: How often do

Quote:
Quote:

How often do these checkpoint? I had been running for over 30 minutes and exited BOINC, and the WU started over from the beginning.

Thanks.

That is strange, on a reasonably fast GPU, the code should have made it to the first checkpoint opportunity within 30 minutes. It's different for the CPU versions, where the problem you mentioned is known to happen (depends on the tasks , IIRC).

HB

It was on a slow GPU. That explains it!

Thanks!

rbpeake
rbpeake
Joined: 18 Jan 05
Posts: 266
Credit: 1145212797
RAC: 623283

I seem to get the GW CPU apps

I seem to get the GW CPU apps as well, even though I did not select that in my project preferences.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508890244
RAC: 76823

RE: RE: look's like CPUs

Quote:
Quote:
look's like CPUs are pretty much redundant now for GW units.

Just to avoid misiunderstandings, I certainly don't want to leave this impression: because if the CPU-only hosts would retreat from the project, the project would lose a lot of its throughput. Currently GPUs provide roughly one third of the throughput (if measured by credits), and launching the GW search with GPU support will NOT change that (it will just shift the split between the GPU enabled BRP,FGRP and GW searches). CPU host would still provide more than half of the scientific throughput.

HB

I'm not so shure that this is the case. On one of my hosts the most common wingman is 'Unsent', at least for the last 27 finished wu's. It looks like a big misbalance is growing.
Since gpu wu's are 8 times faster we need 8times the number of cpu-only hosts to validate the results, at least as long every gpu task validates against a cpu task.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508890244
RAC: 76823

RE: Edit: first finished

Quote:


Edit: first finished wu from AMD A10 7700K APU is in:

http://einsteinathome.org/host/10283382/tasks
3539 sec !!!

Nice! A single APU graphics chip is *as fast* as my i7 when computing GW units! Now that's what I call progress :D

Oh yeah, and that for half the price :)))

Rob
Rob
Joined: 17 Apr 12
Posts: 8
Credit: 1781499
RAC: 0

Hi, I had one Beta

Hi,

I had one Beta workunit so far (currently computing for SETI, but I wanted to help the beta tests) with the app version 1.07. It did not end successfully.
http://einsteinathome.org/task/430286129
Using an AMD HD 7950 with Catalyst 14.4 Beta and Windows 7.
Will try again, if I have time, but currently I'm on Linux Mint (I'll report if I get an error there. Using Catalyst 14.3 Beta on that OS).

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508890244
RAC: 76823

RE: Hi, I had one Beta

Quote:

Hi,

I had one Beta workunit so far (currently computing for SETI, but I wanted to help the beta tests) with the app version 1.07. It did not end successfully.
http://einsteinathome.org/task/430286129
Using an AMD HD 7950 with Catalyst 14.4 Beta and Windows 7.
Will try again, if I have time, but currently I'm on Linux Mint (I'll report if I get an error there. Using Catalyst 14.3 Beta on that OS).

http://einsteinathome.org/node/197531&nowrap=true#130549

Rob
Rob
Joined: 17 Apr 12
Posts: 8
Credit: 1781499
RAC: 0

Hi Alex, thanks for

Hi Alex,

thanks for posting that link.I already saw that, I just thought I might be able to help locating the problem by posting the error. :)

Claggy
Claggy
Joined: 29 Dec 06
Posts: 560
Credit: 2718405
RAC: 1051

Done ~70 v1.07

Done ~70 v1.07 (GWopencl-ati-Beta) tasks on my HD7770 without problem, in a little over 1200 secs each, GPU is running a slot at PCI-E 2.0 x8 speed at present, and on APP runtime 1348.5

Claggy

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4330
Credit: 251470102
RAC: 36353

RE: Since gpu wu's are 8

Quote:
Since gpu wu's are 8 times faster we need 8times the number of cpu-only hosts to validate the results, at least as long every gpu task validates against a cpu task.

Nope, not 8x as many hosts, just 8x as many cores.

The core problem here, however, is the criteria of "locality scheduling" when assigning tasks, which is meant to limit the download volume. In principle we should have enough CPU cores, but not necessarily in each frequency band where the clients with Beta GPU Apps are working.

BM

BM

DF1DX
DF1DX
Joined: 14 Aug 10
Posts: 106
Credit: 3904016045
RAC: 966985

CPU only: 16 Tasks CasA with

CPU only:
16 Tasks CasA with v1.06 (Beta) on OS X 10.9 (hostid 9560887), only 1 task on this host.
No errors, no hanging app.

Runtime 1.05 ~24500s, 1.06 Beta only 15750s!

Jürgen

TJ
TJ
Joined: 11 Feb 05
Posts: 178
Credit: 21041858
RAC: 0

Two of the new beta's errored

Two of the new beta's errored out:
h1_0845.30_S6Directed__S6CasAf40a_846.05Hz_387_0 187970828 3184984 15 Apr 2014 23:51:45 UTC 16 Apr 2014 0:28:35 UTC Error while computing 1,006.44 434.53 3.00 --- Gravitational Wave S6 Directed Search (CasA) v1.07 (GWopencl-ati-Beta)
h1_0845.30_S6Directed__S6CasAf40a_846.05Hz_388_0 187970824 3184984 15 Apr 2014 23:51:45 UTC 16 Apr 2014 0:11:25 UTC Error while computing 1,007.52 431.84 2.99 --- Gravitational Wave S6 Directed Search (CasA) v1.07 (GWopencl-ati-Beta)

Runs fast in about 16 seconds to 99% finished on 5870, then about 2 minutes still at 99% and then error while computing.

Greetings from
TJ

rbpeake
rbpeake
Joined: 18 Jan 05
Posts: 266
Credit: 1145212797
RAC: 623283

RE: RE: Since gpu wu's

Quote:
Quote:
Since gpu wu's are 8 times faster we need 8times the number of cpu-only hosts to validate the results, at least as long every gpu task validates against a cpu task.

Nope, not 8x as many hosts, just 8x as many cores.

The core problem here, however, is the criteria of "locality scheduling" when assigning tasks, which is meant to limit the download volume. In principle we should have enough CPU cores, but not necessarily in each frequency band where the clients with Beta GPU Apps are working.

BM


I guess this explains why I am getting GW CPU tasks even when my preferences say to only get GPU tasks? I started to abort the CPU tasks but they just kept coming, so I guess that is the "price to pay" for running the GPU tasks? If that is the project's policy for GW Search, so be it, I just was not clear about it.

Thanks!

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 756103647
RAC: 1149581

RE: CPU only: 16 Tasks CasA

Quote:

CPU only:
16 Tasks CasA with v1.06 (Beta) on OS X 10.9 (hostid 9560887), only 1 task on this host.
No errors, no hanging app.

Runtime 1.05 ~24500s, 1.06 Beta only 15750s!

Jürgen

These are CPU only, and the difference in run time is quite remarkable. Once we figure out the problem with the OpenCL app we'll resume the CPU beta test, this is encouraging.

Cheers
HB

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508890244
RAC: 76823

RE: Once we figure out the

Quote:

Once we figure out the problem with the OpenCL app we'll resume the CPU beta test, this is encouraging.

Cheers
HB

I reconfigured my A10 system yesterday, added a gt430.
This required to install the nVidia drivers (latest non beta)
in addition I upgraded the AMD drivers to latest beta.
The result is: all gw beta now fail, the ATI as well as the nVidia. All after crunching to 99% and staying there for a while.

So the hw is the same, the wu's are the same, just the driver (mix) is different.

Maybe this helps a little bit to find the problem.
http://einsteinathome.org/host/10283382/tasks

Alexander

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 756103647
RAC: 1149581

RE: RE: Once we figure

Quote:
Quote:

Once we figure out the problem with the OpenCL app we'll resume the CPU beta test, this is encouraging.

Cheers
HB

I reconfigured my A10 system yesterday, added a gt430.
This required to install the nVidia drivers (latest non beta)
in addition I upgraded the AMD drivers to latest beta.
The result is: all gw beta now fail, the ATI as well as the nVidia. All after crunching to 99% and staying there for a while.

So the hw is the same, the wu's are the same, just the driver (mix) is different.

Maybe this helps a little bit to find the problem.
http://einsteinathome.org/host/10283382/tasks

Alexander

Oh oh...that pretty much seems to have messed up your host:

Also failing now on that host are CUDA BRP tasks, we consider those stable, and they should work just fine.

Also when you look inside the log of OpenCL beta tasks that are supposed to run on the GT 430, you can see that actually BOINC picks the AMD card to run them.

Some results are weird, e.g. http://einsteinathome.org/task/431318039 shows the app started up, found a result file (without doing any work to compute it...huh?), stopped, and failed with an upload error because the result files were missing .....


Maybe there is a particular order in which one is supposed to upgrade NVIDIA and AMD drivers? A project reset is probably in order after fixing the drivers.

HB

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1606340474
RAC: 665337

RE: Maybe there is a

Quote:

Maybe there is a particular order in which one is supposed to upgrade NVIDIA and AMD drivers?


Over at Seti it has been said that the AMD drivers first then the NVIDIA.

Claggy
Claggy
Joined: 29 Dec 06
Posts: 560
Credit: 2718405
RAC: 1051

RE: RE: Maybe there is a

Quote:
Quote:

Maybe there is a particular order in which one is supposed to upgrade NVIDIA and AMD drivers?

Over at Seti it has been said that the AMD drivers first then the NVIDIA.


Strange that, I always said the otherway around, and the Seti observation probably came from me in the first place.
(At one point both Nvidia and AMD drivers would both install OpenCl.dll overwriting each other, Raistmers's AMD OpenCL MB app would produce weakily similar/inconclusives results when the Nvidia's OpenCL dll was installed last,
so for his AMD OpenCL MB app to produce good results the AMD drivers had to be installed last)

Claggy

Overtonesinger
Overtonesinger
Joined: 23 Jan 06
Posts: 21
Credit: 94234203
RAC: 41691

Oh! Computational Error.My

Oh! Computational Error.
My (mobile) card apparently does NOT support Double Precision (or FP64) in hardware. Thats a pity. :(

BRISINGR-II: PRIME X370-PRO,AMD'Zen 1800X 3.7/4.1, 2x8 DDR4 G.Sk.3602@2400,Asus STRIX GTX1070 DirectCUIII 8GB,*2017-04-08

BRISINGR: nb ASUS G73-JH,i7 1.73,4x2 DDR3-1333 CL7,ATi5870M 1GB,*2011-02-24

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 379
Credit: 740030628
RAC: 0

Last night my system I setup

Last night my system I setup for running the GW Search GPU Beta full time ran into a 320 task quota limit. Is there any possibility of increasing the limit? Based on current runtime per task, I am estimating 545 task production per day. I understand if this type of change should wait until the application goes from beta to production status. Thanks.

Regarding stability, I have so far not seen any errors or invalid tasks with the version 1.07 application.

rbpeake
rbpeake
Joined: 18 Jan 05
Posts: 266
Credit: 1145212797
RAC: 623283

RE: Last night my system I

Quote:

Last night my system I setup for running the GW Search GPU Beta full time ran into a 320 task quota limit. Is there any possibility of increasing the limit? Based on current runtime per task, I am estimating 545 task production per day. I understand if this type of change should wait until the application goes from beta to production status. Thanks.

Regarding stability, I have so far not seen any errors or invalid tasks with the version 1.07 application.


I have the same problem, I just saw this message in the event log:

  • 4/16/2014 9:49:04 PM | Einstein@Home | Requesting new tasks for NVIDIA GPU 4/16/2014 9:49:09 PM | Einstein@Home | Scheduler request completed: got 0 new tasks
    4/16/2014 9:49:09 PM | Einstein@Home | No work sent
    4/16/2014 9:49:09 PM | Einstein@Home | No work is available for Gravitational Wave S6 Directed Search (CasA)
    4/16/2014 9:49:09 PM | Einstein@Home | (reached daily quota of 128 tasks)
    4/16/2014 9:49:09 PM | Einstein@Home | Project has no jobs available
Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508890244
RAC: 76823

RE: Oh oh...that pretty

Quote:

Oh oh...that pretty much seems to have messed up your host:

Also failing now on that host are CUDA BRP tasks, we consider those stable, and they should work just fine.

Also when you look inside the log of OpenCL beta tasks that are supposed to run on the GT 430, you can see that actually BOINC picks the AMD card to run them.

Some results are weird, e.g. http://einsteinathome.org/task/431318039 shows the app started up, found a result file (without doing any work to compute it...huh?), stopped, and failed with an upload error because the result files were missing .....


Maybe there is a particular order in which one is supposed to upgrade NVIDIA and AMD drivers? A project reset is probably in order after fixing the drivers.

HB

THX for the diagnose.

After BIOS upgrade (FM2+ is quite new, found a new bios which is labeld as 'improved stability') and driver reinstallation now online again.
Let's see how it workes.

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1606340474
RAC: 665337

RE: RE: RE: Maybe

Quote:
Quote:
Quote:

Maybe there is a particular order in which one is supposed to upgrade NVIDIA and AMD drivers?

Over at Seti it has been said that the AMD drivers first then the NVIDIA.

Strange that, I always said the otherway around, and the Seti observation probably came from me in the first place.
(At one point both Nvidia and AMD drivers would both install OpenCl.dll overwriting each other, Raistmers's AMD OpenCL MB app would produce weakily similar/inconclusives results when the Nvidia's OpenCL dll was installed last,
so for his AMD OpenCL MB app to produce good results the AMD drivers had to be installed last)

Claggy


I would defer to your knowledge.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508890244
RAC: 76823

OK, system is up and running

OK, system is up and running again.
The Arecibo and Perseus gpu tasks seem to run fine, but ...

the CasA beta still fail.
https://dl.dropboxusercontent.com/u/50246791/synchron%20crunching.PNG
https://dl.dropboxusercontent.com/u/50246791/gt430%20load.PNG
https://dl.dropboxusercontent.com/u/50246791/R7%20GPU%20load.PNG

When running one nvidia opencl beta and one ati opencl beta they both seem to run on the ati gpu. And they seem to run synchron.

After that, when a new set of wu's were loaded, I put the ati wu on hold.
The result was, that the nvidia wu was running but only the ati gpu showed a gpu load.
https://dl.dropboxusercontent.com/u/50246791/wrong%20gpu.PNG

After that I will try one ati opencl beta alone. If this works I will reconfigure my system to run amd cards only.

Alexander

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 463
Credit: 257957147
RAC: 0

RE: When running one nvidia

Quote:

When running one nvidia opencl beta and one ati opencl beta they both seem to run on the ati gpu. And they seem to run synchron.

After that, when a new set of wu's were loaded, I put the ati wu on hold.
The result was, that the nvidia wu was running but only the ati gpu showed a gpu load.


Running two different graphics cards, especially when they are different types, requires a clean (as in really clean) installation. That means first removing all the old drivers from the Control Panel/Programs and Features (not from "Device Manager"), rebooting after each one. Then you need a driver cleaner that can handle both AMD and Nvidia. I now use Driver Fusion, or Driver Sweeper should do, and get rid of both the AMD and Nvidia remnants.

Then, you reinstall the drivers; doing AMD last seems reasonable. That has worked for me in the past when multiple cards do strange things, especially dissimilar ones.

P.S. - Remember to not let Windows automatically download and install new drivers for you if you need particular ones; you can disable that in "System/Advanced system settings/Hardware/Device Installation Settings".

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508890244
RAC: 76823

RE: Running two different

Quote:


Running two different graphics cards, especially when they are different types, requires a clean (as in really clean) installation. That means first removing all the old drivers from the Control Panel/Programs and Features (not from "Device Manager"), rebooting after each one. Then you need a driver cleaner that can handle both AMD and Nvidia. I now use Driver Fusion, or Driver Sweeper should do, and get rid of both the AMD and Nvidia remnants.

Then, you reinstall the drivers; doing AMD last seems reasonable. That has worked for me in the past when multiple cards do strange things, especially dissimilar ones.

Hi Jim,

thx for the advice, its appreciated.

But as you can see here http://einsteinathome.org/host/6633241 I have systems running with 3 different gpu's.
I did a clean uninstall after bios upgrade and installed the ati drivers first and the nvidia afterwards. This is because the ati is on-chip and the nvidia was pulled off the system.
Then I ran the known good working arecibu gpu tasks on both gpu's together, both finished fine.
It looks like the assignement of the open-cl tasks to the correct gpu is faulty.
I can remember some months ago there was a discussion that wrong gpu-tasks were picked up; i did not follow this discussion because I did not have this problem.

But, as I'm writing these lines, I can remember having seen another thing:

17.04.2014 11:16:01 | | CUDA: NVIDIA GPU 0: GeForce GT 430 (driver version 335.23, CUDA version 6.0, compute capability 2.1, 1024MB, 968MB available, 269 GFLOPS peak)
17.04.2014 11:16:01 | | OpenCL: NVIDIA GPU 0: GeForce GT 430 (driver version 335.23, device version OpenCL 1.1 CUDA, 1024MB, 968MB available, 269 GFLOPS peak)
17.04.2014 11:16:01 | | OpenCL: AMD/ATI GPU 0: Spectre (driver version 1411.4 (VM), device version OpenCL 1.2 AMD-APP (1411.4), 2048MB, 2048MB available, 346 GFLOPS peak)
17.04.2014 11:16:01 | | OpenCL CPU: AMD A10-7700K Radeon R7, 10 Compute Cores 4C+6G (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 1411.4 (sse2,avx,fma4), device version OpenCL 1.2 AMD-APP (1411.4))
17.04.2014 11:16:01 | | NVIDIA library reports 1 GPU
17.04.2014 11:16:01 | | calInit() returned 1
17.04.2014 11:16:01 | | Host name: Raj
17.04.2014 11:16:01 | | Processor: 4 AuthenticAMD AMD A10-7700K Radeon R7, 10 Compute Cores 4C+6G [Family 21 Model 48 Stepping 1]
17.04.2014 11:16:01 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 fma cx16 sse4_1 sse4_2 popcnt aes f16c syscall nx lm avx svm sse4a osvw ibs xop skinit wdt lwp fma4 tce tbm topx page1gb rdtscp fsgsbase bmi1
17.04.2014 11:16:01 | | OS: Microsoft Windows 7: Professional x64 Edition, Service Pack 1, (06.01.7601.00)
17.04.2014 11:16:01 | | Memory: 6.94 GB physical, 13.88 GB virtual

Both gpu's are numbered as 0 !!

Maybe there is the source of the problem? It's BOINC 7.3.11
If I can remember correct, we had a similar problem half a year ago?

Alexander

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 463
Credit: 257957147
RAC: 0

RE: Both gpu's are numbered

Quote:

Both gpu's are numbered as 0 !!

Maybe there is the source of the problem? It's BOINC 7.3.11
If I can remember correct, we had a similar problem half a year ago?

I expect that part is OK, though it has been awhile since I had both AMD and Nvidia. I think it starts numbering the AMD cards at "0", and also the Nvidia cards at "0", but keeps track of them separately by their manufacturer type.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508890244
RAC: 76823

RE: RE: RE: RE: Mayb

Quote:
Quote:
Quote:
Quote:

Maybe there is a particular order in which one is supposed to upgrade NVIDIA and AMD drivers?

Over at Seti it has been said that the AMD drivers first then the NVIDIA.

Strange that, I always said the otherway around, and the Seti observation probably came from me in the first place.
(At one point both Nvidia and AMD drivers would both install OpenCl.dll overwriting each other, Raistmers's AMD OpenCL MB app would produce weakily similar/inconclusives results when the Nvidia's OpenCL dll was installed last,
so for his AMD OpenCL MB app to produce good results the AMD drivers had to be installed last)

Claggy


I would defer to your knowledge.

I've posted the problem in the BOINC forum. Let's see if someone has an idea there.

But to be shure I will try to do it this way also. Will finish the running tasks first.

Alexander

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 379
Credit: 740030628
RAC: 0

RE: I have the same

Quote:

I have the same problem, I just saw this message in the event log:
  • 4/16/2014 9:49:04 PM | Einstein@Home | Requesting new tasks for NVIDIA GPU 4/16/2014 9:49:09 PM | Einstein@Home | Scheduler request completed: got 0 new tasks
    4/16/2014 9:49:09 PM | Einstein@Home | No work sent
    4/16/2014 9:49:09 PM | Einstein@Home | No work is available for Gravitational Wave S6 Directed Search (CasA)
    4/16/2014 9:49:09 PM | Einstein@Home | (reached daily quota of 128 tasks)
    4/16/2014 9:49:09 PM | Einstein@Home | Project has no jobs available

There appears to be a combined quota between GW and FGRP Searches. After my system reached the maximum earlier today for GW Search, I re-enabled FGRP Search but was not able to receive any new work.

  • Thu 17 Apr 2014 08:22:20 PM MST | Einstein@Home | No work is available for Gravitational Wave S6 Directed Search (CasA) Thu 17 Apr 2014 08:22:20 PM MST | Einstein@Home | No work is available for Gamma-ray pulsar search #3
    Thu 17 Apr 2014 08:22:20 PM MST | Einstein@Home | (reached daily quota of 320 tasks)
    Thu 17 Apr 2014 08:22:20 PM MST | Einstein@Home | Project has no jobs available
ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 581798152
RAC: 139027

It might be good to scale the

It might be good to scale the quota with the number of GPUs. Or if this is already being done to increase the tasks per GPU. "Per GPU" is a rough estimate since they vary widely in performance.. but surely better than a static limit.

MrS

Scanning for our furry friends since Jan 2002

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 756103647
RAC: 1149581

We plan to update the server

We plan to update the server software so that the quota would be per app version, which would be especially important for beta test apps (we don't want to penalize beta-testers who are more likely to suffer from massive failures of tasks). I'm not sure tho how fast we can do this update.

Cheers
HB

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 756103647
RAC: 1149581

RE: RE: Both gpu's are

Quote:
Quote:

Both gpu's are numbered as 0 !!

Maybe there is the source of the problem? It's BOINC 7.3.11
If I can remember correct, we had a similar problem half a year ago?

I expect that part is OK, though it has been awhile since I had both AMD and Nvidia. I think it starts numbering the AMD cards at "0", and also the Nvidia cards at "0", but keeps track of them separately by their manufacturer type.

Right, in OpenCL speak, your host will have two OpenCL platforms, one by NVIDIA and one by AMD, and under each there is one GPU, indexed started at 0. BOINC is supposed to be able to deal with this.

HB

Alex Muratov
Alex Muratov
Joined: 8 Mar 11
Posts: 3
Credit: 2223097
RAC: 0

Look

Look here:
http://einsteinathome.org/node/197540

The ATI version cannot complete computation.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508890244
RAC: 76823

RE: RE: RE: Both gpu's

Quote:
Quote:
Quote:

Both gpu's are numbered as 0 !!

Maybe there is the source of the problem? It's BOINC 7.3.11
If I can remember correct, we had a similar problem half a year ago?

I expect that part is OK, though it has been awhile since I had both AMD and Nvidia. I think it starts numbering the AMD cards at "0", and also the Nvidia cards at "0", but keeps track of them separately by their manufacturer type.

Right, in OpenCL speak, your host will have two OpenCL platforms, one by NVIDIA and one by AMD, and under each there is one GPU, indexed started at 0. BOINC is supposed to be able to deal with this.

HB

As Jord posted in the BOINC forum I installed the latest BM 7.3.15
An impressive advance regarding debugging!

I ran 2 wu's one nvidia and one ati. Tha ati finished ok, the nvidia failed.
http://einsteinathome.org/task/431993581 the failing nvidia

From the stderr output:

2014-04-18 19:34:45.1114 (4808) [normal]: Start of BOINC application 'projects/einstein.phys.uwm.edu/einstein_S6CasA_1.08_windows_x86_64__GWopencl-nvidia-Beta.exe'.
Activated exception handling...
command line: projects/einstein.phys.uwm.edu/einstein_S6CasA_1.08_windows_x86_64__GWopencl-nvidia-Beta.exe --skyRegion=(6.1237713,1.0264572) --refTime=960541454.5 --Freq=993.4000000000001 --FreqBand=0.05 --dFreq=5.3519e-07 --f1dot=-2.71657332393e-09 --f1dotBand=7.76163806836e-11 --df1dot=8.2281e-12 --gammaRefine=90 --f2dot=9.664e-19 --f2dotBand=2.21688997516e-17 --df2dot=1.9328e-18 --gamma2Refine=60 --computeLV --LVuseAllTerms=0 --LVrho=2.7564e+17 --LVlX=0.000168379,0.000168379 --nCand1=3000 --SortToplist=3 --recalcToplistStats=1 -o ../../projects/einstein.phys.uwm.edu/h1_0993.20_S6Directed__S6CasAf40a_993.4Hz_1319_0_0 --printCand1 --semiCohToplist --ephemE=../../projects/einstein.phys.uwm.edu/earth_09_11 --ephemS=../../projects/einstein.phys.uwm.edu/sun_09_11 --segmentList=../../projects/einstein.phys.uwm.edu/seglist-CasAf40.dat --Dterms=8 --DataFiles1=..\..\projects\einstein.phys.uwm.edu\h1_0993.20_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0993.20_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0993.25_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0993.25_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0993.30_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0993.30_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0993.35_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0993.35_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0993.40_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0993.40_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0993.45_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0993.45_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0993.50_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0993.50_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0993.55_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0993.55_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0993.60_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0993.60_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0993.65_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0993.65_S6Directed --device 0
2014-04-18 19:34:45.3766 (4808) [debug]: Flags: LAL_NDEBUG, OPTIMIZE, HS_OPTIMIZATION, X64, SSE, SSE2, GNUC X86 GNUX86
2014-04-18 19:34:45.3766 (4808) [debug]: Set up communication with graphics process.
Code-version: %% LAL: 6.10.0.1 (CLEAN 14312d5a9fafa5b46fc6ccc57a08bdfab14361f1)
%% LALApps: 6.12.0.1 (CLEAN 14312d5a9fafa5b46fc6ccc57a08bdfab14361f1)

2014-04-18 19:34:45.3922 (4808) [normal]: Using OpenCL platform provided by: Advanced Micro Devices, Inc.
2014-04-18 19:34:45.3922 (4808) [normal]: Using OpenCL device "Spectre" by: Advanced Micro Devices, Inc.

2014-04-18 19:34:49.5886 (4808) [normal]: OpenCL programs sucessfully compiled !

Now this seems really incorrect!

The complete logfile is here:
https://dl.dropboxusercontent.com/u/50246791/boinc_log.txt

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508890244
RAC: 76823

Jord answered in the BOINC

Jord answered in the BOINC forum.

However, please point out the following at Einstein. When you go further down in the log, the task on the Nvidia card seems to change without warning.
Going from

18.04.2014 21:22:12 | Einstein@Home | [coproc] NVIDIA instance 0; 1.000000 pending for h1_0993.20_S6Directed__S6CasAf40a_993.4Hz_1319_0
18.04.2014 21:22:12 | Einstein@Home | [coproc] NVIDIA instance 0: confirming 1.000000 instance for h1_0993.20_S6Directed__S6CasAf40a_993.4Hz_1319_0

to

18.04.2014 21:25:25 | Einstein@Home | [coproc] NVIDIA instance 0; 1.000000 pending for h1_0993.25_S6Directed__S6CasAf40a_993.45Hz_1329_0
18.04.2014 21:25:25 | Einstein@Home | [coproc] NVIDIA instance 0: confirming 1.000000 instance for h1_0993.25_S6Directed__S6CasAf40a_993.45Hz_1329_0

That's the only weird thing I can find in your log. It doesn't go far enough to see which task eventually uploads. Only the AMD task shows.

Now, here's the clincher: it shouldn't matter what hardware the tasks run on, they're both OpenCL and the OpenCL applications at Einstein are exactly the same ones, whether you run them on the Nvidia, AMD or Intel GPU.

sorcrosc
sorcrosc
Joined: 3 May 13
Posts: 8
Credit: 16046006
RAC: 2

Seems the v1.07

Seems the v1.07 (GWopencl-ati-Beta) I processed will all be marked as invalid
http://einsteinathome.org/host/7803636/tasks&offset=0&show_names=1&state=0&appid=24

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508890244
RAC: 76823

RE: Jord answered in the

Quote:

Jord answered in the BOINC forum.

However, please point out the following at Einstein. When you go further down in the log, the task on the Nvidia card seems to change without warning.
Going from

18.04.2014 21:22:12 | Einstein@Home | [coproc] NVIDIA instance 0; 1.000000 pending for h1_0993.20_S6Directed__S6CasAf40a_993.4Hz_1319_0
18.04.2014 21:22:12 | Einstein@Home | [coproc] NVIDIA instance 0: confirming 1.000000 instance for h1_0993.20_S6Directed__S6CasAf40a_993.4Hz_1319_0

to

18.04.2014 21:25:25 | Einstein@Home | [coproc] NVIDIA instance 0; 1.000000 pending for h1_0993.25_S6Directed__S6CasAf40a_993.45Hz_1329_0
18.04.2014 21:25:25 | Einstein@Home | [coproc] NVIDIA instance 0: confirming 1.000000 instance for h1_0993.25_S6Directed__S6CasAf40a_993.45Hz_1329_0

That's the only weird thing I can find in your log. It doesn't go far enough to see which task eventually uploads. Only the AMD task shows.

Now, here's the clincher: it shouldn't matter what hardware the tasks run on, they're both OpenCL and the OpenCL applications at Einstein are exactly the same ones, whether you run them on the Nvidia, AMD or Intel GPU.

OK, this seems to be a simply loggig error. The other wu was the next one which was started, but no entry in the logfile.

This is not Einstein related.

Alexander

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508890244
RAC: 76823

Following the advice posted

Following the advice posted earlier I uninstalled both ati and nvidia drivers, cleaned the system and reinstalled them in different order, nvidia first and catalyst driver, latest non beta, afterwards.
According to the stderr output file the nvidia app was allocated to the ati gpu, but the wu's both finished and are waiting for validation.

But I still assume that both wu's are running on the same gpu. Both wu's use about the same time and this time is ~ 50% longer than running one ati wu alone.

Alexander

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508890244
RAC: 76823

A second set finished, both

A second set finished, both waiting for validation.
That means that it matters which driver is installed first!

archae86
archae86
Joined: 6 Dec 05
Posts: 3160
Credit: 7261211904
RAC: 1544035

I enabled beta and GW for one

I enabled beta and GW for one GTX660-supplied host. The one thing I ran into that I've not seen mentioned here is that initially no work was supplied because I had (just slightly) too old a version of BOINC. The old pros mostly posting here probably are too up-to-date for this to be a trouble, and would know to look at the host_sched_logs entries to diagnose failure to obtain work, and would understand that the reference to "client" version means what most users may just think of as BOINC.

This post is not for them, but for anyone who might be helped by knowing that if your boinc version is too old, you won't be given this type of work. I believe I was running 7.0.25, and that the message specified 7.0.27 as the boundary line.

You can get a fresher copy of boinc here

Probably you want 7.2.42 for your OS, but choose 64 vs 32 bit variant based on your OS.

Filipe
Filipe
Joined: 10 Mar 05
Posts: 186
Credit: 411023058
RAC: 188707

I can't get any of this work

I can't get any of this work on my 550TI. For some reason boinc only see it with 1023MB. Does an driver update would resolve that?

It suposse to have 1024MB of DDR5

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 581798152
RAC: 139027

The amount of memory

The amount of memory displayed won't matter (1023 is enough), but your driver 266.71 is almost ancient.. considering the pace of change in the GPU world. Upgrade to the current WHQL and it should work.

MrS

Scanning for our furry friends since Jan 2002

exo
exo
Joined: 11 Feb 06
Posts: 11
Credit: 133077998
RAC: 0

Hi, it seems that the ati

Hi,

it seems that the ati app doesn't validate against results calculated by cpu:

http://einsteinathome.org/host/5731078/tasks&offset=0&show_names=1&state=0&appid=24

A different host with nvidia card doesn't have this problem:

http://einsteinathome.org/host/2207991/tasks&offset=0&show_names=1&state=0&appid=24

Both hosts are Linux, btw.

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 379
Credit: 740030628
RAC: 0

I also noticed three tasks

I also noticed three tasks with my system where the GPU task processed in Linux did not validate against the results of the task processed by CPUs. These tasks were processed via version 1.07 via my GPU.

Task 1
Task 2
Task 3

Harri Liljeroos
Harri Liljeroos
Joined: 10 Dec 05
Posts: 4444
Credit: 3253872948
RAC: 1789769

What is the short name of

What is the short name of this application? I would like to update my app_config.xml to use full cpu core. The default seems to be set for 0.5 CPU + 1 NV.