First CUDA App for Windows available for Beta Test

RandyC
RandyC
Joined: 18 Jan 05
Posts: 6007
Credit: 111139797
RAC: 0

Some VERY weird happenings on

Some VERY weird happenings on my system running E@H CUDA.

SETI and E@H are not playing very will with each other. Mostly, SETI is a bully and takes over the system so far as the gpu is concerned. The only way I've been able to get E@H CUDA to run is to suspend SETI completely.

Then last night I had the system clock reset itself backwards about 5 hrs. See the log segment below. I restarted BOINC, then reset the clock (probably not the best sequence to do it) and it looks OK for right now.
[edit]
Actually, I don't know exactly how far back the clock was reset. I did the update listed after 23:15 at about 5:00 am. There are NO log entries between 23:15 and that update. I looked at Task Manager before I restarted BOINC and it looked like everything was idle (System Idle 99%), however Boincview said things were running. I did not monitor the cpu% complete to see if it was increasing however.
[/edit]

06-Aug-2009 19:46:12 [Einstein@Home] Restarting task p2030_53648_84220_0074_G62.52-00.55.N_0.dm_560_0 using einsteinbinary_ABP1 version 307
06-Aug-2009 20:45:51 [Einstein@Home] Resuming task h1_0805.90_S5R4__334_S5R5a_2 using einstein_S5R5 version 305
06-Aug-2009 22:07:46 [SETI@home] task 18fe09ac.4381.1708.7.10.34_0 suspended by user
06-Aug-2009 22:07:52 [SETI@home] resumed by user
06-Aug-2009 22:07:52 [Einstein@Home] Resuming task h1_0805.90_S5R4__315_S5R5a_2 using einstein_S5R5 version 305
06-Aug-2009 22:07:53 [SETI@home] Restarting task 18fe09aa.518.9479.6.10.218_1 using setiathome_enhanced version 608
06-Aug-2009 22:07:54 [SETI@home] Computation for task 18fe09aa.518.9479.6.10.218_1 finished
06-Aug-2009 22:07:54 [SETI@home] Output file 18fe09aa.518.9479.6.10.218_1_0 for task 18fe09aa.518.9479.6.10.218_1 absent
06-Aug-2009 22:07:54 [SETI@home] Starting 18fe09ab.21304.6616.11.10.86_0
06-Aug-2009 22:07:54 [SETI@home] Starting task 18fe09ab.21304.6616.11.10.86_0 using setiathome_enhanced version 608
06-Aug-2009 22:08:51 [SETI@home] task 18fe09ac.4381.1708.7.10.34_0 resumed by user
06-Aug-2009 22:08:51 [SETI@home] Restarting task 18fe09ac.4381.1708.7.10.34_0 using setiathome_enhanced version 608
06-Aug-2009 22:08:59 [SETI@home] update requested by user
06-Aug-2009 22:09:01 [SETI@home] Sending scheduler request: Requested by user.
06-Aug-2009 22:09:01 [SETI@home] Reporting 1 completed tasks, not requesting new tasks
06-Aug-2009 22:09:06 [SETI@home] Scheduler request completed: got 0 new tasks
06-Aug-2009 22:56:17 [SETI@home] Computation for task 18fe09ac.4381.1708.7.10.34_0 finished
06-Aug-2009 22:56:17 [SETI@home] Resuming task 18fe09ab.21304.6616.11.10.86_0 using setiathome_enhanced version 608
06-Aug-2009 22:56:19 [SETI@home] Started upload of 18fe09ac.4381.1708.7.10.34_0_0
06-Aug-2009 22:56:23 [SETI@home] Finished upload of 18fe09ac.4381.1708.7.10.34_0_0
06-Aug-2009 22:57:17 [SETI@home] Sending scheduler request: To fetch work.
06-Aug-2009 22:57:17 [SETI@home] Reporting 1 completed tasks, requesting new tasks
06-Aug-2009 22:57:22 [SETI@home] Scheduler request completed: got 2 new tasks
06-Aug-2009 22:57:22 [SETI@home] Message from server: No work can be sent for the applications you have selected
06-Aug-2009 22:57:22 [SETI@home] Message from server: No work is available for Astropulse v5
06-Aug-2009 22:57:22 [SETI@home] Message from server: You have selected to receive work from other applications if no work is available for the applications you selected
06-Aug-2009 22:57:22 [SETI@home] Message from server: Sending work from other applications
06-Aug-2009 22:57:24 [SETI@home] Started download of 18fe09ad.11473.154468.9.10.173
06-Aug-2009 22:57:24 [SETI@home] Started download of 17oc08ab.19925.8252.6.10.231
06-Aug-2009 22:57:28 [SETI@home] Finished download of 17oc08ab.19925.8252.6.10.231
06-Aug-2009 22:57:29 [SETI@home] Finished download of 18fe09ad.11473.154468.9.10.173
06-Aug-2009 23:06:56 [SETI@home] Resuming task 24fe09ab.21278.9070.15.10.221_1 using setiathome_enhanced version 603
06-Aug-2009 23:09:33 [SETI@home] Resuming task 18fe09ab.21304.7843.11.10.51_1 using setiathome_enhanced version 603
06-Aug-2009 23:14:59 [SETI@home] Computation for task 24fe09ab.21278.9070.15.10.221_1 finished
06-Aug-2009 23:14:59 [SETI@home] Starting 18fe09ae.15593.8253.5.10.75_0
06-Aug-2009 23:14:59 [SETI@home] Starting task 18fe09ae.15593.8253.5.10.75_0 using setiathome_enhanced version 603
06-Aug-2009 23:15:01 [SETI@home] Started upload of 24fe09ab.21278.9070.15.10.221_1_0
06-Aug-2009 23:15:05 [SETI@home] Finished upload of 24fe09ab.21278.9070.15.10.221_1_0
06-Aug-2009 19:50:51 [SETI@home] update requested by user
06-Aug-2009 19:51:12 [SETI@home] suspended by user
06-Aug-2009 19:51:13 [Einstein@Home] Resuming task h1_0805.90_S5R4__334_S5R5a_2 using einstein_S5R5 version 305
06-Aug-2009 19:54:38 [---] Exit requested by user

Seti Classic Final Total: 11446 WU.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4267
Credit: 244933581
RAC: 16281

Does anybody happen to know

Does anybody happen to know or has the patience to try out how to write an app_info.xml that can run a CUDA version of an application on each CUDA device and a CPU version of the same application (here: einsteinbinary_ABP1) on the remaining CPU cores?

BM

BM

Gundolf Jahn
Gundolf Jahn
Joined: 1 Mar 05
Posts: 1079
Credit: 341280
RAC: 0

RE: Does anybody happen to

Message 94091 in response to message 94090

Quote:

Does anybody happen to know or has the patience to try out how to write an app_info.xml that can run a CUDA version of an application on each CUDA device and a CPU version of the same application (here: einsteinbinary_ABP1) on the remaining CPU cores?

BM


Is that possible at all? I thought that's why there are different applications (6.03 / 6.08) at SETI for the same data to be processed. However, I have no practical experience with CUDA processing (and app_info.xml writing :-).

Gruß,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2139
Credit: 2753028342
RAC: 1371490

RE: Does anybody happen to

Message 94092 in response to message 94090

Quote:

Does anybody happen to know or has the patience to try out how to write an app_info.xml that can run a CUDA version of an application on each CUDA device and a CPU version of the same application (here: einsteinbinary_ABP1) on the remaining CPU cores?

BM


No problem at all - it works exactly as you might expect provided the host is running BOINC v6.6.14 or later.

I think it's good practice, but not strictly necessary (except for compatibility with BOINC 6.4 - see below), to keep the version numbers distinct: and at the moment we're lucky with v3.07 for CUDA and v3.08 for CPU. Perhaps you could stick to the odd/even numbering throughout the Beta phase? I'll test one of those and see how it works - report later.

If an app_info like that is run under BOINC v6.4.5/7, only the higher version number will fetch work - so currently 308/CPU would be preferred. With the next release, 309/CUDA would get precedence, and so on.

Edit - Bernd, could you post a direct download link for the 3.08 CPU application, please? None of my hosts have updated themselves yet - they all seem to be busy with S5R5.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2139
Credit: 2753028342
RAC: 1371490

Never mind - found it myself.

Never mind - found it myself. The following app_info has loaded without errors on my test host, and downloaded two new CUDA tasks. It won't fetch any CPU work at the moment, because I'm wrestling with a weird AQUA multi-threading bug (see boinc_alpha) that runs in EDF for no apparent reason. I should be freed from that millstone in about an hour and a half, and I'll let you know what downloads then.

I've added two new file infos (308.exe and 308_graphics), and a whole new app_version at the end: all that's missing is the api_version line, but frankly I've never noticed it making the slightest diffence, and never known what it was for.

einstein_S5R5

einstein_S5R5_3.05_windows_intelx86.exe

einstein_S5R5_3.05_windows_intelx86_0.exe

einstein_S5R5_3.05_windows_intelx86_1.exe

einstein_S5R5_3.05_windows_intelx86_2.exe

einstein_S5R5_3.05_graphics_windows_intelx86.exe

einsteinbinary_ABP1

einsteinbinary_ABP1_3.07_graphics_windows_intelx86.exe

einsteinbinary_ABP1_3.07_windows_intelx86_cuda.exe

einsteinbinary_ABP1_3.08_graphics_windows_intelx86.exe

einsteinbinary_ABP1_3.08_windows_intelx86.exe

cudart.dll

cufft.dll

einstein_S5R5
305
6.3.0

einstein_S5R5_3.05_windows_intelx86.exe



einstein_S5R5_3.05_windows_intelx86_0.exe


einstein_S5R5_3.05_windows_intelx86_1.exe


einstein_S5R5_3.05_windows_intelx86_2.exe


einstein_S5R5_3.05_graphics_windows_intelx86.exe
graphics_app

einsteinbinary_ABP1
307
cuda
1.0
1.0

CUDA
1

6.7.0

einsteinbinary_ABP1_3.07_windows_intelx86_cuda.exe



einsteinbinary_ABP1_3.07_graphics_windows_intelx86.exe
graphics_app


cudart.dll


cufft.dll

einsteinbinary_ABP1
308

einsteinbinary_ABP1_3.08_windows_intelx86.exe



einsteinbinary_ABP1_3.08_graphics_windows_intelx86.exe
graphics_app

stormdog
stormdog
Joined: 3 May 05
Posts: 5
Credit: 282053947
RAC: 0

Hello, I've just

Hello,

I've just downloaded and installed CUDA version and got errors immediatelly. Check Tasks 135844155, 135908112, 135914178. Here are details for one of these tasks

6.6.36

������� �� ������� ����� ��������� ����. (0x3) - exit code 3 (0x3)

Activated exception handling...
[17:42:27][5224][INFO ] Starting data processing...
[17:42:29][5224][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[17:42:29][5224][INFO ] Header contents:
------> Original WAPP file: p2030_53703_72629_0179_G49.79-01.73.N_5.wapp
------> Sample time in microseconds: 128
------> Observation time in seconds: 268.9792
------> Time stamp (MJD): 53703.840613425928
------> Number of samples/record: 512
------> Center freq in MHz: 1440
------> Channel band in MHz: 0.390625
------> Number of channels/record: 256
------> Nifs: 1
------> RA (J2000): 192936.834455
------> DEC (J2000): 141150.559412
------> Galactic l: 49.8928
------> Galactic b: -1.7891
------> Name: G49.79-01.73.N
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 432.2594
------> ZA at start: 12.5091
------> AST at start: 0
------> LST at start: 0
------> Project ID: p2030
------> Observers: VickyKaspi
------> File size (bytes): 16190702
------> Data size (bytes): 16179201
------> Number of samples: 2097152
------> Trial dispersion measure: 303.4 cm^-3 pc
------> Scale factor: 6441.3
[17:42:31][5224][INFO ] Seed for random number generator is 1015217833.
[17:42:33][5224][ERROR] Error creating CUDA FFT plan (error code: 8)
[17:42:33][5224][ERROR] Demodulation failed (error: 3)!
called boinc_finish

]]>

Platform is Windows. Boinc says

CUDA device: GeForce 9600 GT (driver version 17779, compute capability 1.1, 512 MB)

Regards,
Andrew

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2139
Credit: 2753028342
RAC: 1371490

RE: CUDA device: GeForce

Message 94095 in response to message 94094

Quote:

CUDA device: GeForce 9600 GT (driver version 17779, compute capability 1.1, 512 MB)

Regards,
Andrew


Read the opening post again, and update your GeForce driver to a matching version - 181.20 or higher.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2139
Credit: 2753028342
RAC: 1371490

RE: I've added two new file

Message 94096 in response to message 94093

Quote:
I've added two new file infos (308.exe and 308_graphics), and a whole new app_version at the end.....


That worked OK: now that AQUA has finished, BOINC has downloaded a couple of ABP1/308/CPU to match the CUDAs it already has.

Got a CUDA finishing in a couple of minutes - see you in the 3.09 thread. These 3.07/3.08/3.09 timings are going to be interesting.

Edit - ooops, misread. 309 is not a CUDA release. I'll let those 308s run on the existing app, and only then replace 308 with 309 to get comparative timings.

nenym
nenym
Joined: 4 Aug 09
Posts: 8
Credit: 432950967
RAC: 33472

Result

Result p2030_53617_03095_0027_G53.81-00.16.N_2.dm_615_0 finished (App 3.07). Host ID 2028119: CPU E6550 @ 2.33GHz + GF 9600GT (drivers 19038 ASUS), Win XP x86. Wall time 21,812.95s, granted credit 250. Nothing interesting for credithunters, OK for testers.

hoarfrost
hoarfrost
Joined: 9 Feb 05
Posts: 207
Credit: 92507737
RAC: 110666

10 Tasks and 10 errors for

10 Tasks and 10 errors for host 1275368. From 57029869 to 57030024...

Update: BOINC 6.6.36. NVIDIA GeForce 8500 GT.
СUDA device: GeForce 8500 GT (driver version 19038, compute capability 1.1, 256MB, est. 6GFLOPS)

Quote:


6.6.36

������� �� ������� ����� ��������� ����. (0x3) - exit code 3 (0x3)

Activated exception handling...
[21:43:15][2672][INFO ] Starting data processing...
[21:43:15][2672][INFO ] Using CUDA device #0 "GeForce 8500 GT" (54.43 GFLOPS)
[21:43:15][2672][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[21:43:15][2672][INFO ] Header contents:
------> Original WAPP file: p2030_53835_35106_0039_G36.72-00.43.C_5.wapp
------> Sample time in microseconds: 128
------> Observation time in seconds: 268.9792
------> Time stamp (MJD): 53835.406319444446
------> Number of samples/record: 512
------> Center freq in MHz: 1440
------> Channel band in MHz: 0.390625
------> Number of channels/record: 256
------> Nifs: 1
------> RA (J2000): 190014.12007
------> DEC (J2000): 31010.070266
------> Galactic l: 36.7553
------> Galactic b: -0.5094
------> Name: G36.72-00.43.C
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 334.3573
------> ZA at start: 16.662
------> AST at start: 0
------> LST at start: 0
------> Project ID: p2030
------> Observers: JD
------> File size (bytes): 16190754
------> Data size (bytes): 16179201
------> Number of samples: 2097152
------> Trial dispersion measure: 14.4 cm^-3 pc
------> Scale factor: 7081.6
[21:43:17][2672][INFO ] Seed for random number generator is 1009009674.
[21:43:17][2672][ERROR] Error creating CUDA FFT plan (error code: 2)
[21:43:17][2672][ERROR] Demodulation failed (error: 3)!
called boinc_finish

]]>

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.