ABP2 CUDA applications

TamCaP
TamCaP
Joined: 22 Jan 05
Posts: 4
Credit: 4,281,922
RAC: 0

To paraphrase, maybe it would

To paraphrase, maybe it would be possible to run 2-3 parallel ABP tasks using 2-3 CPUs and 1 GPU (at higher GPU utilization ratio). However, I guess the programming effort required might not bring enough ROI to do that.

I believe the guys at E@H are right now directing their efforts to increase the overall CPU/GPU utilization (per arbitrary science unit, however defined) anyway, thus lets just wait and see. My GTX260 is happily crunching anything I throw at it from E@H anyway.

Oh, and to all they naysayers saying that 100% GPU utilization = more science done... It simply means more calculations done... I could start my own BOINC project doing 2+2 on CUDA over and over again exploiting 100% of all the GPUs you could throw at me - but what's the point? That's why I think E@H is worth the time - the science actually brings some results (papers, pulsar reconfirmations, etc.).

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6,579
Credit: 306,922,947
RAC: 160,559

RE: That's why I think E@H

Message 96375 in response to message 96374

Quote:
That's why I think E@H is worth the time - the science actually brings some results (papers, pulsar reconfirmations, etc.).


Yeah! It'd be nice if you could 'feel' the gravity wave or pulsar data go through the machine. Alas it's not the enterprise where a horn goes off, or a trumpet toots, or somesuch when a signal is detected. :-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,870
Credit: 115,874,177,999
RAC: 35,436,006

RE: ... I asked for the old

Message 96376 in response to message 96372

Quote:
... I asked for the old page where the app_info for CUDA was.
Bottom line, I asked for one simple redirect link, and got two not helpful solutions.
Thanks a lot.


Bottom line is ... you are asking for something that doesn't now exist and probably wouldn't be of much use even if it did.

In this thread back in November last year, Bernd announced the automatically deployed ABP1 and its requirements. By implication, all previous beta tests were dead and people were advised to revert to official distribution channels. There has since been a complete change of app to ABP2 (and more recently the 'quad-task' version) and I'm sure you would have noticed that there are no beta test packages of any description on the beta test page. This is the place where you would normally be able to get app_info.xml files if they were needed or appropriate. What other link do you imagine that people could give you? The latest apps are more efficient but still require a full CPU - according to Bernd. I don't have any CUDA capable GPUs so I have no experience.

Over the period since last November, various people have pointed out the inefficiency of tying up a full CPU with a lightly loaded GPU. I haven't seen any indication that you can do much about this other than to stop E@H using your GPU and to assign it to other projects that can use it more efficiently, whilst awaiting further improvements in the CUDA app here (which may take some time). I don't imagine you can rectify things in any significant way by using AP 'trickery' :-). Of course, you are welcome to try but I would guess that if you do 'free up' some of that 1 full CPU that goes with your GPU, you may well further sabotage the already inefficient use of your GPU without any appreciable gain.

Please let us all know if you come up with some 'magic combination' that gives a significant overall improvement. If you post the full app_info.xml that's not working, maybe some knowledgeable person (Richard Haselgrove springs immediately to mind) may have some suggestions about how to get it working.

Cheers,
Gary.

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6,579
Credit: 306,922,947
RAC: 160,559

RE: Bottom line, I asked

Message 96377 in response to message 96372

Quote:
Bottom line, I asked for one simple redirect link, and got two not helpful solutions.Thanks a lot.


Err .... poor solutions with respect to the now stated, but previously unstated, assumptions? That could well be thankless! It's a bit tricky accounting for unrevealed "didn't wants" :-) :-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2,143
Credit: 2,924,283,493
RAC: 899,780

RE: Please let us all know

Message 96378 in response to message 96376

Quote:
Please let us all know if you come up with some 'magic combination' that gives a significant overall improvement. If you post the full app_info.xml that's not working, maybe some knowledgeable person (Richard Haselgrove springs immediately to mind) may have some suggestions about how to get it working.


I'm sure it can be 'got working' ;-)

However, before we go down that route, can we please make sure that we're not re-visiting old ground that was thoroughly thrashed over during ABP1 testing?

Please read, in particular, these two posts by Oliver Bock, the programmer responsible for the decisions you're re-visiting:

Message 100733
Message 100738
(those both come from the 'ABP1 CUDA applications' thread, once sticky. BOINC websites have a habit of losing useful information like that, causing much re-inventing of wheels).

If you're willing to accept the risk that your computer may become unresponsive for other non-BOINC tasks, and still want to go ahead with your experiments, post a sutiable disclaimer here and I'll see what I can knock up.

Edit - it might help you understand what's been done already if you review the very start of the testing process, from message 98474.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2,143
Credit: 2,924,283,493
RAC: 899,780

In case cristipurdel ever

In case cristipurdel ever visits us again, I have followed my own advice - allowed an ABP2 CUDA task to download to a test machine, and copied the relevant file names and other details from the in client_state.xml into the carcass of an app_info.xml left over from last August.

I won't post the result here, because that will only encourage people to mess around when the project is trying to concentrate on production work: but if anybody posts a reasonable request, demonstrating that they understand what they're letting themselves in for and are prepared to accept the consquences, then I can send them a PM.

Elphidieus
Elphidieus
Joined: 20 Feb 05
Posts: 245
Credit: 20,603,702
RAC: 0

RE: RE: RE: I've got

Message 96380 in response to message 96365

Quote:
Quote:
Quote:

I've got three of these long CUDA workunits that are having error messages written all over them:

161525558
161525522
161525518

Any ideas...?

At least they validated OK.

Michael

I'm not sure if these error messages also affect the performance of my 8800GT. Mine averages around 11,000 sec per CUDA task, while I've seen Linux-based system with an 8800GT does around 7,500 sec. Unless Linux and OS X CUDA clients have a huge performance gap...? My GT120 does 13,500, which shows my 8800GT does not have much improvement over the GT120 side by side.

Anybody here with an 8800GT would want to clarify...?

Fred J. Verster
Fred J. Verster
Joined: 27 Apr 08
Posts: 118
Credit: 22,451,438
RAC: 0

RE: RE: RE: RE: I've

Message 96381 in response to message 96380

Quote:
Quote:
Quote:
Quote:

I've got three of these long CUDA workunits that are having error messages written all over them:

161525558
161525522
161525518

Any ideas...?

At least they validated OK.

Michael

I'm not sure if these error messages also affect the performance of my 8800GT. Mine averages around 11,000 sec per CUDA task, while I've seen Linux-based system with an 8800GT does around 7,500 sec. Unless Linux and OS X CUDA clients have a huge performance gap...? My GT120 does 13,500, which shows my 8800GT does not have much improvement over the GT120 side by side.

Anybody here with an 8800GT would want to clarify...?

Hi, I would if I could . . .
Here is the last Task , on a 9800GTX+ & QX9650, both stock.
MoBo ASUS P5E; O.S. WIN XP64 Pro; BOINC 6.10.18 (x64)
And here
the result.

Elphidieus
Elphidieus
Joined: 20 Feb 05
Posts: 245
Credit: 20,603,702
RAC: 0

RE: Hi, I would if I

Message 96382 in response to message 96381

Quote:

Hi, I would if I could . . .
Here is the last Task , on a 9800GTX+ & QX9650, both stock.
MoBo ASUS P5E; O.S. WIN XP64 Pro; BOINC 6.10.18 (x64)
And here
the result.

Thank you for sharing the info.

I've reduced about 75% of the CPU utilisation, and GPU tasks sped up. With only 2 out of 8 physical CPU cores utilised, my 8800GT crunch thru a CUDA task on an estimated sub-8,000 sec, down from the usual 11,000-sec GPU time. Pretty consistent to Apple's underclocked nVidia 8800GT firmware. So what happened here...? CPU contention problem...?

Fred J. Verster
Fred J. Verster
Joined: 27 Apr 08
Posts: 118
Credit: 22,451,438
RAC: 0

RE: RE: Hi, I would if

Message 96383 in response to message 96382

Quote:
Quote:

Hi, I would if I could . . .
Here is the last Task , on a 9800GTX+ & QX9650, both stock.
MoBo ASUS P5E; O.S. WIN XP64 Pro; BOINC 6.10.18 (x64)
And here
the result.

Thank you for sharing the info.

I've reduced about 75% of the CPU utilisation, and GPU tasks sped up. With only 2 out of 8 physical CPU cores utilised, my 8800GT crunch thru a CUDA task on an estimated sub-8,000 sec, down from the usual 11,000-sec GPU time. Pretty consistent to Apple's underclocked nVidia 8800GT firmware. So what happened here...? CPU contention problem...?

Hi, probably not so easy to tell, without some (serious)debugging.
I don't even use an app_info.xml FILE, to put some debug-flags in.
Can try this too, with even less CPU-use, say 50%. And see how GPU behaves.
Have to do some more serious reading, on EINSTEIN applications!
The way CUDA is 'used' in EINSTEIN is hard or not at all compairable to
SETI and totally different from Collatz C (3x+1).
And since the GPU is used for computing 'results' from the CPU, it'll have to 'wait', for the CPU. GPU's are excellent for many short 'parallel' task's and graphics . . . :)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.