ABP2 CUDA applications

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1,358
Credit: 2,341,195,729
RAC: 3,306,178

RE: Elapsed time in BOINC

Message 96424 in response to message 96423

Quote:

Elapsed time in BOINC Manager is the Run time you see on your tasks list. It is counted from task start to task finish.

CPU time is purely the time that the CPU worked on the data in the task.

In the case of GPUs, the CPU is only used to translate and transport the data to the GPU, which does all the calculations. So counting CPU time here is bad, as it isn't the CPU doing any of the real calculations.

That is mostly true of most GPU apps; but the current ABP2 CUDA app only does a small part of the calculation on the GPU, most of it is still done on the CPU.

Speedy
Speedy
Joined: 11 Aug 05
Posts: 22
Credit: 7,886,132
RAC: 598

Thank you. I have a better

Thank you. I have a better understanding of how ABP2 uses our system

Saenger
Saenger
Joined: 15 Feb 05
Posts: 403
Credit: 29,839,642
RAC: 17,895

RE: RE: @Saenger: The

Message 96426 in response to message 96417

Quote:
Quote:
@Saenger: The explanation given by Ver Greeneyes is correct. There is a certain part of the computation (here: Fast Fourier Transform, FFT) that is executed either exclusively on the GPU (Cuda version) or CPU (conventional app). No matter how you arrange the other work, the number of FFTs per second that your GPU can do will be the bottleneck if the GPU is sufficiently slow. Usually it's impractical to have the GPU and CPU collaborate closely on the same algorithm (e.g. FFT) at the same time, because between CPU and GPU, there is a bottleneck called PCIe bus. You want to push some data onto the card, then have the GPU crunch on this (using it's ultra fast on-board RAM but not the PCIe bus) and only at the end transfer data (results) back from the board over PCIe to main RAM.

OK, my set-up is thus useless for running GPU tasks. I've disabled my GPU for Einstein for now.

I'm no programmer, so I don't know whether a detection of this situation is a) possible at all server side or in BOINC and b) how hard this would be to implement.

From a non-programmer POV it should work like this:
Compare a list of GPU with the benchmark for the CPU, and if the comparision says the CPU is too fast for that GPU, don't send any GPU-work.
An Athlon XP1500 will probably be accelerated even with my GPU, while my CPU will not get any use even out of a 9600 probably.


I will not get any real use even out of a GT240, I just tried. It increases the speed by about 3%, and blocks the whole GPU for that.
So: + GT240 -> no good idea here ;)
I would probably need a high-end new one to get any real increase.

Grüße vom Sänger

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,516
Credit: 470,930,950
RAC: 54,803

RE: It increases the speed

Message 96427 in response to message 96426

Quote:
It increases the speed by about 3%, and blocks the whole GPU for that.
So: + GT240 -> no good idea here ;)
I would probably need a high-end new one to get any real increase.

It depends on which of the CPU task you use for reference (some take up to 4800 sec on your host), so the increase might me closer to 10% actually, but that doesn't change the big picture.

On a really fast host like yours, even a high end GPU will not cause an impressive speed-up because only part of the work is done on the GPU in this generation of the ABP2 app. The picture is entirely different on slower CPUs and especially AMD CPUs where the CPU version doesn't perform that well. On those hosts, even a GT 240 should yield a noticeable speedup. The next generation of ABP apps will put more load on the GPU and should no longer require a full CPU core. This will make the CUDA app much more useful for the power users.

CU
HB

Saenger
Saenger
Joined: 15 Feb 05
Posts: 403
Credit: 29,839,642
RAC: 17,895

RE: RE: It increases the

Message 96428 in response to message 96427

Quote:
Quote:
It increases the speed by about 3%, and blocks the whole GPU for that.
So: + GT240 -> no good idea here ;)
I would probably need a high-end new one to get any real increase.

It depends on which of the CPU task you use for reference (some take up to 4800 sec on your host), so the increase might me closer to 10% actually, but that doesn't change the big picture.

On a really fast host like yours, even a high end GPU will not cause an impressive speed-up because only part of the work is done on the GPU in this generation of the ABP2 app. The picture is entirely different on slower CPUs and especially AMD CPUs where the CPU version doesn't perform that well. On those hosts, even a GT 240 should yield a noticeable speedup. The next generation of ABP apps will put more load on the GPU and should no longer require a full CPU core. This will make the CUDA app much more useful for the power users.


I looked at the C/h-rate, it's for my Arecibos ~67-70 C/h, for the one and only with Cuda it was 72 C/h, OK, that' more than 3% ;)

Any estimates when the new Cuda-App can be tested?

Grüße vom Sänger

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,031
Credit: 218,159,243
RAC: 49,950

RE: Any estimates when the

Message 96429 in response to message 96428

Quote:
Any estimates when the new Cuda-App can be tested?

I'd say there remains about 1-2 weeks of work on the App. The problem is that it has been that way for weeks now where a lot of other more urgent things kept popping up and prevented us from working on this.

BM

BM

hotze33
hotze33
Joined: 10 Nov 04
Posts: 100
Credit: 367,146,771
RAC: 318

Hi, I have calculated some wu

Message 96430 in response to message 96426

Hi,
I have calculated some wu with the cpu app (3.08, the first since the cuda support ;)). The 160cr units took 11000s (Q6600 @ 3.6GHz). With the cuda support (9800GX2)they only need 7000s. So for me the cuda speed up is quite remarkable.
I think you gain the most from old, but still fast enough cpus.

Cannibal Corpse
Cannibal Corpse
Joined: 21 Feb 05
Posts: 18
Credit: 1,555,535
RAC: 0

Hello, I am new to running a

Hello, I am new to running a CUDA card, mine has 128 cores(low # cores )gts 250. So on my one cuda app (ABP2) while its running, it reads (1 gpu, 1 cpu)So that will be normal.

DO WHAT THO WILL SHALL BE THE WHOLE OF THE LAW.
PROUD MEMBER OF THE CARL SAGAN TEAM.

Fred J. Verster
Fred J. Verster
Joined: 27 Apr 08
Posts: 118
Credit: 22,451,438
RAC: 0

Is there support for (FERMI),

Message 96432 in response to message 96431

Is there support for (FERMI), GTX460/470, as I'm now crunching with a QX9650+9800GTX+ & 8500GT. (Half as fast compaired to 9800GTX+)
STFP CUDA 2.3 takes 1 hour 10 minutes. But do take a full CPU+GPU.
And little load on GPU.

Fred J. Verster
Fred J. Verster
Joined: 27 Apr 08
Posts: 118
Credit: 22,451,438
RAC: 0

After, the minute I connected

Message 96433 in response to message 96432

After, the minute I connected to the WWW, had BOINC (64BIT)6.10.58, installed and the latest app.'s, I received [i] a boat load of AREICHIBO' and ' Globals' before the previous outage. And was with a GTX9800+ and 8500GT.
Switched these for a GTX470, Windows would not even start with the 9800GTX+, inplace? So I removed it, for the moment.

Have UPgraded to driver 256.38 (not exactly sure) and CUDA 3.1, not much of a speedup, though.
Also running 2 MB at a time, on a GTX470(CPU=QX9650;3250MHz;5.5.5.15;1T 800MHz, FSB:DRAM=5:6)
It should be possible to run multiple, 2 or more WU's run together. It works in all other projects, depending of the card, FERMI is a must, but also the application, ofcoarse.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.