ABP1 CUDA applications

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 942
Credit: 25166626
RAC: 0

For your information, We

For your information,

We proposed an additional GPU memory checking procedure before CUDA tasks are launched. Backround: today, among other things, the total device memory is used to decide whether we sent work to a client requesting GPU work or not. This is, however, not enough to ensure that there's enough memory available on the device when the task is actually launched. The new method (already added to BOINC, pending release 6.10.25) will check the available memory and compare it to the minimum requirements of a given task. If it's not sufficient the task will be deferred for five minutes and other GPU tasks might be launched in the meantime. It's up to the user to decide whether this memory limitation is temporary or permanent (e.g. caused by multi-head setup and/or screen resolution) in which case one should opt-out GPU work. This opt-out process should eventually be made more fine-grained such that this decision can be made on the application rather than the project level.

Cheers,
Oliver

 

Einstein@Home Project

Ver Greeneyes
Ver Greeneyes
Joined: 26 Mar 09
Posts: 140
Credit: 9562235
RAC: 0

The check-in notes mentioned

Message 95689 in response to message 95688

The check-in notes mentioned both the scheduler and the client needed to be updated.. so this will only help Einstein@Home when they upgrade their server-side software, right?

MarkJ
MarkJ
Joined: 28 Feb 08
Posts: 437
Credit: 137324514
RAC: 18517

RE: RE: I also hope by

Message 95690 in response to message 95666

Quote:
Quote:
I also hope by then the size of the E@H CUDA/GPU tasks will also be less than 3 hours (a limit which appears to be what the people at S@H are using) so E@H plays better with others on the GPU.

Our tests indicate that the upcoming ABP2 GPU tasks typically take ~0.6 hours per WU. This is still "just" a factor of 2-3 faster than the ABP2 CPU version, but after the ABP2 release we are going to concentrate on improving the GPU code.

Cheers,
Oliver

You mentioned elsewhere that this one uses single precision a lot more. Would there be a speed benefit to having an x64 app?

Milkyway have one for ATI/Cal.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 686043288
RAC: 582127

RE: You mentioned

Message 95691 in response to message 95690

Quote:

You mentioned elsewhere that this one uses single precision a lot more. Would there be a speed benefit to having an x64 app?

Milkyway have one for ATI/Cal.

I did some experiments with x64 builds (under Linux) and found no significant performance increase so far.

CU
Bikeman

MarkJ
MarkJ
Joined: 28 Feb 08
Posts: 437
Credit: 137324514
RAC: 18517

RE: RE: You mentioned

Message 95692 in response to message 95691

Quote:
Quote:

You mentioned elsewhere that this one uses single precision a lot more. Would there be a speed benefit to having an x64 app?

Milkyway have one for ATI/Cal.

I did some experiments with x64 builds (under Linux) and found no significant performance increase so far.

CU
Bikeman

Hi Bikeman,

I presume that is the CPU app you are referring to?

Cheers

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 686043288
RAC: 582127

RE: Hi Bikeman, I presume

Message 95693 in response to message 95692

Quote:


Hi Bikeman,

I presume that is the CPU app you are referring to?

Cheers

Yes. If even the CPU is not benefitting from x64 compilation, the GPU app should not benefit either.

CU
Bikeman

cristipurdel
cristipurdel
Joined: 19 Jul 07
Posts: 26
Credit: 11991887
RAC: 0

Any news on ABP2 CUDA? or a

Any news on ABP2 CUDA? or a 'possible' ATI client?

rroonnaalldd
rroonnaalldd
Joined: 12 Dec 05
Posts: 116
Credit: 537221
RAC: 0

Could the inefficiencies of

Could the inefficiencies of the actually CUDA app have to do with the high values of page faults from the cpu-app???
Here a log from sysinternals process-explorer:

Quote:
Process_________PID___CPU___Virtual Size__Working Set__Page Faults____PF Delta__CPU Time
einsteinbinary____2276_100.00_V138.348___K100.908____K39.551.450__4.990____2:11:58.562
Amolqc-preRC1_5_2840_000.00_V261.496___K075.600____K18.882______0.000____1:37:35.015

Okay QMC is not running, but look at the cpu times and compare the values of page faults.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 686043288
RAC: 582127

Discussion of ABP2 CUDA app

Discussion of ABP2 CUDA app continues here:

http://einsteinathome.org/node/194710

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.