Any update on CUDA 5.5?

5pot
5pot
Joined: 8 Apr 12
Posts: 107
Credit: 7,577,619
RAC: 0
Topic 197131

Read you guys say the newest CUDA 5.5 may solve the bug that has been interfering with your app.

Any news in regards to this? Would love to come back crunching here. I mean, I can now, it's just that the app is rather inefficient on newer NVIDIA GPUs.

Cheers

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,023
Credit: 214,166,912
RAC: 42,753

Any update on CUDA 5.5?

Quote:
it's just that the app is rather inefficient on newer NVIDIA GPUs.

Yes, but this seems related to the drivers from NVidia, it doesn't get better with CUDA 5.5.

Still under investigation.

BM

BM

Shafa
Shafa
Joined: 31 May 05
Posts: 53
Credit: 627,005,014
RAC: 0

Well, it seems like the

Well, it seems like the purchase of the GTX660 will be postponed, while a couple of GTX560ti will take place instead :D

5pot
5pot
Joined: 8 Apr 12
Posts: 107
Credit: 7,577,619
RAC: 0

First, thx for the update.

First, thx for the update. Shame you guys are having driver issues with the app. Other projects are seeing over a 30% increase in performance.

To the other guy, I believe it would appear this project is now becoming much much better for AMD cards.

Betreger
Betreger
Joined: 25 Feb 05
Posts: 953
Credit: 678,570,578
RAC: 298,512

I was planing on replacing an

I was planing on replacing an OEM ATI card with a GTX760, maybe I should hold off for a while?

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 889
Credit: 25,165,240
RAC: 0

Hi, RE: Read you

Hi,

Quote:
Read you guys say the newest CUDA 5.5 may solve the bug that has been interfering with your app.

Depends on what you mean by "the bug". There has been a bug that prevented us from using anything newer than CUDA 3.2. NVIDIA was able to reproduce it but couldn't fix it for quite some time. This bug now seems fixed with CUDA 5.5 and we could start testing it on albert as we have built all necessary binaries. It's just a matter of time.

However, there's also this performance regression we see with the latest drivers (>= 319.xx) but that's unrelated to CUDA 5.5 as we received reports that also our CUDA 3.2 app is affected, which clearly points at the driver. That's one of the reasons why we don't yet roll out CUDA 5.5 apps as these would require the affected driver versions, whereas the CUDA 3.2 app can still be used with older unaffected drivers.

Best,
Oliver

 

Einstein@Home Project

robl
robl
Joined: 2 Jan 13
Posts: 1,686
Credit: 1,309,866,432
RAC: 16,117

I am crunching on Linux

I am crunching on Linux Ubuntu 12.04 with a NVIDIA 650 ti. While some tools are installed during driver installation like "nvidia-smi" they do not allow you to acquire GPU utilization stats, etc. I can get GPU temp and GPU fan speed and a few other attributes but nothing having to do with GPU performance/utilization.

I wrote to NVIDIA support who informed me that the data I wanted is not supplied in their current driver set and suggested that I download and install CUDA tool set because it "might" provide the data I seek. Before expending the effort to install this tool set I thought I would ask the community about their experience with this tool set.

My question: has anyone running Linux installed the NVIDIA CUDA tool set and how much "value added" performance was gained from the installation effort? Can someone who has experience with the CUDA tool set enumerate the "tools"/features of the NVIDIA CUDA tool set.

TIA

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 889
Credit: 25,165,240
RAC: 0

Hi robl, It seems you're

Hi robl,

It seems you're asking two questions here:

1) What nvidia-smi reports doesn't just depend on the driver but also on the device. I'm not sure that fan speeds or utilisation details are reported by consumer cards. It could very well be that only the Tesla (workstation/GPGPU) series do support this (which I know they do).

2) I can't really follow you regarding the "value added" performance gain by installing the "NVIDIA CUDA tool set". The CUDA toolkit is meant to used for building and running GPGPU applications. You don't need this for Einstein@Home as we provide all necessary libraries with our apps. Also, the CUDA development drivers are in principle not optimised or by any means better than the regular drivers. AFAIK, they are technically the same and the dev drivers are just the ones which CUDA releaes get developed and tested against. It shouldn't matter if you install any newer regular driver.

HTH,
Oliver

 

Einstein@Home Project

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2,038
Credit: 671,293,207
RAC: 1,300,890

RE: Hi, RE: Read you

Quote:

Hi,

Quote:
Read you guys say the newest CUDA 5.5 may solve the bug that has been interfering with your app.

Depends on what you mean by "the bug". There has been a bug that prevented us from using anything newer than CUDA 3.2. NVIDIA was able to reproduce it but couldn't fix it for quite some time. This bug now seems fixed with CUDA 5.5 and we could start testing it on albert as we have built all necessary binaries. It's just a matter of time.

However, there's also this performance regression we see with the latest drivers (>= 319.xx) but that's unrelated to CUDA 5.5 as we received reports that also our CUDA 3.2 app is affected, which clearly points at the driver. That's one of the reasons why we don't yet roll out CUDA 5.5 apps as these would require the affected driver versions, whereas the CUDA 3.2 app can still be used with older unaffected drivers.

Best,
Oliver


I was interested by this, because I've recently upgraded an old 9800GT host from the 310.70 to 326.14 beta driver. That should have crossed the 'slowdown' threshhold at 319, but I did not see a slowdown with a thirdy-party SETI app I'm testing.

So I ran it past the SETI developer (Jason Gee), and he replied:

Quote:

No it makes more sense now, which is satisfying to me, being a fan of symmetry.

In a nutshell, the drivers have not a lot at all to do with 'how stuff is processed', but everything to do with moving data/requests around, the protocols involved in the app-->runtime-->driver-->kernel-->device-->kernel-->runtime-->app chain, which is mandatory since the move to 'user mode drivers', and the net cost is roughly 10% across the board. [note: 'user mode drivers' are mandatory for the WDDM driver model in Vista/Win7 and above, and newer GPU models have special hardware to support these drivers: they are increasingly being emulated for older OSs and older cards for consistency and good practice - ed.]

These same factors affect the SETI app, just that I've been steadily reducing synchronisation demands as time progresses, so treading water if you like. The old ways aren't good, just fast.


So, the more you can do to minimise the data transfers and communications overheads, the better. Strangely enough, that's exactly the same point as was being stressed at the recent 'Preparing for Parallella' event, which I attended with Claggy.

Claggy has posted links to videos of the event in Interesting Project on Kickstarter, but I'd especially commend the keynote address by Iann Barron (Inmos, Transputer) - http://www.youtube.com/watch?v=8sO-jj9X2xc - to even a general audience.

robl
robl
Joined: 2 Jan 13
Posts: 1,686
Credit: 1,309,866,432
RAC: 16,117

RE: RE: I probably did

Quote:
Quote:

I probably did not do a good job of asking my question. What I really want to know is the % of GPU untilization so that I may understand how hard I am pushing the GPU.

Quote:

Hi robl,

It seems you're asking two questions here:

1) What nvidia-smi reports doesn't just depend on the driver but also on the device. I'm not sure that fan speeds or utilisation details are reported by consumer cards. It could very well be that only the Tesla (workstation/GPGPU) series do support this (which I know they do).

Here is my output from nvidia-smi. You will notice that the GPU Utilization field is N/A. I would like to know what this value is so that I can better understand how many GPU WU to crunch. I have noted elsewhere that "nvidia-smi" does provide this information on some older cards but stopped providing this value after the introduction of driver xxx.xx. Notice GPU temp and fan rpm are shown. Some values "have been changed".

==============NVSMI LOG==============

Timestamp : Thu Aug 15 12:00:35 2013
Driver Version : 304.88

Attached GPUs : 1
GPU 0000:01:00.0
Product Name : GeForce GTX 650 Ti
Display Mode : N/A
Persistence Mode : Disabled
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-XXXXXXXXX-XXXXXXXXXXXXX-XXXXXXXXXXX
VBIOS Version : 11.22.33.44.55.66
Inforom Version
Image Version : N/A
OEM Object : N/A
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Device Id : 0x1234567
Bus Id : 0000:01:00.0
Sub System Id : 0x842A1043
GPU Link Info
PCIe Generation
Max : N/A
Current : N/A
Link Width
Max : N/A
Current : N/A
Fan Speed : 44 %
Performance State : N/A
Clocks Throttle Reasons : N/A
Memory Usage
Total : 1023 MB
Used : 352 MB
Free : 671 MB
Compute Mode : Default
Utilization
Gpu : N/A
Memory : N/A

Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile

Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Total : N/A
Temperature
Gpu : 56 C
Power Readings
Power Management : N/A
Power Draw
: N/A
Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : N/A

SM : N/A
Memory : N/A
Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : N/A
SM : N/A
Memory : N/A
Compute Processes : N/A

Quote:


2) I can't really follow you regarding the "value added" performance gain by installing the "NVIDIA CUDA tool set". The CUDA toolkit is meant to used for building and running GPGPU applications. You don't need this for Einstein@Home as we provide all necessary libraries with our apps. Also, the CUDA development drivers are in principle not optimised or by any means better than the regular drivers. AFAIK, they are technically the same and the dev drivers are just the ones which CUDA releaes get developed and tested against. It shouldn't matter if you install any newer regular driver.

HTH
Oliver

My reference to "value added" was meant to imply if I go to the effort of installing the CUDA tool kit will I have the "tools" to acquire % of GPU utilization or will this tool set also fail to provide the information. In other words if it does not provide what I want do I want to invest the time installing it.

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 889
Credit: 25,165,240
RAC: 0

I'm not aware that support

I'm not aware that support for that got dropped in a certain driver version, but it may of course very well be the case (for desktop cards). All our Linux boxes run Tesla-series cards and those still show these values with driver 319.37 (Tesla only fan, Fermi also GPU util). But again, installing the CUDA toolkit shouldn't make any difference.

Sorry,
Oliver

 

Einstein@Home Project

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.