FGRP - High invalid rate on Nvidia 4090?

DF1DX
DF1DX
Joined: 14 Aug 10
Posts: 102
Credit: 3,086,104,491
RAC: 4,021,987

FYI: In the meantime i

FYI:

In the meantime i have now calculated over 700 BRP7 WUs on the 4090.
About 11% of these are currently invalid.

From the remaining FGRPB WUs, about 12% are invalids. With the optimized AIO app from petri, there were up to 20 % invalids here.

So far, not a single error with GW-WUs.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,276
Credit: 245,631,554
RAC: 11,200

If the "invalid"s rate on

If the "invalid"s rate on BRP7 is also higher on the 4090, I would also be interested in whether there's a difference in validation between the Windows (CUDA) and the Linux (OpenCL) app.

Both FGRP and BRP Apps are mainly FFT bound, and the FFT happens in a pretty early step. Thus the result of the FFT has a much higher impact on the overall result as e.g. in the GW app.

However, the different Apps use different libraries for the FFT:

* FGRP uses "clFFT", originally developed by AMD for their cards, now OpenSource on GitHub

* BRP CUDA (BRP7 Windows) uses cuFFT

* BRP OpenCL uses an own development based on an Apple OpenCL code example, which seems to be derived from an early cuFFT version

BM

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,276
Credit: 245,631,554
RAC: 11,200

Hm. The overall "invalid"

Hm. The overall "invalid" rate of 4090s on BRP7 is <3%, which is even lower than the overall "invalid" average there (~3,4%). However, in the DB i currently only have 4090 results from hosts running Windows. Actually, from Linux I only have 5(!) valid results in total from BRP7.

BM

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 216
Credit: 8,516,434,008
RAC: 4,514,805

Bernd Machenschalk wrote: If

Bernd Machenschalk wrote:

If the "invalid"s rate on BRP7 is also higher on the 4090, I would also be interested in whether there's a difference in validation between the Windows (CUDA) and the Linux (OpenCL) app.

Both FGRP and BRP Apps are mainly FFT bound, and the FFT happens in a pretty early step. Thus the result of the FFT has a much higher impact on the overall result as e.g. in the GW app.

However, the different Apps use different libraries for the FFT:

* FGRP uses "clFFT", originally developed by AMD for their cards, now OpenSource on GitHub

* BRP CUDA (BRP7 Windows) uses cuFFT

* BRP OpenCL uses an own development based on an Apple OpenCL code example, which seems to be derived from an early cuFFT version

 

Looks like I have some learning to do about FFT! This is interesting that different versions are used. 

 

Bernd Machenschalk wrote:

Hm. The overall "invalid" rate of 4090s on BRP7 is <3%, which is even lower than the overall "invalid" average there (~3,4%). However, in the DB i currently only have 4090 results from hosts running Windows. Actually, from Linux I only have 5(!) valid results in total from BRP7.

I am going to try and get one of the 4090 systems working on BRP7 this week. 

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 216
Credit: 8,516,434,008
RAC: 4,514,805

Got it working on this host.

Got it working on this host. It will crunch BRP7 full-time for the rest of the week to give us a good sample size (adding to DF1DX completed work units). It is finishing a BRP7 work unit in ~3:09.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,276
Credit: 245,631,554
RAC: 11,200

Bernd Machenschalk

Bernd Machenschalk wrote:

Hm. The overall "invalid" rate of 4090s on BRP7 is <3%, which is even lower than the overall "invalid" average there (~3,4%). However, in the DB i currently only have 4090 results from hosts running Windows. Actually, from Linux I only have 5(!) valid results in total from BRP7.

The OS distinction/selection in my query was somewhat wrong. Actually on BRP7, Linux hosts have roughly 10% invalid results, while Windows hosts only have 0,5%. Judging from the above I'd guess that the problem lies in the OpenCL (compiler in the) driver, the CUDA version of BRP7 seems to work fine.

So if you are on Windows and want to avoid these invalid rates, my recommendation for now would be to restrict yourself (or your hosts) to run BRP7.

Here's the thing with the Linux CUDA version: we found that the gcc version used to build the CPU part of the application is crucial for validation (some data preparation is done beforehand on the CPU, and this needs to yield the exact same results). However I couldn't get the libgcc to link with the CUDA libraries, at least CUDA 5.5. I see if I can get this app to link with a newer CUDA version.

BM

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,758
Credit: 36,025,442,734
RAC: 44,105,201

Petri has a Linux CUDA 11 and

Petri has a Linux CUDA 11 and 12 version of BRP7 that validates well. At least on 30-series cards and earlier. Not sure about 40-series. 

_________________________________________________________________________

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,276
Credit: 245,631,554
RAC: 11,200

I published a BRP7 Linux app

I published a BRP7 Linux app version (0.16) with CUDA 10.2. This was built on an Ubuntu 18.04 and my not run on other systems with older libc. It's Beta anyway. You may want to give it a try.

BM

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 216
Credit: 8,516,434,008
RAC: 4,514,805

Bernd Machenschalk wrote: I

Bernd Machenschalk wrote:

I published a BRP7 Linux app version (0.16) with CUDA 10.2. This was built on an Ubuntu 18.04 and my not run on other systems with older libc. It's Beta anyway. You may want to give it a try.

 

On it! The older version of the app gave us the following results (some pending): 

Pending (85)
Valid (293)
Invalid (57)
Error (0)

 

I will enable beta apps and then run more of these on the 4090 for this week. Will it automatically receive the version 0.16 when it requests tasks?

 

DF1DX
DF1DX
Joined: 14 Aug 10
Posts: 102
Credit: 3,086,104,491
RAC: 4,021,987

I only get the previous

I only get the previous version 0.15 with OpenCL at the moment. Yes, beta is enabled.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.