BRP4 Intel GPU app feedback thread

Claggy
Claggy
Joined: 29 Dec 06
Posts: 560
Credit: 2694028
RAC: 0

RE: One wu was distributed

Quote:
One wu was distributed to a third pc. The wu cruched by an i5 HD4600 was marked invalid then.
The user has a lot of invalids (345) and some hundred in the queue.
http://einsteinathome.org/host/11706755/tasks&offset=0&show_names=1&state=4&appid=19


Host has an incompatible driver:

http://einstein.phys.uwm.edu/host_sched_logs/11706/11706755

Quote:
2014-12-29 17:17:23.6031 [PID=29652] [version] Checking plan class 'opencl-intel_gpu'
2014-12-29 17:17:23.6031 [PID=29652] [version] parsed project prefs setting 'gpu_util_brp': 0.500000
2014-12-29 17:17:23.6031 [PID=29652] [version] OpenCL driver version: 10.18.10.3960; platform version: OpenCL 1.2; device version: OpenCL 1.2
2014-12-29 17:17:23.6031 [PID=29652] [version] Peak flops supplied: 5.6e+10
2014-12-29 17:17:23.6032 [PID=29652] [version] plan class ok

Claggy

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 536654324
RAC: 184837

Update Alex reported on his

Update
Alex reported on his Celeron J1900 with the very new driver version 10.18.10.3408 producing an impressive amount of valids: at the time of writing 2 invalids vs. 82 valids, whereas he reported 3 invalids of 185 WUs crunched.

The 2 invalids currently in the database happened against 2 hosts with HD4600 GPUs. One of them is using a problematic driver 10.18.10.3977, whereas the other ones are not asking for work (lazy bastards!) so I can see their driver version. They are all showing a ratio of 1/3 valid and 2/3 invalid WUs, which is typical for the problematic drivers.

This Celeron uses a Silvermont Atom with Ivy Bridge GPU, so the results should at least apply to all HD2500 / HD4000 GPUs. Currently we can't get work from Einstein, but if this changes it may be worth testing driver 10.18.10.3408 on other hosts as well.

MrS

Scanning for our furry friends since Jan 2002

boinc127
boinc127
Joined: 17 Mar 11
Posts: 23
Credit: 4003975
RAC: 1

I'm going to be a guinea pig

I'm going to be a guinea pig and try out the 4080 driver that just got released. Hopefully this one works.

boinc127
boinc127
Joined: 17 Mar 11
Posts: 23
Credit: 4003975
RAC: 1

I don't know why I thought

I don't know why I thought the new 4080 driver would work. From the tiny sample I did, I can tell this driver doesn't work correctly either. I have a few inconclusives and 2 valid workunits from wingmen with quite a few current marked as invalid and validate errors. Don't upgrade to the 4080 driver.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2139
Credit: 2752967155
RAC: 1383389

Following a similar

Following a similar discussion about drivers at SETI, here are some observations which may provide a clue about what's going on.

I've run some extended tests with a SETI application, using the 64-bit drivers under Windows 7. The driver versions I've tested are

HD 4000 with drivers 10.18.10.3621 and 10.18.10.4061
HD 4600 with drivers 10.18.10.3621 and 10.18.14.4080

Note that with the 3621 drivers (which date from May 2014), the same version can be used for both 'Ivy Bridge' (HD 4000) and 'Haswell' (HD 4600) CPUs.

This is not the case with current drivers - 4061 and 4080 can only be installed on Ivy Bridge and Haswell, respectively (I'm using the setup program contained within the zip download, so I can see more clearly what's going on).

The SETI science application provided by Raistmer is packaged differently from the Einstein BRP4 application: the OpenCL kernels are downloaded separately as a source-code .cl file, and compiled into binary files on the target host the first time the application is run. Using the SETI application, all four hardware/driver combinations produced valid results.

But it is clear that the intermediate binary files compiled on the Haswell CPU by the 4080 driver are significantly different from the other three cases - the main application binary is roughly a third of the size, and the clFFTplan binary files less than a quarter of the size.

The Einstein application, on the other hand, is downloaded as a single monolithic .exe file (plus a separate screensaver/graphics file, which I don't think is relevant to this discussion). I can't find any evidence of a separate 'in situ' compilation using the driver resources. The three driver/hardware combinations which compile large binary files all produce a respectable validation rate at Einstein, but the Haswell/4080 combination gave very poor results (including two validate errors in the half-dozen tasks I tried). Validation rates on my Haswell returned to normal when I reverted to the 3621 driver.

So my finger of suspicion is pointing towards the new drivers for the Haswell range (only) requiring a different format of compiled OpenCL kernel resource from the original build.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4267
Credit: 244932206
RAC: 16344

The Einstein@Home application

The Einstein@Home application binary contains the OpenCL source code (text) as a single large string (try the Linux "strings" tool on an OpenCL app binary). The OpenCL code is always compiled by the driver for the underlying device at the start of the application. Compile time is usually very small compared to the actual run time, so we don't bother to dump the compiled result.

We always wanted to evaluate AMDs OpenCL FFT library (which I think the SETI App is using) for our purposes, but actually never got around yet to do this. We still use an OpenCL FFT code that was published by Apple as an example, and looks like it's derived from an early version of NVidias CUDA FFT optimized for the GT200 chips. In particular it doesn't make any use of the SIMD units in AMD chips.

BM

BM

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 536654324
RAC: 184837

Have you tried to contact

Have you tried to contact Intel? Or any other light to be seen at the horizon?

So far it sems like the problem is only going to get worse when Broadwell and Skylake with even more powerful GPUs arrive.. and will require newer drivers.

MrS

Scanning for our furry friends since Jan 2002

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4267
Credit: 244932206
RAC: 16344

So far I don't have a precise

So far I don't have a precise understanding of what the problem actually is, or how exactly it expresses itself in the returned results. Currently BRP4 file deleters are disabled in order to keep the result files around for a closer analysis. Unfortunately I am away from my desk this week, so I'll have another look not before Monday next week - unless there (yet) is another emergency, like a server crash or another security vulnerability that needs urgent fixing.

BM

BM

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 536654324
RAC: 184837

Thanks for looking into it!

Thanks for looking into it! This problem has been with us for months, so I'm sure it won't go away for a few days more ;)

MrS

Scanning for our furry friends since Jan 2002

Maximilian Mieth
Maximilian Mieth
Joined: 4 Oct 12
Posts: 128
Credit: 9885704
RAC: 2055

MrS wrote:This problem has

MrS wrote:
This problem has been with us for months

And it is getting bigger and bigger. Right now I have 77 tasks marked as invalid and a large number of "validation inconclusive". As far as I can see it is always the same behavior. My HD4000 and my wingmens' HD4600 cannot validate against each other, then another HD4600 gets the third task and validates against the first HD4600.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.