BRP4 Intel GPUs and validation

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,305
Credit: 248,801,863
RAC: 33,348
Topic 225940

We are aware that currently quite some results computed with various, particularly ARM-based devices "lose" validation when compared to that of the dominating Intel GPUs. We are still working on solving this in the application code, but due to various issues this will take some more time.

To improve the situation that is understandably frustrating for some of you, we are reviving the "BRP4G" application and will let the Intel GPUs run these, so the results will be validated only among those. The other devices will keep processing BRP4 work and validate only in that group.

BM

Bill F
Bill F
Joined: 24 Dec 05
Posts: 51
Credit: 87,690,182
RAC: 157,158

As the results from the

As the results from the current BETA applications are collected and it is determined that the application is in fact running at a low enough error rate to make the application version is safe to be run by everyone can the BETA status be removed so that main line users who have not agreed to run "TEST Applications" can receive the new versions of the Applications ?

 

Thanks

Bill F

Dallas TX

In October of 1969 I took an oath to support and defend the Constitution of the United States against all enemies, foreign and domestic;
There was no expiration date.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2,143
Credit: 2,923,362,889
RAC: 933,773

I'd be against doing that

I'd be against doing that just yet. I've been running these tasks on some sample machines since the BRP4G run was restarted. I'd say that my results on HD 4600 GPUs are OK, but anything later than that is still problematic.

Take these stats from an HD 530 (still over 5 years old - nowhere near the cutting edge):

Check those figures very carefully. The separate categories add up to 238, leaving 146 unaccounted for. They are all 'validation inconclusive' against the stock application. Coupled with the figure of 69 invalid, I'd say that's unacceptable from a quality control point of view. The machine is a Dell Optiplex 5040 desktop, with no overclocking or other stressors.

Bernd intimated to me privately - just before his holidays - that "...  we finally have someone actively working on it now, so we're getting back to it. I'll take a look at this issue." I'm ready and willing to take part in that investigation whenever called upon: I have some some experience of tackling similar issues at SETI@Home.

Bill F
Bill F
Joined: 24 Dec 05
Posts: 51
Credit: 87,690,182
RAC: 157,158

Richard Haselgrove

Richard Haselgrove wrote:

I'd be against doing that just yet. I've been running these tasks on some sample machines since the BRP4G run was restarted. I'd say that my results on HD 4600 GPUs are OK, but anything later than that is still problematic.

Take these stats from an HD 530 (still over 5 years old - nowhere near the cutting edge):

Check those figures very carefully. The separate categories add up to 238, leaving 146 unaccounted for. They are all 'validation inconclusive' against the stock application. Coupled with the figure of 69 invalid, I'd say that's unacceptable from a quality control point of view. The machine is a Dell Optiplex 5040 desktop, with no overclocking or other stressors.

Bernd intimated to me privately - just before his holidays - that "...  we finally have someone actively working on it now, so we're getting back to it. I'll take a look at this issue." I'm ready and willing to take part in that investigation whenever called upon: I have some some experience of tackling similar issues at SETI@Home.

 

I agree with Richard I too would like the apps to be pretty solid before the drop of BETA status. I was thinking more of the two BETA's that have release dates in 2019.

 

Bill F

 

 

In October of 1969 I took an oath to support and defend the Constitution of the United States against all enemies, foreign and domestic;
There was no expiration date.

[AF] fansyl
[AF] fansyl
Joined: 27 Sep 15
Posts: 3
Credit: 226,326,704
RAC: 12,014

Hello, I can no longer

Hello,

I can no longer receive BRP4(G) tasks for my Intel HD4000 GPU from my i7-3770T even though I have not changed any settings.

Is this normal?

I have the following error message:

2022-05-02 17:38:36.1009 [PID=18124]   Request: [USER#xxxxx] [HOST#12060147] [IP xxx.xxx.xxx.144] client 7.16.11
2022-05-02 17:38:36.1614 [PID=18124] [debug]   have_master:1 have_working: 1 have_db: 1
2022-05-02 17:38:36.1615 [PID=18124] [debug]   using working prefs
2022-05-02 17:38:36.1615 [PID=18124] [debug]   have db 1; dbmod 1636371210.000000; global mod 1636371210.000000
2022-05-02 17:38:36.1615 [PID=18124]    [send] effective_ncpus 2 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2022-05-02 17:38:36.1615 [PID=18124]    [send] effective_ngpus 1 max_jobs_on_host_gpu 999999
2022-05-02 17:38:36.1615 [PID=18124]    [send] Not using matchmaker scheduling; Not using EDF sim
2022-05-02 17:38:36.1615 [PID=18124]    [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2022-05-02 17:38:36.1615 [PID=18124]    [send] Intel GPU: req 17280.00 sec, 1.00 instances; est delay 0.00
2022-05-02 17:38:36.1615 [PID=18124]    [send] work_req_seconds: 0.00 secs
2022-05-02 17:38:36.1616 [PID=18124]    [send] available disk 95.24 GB, work_buf_min 8640
2022-05-02 17:38:36.1616 [PID=18124]    [send] active_frac 0.964182 on_frac 0.994370 DCF 1.000000
2022-05-02 17:38:36.1625 [PID=18124]    [mixed] sending locality work first (0.1994)
2022-05-02 17:38:36.1626 [PID=18124]    [mixed] sending non-locality work second
2022-05-02 17:38:36.1881 [PID=18124]    [send] [HOST#12060147] will accept beta work.  Scanning for beta work.
2022-05-02 17:38:36.2329 [PID=18124]    [version] Checking plan class 'opencl-intel_gpu-new'
2022-05-02 17:38:36.2356 [PID=18124]    [version] reading plan classes from file '/BOINC/projects/EinsteinAtHome/plan_class_spec.xml'
2022-05-02 17:38:36.2357 [PID=18124]    [version] parsed project prefs setting 'gpu_util_brp': 1.000000
2022-05-02 17:38:36.2357 [PID=18124]    [version] GPU RAM calculated: min: 512 MB, use: 360 MB, WU#631275565 CPU: 248 MB
2022-05-02 17:38:36.2357 [PID=18124]    [version] [HOST#12060147] device name: 'Intel(R) HD Graphics 4000'; OpenCL driver version: 10.18.10.5161; platform version: OpenCL 1.2; device version: OpenCL 1.2
2022-05-02 17:38:36.2357 [PID=18124]    [version] driver version 1018105161, min: 0, max: 1018103906
2022-05-02 17:38:36.2357 [PID=18124]    [version] driver version required max: 1018103906, supplied: 1018105161
2022-05-02 17:38:36.2358 [PID=18124]    [version] no app version available: APP#19 (einsteinbinary_BRP4) PLATFORM#9 (windows_x86_64) min_version 0
2022-05-02 17:38:36.2358 [PID=18124]    [version] no app version available: APP#19 (einsteinbinary_BRP4) PLATFORM#2 (windows_intelx86) min_version 0
2022-05-02 17:38:36.2358 [PID=18124]    [version] Checking plan class 'BRP4X64'
2022-05-02 17:38:36.2358 [PID=18124]    [version] plan class ok
2022-05-02 17:38:36.2358 [PID=18124]    [version] Don't need CPU jobs, skipping version 133 for einsteinbinary_BRP4G (BRP4X64)
2022-05-02 17:38:36.2358 [PID=18124]    [version] Checking plan class 'opencl-intel_gpu-new'
2022-05-02 17:38:36.2358 [PID=18124]    [version] parsed project prefs setting 'gpu_util_brp': 1.000000
2022-05-02 17:38:36.2359 [PID=18124]    [version] GPU RAM calculated: min: 512 MB, use: 360 MB, WU#638199463 CPU: 248 MB
2022-05-02 17:38:36.2359 [PID=18124]    [version] [HOST#12060147] device name: 'Intel(R) HD Graphics 4000'; OpenCL driver version: 10.18.10.5161; platform version: OpenCL 1.2; device version: OpenCL 1.2
2022-05-02 17:38:36.2359 [PID=18124]    [version] driver version 1018105161, min: 0, max: 1018103906
2022-05-02 17:38:36.2359 [PID=18124]    [version] driver version required max: 1018103906, supplied: 1018105161
2022-05-02 17:38:36.2359 [PID=18124]    [version] Don't need CPU jobs, skipping version 133 for einsteinbinary_BRP4G ()
2022-05-02 17:38:36.2359 [PID=18124]    [version] no app version available: APP#25 (einsteinbinary_BRP4G) PLATFORM#9 (windows_x86_64) min_version 0
2022-05-02 17:38:36.2359 [PID=18124]    [version] no app version available: APP#25 (einsteinbinary_BRP4G) PLATFORM#2 (windows_intelx86) min_version 0
2022-05-02 17:38:36.2364 [PID=18124]    [version] Checking plan class 'FGRPSSE'
2022-05-02 17:38:36.2364 [PID=18124]    [version] plan class ok
2022-05-02 17:38:36.2365 [PID=18124]    [version] Don't need CPU jobs, skipping version 108 for hsgamma_FGRP5 (FGRPSSE)
2022-05-02 17:38:36.2365 [PID=18124]    [version] no app version available: APP#46 (hsgamma_FGRP5) PLATFORM#9 (windows_x86_64) min_version 0
2022-05-02 17:38:36.2365 [PID=18124]    [version] no app version available: APP#46 (hsgamma_FGRP5) PLATFORM#2 (windows_intelx86) min_version 0
2022-05-02 17:38:36.2372 [PID=18124]    [version] Checking plan class 'FGRPopencl-ati'
2022-05-02 17:38:36.2372 [PID=18124]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2022-05-02 17:38:36.2372 [PID=18124]    [version] No ATI devices found
2022-05-02 17:38:36.2372 [PID=18124]    [version] Checking plan class 'FGRPopencl-intel_gpu'
2022-05-02 17:38:36.2372 [PID=18124]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2022-05-02 17:38:36.2372 [PID=18124]    [version] GPU RAM calculated: min: 766 MB, use: 750 MB, WU#638221306 CPU: 429 MB
2022-05-02 17:38:36.2373 [PID=18124]    [version] OpenCL GPU RAM required min: 803209216.000000, supplied: 609012941
2022-05-02 17:38:36.2373 [PID=18124]    [version] Checking plan class 'FGRPopencl-nvidia'
2022-05-02 17:38:36.2373 [PID=18124]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2022-05-02 17:38:36.2373 [PID=18124]    [version] No CUDA devices found
2022-05-02 17:38:36.2373 [PID=18124]    [version] Checking plan class 'FGRPopencl1K-ati'
2022-05-02 17:38:36.2373 [PID=18124]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2022-05-02 17:38:36.2373 [PID=18124]    [version] No ATI devices found
2022-05-02 17:38:36.2373 [PID=18124]    [version] Checking plan class 'FGRPopencl1K-nvidia'
2022-05-02 17:38:36.2373 [PID=18124]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2022-05-02 17:38:36.2373 [PID=18124]    [version] No CUDA devices found
2022-05-02 17:38:36.2373 [PID=18124]    [version] Checking plan class 'FGRPopenclTV-nvidia'
2022-05-02 17:38:36.2373 [PID=18124]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2022-05-02 17:38:36.2373 [PID=18124]    [version] No CUDA devices found
2022-05-02 17:38:36.2373 [PID=18124]    [version] Checking plan class 'FGRPopencl2-ati'
2022-05-02 17:38:36.2373 [PID=18124]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2022-05-02 17:38:36.2373 [PID=18124]    [version] No ATI devices found
2022-05-02 17:38:36.2373 [PID=18124]    [version] Checking plan class 'FGRPopencl2Pup-nvidia'
2022-05-02 17:38:36.2373 [PID=18124]    [version] parsed project prefs setting 'gpu_util_fgrp': 1.000000
2022-05-02 17:38:36.2373 [PID=18124]    [version] No CUDA devices found
2022-05-02 17:38:36.2373 [PID=18124]    [version] no app version available: APP#40 (hsgamma_FGRPB1G) PLATFORM#9 (windows_x86_64) min_version 0
2022-05-02 17:38:36.2410 [PID=18124]    [send] [HOST#12060147] is looking for work from a non-preferred application
2022-05-02 17:38:36.2483 [PID=18124] [debug]   [HOST#12060147] MSG(high) No work sent
2022-05-02 17:38:36.2483 [PID=18124] [debug]   [HOST#12060147] MSG(high) see scheduler log messages on https://einsteinathome.org/host/12060147/log
2022-05-02 17:38:36.2483 [PID=18124]    Sending reply to [HOST#12060147]: 0 results, delay req 60.00
2022-05-02 17:38:36.2484 [PID=18124]    Scheduler ran 0.152 seconds
Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,870
Credit: 115,825,691,900
RAC: 35,467,696

{AF} fansyl wrote:I can no

{AF} fansyl wrote:
I can no longer receive BRP4(G) tasks for my Intel HD4000 GPU from my i7-3770T even though I have not changed any settings.

I don't use Intel GPUs at all so don't have any experience but from the log excerpt you published, did you see the following line?

... driver version required max: 1018103906, supplied: 1018105161

It looks like there is a maximum driver version limit being applied and it shows that your supplied version is higher than the allowed max value.  Perhaps that's a recent change by the project, or perhaps your driver version was updated recently as part of normal system maintenance.  They've been trying to cut down on validation issues by excluding known problematic driver versions, hence the setting of these limits.

You could try downgrading your driver to be at or below the limit stated.

Cheers,
Gary.

Link
Link
Joined: 15 Mar 20
Posts: 118
Credit: 7,799,001
RAC: 47,480

Not Intel GPU related, but I

Not Intel GPU related, but I see recently similar issue with the "v1.61 aarch64-unknown-linux-gnu" application.

Examples of aarch64 loosing against two Androids: WU 649167107 WU 650389871

Example of Android loosing against two aarch64: WU 649945764

.

Link
Link
Joined: 15 Mar 20
Posts: 118
Credit: 7,799,001
RAC: 47,480

Next example of Android

Next example of Android loosing against two aarch64: WU 655777188

Update 31.07., Android loosing against two aarch64: WU 659634571. Perhaps interesting: both aarch64 computers use ARM BCM2835 CPU.

.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.