Important news on BRP7 and FGRPB1 work on E@H

Scrooge McDuck
Scrooge McDuck
Joined: 2 May 07
Posts: 862
Credit: 16267400
RAC: 5913

New BRP7 app version for

New BRP7 app version for Intel GPU (windows) released today (2023-09-26): All tasks error out immediately.

Please have a look into my comment in problems section "Lots of BRP7 errors" (incl. stderr.txt).

[EDIT:] It's a Beta test app. Sorry, I only realized that later. Please ignore my comments.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3743
Credit: 35452966376
RAC: 46113894

Boca Raton Community HS

Boca Raton Community HS wrote:

An early observation- the RTX 40xx generation still has a high amount of invalids for BRP7 tasks. We saw this behavior with the FGRPB1 and was hoping that it was something about these exact tasks that was causing the issues, but it appears to be something larger. I am waiting to see results from other users that have RTX 40xx GPUs but I would assume it would be roughly the same result. Of course, it is a small sample size and more time/tasks will tell the story better but it is already not looking great. 

Here are the hosts:

Host 1

Host 2

 

disable test tasks processing and see if you have better luck with the opencl v0.17 app that the project posted yesterday.

you might also wait a little longer and see how things shake up with the current cuda102 app

_________________________________________________________________________

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3743
Credit: 35452966376
RAC: 46113894

Scrooge McDuck wrote: New

Scrooge McDuck wrote:

New BRP7 app version for Intel GPU (windows) released today (2023-09-26): All tasks error out immediately.

Please have a look into my comment in problems section "Lots of BRP7 errors" (incl. stderr.txt).

[EDIT:] It's a Beta test app. Sorry, I only realized that later. Please ignore my comments.

FYI, I also got the same error about failed to build opencl app. on a Linux IntelGPU system (using the ati app per Bernd's instructions). this GPU on the same platform previously run BRP4 tasks fine (ignoring validation issues)

https://einsteinathome.org/task/1529874869

i guess the binary does need something tweaked.

_________________________________________________________________________

zyxbase
zyxbase
Joined: 26 Apr 22
Posts: 4
Credit: 79511928
RAC: 837

Alle BRP7 on Intel GPU enden

Alle BRP7 on Intel GPU enden nach 2 sekunden mit "Berechnungsfehler". Sollen wir den Rechner so laufen lassen, oder die BRP7 vorerst beenden?!?

_ _ _ _ _ _ _ _ _

All BRP7 on Intel GPU end with "calculation error" after 2 seconds. Should we let the computer run like this or stop BRP7 for now?!?

Scrooge McDuck
Scrooge McDuck
Joined: 2 May 07
Posts: 862
Credit: 16267400
RAC: 5913

zyxbase schrieb:Alle BRP7 on

zyxbase wrote:
Alle BRP7 on Intel GPU enden nach 2 sekunden mit "Berechnungsfehler". Sollen wir den Rechner so laufen lassen, oder die BRP7 vorerst beenden?!?

Es ist nur eine neue Beta-App, die diese Fehler produziert. Die Testresultate will das Team ja sehen. BRP7 gab's vorher nicht für iGPU. Diese App wird sicher bald gestoppt und ersetzt. Die Wingman mit AMD oder Nvidia GPUs werden diese Tasks korrekt rechnen. Mein BOINC-Client legt stundenlange Pausen zwischen Schedulder-Anfragen ein, sodass nicht zu viele Tasks in den Orkus gehen. Wenn's dich nervt: Beta-Apps in Einstellungen abschalten.

---

It's just a new beta app crashing tasks. The team wants to see the 'results'. BRP7 wasn't previously available for Intel GPU. This app version will be stopped; replaced. Your wingman with AMD or Nvidia GPUs will finish those tasks. My BOINC client delays scheduler requests for many hours due to computation errors so that only few of them are requested then crashed. If it bothers you: deactivate beta apps in project preferences.

zyxbase
zyxbase
Joined: 26 Apr 22
Posts: 4
Credit: 79511928
RAC: 837

THX, Scrooge McDuck.Deine

THX, Scrooge McDuck.Deine Erklärung leuchtet ein. Also wie immer im Leben: "Einfach laufen lassen..." :-D

TRAPPIST-713
TRAPPIST-713
Joined: 13 May 20
Posts: 10
Credit: 2206449268
RAC: 1518242

Is BR7 CPU

Is BR7 CPU bottlenecked?

---

1. For the same GPU (GRX 1660) “Binary Radio Pulsar Search (MeerKAT) v0.12 () windows_x86_64” run time significantly depends on the CPU:

Core(TM) i7-4770 CPU – Run time ~ 1,300 s

Core(TM)2 Quad CPU Q6600 – Run time ~ 1,600 s

---

2. For the other host, which has RX 580 and a relatively old CPU (Xeon(R) CPU X5570) run time increased ~ 40 times after the transition from “Gamma-ray pulsar binary search #1 on GPUs v1.22 () windows_x86_64” to “Binary Radio Pulsar Search (MeerKAT) v0.12 () windows_x86_64”. From ~ 500 s to ~ 20,000 s.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4274
Credit: 245468591
RAC: 11483

It seems like there's

It seems like there's something in the OpenCL code of BRP7 that the Intel GPU drivers can't easily digest (while AMD and NVidia do). Actually I don't have a clue yet, no machine to debug this myself and not much time for this. I can say, though, that there are a few successful results from that app version, so this problem doesn't occur on all installations.

BM

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5851
Credit: 110641145768
RAC: 33809218

TRAPPIST-713 wrote:... For

TRAPPIST-713 wrote:
... For the other host, which has RX 580 and a relatively old CPU (Xeon(R) CPU X5570) run time increased ~ 40 times .... From ~ 500 s to ~ 20,000 s.

I also have an old host (Q6600 CPU) with an RX 570 (a little slower than an RX 580) that was using the v0.15 app.  It has just transitioned today to using the new v0.17 app which seems a little slower than the previous version.  I've been running tasks singly to get an accurate idea of single task crunch times.  Later, I'll do much the same at 2x.  My impression so far is that there will be a small gain in output for running 2x.

At 1x, the average crunch time for v0.15 was ~1550 sec.  The new v0.17 tasks look like averaging around 1750+ sec.  It's too early to say for sure.  The motherboard/PCIe_1.x/CPU combo dates back to 2008.

A more modern machine (i3-3240 CPU) also with an RX 570 seems slightly faster.  I started it up earlier for the first time on BRP7 and it immediately got the v0.17 app.  It has completed 16 tasks (at 1x) averaging ~1600 sec.

My machines don't run CPU tasks and to save electricity, the CPUs tend to run close to idle speeds a lot of the time.  For example, the Q6600 CPU has a 1600/2400 min/max range and a lot of the time it shows very close to 1600.  You wouldn't expect that if these tasks cause any sort of bottlenecking.

I would guess there might be something wrong with the setup on your machine.  I tried to look at the CPU time/Elapsed time range for your tasks but your computers are hidden - so not possible.  Is your CPU running very heavy loads outside of crunching?  What are the typical values for those two times?  For my Q6600 machine a typical pair of CPU/Run times would be 131s/1552s.

Also, the stderr output for some returned tasks might give some sort of clue.  You should take a look by clicking on the taskID link for a returned task to see if there is anything of interest there.  Are these long-running tasks even validating??

Cheers,
Gary.

TRAPPIST-713
TRAPPIST-713
Joined: 13 May 20
Posts: 10
Credit: 2206449268
RAC: 1518242

Gary, This is link to that

Gary,

This is link to that computer:

https://einsteinathome.org/host/12832734

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.