Possible bad batch of faulty Gravity Wave tasks

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2140
Credit: 2770619447
RAC: 910183
Topic 229172

Got a bunch of tasks on host 12788501, all of which errored out with:

einstein_O3MDF_1.00_x86_64-pc-linux-gnu__GW-opencl-nvidia: `--tlCompartments' requires `--FreqBand' to be divisible without rest by `--dFreq', is 7e-08

Suspending GW work for the time being.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5845
Credit: 109974366987
RAC: 29681079

Hi Richard, Thanks for the

Hi Richard,

Thanks for the heads-up.  I presume you would have messaged Bernd directly as well?

 

Cheers,
Gary.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2140
Credit: 2770619447
RAC: 910183

No, I was just on my way out

No, I was just on my way out of the door when I saw them. And I doubt he would want to intervene at this time of a Friday night!

I see the same thing on host 12808716, but that's protected from further downloads by the same preference change.

cecht
cecht
Joined: 7 Mar 18
Posts: 1432
Credit: 2468181926
RAC: 726375

Richard Haselgrove

Richard Haselgrove wrote:

Got a bunch of tasks on host 12788501, all of which errored out with:

einstein_O3MDF_1.00_x86_64-pc-linux-gnu__GW-opencl-nvidia: `--tlCompartments' requires `--FreqBand' to be divisible without rest by `--dFreq', is 7e-08

Suspending GW work for the time being.

I'm getting the same error with:

einstein_O3MDF_1.00_x86_64-pc-linux-gnu__GW-opencl-ati

These are all with the VelaJr tasks.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5845
Credit: 109974366987
RAC: 29681079

Richard Haselgrove

Richard Haselgrove wrote:

... I doubt he would want to intervene at this time of a Friday night!

I've sent him a message so hopefully he might get it in the morning.

He may want to use remote access at least to stop sending faulty tasks.  Otherwise nothing might happen until Monday.

 

Cheers,
Gary.

Ron Kosinski
Ron Kosinski
Joined: 23 Mar 05
Posts: 57
Credit: 893658332
RAC: 625488

It's just the 1.03 GW GPU

It's just the 1.03 GW GPU tasks that are failing.  I am running some 1.03 GW CPU tasks with no issues.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2140
Credit: 2770619447
RAC: 910183

Yes, mine were GPU tasks too

Yes, mine were GPU tasks too (for NVidia), so it might be another application problem - I didn't have any CPU tasks for comparison. But from the error message I posted, it seemed possible that the search parameters were at fault. All possibilities will have to be explored.

Edit - I looked through some of my failed tasks on the original machine, but couldn't find any that had been reallocated to a CPU. They'd all been tried (and failed) by other users on GPUs, and then cancelled.

Link
Link
Joined: 15 Mar 20
Posts: 97
Credit: 605105
RAC: 463

Richard Haselgrove wrote:Edit

Richard Haselgrove wrote:
Edit - I looked through some of my failed tasks on the original machine, but couldn't find any that had been reallocated to a CPU. They'd all been tried (and failed) by other users on GPUs, and then cancelled.

O3MDF does not have any CPU apps, it would be very unusual if they suddenly became O3MD1 tasks.

.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2140
Credit: 2770619447
RAC: 910183

Bernd cancelled the batch

Bernd cancelled the batch (see tech news)

It's been re-started now. I got one bad task on the first attempt, but the ones being issued now seem to be working.

mikey
mikey
Joined: 22 Jan 05
Posts: 11948
Credit: 1832761197
RAC: 219860

Richard Haselgrove

Richard Haselgrove wrote:

Bernd cancelled the batch (see tech news)

It's been re-started now. I got one bad task on the first attempt, but the ones being issued now seem to be working. 

thanks

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.