Possible bad batch of faulty Gravity Wave tasks

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2957363012
RAC: 714118
Topic 229172

Got a bunch of tasks on host 12788501, all of which errored out with:

einstein_O3MDF_1.00_x86_64-pc-linux-gnu__GW-opencl-nvidia: `--tlCompartments' requires `--FreqBand' to be divisible without rest by `--dFreq', is 7e-08

Suspending GW work for the time being.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117600219755
RAC: 35215082

Hi Richard, Thanks for the

Hi Richard,

Thanks for the heads-up.  I presume you would have messaged Bernd directly as well?

 

Cheers,
Gary.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2957363012
RAC: 714118

No, I was just on my way out

No, I was just on my way out of the door when I saw them. And I doubt he would want to intervene at this time of a Friday night!

I see the same thing on host 12808716, but that's protected from further downloads by the same preference change.

cecht
cecht
Joined: 7 Mar 18
Posts: 1533
Credit: 2904408837
RAC: 2187878

Richard Haselgrove

Richard Haselgrove wrote:

Got a bunch of tasks on host 12788501, all of which errored out with:

einstein_O3MDF_1.00_x86_64-pc-linux-gnu__GW-opencl-nvidia: `--tlCompartments' requires `--FreqBand' to be divisible without rest by `--dFreq', is 7e-08

Suspending GW work for the time being.

I'm getting the same error with:

einstein_O3MDF_1.00_x86_64-pc-linux-gnu__GW-opencl-ati

These are all with the VelaJr tasks.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117600219755
RAC: 35215082

Richard Haselgrove

Richard Haselgrove wrote:

... I doubt he would want to intervene at this time of a Friday night!

I've sent him a message so hopefully he might get it in the morning.

He may want to use remote access at least to stop sending faulty tasks.  Otherwise nothing might happen until Monday.

 

Cheers,
Gary.

Ron Kosinski
Ron Kosinski
Joined: 23 Mar 05
Posts: 57
Credit: 1062448837
RAC: 885526

It's just the 1.03 GW GPU

It's just the 1.03 GW GPU tasks that are failing.  I am running some 1.03 GW CPU tasks with no issues.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2957363012
RAC: 714118

Yes, mine were GPU tasks too

Yes, mine were GPU tasks too (for NVidia), so it might be another application problem - I didn't have any CPU tasks for comparison. But from the error message I posted, it seemed possible that the search parameters were at fault. All possibilities will have to be explored.

Edit - I looked through some of my failed tasks on the original machine, but couldn't find any that had been reallocated to a CPU. They'd all been tried (and failed) by other users on GPUs, and then cancelled.

Link
Link
Joined: 15 Mar 20
Posts: 121
Credit: 10110059
RAC: 54829

Richard Haselgrove wrote:Edit

Richard Haselgrove wrote:
Edit - I looked through some of my failed tasks on the original machine, but couldn't find any that had been reallocated to a CPU. They'd all been tried (and failed) by other users on GPUs, and then cancelled.

O3MDF does not have any CPU apps, it would be very unusual if they suddenly became O3MD1 tasks.

.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2957363012
RAC: 714118

Bernd cancelled the batch

Bernd cancelled the batch (see tech news)

It's been re-started now. I got one bad task on the first attempt, but the ones being issued now seem to be working.

mikey
mikey
Joined: 22 Jan 05
Posts: 12684
Credit: 1839089911
RAC: 3810

Richard Haselgrove

Richard Haselgrove wrote:

Bernd cancelled the batch (see tech news)

It's been re-started now. I got one bad task on the first attempt, but the ones being issued now seem to be working. 

thanks

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.