Can't get an ack for a couple of uploads

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17541289108
RAC: 6354673
Topic 224095

Anybody else having issues with getting some uploads unable to get an ack from the servers?

I have several tasks, a couple on each host that can't get an ack from the server after they have finished their uploads at 100%.

On the 7th retry with large backoffs.  In the meantime every other finished task is getting uploaded and acknowledged normally.

http_xfer_debug is not that helpful and only showing transient errors for the stuck tasks.

Sun 29 Nov 2020 12:11:42 AM PST | Einstein@Home | Started upload of LATeah1066L30_500.0_0_0.0_33517050_1_1
Sun 29 Nov 2020 12:11:42 AM PST | Einstein@Home | Started upload of LATeah1066L30_508.0_0_0.0_1497258_0_0
Sun 29 Nov 2020 12:11:42 AM PST | Einstein@Home | Started upload of LATeah1066L30_508.0_0_0.0_1497258_0_1
Sun 29 Nov 2020 12:11:45 AM PST |  | Project communication failed: attempting access to reference site
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Temporarily failed upload of LATeah1066L30_500.0_0_0.0_33517050_1_1: transient HTTP error
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Backing off 04:57:41 on upload of LATeah1066L30_500.0_0_0.0_33517050_1_1
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Temporarily failed upload of LATeah1066L30_508.0_0_0.0_1497258_0_0: transient HTTP error
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Backing off 00:06:52 on upload of LATeah1066L30_508.0_0_0.0_1497258_0_0
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Temporarily failed upload of LATeah1066L30_508.0_0_0.0_1497258_0_1: transient HTTP error
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Backing off 00:06:35 on upload of LATeah1066L30_508.0_0_0.0_1497258_0_1
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 1804 bytes
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 2584 bytes
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 3091 bytes
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 2996 bytes
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 1561 bytes
Sun 29 Nov 2020 12:11:49 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 201 bytes
Sun 29 Nov 2020 12:11:49 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 1499 bytes
Sun 29 Nov 2020 12:11:49 AM PST |  | Internet access OK - project servers may be temporarily down.
 

Weird, as I said all other uploads for Einstein and my other projects are fine.  Anybody else having issues?

 

Harri Liljeroos
Harri Liljeroos
Joined: 10 Dec 05
Posts: 3607
Credit: 2901249022
RAC: 1034803

Yes, I have a few of those as

Yes, I have a few of those as well on my two hosts (2 & 4 uploads hanging). They seem to go eventually as it is only the latest finished tasks that are hanging. As I am typing the other host succeeded in the file transfer but the other got two more.

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7022924931
RAC: 1835916

On first look this morning

On first look this morning all three of my machines showed a small number of tasks in uploading status.  The biggest backoff was three hours with retry counts as high as 5.  But the small number of tasks indicated that, as you specified, these were just a few of the tasks.  Then, as I was typing, most cleared, and with a forced retry, all cleared.

Perhaps the actual problem at the server end is now fixed?

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17541289108
RAC: 6354673

Forced retries just increased

Forced retries just increased the backoffs.  This morning, looks like all the uploads have cleared on both hosts.

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17541289108
RAC: 6354673

Great!! . . . . another

Great!! . . . . another project with stalled out, backed off, uploads.

 

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 260
Credit: 6909711637
RAC: 22020554

... that's "standard

... that's "standard procedure" over the weekend ...

Usually happening Satuday night/Sunday morning, depending on where you are situated.

Have a nice Sunday!

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1695
Credit: 1041967720
RAC: 1405507

Same here for the last

Same here for the last several hours.

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 260
Credit: 6909711637
RAC: 22020554

There is a new post under

There is a new post under "technical news" from Bernd about this "problem".

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17541289108
RAC: 6354673

Get used to long backoffs and

Get used to long backoffs and upload issues here. 

GPUGrid is out of work and I know many people run Einstein as their failover, 0 resource backup project when GPUGrid has issues.

Pages of stalled and backed off Einstein GR uploads on all my hosts.

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3681
Credit: 33812231309
RAC: 37812599

no problems if you run

no problems if you run Gravitational Wave tasks

_________________________________________________________________________

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5585
Credit: 7672532904
RAC: 1728769

Ian&Steve C. wrote:no

Ian&Steve C. wrote:

no problems if you run Gravitational-Wave tasks

I am running GW CPU tasks.  Are you saying that the GW GPU tasks also don't have an upload issue either?

I have 100's of GR GPU tasks unable to upload.

Edit:

1/3/2021 7:29:36 AM | Einstein@Home | [error] Error reported by file upload server: File uploads are temporarily disabled.
1/3/2021 7:29:36 AM | Einstein@Home | Temporarily failed upload of LATeah2065L68aj_484.0_0_0.0_24879274_1_1: transient upload error
1/3/2021 7:29:36 AM | Einstein@Home | Backing off 00:28:27 on upload of LATeah2065L68aj_484.0_0_0.0_24879274_1_1
 

Tom M

edit>> ps. Just created a 0 resource profile of short tasks for GPUs in PrimeGrid.  My GPUs are busy again.

 

 

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.