Can't get an ack for a couple of uploads

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17543153886
RAC: 6388162
Topic 224095

Anybody else having issues with getting some uploads unable to get an ack from the servers?

I have several tasks, a couple on each host that can't get an ack from the server after they have finished their uploads at 100%.

On the 7th retry with large backoffs.  In the meantime every other finished task is getting uploaded and acknowledged normally.

http_xfer_debug is not that helpful and only showing transient errors for the stuck tasks.

Sun 29 Nov 2020 12:11:42 AM PST | Einstein@Home | Started upload of LATeah1066L30_500.0_0_0.0_33517050_1_1
Sun 29 Nov 2020 12:11:42 AM PST | Einstein@Home | Started upload of LATeah1066L30_508.0_0_0.0_1497258_0_0
Sun 29 Nov 2020 12:11:42 AM PST | Einstein@Home | Started upload of LATeah1066L30_508.0_0_0.0_1497258_0_1
Sun 29 Nov 2020 12:11:45 AM PST |  | Project communication failed: attempting access to reference site
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Temporarily failed upload of LATeah1066L30_500.0_0_0.0_33517050_1_1: transient HTTP error
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Backing off 04:57:41 on upload of LATeah1066L30_500.0_0_0.0_33517050_1_1
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Temporarily failed upload of LATeah1066L30_508.0_0_0.0_1497258_0_0: transient HTTP error
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Backing off 00:06:52 on upload of LATeah1066L30_508.0_0_0.0_1497258_0_0
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Temporarily failed upload of LATeah1066L30_508.0_0_0.0_1497258_0_1: transient HTTP error
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Backing off 00:06:35 on upload of LATeah1066L30_508.0_0_0.0_1497258_0_1
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 1804 bytes
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 2584 bytes
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 3091 bytes
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 2996 bytes
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 1561 bytes
Sun 29 Nov 2020 12:11:49 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 201 bytes
Sun 29 Nov 2020 12:11:49 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 1499 bytes
Sun 29 Nov 2020 12:11:49 AM PST |  | Internet access OK - project servers may be temporarily down.
 

Weird, as I said all other uploads for Einstein and my other projects are fine.  Anybody else having issues?

 

Harri Liljeroos
Harri Liljeroos
Joined: 10 Dec 05
Posts: 3609
Credit: 2901475666
RAC: 1032836

Yes, I have a few of those as

Yes, I have a few of those as well on my two hosts (2 & 4 uploads hanging). They seem to go eventually as it is only the latest finished tasks that are hanging. As I am typing the other host succeeded in the file transfer but the other got two more.

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7023294931
RAC: 1830267

On first look this morning

On first look this morning all three of my machines showed a small number of tasks in uploading status.  The biggest backoff was three hours with retry counts as high as 5.  But the small number of tasks indicated that, as you specified, these were just a few of the tasks.  Then, as I was typing, most cleared, and with a forced retry, all cleared.

Perhaps the actual problem at the server end is now fixed?

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17543153886
RAC: 6388162

Forced retries just increased

Forced retries just increased the backoffs.  This morning, looks like all the uploads have cleared on both hosts.

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17543153886
RAC: 6388162

Great!! . . . . another

Great!! . . . . another project with stalled out, backed off, uploads.

 

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 260
Credit: 6911321637
RAC: 21657954

... that's "standard

... that's "standard procedure" over the weekend ...

Usually happening Satuday night/Sunday morning, depending on where you are situated.

Have a nice Sunday!

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1695
Credit: 1042244373
RAC: 1399590

Same here for the last

Same here for the last several hours.

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 260
Credit: 6911321637
RAC: 21657954

There is a new post under

There is a new post under "technical news" from Bernd about this "problem".

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17543153886
RAC: 6388162

Get used to long backoffs and

Get used to long backoffs and upload issues here. 

GPUGrid is out of work and I know many people run Einstein as their failover, 0 resource backup project when GPUGrid has issues.

Pages of stalled and backed off Einstein GR uploads on all my hosts.

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3681
Credit: 33821107393
RAC: 37791767

no problems if you run

no problems if you run Gravitational Wave tasks

_________________________________________________________________________

Tom M
Tom M
Joined: 2 Feb 06
Posts: 5586
Credit: 7673176219
RAC: 1746766

Ian&Steve C. wrote:no

Ian&Steve C. wrote:

no problems if you run Gravitational-Wave tasks

I am running GW CPU tasks.  Are you saying that the GW GPU tasks also don't have an upload issue either?

I have 100's of GR GPU tasks unable to upload.

Edit:

1/3/2021 7:29:36 AM | Einstein@Home | [error] Error reported by file upload server: File uploads are temporarily disabled.
1/3/2021 7:29:36 AM | Einstein@Home | Temporarily failed upload of LATeah2065L68aj_484.0_0_0.0_24879274_1_1: transient upload error
1/3/2021 7:29:36 AM | Einstein@Home | Backing off 00:28:27 on upload of LATeah2065L68aj_484.0_0_0.0_24879274_1_1
 

Tom M

edit>> ps. Just created a 0 resource profile of short tasks for GPUs in PrimeGrid.  My GPUs are busy again.

 

 

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.