Can't get an ack for a couple of uploads

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 1,001
Credit: 1,041,739,043
RAC: 3,334,286
Topic 224095

Anybody else having issues with getting some uploads unable to get an ack from the servers?

I have several tasks, a couple on each host that can't get an ack from the server after they have finished their uploads at 100%.

On the 7th retry with large backoffs.  In the meantime every other finished task is getting uploaded and acknowledged normally.

http_xfer_debug is not that helpful and only showing transient errors for the stuck tasks.

Sun 29 Nov 2020 12:11:42 AM PST | Einstein@Home | Started upload of LATeah1066L30_500.0_0_0.0_33517050_1_1
Sun 29 Nov 2020 12:11:42 AM PST | Einstein@Home | Started upload of LATeah1066L30_508.0_0_0.0_1497258_0_0
Sun 29 Nov 2020 12:11:42 AM PST | Einstein@Home | Started upload of LATeah1066L30_508.0_0_0.0_1497258_0_1
Sun 29 Nov 2020 12:11:45 AM PST |  | Project communication failed: attempting access to reference site
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Temporarily failed upload of LATeah1066L30_500.0_0_0.0_33517050_1_1: transient HTTP error
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Backing off 04:57:41 on upload of LATeah1066L30_500.0_0_0.0_33517050_1_1
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Temporarily failed upload of LATeah1066L30_508.0_0_0.0_1497258_0_0: transient HTTP error
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Backing off 00:06:52 on upload of LATeah1066L30_508.0_0_0.0_1497258_0_0
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Temporarily failed upload of LATeah1066L30_508.0_0_0.0_1497258_0_1: transient HTTP error
Sun 29 Nov 2020 12:11:45 AM PST | Einstein@Home | Backing off 00:06:35 on upload of LATeah1066L30_508.0_0_0.0_1497258_0_1
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 1804 bytes
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 2584 bytes
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 3091 bytes
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 2996 bytes
Sun 29 Nov 2020 12:11:48 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 1561 bytes
Sun 29 Nov 2020 12:11:49 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 201 bytes
Sun 29 Nov 2020 12:11:49 AM PST |  | [http_xfer] [ID#0] HTTP: wrote 1499 bytes
Sun 29 Nov 2020 12:11:49 AM PST |  | Internet access OK - project servers may be temporarily down.
 

Weird, as I said all other uploads for Einstein and my other projects are fine.  Anybody else having issues?

 

Harri Liljeroos
Harri Liljeroos
Joined: 10 Dec 05
Posts: 704
Credit: 926,060,976
RAC: 1,690,133

Yes, I have a few of those as

Yes, I have a few of those as well on my two hosts (2 & 4 uploads hanging). They seem to go eventually as it is only the latest finished tasks that are hanging. As I am typing the other host succeeded in the file transfer but the other got two more.

archae86
archae86
Joined: 6 Dec 05
Posts: 2,933
Credit: 3,729,073,105
RAC: 4,919,859

On first look this morning

On first look this morning all three of my machines showed a small number of tasks in uploading status.  The biggest backoff was three hours with retry counts as high as 5.  But the small number of tasks indicated that, as you specified, these were just a few of the tasks.  Then, as I was typing, most cleared, and with a forced retry, all cleared.

Perhaps the actual problem at the server end is now fixed?

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 1,001
Credit: 1,041,739,043
RAC: 3,334,286

Forced retries just increased

Forced retries just increased the backoffs.  This morning, looks like all the uploads have cleared on both hosts.

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 1,001
Credit: 1,041,739,043
RAC: 3,334,286

Great!! . . . . another

Great!! . . . . another project with stalled out, backed off, uploads.

 

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 150
Credit: 2,985,119,586
RAC: 13,705,933

... that's "standard

... that's "standard procedure" over the weekend ...

Usually happening Satuday night/Sunday morning, depending on where you are situated.

Have a nice Sunday!

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1,317
Credit: 431,600,107
RAC: 101,062

Same here for the last

Same here for the last several hours.

 

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 150
Credit: 2,985,119,586
RAC: 13,705,933

There is a new post under

There is a new post under "technical news" from Bernd about this "problem".

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 1,001
Credit: 1,041,739,043
RAC: 3,334,286

Get used to long backoffs and

Get used to long backoffs and upload issues here. 

GPUGrid is out of work and I know many people run Einstein as their failover, 0 resource backup project when GPUGrid has issues.

Pages of stalled and backed off Einstein GR uploads on all my hosts.

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 328
Credit: 780,013,252
RAC: 4,162,139

no problems if you run

no problems if you run Gravitational Wave tasks

_____________________________________________


Tom M
Tom M
Joined: 2 Feb 06
Posts: 390
Credit: 894,884,373
RAC: 5,806,287

Ian&Steve C. wrote:no

Ian&Steve C. wrote:

no problems if you run Gravitational-Wave tasks

I am running GW CPU tasks.  Are you saying that the GW GPU tasks also don't have an upload issue either?

I have 100's of GR GPU tasks unable to upload.

Edit:

1/3/2021 7:29:36 AM | Einstein@Home | [error] Error reported by file upload server: File uploads are temporarily disabled.
1/3/2021 7:29:36 AM | Einstein@Home | Temporarily failed upload of LATeah2065L68aj_484.0_0_0.0_24879274_1_1: transient upload error
1/3/2021 7:29:36 AM | Einstein@Home | Backing off 00:28:27 on upload of LATeah2065L68aj_484.0_0_0.0_24879274_1_1
 

Tom M

edit>> ps. Just created a 0 resource profile of short tasks for GPUs in PrimeGrid.  My GPUs are busy again.

 

 

Proud Member of the OFA (Old Farts Assoc.)
Your are entitled to your own Opinion but not your own Data. (Senator and Prof. Pat Moynihan).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.