Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal S3.list-error #90

Open
siebrand opened this issue Feb 28, 2024 · 5 comments
Open

Fatal S3.list-error #90

siebrand opened this issue Feb 28, 2024 · 5 comments

Comments

@siebrand
Copy link

s3p version: 3.5.2

With the latest version of s3p I'm running into an error that often causes my command to not complete successfully. Full details in attached S3.list-error.log.

Command used:

export AWS_REGION=eu-north-1
time npx s3p sync --bucket my-bucket --to-bucket my-target-bucket --overwrite --storage-class INTELLIGENT_TIERING --large-copy-concurrency 20

Error:

S3.list-error:
  bucket:     :my-bucket
  prefix:     undefined
  startAfter: :56Rh
  limit:      1000
  error:      Error:
    class: class Error
    stack:
      TimeoutError: socket hang up
          at connResetException (node:internal/errors:705:14)
          at TLSSocket.socketCloseListener (node:_http_client:467:25)
          at TLSSocket.emit (node:events:525:35)
          at node:net:301:12
          at TCP.done (node:_tls_wrap:588:7)

Initially this happened most often near the end of the job, with the log as attached, but I've also seen it 5-6 minutes in instead of after 22-23 minutes, which is the expected completion time for this job. I'm planning to repeat with s3p 3.4.10, but I haven't had the time yet.

@siebrand
Copy link
Author

siebrand commented Feb 28, 2024

Just tried the same command with s3p 3.4.10 and this didn't have issues. It was considerably slower, though: 38m 15s for 3.4.10 vs. ±23 minutes for 3.5.2. But I'll take the version with fewer issues for now...

@shanebdavis
Copy link
Member

Hmm. The difference is probably the AWS SDKv3 - which sped things up over v2. I'm not sure how to handle "socket hang up" though. It's possible adding a re-try might help.

@shanebdavis
Copy link
Member

The 1000 timeout is probably 1000 seconds - 16.7 minutes. But that isn't consistent with you seeing it after 5 minutes. Is there error any different when it fails that early?

@siebrand
Copy link
Author

I don't recall the error having been different. After 10 tries or so failing at different points in the run, I gave up, and kept using 3.4.latest without any issue.

@siebrand
Copy link
Author

siebrand commented May 6, 2024

Any way to make progress on this? What do you need exactly? This is blocking my adoption of 3.5.x, and I would really love to start using it because of the fixes it does have compared to 3.4.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants