Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wget2 errors on site cloning #91

Open
fauno opened this issue Nov 11, 2024 · 5 comments
Open

wget2 errors on site cloning #91

fauno opened this issue Nov 11, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@fauno
Copy link
Collaborator

fauno commented Nov 11, 2024

i'm getting a 500 error while cloning https://sutty.nl even though running the same command locally with wget2 2.1.0 works correctly

{
  "statusCode": 500,
  "code": "8",
  "error": "Internal Server Error",
  "message": "Command failed: wget2   --random-wait   --compression=identity,gzip,br   --user-agent=\"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/119.0\"   --mirror   --page-requisites   --convert-links   --adjust-extension   --continue   --no-host-directories   --directory-prefix=sutty.nl   \"https://sutty.nl\"\nMissing host/domain in URI 'https:'\nCannot resolve URI 'https:'\ntoASCII(�y ni hablar del dinero que ganan <a href=\"https) failed (-200): string encoding error\ntoASCII(tml\">consejos piratas para la apostasía de redes sociales<) failed (-203): punycode encoded data will be too large\ntoASCII(go de convivencia está basado en los “<a href=\"https) failed (-203): punycode encoded data will be too large\ntoASCII(ión es almacenada por sólo un par de proveedores de servicios, o servidores,<a href=\"https) failed (-203): punycode encoded data will be too large\ntoASCII(da personal, de nuestra presencia online o de nuestros ingresos económicos a <a href=\"https) failed (-203): punycode encoded data will be too large\ntoASCII(s en orden de prioridad una lista de ideas que teníamos y otras que aportaron les participantes.<) failed (-203): punycode encoded data will be too large\ntoASCII(as, stickers en la compu y de fondo, la exposición del cc tierra violeta\" ) failed (-203): punycode encoded data will be too large\ntoASCII(amos un sitio específico que actúa como intermediario entre la página que queremos compartir y la instancia que aloja nuestre usuarie.<) failed (-203): punycode encoded data will be too large\ntoASCII(� que al final seguimos los <a href=\"https) failed (-200): string encoding error\n"
}
>_ wget2   --random-wait   --compression=identity,gzip,br   --user-agent="Mozilla/5.0 (Windows NT 
10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/119.0"   --mirror   --page-requisites   --convert-links   --adjust-extension   --cont
inue   --no-host-directories   --directory-prefix=sutty.nl   "https://sutty.nl"
31 files             100% [=====================================================================================>]    6.82M    1.28MB/s
70 files             100% [=====================================================================================>]   10.02M    2.15MB/s
34 files             100% [=====================================================================================>]   55.64M    5.40MB/s
36 files             100% [=====================================================================================>]   13.60M    2.60MB/s
73 files             100% [=====================================================================================>]    8.24M    1.18MB/s
                          [Files: 244  Bytes: 94.34M [3.68MB/s] Redirects: 2  Todo: 0  Errors: 19                ]
@RangerMauve
Copy link
Contributor

Super weird. Unit tests for the clone API are passing

@RangerMauve
Copy link
Contributor

Interesting, it seems to be an issue with one of the URLs in the sutty site being too long when combined with the rest of the FS

@RangerMauve
Copy link
Contributor

Try running the wget2 command in a deeply nested directory. ON the DP server it's running in /home/press/.local/share/distributed-press-nodejs/sites/sites

@RangerMauve
Copy link
Contributor

I think this has to do with the filesystem having a limit on file names. https://superuser.com/a/790264

@RangerMauve
Copy link
Contributor

@fauno try now, I pushed a potential fix by setting the local-encoding to UTF-8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants