You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Oh my. So I once thought (early 2022 when I built this) I have issues with some non ASCII file names.
So now I added a test that puts 🤗 into a name and Voilà it worked! I thought it's all good until I performed a scan on some other FTP space than the test area.
Turns out you can set the encoding on the ftplib FTP object and these emojis and stuff work no problem!
But when you have an é in a name:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xeb in position 868: invalid continuation byte
But all good1! These can be remidied changing the encoding to latin-1. Yes? :D
Yes. But then emojies no longer work! 😫
Honestly I don't know how we would solve this completely now.
A 1st step: Less directory listing!
I already modified our mkdirs function no so that it no longer looks up the parent dir of each part of the directory to be created to check for already existing.
Now It just fires the mkd and catches ftplib.error_perm with error code 550. This code is not specifically for "already exists" but close enough. And its faster as well!
Listing for each part of the path is rather expensive.
So under normal circumstances we no longer do directory listing at all on the FTP! 🙌
But we can! Maybe we should drop the option entirely. But then we'd still need a unittest that verifies that update still works with weird file names.
For kicks I just created a file named tëstfilé🤗.txt to trip off ANY encoding :D
When uploading it with WinSCP it turned into tëstfilé??.txt on the server and when copying back tëstfilé%3F%3F.txt
so not event THEY have it solved!
The text was updated successfully, but these errors were encountered:
Oh my. So I once thought (early 2022 when I built this) I have issues with some non ASCII file names.
So now I added a test that puts
🤗
into a name and Voilà it worked! I thought it's all good until I performed a scan on some other FTP space than the test area.Turns out you can set the encoding on the
ftplib
FTP object and these emojis and stuff work no problem!But when you have an
é
in a name:But all good1! These can be remidied changing the encoding to
latin-1
. Yes? :DYes. But then emojies no longer work! 😫
https://stackoverflow.com/q/77089678/469322
solution
Honestly I don't know how we would solve this completely now.
A 1st step: Less directory listing!
I already modified our
mkdirs
function no so that it no longer looks up the parent dir of each part of the directory to be created to check for already existing.Now It just fires the
mkd
and catchesftplib.error_perm
with error code550
. This code is not specifically for "already exists" but close enough. And its faster as well!Listing for each part of the path is rather expensive.
So under normal circumstances we no longer do directory listing at all on the FTP! 🙌
But we can! Maybe we should drop the option entirely. But then we'd still need a unittest that verifies that
update
still works with weird file names.For kicks I just created a file named
tëstfilé🤗.txt
to trip off ANY encoding :DWhen uploading it with WinSCP it turned into
tëstfilé??.txt
on the server and when copying backtëstfilé%3F%3F.txt
so not event THEY have it solved!
The text was updated successfully, but these errors were encountered: