Skip to content
This repository has been archived by the owner on Nov 18, 2022. It is now read-only.

problems with "special" characters #3

Open
kalle-del-haye opened this issue Sep 1, 2015 · 9 comments
Open

problems with "special" characters #3

kalle-del-haye opened this issue Sep 1, 2015 · 9 comments

Comments

@kalle-del-haye
Copy link

With every download which contains characters like é, ä or ß this script fails and I get error messages like this:

VideoSort: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)

Is there any way I can correctly handle this kind of characters?

Thanks
kalle

@hugbug
Copy link
Member

hugbug commented Sep 1, 2015

Are you sure you use the latest version?

@kalle-del-haye
Copy link
Author

The About VideoSort text says:

PP-Script Version: 6.1.

Downloaded from GitHub one or two weeks ago.

@hugbug
Copy link
Member

hugbug commented Sep 2, 2015

Please post the full log output, it should have more exception info.
What OS it runs on?
Please send me an example nzb-file to [email protected].

@kalle-del-haye
Copy link
Author

The OS is Debian 7.8 with backport-kernel 3.16 and Python 2.7.3.

error   Sun Aug 30 2015 16:11:01   Post-process-script videosort/VideoSort.py for xxxxxxx failed
error   Sun Aug 30 2015 16:11:01   VideoSort: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)
error   Sun Aug 30 2015 16:11:01   VideoSort: Failed: roor-vsdb-1080p-subs.rar
info    Sun Aug 30 2015 16:11:01   VideoSort: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)
info    Sun Aug 30 2015 16:11:01   VideoSort: matcher = guessit.matcher.IterativeMatcher(unicode(guessfilename), filetype='autodetect', options={'nolanguage': True, 'nocountry': True})
info    Sun Aug 30 2015 16:11:01   VideoSort: File "/usr/share/nzbget/scripts/videosort/VideoSort.py", line 980, in guess_info
info    Sun Aug 30 2015 16:11:01   VideoSort: guess = guess_info(filename)
info    Sun Aug 30 2015 16:11:01   VideoSort: File "/usr/share/nzbget/scripts/videosort/VideoSort.py", line 1065, in construct_path
info    Sun Aug 30 2015 16:11:01   VideoSort: new_path = construct_path(old_path)
info    Sun Aug 30 2015 16:11:01   VideoSort: File "/usr/share/nzbget/scripts/videosort/VideoSort.py", line 1180, in <module>
info    Sun Aug 30 2015 16:11:01   VideoSort: Traceback (most recent call last):
info    Sun Aug 30 2015 16:11:01   Executing post-process-script videosort/VideoSort.py for xxxxxxx

The nzb-file is on the way.

@hugbug
Copy link
Member

hugbug commented Sep 10, 2015

The nzb-file is on the way.

I never got the email. Please send again.

@kalle-del-haye
Copy link
Author

Send again, using a different account and zipped the nzb file.

@Sirvival21
Copy link

Sirvival21 commented Jun 6, 2016

I have the same error with this nzb:
"Post-process-script videosort/VideoSort.py for Zoomania - Ganz schön ausgefuchst (2016) failed"

"VideoSort: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)"

"VideoSort: Failed: Zoomania - 1080p - AC3.mkv
INFO Mon Jun 06 2016 12:54:18 VideoSort: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)
INFO Mon Jun 06 2016 12:54:18 VideoSort: matcher = guessit.matcher.IterativeMatcher(unicode(guessfilename), filetype='autodetect', options={'nolanguage': True, 'nocountry': True})
INFO Mon Jun 06 2016 12:54:18 VideoSort: File "/storage/nzbget/scripts/videosort/VideoSort.py", line 980, in guess_info
INFO Mon Jun 06 2016 12:54:18 VideoSort: guess = guess_info(filename)
INFO Mon Jun 06 2016 12:54:18 VideoSort: File "/storage/nzbget/scripts/videosort/VideoSort.py", line 1065, in construct_path"

Can you say whats the problem with this file?
Zoomania - Ganz schön ausgefuchst (2016) {{XDx3rX6JGRvRgnJpRW}}.zip

@mikenerone
Copy link

mikenerone commented Nov 1, 2016

The root problem is that in modern OSes, filenames can typically have unicode characters, pretty much universally byte-encoded as UTF-8, but VideoSort tries to decode with a simple unicode(guessfilename) call. As no encoding is specified, Python 2 defaults to assuming a much more limited ASCII encoding. The change that's needed is making this unicode(guessfilename, encoding='utf-8'). This should be perfectly safe and backward-compatible because UTF-8 is a superset of ASCII.

Specifically, this line in VideoSort.py:

matcher = guessit.matcher.IterativeMatcher(unicode(guessfilename), filetype='autodetect', options={'nolanguage': True, 'nocountry': True})

needs to be replaced with this (or the equivalent):

guessfilename = unicode(guessfilename, encoding='utf-8')
matcher = guessit.matcher.IterativeMatcher(guessfilename, filetype='autodetect', options={'nolanguage': True, 'nocountry': True})

@hugbug
Copy link
Member

hugbug commented Jun 22, 2017

@mikenerone:
Will it work on Windows too (as Windows doesn't use UTF-8 for file names)?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants