problems with "special" characters #3

kalle-del-haye · 2015-09-01T06:55:55Z

With every download which contains characters like é, ä or ß this script fails and I get error messages like this:

VideoSort: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)

Is there any way I can correctly handle this kind of characters?

Thanks
kalle

The text was updated successfully, but these errors were encountered:

hugbug · 2015-09-01T07:13:47Z

Are you sure you use the latest version?

kalle-del-haye · 2015-09-01T12:49:40Z

The About VideoSort text says:

PP-Script Version: 6.1.

Downloaded from GitHub one or two weeks ago.

hugbug · 2015-09-02T09:36:26Z

Please post the full log output, it should have more exception info.
What OS it runs on?
Please send me an example nzb-file to [email protected].

kalle-del-haye · 2015-09-02T17:44:03Z

The OS is Debian 7.8 with backport-kernel 3.16 and Python 2.7.3.

error   Sun Aug 30 2015 16:11:01   Post-process-script videosort/VideoSort.py for xxxxxxx failed
error   Sun Aug 30 2015 16:11:01   VideoSort: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)
error   Sun Aug 30 2015 16:11:01   VideoSort: Failed: roor-vsdb-1080p-subs.rar
info    Sun Aug 30 2015 16:11:01   VideoSort: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)
info    Sun Aug 30 2015 16:11:01   VideoSort: matcher = guessit.matcher.IterativeMatcher(unicode(guessfilename), filetype='autodetect', options={'nolanguage': True, 'nocountry': True})
info    Sun Aug 30 2015 16:11:01   VideoSort: File "/usr/share/nzbget/scripts/videosort/VideoSort.py", line 980, in guess_info
info    Sun Aug 30 2015 16:11:01   VideoSort: guess = guess_info(filename)
info    Sun Aug 30 2015 16:11:01   VideoSort: File "/usr/share/nzbget/scripts/videosort/VideoSort.py", line 1065, in construct_path
info    Sun Aug 30 2015 16:11:01   VideoSort: new_path = construct_path(old_path)
info    Sun Aug 30 2015 16:11:01   VideoSort: File "/usr/share/nzbget/scripts/videosort/VideoSort.py", line 1180, in <module>
info    Sun Aug 30 2015 16:11:01   VideoSort: Traceback (most recent call last):
info    Sun Aug 30 2015 16:11:01   Executing post-process-script videosort/VideoSort.py for xxxxxxx

The nzb-file is on the way.

hugbug · 2015-09-10T08:33:44Z

The nzb-file is on the way.

I never got the email. Please send again.

kalle-del-haye · 2015-09-10T08:53:43Z

Send again, using a different account and zipped the nzb file.

Sirvival21 · 2016-06-06T11:27:17Z

I have the same error with this nzb:
"Post-process-script videosort/VideoSort.py for Zoomania - Ganz schön ausgefuchst (2016) failed"

"VideoSort: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)"

"VideoSort: Failed: Zoomania - 1080p - AC3.mkv
INFO Mon Jun 06 2016 12:54:18 VideoSort: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)
INFO Mon Jun 06 2016 12:54:18 VideoSort: matcher = guessit.matcher.IterativeMatcher(unicode(guessfilename), filetype='autodetect', options={'nolanguage': True, 'nocountry': True})
INFO Mon Jun 06 2016 12:54:18 VideoSort: File "/storage/nzbget/scripts/videosort/VideoSort.py", line 980, in guess_info
INFO Mon Jun 06 2016 12:54:18 VideoSort: guess = guess_info(filename)
INFO Mon Jun 06 2016 12:54:18 VideoSort: File "/storage/nzbget/scripts/videosort/VideoSort.py", line 1065, in construct_path"

Can you say whats the problem with this file?
Zoomania - Ganz schön ausgefuchst (2016) {{XDx3rX6JGRvRgnJpRW}}.zip

mikenerone · 2016-11-01T05:50:00Z

The root problem is that in modern OSes, filenames can typically have unicode characters, pretty much universally byte-encoded as UTF-8, but VideoSort tries to decode with a simple unicode(guessfilename) call. As no encoding is specified, Python 2 defaults to assuming a much more limited ASCII encoding. The change that's needed is making this unicode(guessfilename, encoding='utf-8'). This should be perfectly safe and backward-compatible because UTF-8 is a superset of ASCII.

Specifically, this line in VideoSort.py:

matcher = guessit.matcher.IterativeMatcher(unicode(guessfilename), filetype='autodetect', options={'nolanguage': True, 'nocountry': True})

needs to be replaced with this (or the equivalent):

guessfilename = unicode(guessfilename, encoding='utf-8')
matcher = guessit.matcher.IterativeMatcher(guessfilename, filetype='autodetect', options={'nolanguage': True, 'nocountry': True})

hugbug · 2017-06-22T19:27:44Z

@mikenerone:
Will it work on Windows too (as Windows doesn't use UTF-8 for file names)?

hugbug added the help wanted label Aug 11, 2016

hugbug mentioned this issue Apr 21, 2017

UnicodeDecodeError: 'ascii' codec can't decode byte #16

Closed

hugbug mentioned this issue Jul 22, 2018

Fix error when path has unicode chars #40

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

problems with "special" characters #3

problems with "special" characters #3

kalle-del-haye commented Sep 1, 2015

hugbug commented Sep 1, 2015

kalle-del-haye commented Sep 1, 2015

hugbug commented Sep 2, 2015

kalle-del-haye commented Sep 2, 2015

hugbug commented Sep 10, 2015

kalle-del-haye commented Sep 10, 2015

Sirvival21 commented Jun 6, 2016 •

edited

Loading

mikenerone commented Nov 1, 2016 •

edited

Loading

hugbug commented Jun 22, 2017

problems with "special" characters #3

problems with "special" characters #3

Comments

kalle-del-haye commented Sep 1, 2015

hugbug commented Sep 1, 2015

kalle-del-haye commented Sep 1, 2015

hugbug commented Sep 2, 2015

kalle-del-haye commented Sep 2, 2015

hugbug commented Sep 10, 2015

kalle-del-haye commented Sep 10, 2015

Sirvival21 commented Jun 6, 2016 • edited Loading

mikenerone commented Nov 1, 2016 • edited Loading

hugbug commented Jun 22, 2017

Sirvival21 commented Jun 6, 2016 •

edited

Loading

mikenerone commented Nov 1, 2016 •

edited

Loading