-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with folder name in non-western characters #28
Comments
All code file involved are not my agent ones but Plex ones. |
Thanks for your feedback. Below is the full log.
And com.plexapp.system.log
|
still think it is Plex related |
Still the same in the latest plex media server. Non-western character in dir (ok in filename) causes
Not sure what exactly the serialization is though. I am suspicious it is somehow related with metadata.id or some guid things (these are made from dirname), so tried to debug but failed. :) Would you please test video |
Reproduced
|
Reproduced again. Show title have chars/format causing issues
I am not good enough in Python to solve this issue i am afraid... |
I have reproduced this error with any video from the following channel As you point out, it looks like Plex files that can't handle them rather than anything in the bundle.
|
i wonder if i could replace in the logs manually with a normal apostrophe, and if hte error would disapear too... |
I would guess so. Once I renamed the directory and filename to remove the ’ then parsing happened successfully. |
in line 42 in function: def filterInvalidXMLChars(string) try to add: Please report if it fixes the issue and will add code to the master |
No luck with my case. |
You do use the "Absolute Series Scanner" as scanner? it is meant to filter those out normally... https://stackoverflow.com/questions/2477452/%C3%A2%E2%82%AC-showing-on-page-instead-of So encoding issue and the character is changed somehow
in line 42 in function: def filterInvalidXMLChars(string) try to add the first line: (the others are development notes)
|
I'm seeing this too: |
I am sorry but i have no idea how to fix. Ideas welcome |
My work around was simply to drop any special (non-unicode) characters when I name my files. Because the agent matches only based on the youtubeID as long as the filename doesn't contain anything bizarre it seemed to parse with no problems. Personally I would put this as a "won't fix" people can just name their files nicely/differently :p |
I did this on another project once.. there is a little python trick to convert unicode stuff to near-ascii equivilent.. could you use something like that and pass everything through it? |
Do you mean something like line 40-43
And called line 333 |
Ahh, found what I was doing in my other code.. foo.encode('ascii', 'ignore') which I think just drops the bad chars. |
I found a temporary workaround for this issue by removing a folder name from a metadata id as in https://github.com/wiserain/YouTube-Agent.bundle/commit/43f4828fca7d987fa63f6719ce7df11daf3c91d7 For the example in my original thread, metadata id with a folder name I don't know what a proper way to fix is, but hope this help finding it. |
Knowing the weird character in id cause the issue is key to solve the issue, thank you...
|
@ZeroQI Unable to use |
Just found an option in youtube-dl that resolves this issue --restrict-filenames makes the filenames slightly messy but matching is fixed. |
Thanks for the update. |
A user solved these type of issues on HAMA, so writing it here for future reference to solve: ZeroQI/Absolute-Series-Scanner#335
No need for
|
After full rewrite to reduce length and handle foreign codepage better
Coded to sort all issues and back to
your foldername is not unicode seemingly, however tried and it pass tests https://apps.timwhitlock.info/unicode/inspect?s=%EB%94%94%EC%97%90%EC%9D%B4%EC%97%94%EC%A7%80+%5Byoutube2-UCCaYq4ZNPHP9v_0vL2xjrWw%5D#block-UAC00 The code crashing is Plex framework, it happen after the last agent output, I cannot fix, i did try... |
Is it possible to handle the crash in the agent side of things and out put a clean error message "non-unicode characters detected in file/path" or some such so that people can at least see the cause themselves? Obviously those of us in the know purge offending characters now but others may not. |
i have a function to make unicode called sanitize_path() which return unicode (most compatible) After tests,
|
regarding the issue in #79 with the "TM" symbol, can the offending symbol be removed from the title before it is sent to Plex? |
removing specific special characters is not really a viable solution. It would be much cleaner if you removed the special characters yourself via |
Fair enough. It was something I was using but thought I wouldn't need any more after the recent changes. I'll go back to using it. |
I think i stopped using it because channels like |
I have downloaded one youtube video using the exactly same format as in Readme.md via youtube-dl, which includes a Korean channel name and Korean video name, like below.
The log said
I found a related issue reporting a limitation in a folder name, #14 , however, I don't see it is closely related because in my case it doen't have any special characters except non-western ones.
After I changed the folder name from
디에이엔지 [UCCaYq4ZNPHP9v_0vL2xjrWw]
totest [UCCaYq4ZNPHP9v_0vL2xjrWw]
by replacing channel name with "test", it successfully updated metainfo.Is this expected not to work with CJK characters? or am I missing something else?
Thanks for your work.
The text was updated successfully, but these errors were encountered: