Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problems with encoding of binary data #2

Open
jpriebe opened this issue Sep 3, 2013 · 1 comment
Open

problems with encoding of binary data #2

jpriebe opened this issue Sep 3, 2013 · 1 comment

Comments

@jpriebe
Copy link

jpriebe commented Sep 3, 2013

Thanks for contributing this module -- it looks very promising. I'm using it with python 2.6.5, and I have been running into issues like this:

20:57:58 T:2960314368 ERROR: File "/Applications/XBMC.app/Contents/Resources/XBMC/addons/webinterface.qxbmc/zipstream2.py", line 175, in FileHeader
20:57:58 T:2960314368 ERROR: return header + self.filename + extra
20:57:58 T:2960314368 ERROR: UnicodeDecodeError: 'ascii' codec can't decode byte 0xec in position 10: ordinal not in range(128)

(I've made some modifications to zipstream to be able to zip an arbitrary list of files rather than an entire directory tree, so you'll notice that my line numbers don't match yours; but I did also get this error with the unmodified code).

It seems that python is trying to interpret the header as if it were some sort of character data.

There are a few places in the code where this happens. I've been able to code around some of them by making multiple calls to yield instead of joining the binary strings together, but I'm not sure that's the right thing to do.

It also only happens with certain input files; some files don't trigger the problem.

@jpriebe
Copy link
Author

jpriebe commented Sep 3, 2013

So I changed this line at the end of FileHeader() from this:

return header + self.filename + extra

to this:

return header + self.filename.encode ('utf-8') + extra

and the error went away.

I think what's happening is that python is seeing header and extra as bytestrings and the filename as a unicode string. Then I guess you can't concatenate them if anything in the header or extra variables is not ascii compatible?

I've also seen problems at the end of archive_footer(), where the return ''.join(data) call will throw similar errors. In that case, I just return data directly and then in the iter() function, I loop through all the items in the data array and yield them.

I'm not really a python guy, so this is a bit beyond my limited knowledge of python strings. Maybe you guys have some ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant