Open
Description
The JSON specification defines JSON strings as:
A string is a sequence of Unicode code points wrapped with quotation marks (U+0022).
The natural Python analog of these appears to be Python unicode strings. Confusingly, the simplejson
library sometimes returns unicode strings and sometimes byte strings, e.g.:
>>> simplejson.loads('"\\u00e6"')
u'\xe6'
>>> simplejson.loads('"ae"')
'ae'
>>> simplejson.loads(u'"ae"')
u'ae'
This makes writing correct code on top of simplejson
rather hard.
Riak uses simplejson
in two kinds of places -- firstly, for decoding object data in the client. These allow for overriding the default encoders and so are less of an issue. Secondly, in the HTTP transport for decoding JSON responses. These provide no means to control the JSON parser used and return decoded keys and indexes in many places.
For the HTTP transport could we either:
- An an option for controlling which parser is used by the HTTP client.
- Use the stdlib json library which returns consistently typed results.