-
Notifications
You must be signed in to change notification settings - Fork 119
Strings and bytes in Cython
Giovanni Torres edited this page Aug 19, 2017
·
1 revision
Strings are bytes.
>>> type("a")
<type 'str'>
>>> type(b'a')
<type 'str'>
>>> type(u'a')
<type 'unicode'>
>>> type("a".encode("UTF-8"))
<type 'str'>
>>> type("a".decode("UTF-8"))
<type 'unicode'>
Strings are unicode.
>>> type("a")
<class 'str'>
>>> type(b'a')
<class 'bytes'>
>>> type(u'a')
<class 'str'>
>>> type("a".encode("UTF-8"))
<class 'bytes'>
>>> type("a".decode("UTF-8"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'decode'
A function to decode a C character pointer:
cdef unicode tounicode(char* s):
if s == NULL:
return None
else:
return s.decode("UTF-8", "replace")
In Python2, the c_string is decoded to a type unicode
.
>>> c_string.decode("UTF-8")
unicode
In Python3, the c_string is decoded to a type str
, which is unicode
.
>>> c_string.decode("UTF-8")
str
c_function(item)
Python 2: item
should be string (which is bytes in Py2) and needs no conversion, but .encode("UTF-8") will keep it as string/bytes, which can be passed to C
Python 3: item
should be bytes and needs to be encoded, .encode("UTF-8") will convert to bytes and then passed to C
- .encode() when passing to C (converts to bytes - py2 string is bytes)
- .decode() when receiving from C (converts to unicode - py3 string is unicode)
Intro
Getting Started
Development
- Running a local user installation
- Testing PySlurm with Docker
- Continuous Integration
- Updating PySlurm for New Slurm Releases
- Using latest version of Cython
- Strings and bytes in Cython
- Profiling PySlurm
- Checking for memory leaks
- Do's and Dont's
- Slurm shell completion
Contributing