Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document PyUnicode_* API #46236

Open
avassalotti opened this issue Jan 27, 2008 · 9 comments
Open

Document PyUnicode_* API #46236

avassalotti opened this issue Jan 27, 2008 · 9 comments
Labels
docs Documentation in the Doc dir type-feature A feature request or enhancement

Comments

@avassalotti
Copy link
Member

avassalotti commented Jan 27, 2008

BPO 1944
Nosy @malemburg, @birkenfeld, @vstinner, @avassalotti, @berkerpeksag, @vadmium, @serhiy-storchaka, @shihai1991, @furkanonder
PRs
  • bpo-1944: wrap functions with macro #20011
  • Files
  • unicode.patch: docs for PyUnicodes C-API functions: FromFormat, FromFormatV, FromString, FromStringAndSize, Partition, RPartition and RSplit
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2008-01-27.06:26:42.019>
    labels = ['type-feature', 'docs']
    title = 'Document PyUnicode_* API'
    updated_at = <Date 2020-06-21.12:39:22.129>
    user = 'https://github.com/avassalotti'

    bugs.python.org fields:

    activity = <Date 2020-06-21.12:39:22.129>
    actor = 'shihai1991'
    assignee = 'docs@python'
    closed = False
    closed_date = None
    closer = None
    components = ['Documentation']
    creation = <Date 2008-01-27.06:26:42.019>
    creator = 'alexandre.vassalotti'
    dependencies = []
    files = ['13717']
    hgrepos = []
    issue_num = 1944
    keywords = ['patch']
    message_count = 9.0
    messages = ['61734', '86116', '86121', '89100', '185552', '185725', '264571', '264575', '368582']
    nosy_count = 11.0
    nosy_names = ['lemburg', 'georg.brandl', 'vstinner', 'alexandre.vassalotti', 'donlorenzo', 'docs@python', 'berker.peksag', 'martin.panter', 'serhiy.storchaka', 'shihai1991', 'furkanonder']
    pr_nums = ['20011']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue1944'
    versions = ['Python 3.5', 'Python 3.6']

    Linked PRs

    @avassalotti
    Copy link
    Member Author

    I was wandering whether the pointer returned by PyUnicode_AsString needs
    to be freed after usage (It turned it doesn't since the result is
    cached). However, I found out that there isn't any documentation on
    docs.python.org about the PyUnicode_AsString and
    PyUnicode_AsStringAndSize functions. Although, both are documented in
    the public unicodeobject.h header.

    I notice that the documentation for several other unicode functions is
    missing. Quickly, I see:

    PyUnicode_Resize
    PyUnicode_InternImmortal
    PyUnicode_GetDefaultEncoding
    PyUnicode_SetDefaultEncoding
    PyUnicode_BuildEncodingMap
    PyUnicode_FromFormatV
    PyUnicode_*UTF7*
    PyUnicode_AsEncodedObject
    PyUnicode_FromOrdinal
    PyUnicode_DecodeFSDefault
    PyUnicode_DecodeFSDefaultAndSize

    It would probably be a good idea to polish up the documentation for
    PyUnicode as much as possible for Python 3000, since extension
    developers will certainly need to refer to it a lot during the
    transition from 2.x.

    @avassalotti avassalotti added easy docs Documentation in the Doc dir labels Jan 27, 2008
    @tiran tiran added the type-feature A feature request or enhancement label Jan 27, 2008
    @donlorenzo
    Copy link
    Mannequin

    donlorenzo mannequin commented Apr 18, 2009

    In addition to the above mentioned functions I found these to be
    undocumented:

    PyUnicode_DecodeUTF7
    PyUnicode_DecodeUTF7Stateful
    PyUnicode_EncodeDecimal
    PyUnicode_EncodeUTF7
    PyUnicode_FromFormat
    PyUnicode_FromString
    PyUnicode_FromStringAndSize
    PyUnicode_GetMax
    PyUnicode_Partition
    PyUnicode_RPartition
    PyUnicode_RSplit

    From the original list the following functions seem to have been removed:

    PyUnicode_InternImmortal
    PyUnicode_DecodeFSDefault
    PyUnicode_DecodeFSDefaultAndSize

    I try to put together a patch for some of these during the weekend.

    @donlorenzo
    Copy link
    Mannequin

    donlorenzo mannequin commented Apr 18, 2009

    Ok, here is my shot at a patch for at least some of the undocumented
    functions. Namely the following functions are being documented in the patch:

    PyUnicode_FromFormat
    PyUnicode_FromFormatV
    PyUnicode_FromString
    PyUnicode_FromStringAndSize
    PyUnicode_Partition
    PyUnicode_RPartition
    PyUnicode_RSplit

    Please thoroughly review this patch since I didn't really digg into the
    source to find out what the functions do but rather just copied old
    PyString documentation or derived it from the docs for the Python API.

    @avassalotti
    Copy link
    Member Author

    The patch looks alright. I don't like the documentation for
    PyUnicode_FromFormatV, however. Here's my attempt to document it:

    .. cfunction:: PyObject* PyUnicode_FromFormatV(const char *format,
    va_list vargs)

    Equivalent to the function :cfunc:`PyUnicode_FromFormat`, except that
    it takes a va_list instead of variable number of arguments.

    @birkenfeld birkenfeld self-assigned this Apr 3, 2010
    @BreamoreBoy BreamoreBoy mannequin assigned docspython and unassigned birkenfeld Sep 20, 2010
    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Mar 30, 2013

    Is it worth applying the patch given the complete rewrite of unicode for 3.3 via PEP-393?

    @malemburg
    Copy link
    Member

    On 30.03.2013 13:09, Mark Lawrence wrote:

    Is it worth applying the patch given the complete rewrite of unicode for 3.3 via PEP-393?

    PEP-393 only changed the way Unicode is internally stored.
    The Unicode API is mostly unaffected by this change.

    @berkerpeksag
    Copy link
    Member

    Remaining undocumented functions:

    From this issue:

    PyUnicode_RSplit
    PyUnicode_Partition
    PyUnicode_RPartition

    From bpo-10435:

    PyUnicode_IsIdentifier
    PyUnicode_Append
    PyUnicode_AppendAndDel
    PyUnicode_GetDefaultEncoding
    PyUnicode_FromOrdinal
    PyUnicode_Resize
    PyUnicode_GetMax
    PyUnicode_InternImmortal
    PyUnicode_CHECK_INTERNED

    From bpo-18688:

    Py_UNICODE_REPLACEMENT_CHARACTER
    PyUnicodeIter_Type
    PyUnicode_AsDecodedObject
    PyUnicode_AsDecodedUnicode
    PyUnicode_AsEncodedObject
    PyUnicode_AsEncodedUnicode
    PyUnicode_BuildEncodingMap

    @berkerpeksag berkerpeksag removed the easy label Apr 30, 2016
    @berkerpeksag berkerpeksag changed the title Documentation for PyUnicode_AsString (et al.) missing. Document PyUnicode_* API Apr 30, 2016
    @serhiy-storchaka
    Copy link
    Member

    PyUnicode_DecodeCodePageStateful

    Following functions likely should be wrapped with "#ifndef Py_LIMITED_API":

    _PyUnicode_ClearStaticStrings
    _PyUnicode_EQ
    _PyUnicode_FromId

    @vstinner
    Copy link
    Member

    """
    Following functions likely should be wrapped with "#ifndef Py_LIMITED_API":

    _PyUnicode_ClearStaticStrings
    _PyUnicode_EQ
    _PyUnicode_FromId
    """

    It's already the case since at least Python 3.7. Extract of Python 3.7 Include/unicodeobject.h:

    #ifndef Py_LIMITED_API
    /* Return an interned Unicode object for an Identifier; may fail if there is no memory.*/
    PyAPI_FUNC(PyObject*) _PyUnicode_FromId(_Py_Identifier*);
    /* Clear all static strings. */
    PyAPI_FUNC(void) _PyUnicode_ClearStaticStrings(void);
    
    /* Fast equality check when the inputs are known to be exact unicode types
       and where the hash values are equal (i.e. a very probable match) */
    PyAPI_FUNC(int) _PyUnicode_EQ(PyObject *, PyObject *);
    #endif /* !Py_LIMITED_API */

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    koyuki7w added a commit to koyuki7w/cpython that referenced this issue Dec 14, 2024
    …eful`
    
    `PyUnicode_DecodeCodePageStateful` is in the limited C-API,
    and the document does not exist.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    docs Documentation in the Doc dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants