Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document members with a larger range of unicode characters (e.g. σ) #13084

Open
satr-cowi opened this issue Oct 30, 2024 · 4 comments
Open

Document members with a larger range of unicode characters (e.g. σ) #13084

satr-cowi opened this issue Oct 30, 2024 · 4 comments
Labels

Comments

@satr-cowi
Copy link

Is your feature request related to a problem? Please describe.

I have functions/classes with a unicode character σ in the title. I am using autosummary to produce my docs, but these .rst files are not getting produced.

I can see there are regex expressions (I think in a few places, the autosummary parsing and in toctree generation) e.g.

autosummary_item_re = re.compile(r'^\s+(~?[_a-zA-Z][a-zA-Z0-9_.]*)\s*.*?')

Describe the solution you'd like
A larger range of allowable characters in methods/functions/attributes etc.

Perhaps we could disallow problematic characters rather than only allow a limited range.

Describe alternatives you've considered

If there is a good reason to hardcode this in (e.g. different OS having issues), perhaps there could be a configuration option to try and allow extra characters for your own build.

If it is decided to not change this and stick with the current set, documentation could be improved stating the allowable characters and the fact that anything else will not be documented.

@satr-cowi satr-cowi added the type:proposal a feature suggestion label Oct 30, 2024
@jayaddison
Copy link
Contributor

A larger range of allowable characters in methods/functions/attributes etc.

Thank you for the suggestion @satr-cowi. Could you confirm what programming languages you're referring to? I'm guessing Python, based on the reference to autosummary - but perhaps also some of the other supported programming languages of Sphinx? (note that the acceptable symbols for names in each programming language may differ, which is why I ask)

@satr-cowi
Copy link
Author

Ah yes, I was only referring to Python (and hadn't really thought about the others as I've never used Sphinx with them!).

Maybe options in config could be a good way to help deal with this? Unsure exactly what format would be best, but currently I couldn't find an easy way round it without a deep dive into all the regex expressions in the code.

@jayaddison
Copy link
Contributor

Thanks @satr-cowi - yep, sometimes configuration options can help here; however, too much configurability can also create problems, so I'd note that attribute inheritance (in this case, from subclasses of Domain) might be another way to achieve the same.

What would be really nice would be to perform some kind of comparative analysis and produce a table per-programming-language of support for Unicode characters in various API objects. I'm not yet familiar enough to know what the common-denominator objects are across all domains (functions? classes? ...), but some kind of support status per-language might (in my opinion) shed some light of how to design/implement this.

@jayaddison
Copy link
Contributor

(I would like to volunteer for that, but at the moment I think I need to step back from a few threads/tasks - so please don't hold your breath)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants