Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COMBINING Unicode categories and/or Unicode blocks? #30649

Closed
PO1970 opened this issue Aug 12, 2022 · 1 comment
Closed

COMBINING Unicode categories and/or Unicode blocks? #30649

PO1970 opened this issue Aug 12, 2022 · 1 comment
Labels
dotnet-fundamentals/svc Pri1 High priority, do before Pri2 and Pri3 product-question Product usage related questions [org][type][category]

Comments

@PO1970
Copy link

PO1970 commented Aug 12, 2022

Type of issue

Other (describe below)

Description

[Enter feedback here]
Not really a feedback, just a QUESTION about the \p{...} thingy:

https://docs.microsoft.com/en-us/dotnet/standard/base-types/character-classes-in-regular-expressions#unicode-category-or-unicode-block-p

The example at
https://docs.microsoft.com/en-us/dotnet/standard/base-types/character-classes-in-regular-expressions#non-word-character-w
i. e.

…[^\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}\p{Pc}\p{Lm}]…

...makes me think that every single \p{...} compound requires its own \p at the beginning, i. e. it is NOT possible to combine Unicode categories and/or Unicode blocks this way:

…\p{LtLoNdLm}…

or

…\p{Lt|Lo|Nd|Lm}…

Confirm?

I mean I did try both via regex101.com and none of the above worked ... but there may be some other trick, which is not mentioned on the above page.

Page URL

https://learn.microsoft.com/en-us/dotnet/standard/base-types/character-classes-in-regular-expressions

Content source URL

https://github.com/dotnet/docs/blob/live/docs/standard/base-types/character-classes-in-regular-expressions.md

Document Version Independent Id

95abea42-fa7f-3feb-ae38-049719ab938f

Article author

@adegeo

Metadata

  • ID: a1958bf6-a17d-1256-d659-31066aa02604
  • Service: dotnet-fundamentals
@dotnet-bot dotnet-bot added the ⌚ Not Triaged Not triaged label Aug 12, 2022
@issues-automation issues-automation bot added dotnet-fundamentals/svc Pri1 High priority, do before Pri2 and Pri3 labels Aug 12, 2022
@gewarren gewarren added the product-question Product usage related questions [org][type][category] label Aug 15, 2022
@dotnet-bot dotnet-bot removed the ⌚ Not Triaged Not triaged label Aug 15, 2022
@adegeo
Copy link
Contributor

adegeo commented Dec 12, 2024

I came across this issue while trying to investigate another. I know this is way late for a response but I had the same experience as you. I got them working in Visual Studio, but not in various online regex testing apps. I did find some info about unicode blocks that talked about requiring a unicode-friendly regex parse. I wonder if the online pages that test regex are all based on UTF-8 and just don't really handle those \P commands.

Regardless. I'm going to close this issue as I don't think there is anything to do in the docs about this, at the moment.

@adegeo adegeo closed this as completed Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dotnet-fundamentals/svc Pri1 High priority, do before Pri2 and Pri3 product-question Product usage related questions [org][type][category]
Projects
None yet
Development

No branches or pull requests

4 participants