Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dies with unicode bare key name #258

Open
plicease opened this issue Mar 29, 2021 · 3 comments
Open

dies with unicode bare key name #258

plicease opened this issue Mar 29, 2021 · 3 comments

Comments

@plicease
Copy link
Contributor

First reported with to Perl::Critic, since I wasn't sure if the problem was there or here. However, it seems to be reproducible with ppidump (see Perl-Critic/Perl-Critic#948 (comment) ), so I believe this is indeed a PPI issue.

code:

use utf8;
my %x = (Привет => 1);

my($key) = keys %x;
die unless $key eq 'Привет';

perlcritic:

% perlcritic .
Problem while critiquing "foo.pl": Can't parse code: Encountered unexpected character '208'

It does work when I change it to 'Привет' => 1)

I admit to not being 100% certain that you can use a bare unicode key name like this, but the code does seem to work as intended. If so a more useful diagnostic would indicate where the character is located, if that is possible.

@oalders
Copy link
Collaborator

oalders commented Jan 21, 2022

Some notes on unicode support: https://metacpan.org/pod/PPI#Internationalisation

@ernstki
Copy link

ernstki commented Nov 4, 2024

I'm seeing the same behavior (at least I think I'm seeing the same behavior) with something like:

warn "Converted user name to submission ID → $subid\n";

I think this leaves me with a choice of not to use perlcritic or not to use Unicode characters, which is kind of a bummer either way.

Update: Aha, no it was a parsing error where the missing closing quote ran up to a Unicode ellipsis character (…) before finding another matching quote, not the "→" I mentioned above.

I wouldn't know offhand what to do with Encountered unexpected character '226'. I suppose now I will recognize that as 0xE2, the start of the Unicode representation for "…", but it's not something that will turn up in od -a or hexdump -C, unless I'm just using the tools wrong.

@oalders
Copy link
Collaborator

oalders commented Nov 4, 2024

If you think we can provide better error messaging, a PR would be most welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants