-
Notifications
You must be signed in to change notification settings - Fork 442
IDNA proposal #2874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
IDNA proposal #2874
Conversation
Mentioning a number of people that I suspect have IDNA domains or knowledge about how IDNA works (based on skimmy bugs and PRs) @adamus1red I'd love to receive feedback about this proposal! |
I think your example for the asci==unicode exampe might be wrong:
Was this what you intended? |
I noticed you created both CREATE and MODIFY examples for |
Ah, good point! The first one is where the label is ascii==unicode but the target is ascii!=unicode. I'll update the comment. Thanks for finding that! Tom |
I've added more examples. I don't think I've covered every combination, but my goal is to show typical examples not every possible example. I've also added examples where we use {} and |
While I'm not against having the ASCII and UTF on the output lines, I do worry it might make the output too busy. |
LGTM |
@adamus1red wrote:
That's an interesting point! I guess my thought is that showing both versions helps with debugging. |
@tlimoncelli maybe a compromise would be if the output was the same as what the DNS provider or Registrar used. I know I've had issues where the DNS is using UTF but the registrar is using ASCII. I.e. namecheap uses ascii, so for registrar stuff using namecheap use ascii punycode and the DNS is cloudflare which uses UTF, so the output uses UTF. |
The only IDNA domain I have is for fun, so I don't have a strong preference. I'll give my input nonetheless :). If you want to show both, I think I like B better, as it feels more consistent to me. Anything not in brackets will always be ASCII that way. I'd probably go with showing what the original user input was, with a flag to only show ASCII if needed. It's less information to parse, and the user should be familiar with it as that's the way it's listed in their config. I could see points being made for showing both, but I've always liked things more distraction free and less dense. I think the Unicode brackets are a little too clever, perhaps even a little confusing ;) |
First of all, improving IDNA handling would be a great improvement to dnscontrol. Regarding output, the one thing I definitely do not like is having the ascii output come first, because it is the one least likely to be understood/mentally associated with the relevant domain. I think simply using the original user input has merit, pairing that with a toggle to additionally show ascii seems fine to me. |
I'm seconding this suggestion, by displaying the "human readable" format I think the barrier for using IDN's with dnscontrol is getting lowered. Because the IDNA format is not human readable, especially when it comes to non-latinized languages. |
This is excellent feedback! It's getting me excited! Question: In what situations would people want to see something besides the .Name (the user input) version? |
What about if the registrar or dns provider use something different than the .Name value, then include the version they are using in brackets? |
Personally, I think whatever the dns provider does isn't relevant to the cli output. Behind the scenes at every provider, it's all punycode anyway. |
Agreed, having the display format handled outside of the provider is to be preferred IMO. |
I don't have experience with IDNA at all. |
Hi folks! 2 ideas: Support multiple formats?There's been a lot of discussion about I'll know more if this is possible when I start coding. An idea that would break less existing codeExisting code expects .Name to be ASCII (the current code runs dc.Punycode() for all providers, which rewrites .Name to be ASCII). Rather than require every use of .Name to change to .NameASCII, maybe the names should be: .Name (ASCII, to be compatible with old code), .NameORIG (how the user input the string), .NameUNICODE, and .NameDisplay. |
``` | ||
models.DomainConfig: | ||
|
||
.Name: the name from D() after downcased via unicode.ToLower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to always put the punycode variant in here (which is identical to the non-punycode if it's a ASCII-only domain).
This should require the least amount of changes in all providers and "do the right thing" out of the box 99% of the time, without breaking pure-ASCII domains.
It's also least surprising for users - how they write it shouldn't affect how the provider should treat it.
Which encoding the user provided in the D
call is nothing individual providers need to worry about.
models.DomainConfig: | ||
|
||
.Name: the name from D() after downcased via unicode.ToLower() | ||
.NameASCII: The name stored after calling ToASCII() (with ACE prefix if any Unicode chars are present) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop
|
||
.Name: the name from D() after downcased via unicode.ToLower() | ||
.NameASCII: The name stored after calling ToASCII() (with ACE prefix if any Unicode chars are present) | ||
.NameUnicode: The name stored after calling ToUnicode() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IDN
seems more fitting (Internationalized Domain Name)
.Name: the name from D() after downcased via unicode.ToLower() | ||
.NameASCII: The name stored after calling ToASCII() (with ACE prefix if any Unicode chars are present) | ||
.NameUnicode: The name stored after calling ToUnicode() | ||
.NameDisplay: if .NameASCII != .NameUnicode, store as "ascii (unicode)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is confusing. Shouldn't this be the bigger slice that Name
and IDN
would be subslices of?
|
||
Here are some example outputs: | ||
|
||
NOTE: Feedback needed! Do you prefer "a" or "b"? Is there an even better format I should consider? Should we use `{}` instead of `()`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer b
and ()
. We should also specify NameDisplay
only is a string with both variants if the domain is non-ASCII.
|
||
Here are some example outputs: | ||
|
||
NOTE: Feedback needed! Do you prefer "a" or "b"? Is there an even better format I should consider? Should we use `{}` instead of `()`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b
and ()
models.DomainConfig: | ||
|
||
.Name: the name from D() after downcased via unicode.ToLower() | ||
.NameASCII: The name stored after calling ToASCII() (with ACE prefix if any Unicode chars are present) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, drop!
``` | ||
models.DomainConfig: | ||
|
||
.Name: the name from D() after downcased via unicode.ToLower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
punycode variant in here ftw.
.Name: the name from D() after downcased via unicode.ToLower() | ||
.NameASCII: The name stored after calling ToASCII() (with ACE prefix if any Unicode chars are present) | ||
.NameUnicode: The name stored after calling ToUnicode() | ||
.NameDisplay: if .NameASCII != .NameUnicode, store as "ascii (unicode)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, it will be either
mydomain.com
if it is not about an IDN andxn--p1ai.com (рф.com)
otherwise.
Sounds good to me.
|
||
.Name: the name from D() after downcased via unicode.ToLower() | ||
.NameASCII: The name stored after calling ToASCII() (with ACE prefix if any Unicode chars are present) | ||
.NameUnicode: The name stored after calling ToUnicode() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I enjoy playing with short names that make things clear - so IDN
would be perfect, but from developer perspective, having that property to start also with NameXXX
makes it probably better accessible via IDE as the Name
properties appear right after each other.
.NameIDN
isn't applicable as it states "name" twice out.NameUnicode
is ok though.NameInternationalized
eventually as well
My favs are by that .IDN
or .NameUnicode
while .NameUnicode
is still my personal pick. Unicode
and ASCII
keywords are often used whenever it comes to IDN translation libraries. While IDN
and Punycode
keywords are usually used on Domain/DNS Provider side. That's at least what I noticed over the years. Still, that shouldn't give a hint for making a decision.
Also, IDN isn't IDN if we compare .de and .com. Some TLD Providers support different IDNA Standards (IDNA2003 vs. IDNA2008, UTS46). Translating an IDN might by that end in a different punycode variant. Let me provide some example in here from the HEXONET Provider's ConvertIDN API Command:
No big difference in here. But let us pick one with german special characters:
Let us ignore that Mhmm... DNSControl again runs on "existing" data configured by the user. By that, the input should be considered as "correct" (would be very stupid otherwise) and by that, we can consider this special discussion probably as superfluous... Mhmm 2 ... The DNS/Domain Provider should finally be capable of handling that on their own (returning an error message static out that the provided domain/dnszone name is invalid) and you guys do not have to worry about all that. Sorry that I bumped this up :-) |
No description provided.