Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Handle minimum table width for multi-column utf-8 characters #53

Closed
Nukesor opened this issue Oct 29, 2021 · 7 comments
Closed

[BUG] Handle minimum table width for multi-column utf-8 characters #53

Nukesor opened this issue Oct 29, 2021 · 7 comments
Labels
f: wontfix This will not be worked on t: bug Something isn't working

Comments

@Nukesor
Copy link
Owner

Nukesor commented Oct 29, 2021

This is a follow-up issue to #44.

By default, if there's not enough space and dynamic mode is active, comfy-table falls back to a column with minimum-width of 1.
However, if there are multi-column characters in the text, the format will break as the character doesn't fit into the line.

Example:

╭───╮
│   │
├╌╌╌┤
│ ⺀ │
╰───╯

This is a super rare edge-case and this will need quite a bit of special logic to be properly handled. This includes iterating through all characters of a column.

I'm not sure if it's worth doing it. However, it's probably better to do so, even if it's just for the sake of consistency and correctness.

This behavior will also show up if the user enforces a width of 1 for a column, but this is something we won't handle!

@Nukesor Nukesor added t: bug Something isn't working f: help wanted Feel free to start working on this labels Jan 6, 2023
@Nukesor Nukesor changed the title Handle minimum table width for multi-column utf-8 characters [BUG] Handle minimum table width for multi-column utf-8 characters Jan 6, 2023
@Nukesor
Copy link
Owner Author

Nukesor commented May 26, 2023

After a lot of consideration, this is an edge-case that's considered acceptable.

This will need a lot of additional checks and it will most likely never happen. If it happens, it's easily fixable by adding a Constraint with the minimum length of 2 on this column.

@Nukesor Nukesor closed this as completed May 26, 2023
@Nukesor Nukesor added f: wontfix This will not be worked on and removed f: help wanted Feel free to start working on this labels May 26, 2023
@soywod
Copy link

soywod commented Jan 1, 2025

I would like to re-open this issue, as it seems to break more often that the edge case you mentioned: just take the readme_table example, add an emoji like 🚴‍♀️ and you get a broken layout. Some of my users even made it panic, but I personally was not able to reproduce it.


Before using your lib, I manually handled this case (see pimalaya/himalaya#300 (comment)), I feel confident enough to propose a PR. Would you be interested in it?

@Nukesor
Copy link
Owner Author

Nukesor commented Jan 1, 2025

Hm. I think these are two different problems.

The ticket is about single utf-8 chars that occupy multiple spaces inside a string that're inside of columns with a width of 1 due to external constraints (such as terminal width or explicit constraints).

The issue you're showing is the other way around. It's about multiple utf-8 chars that're squashed into a single utf-8 char.
Comfy-table doesn't do any utf-8 logic itself, it utilizes the unicode-width library to determine the actual length of a given utf-8 string.
unicode-width shows a length of 2 for the 🚴‍♀️ emoji, which is correct as it in theory only occupies two spaces just like the bike which is the "base" emoji the selector is applied on.

This is a bit curious so I started to check out terminals.

Alacritty

Alacritty seems to not be able to display variation selectors:
image

As you can see, the selector isn't merged into the bike, which leads to a shift to the right. Otherwise this is perfectly fine and expected behavior.

Kitty

Kitt is able to display those chars, but it does some weird stuff:
image

The bike char correctly occupies 2 spaces, but for some reason it still occupies as much space as the original sequences which is 🚴<200d>♀️ and is thereby 4 spaces wide despite only using 2.

Wezterm

Wezterm gets it right

image

Conclusion

From what I can observe, this isn't an issue with comfy-table but rather bugs, non-supported unicode sets and non-standardized drawing behaviours accross various terminal emulators and fonts.
If this is a large issue for your user base, I would propose to implement the filtering logic into your own code before you build the table. It doesn't feel right to cripple the library and go off-spec to compensate for issues in terminal emulators.

Regarding the panic, that should obviously not happen. I'll see whether I can reproduce the issue.

@Nukesor
Copy link
Owner Author

Nukesor commented Jan 1, 2025

I would probably be okay with a helper function or config flag inside of comfy-table that goes through everything and strips those chars.

I really don't want to make this a default, as this won't be a problem for everyone and I'm certain that there'll be bug reports about those emojis not being properly displayed.
Furthermore, this will make the library a lot slower as every string needs to be parsed char by char. This simply isn't needed for the vast majority of people.

I never expected to have this many issues with utf-8 -.-.
Biggest cause of issues next to ANSII escape sequences.

@soywod
Copy link

soywod commented Jan 1, 2025

Thank you for diving into the issue. The terminal results are really unexpected. The good news is that there is nothing to do from your side. The only issue I still cannot reproduce (and I'm not even sure it comes from comfy-table) is the panic I mentioned earlier.

@soywod
Copy link

soywod commented Jan 2, 2025

I propose to close this issue, as it does not come from the lib. I will open another issue if we find the reason of the panic.

@Nukesor
Copy link
Owner Author

Nukesor commented Jan 2, 2025

Sounds good to me :)

Thanks!

@Nukesor Nukesor closed this as completed Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
f: wontfix This will not be worked on t: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants