Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading a large number of user emoji can break emoji rendering Emacs wide #103

Open
matthew-piziak opened this issue Apr 7, 2022 · 4 comments

Comments

@matthew-piziak
Copy link

I use the emacs-slack package with a Slack team that uses a large number of emoji. I've actually managed to throw a low-level REG_ESIZE error from regex-emacs.c because the compiled regular expression is too big for emacs.

For example, here's the backtrace for (emojify-string "❄"):

Debugger entered--Lisp error: (invalid-regexp "Regular expression too big")
  search-forward-regexp("\\(?::\\(?:\\(?:\\+11\\|0\\(?:2_\\(?:b\\(?:\\(?:lin\\|ore\\)d..." 3 t)
  #f(compiled-function (regexp) #<bytecode 0xe297d60576a0c31>)("\\(?::\\(?:\\(?:\\+11\\|0\\(?:2_\\(?:b\\(?:\\(?:lin\\|ore\\)d...")
  mapc(#f(compiled-function (regexp) #<bytecode 0xe297d60576a0c31>) ("\\(?::\\(?:\\(?:\\+11\\|0\\(?:2_\\(?:b\\(?:\\(?:lin\\|ore\\)d..." ":[[:alnum:]+_-]+:" "\\(?:#⃣\\|\\*⃣\\|0⃣\\|1⃣\\|2⃣\\|3⃣\\|4⃣\\|5⃣\\|6⃣\\|7⃣\\|8⃣\\|9..." "\\(?:#\\(?:-?)\\)\\|%\\(?:-?)\\)\\|'\\(?::\\(?:-[()D]\\|[()D..."))
  seq-do(#f(compiled-function (regexp) #<bytecode 0xe297d60576a0c31>) ("\\(?::\\(?:\\(?:\\+11\\|0\\(?:2_\\(?:b\\(?:\\(?:lin\\|ore\\)d..." ":[[:alnum:]+_-]+:" "\\(?:#⃣\\|\\*⃣\\|0⃣\\|1⃣\\|2⃣\\|3⃣\\|4⃣\\|5⃣\\|6⃣\\|7⃣\\|8⃣\\|9..." "\\(?:#\\(?:-?)\\)\\|%\\(?:-?)\\)\\|'\\(?::\\(?:-[()D]\\|[()D..."))
  emojify-display-emojis-in-region(1 3 nil)
  emojify-string(" ❄")
  eval-expression((emojify-string " ❄") nil nil 127)
  funcall-interactively(eval-expression (emojify-string " ❄") nil nil 127)
  command-execute(eval-expression)

Is there any way I can limit the number of emoji, compile down the regex, or increase the allocated regex space?

@matthew-piziak
Copy link
Author

I see that this package already uses regexp-opt, that's good. I've created a kludge where I take only the first 2000 user emoji in emojify-set-emoji-data.

@ag91
Copy link

ag91 commented Nov 25, 2024

I fell in the same issue now that I am maintaining emacs-slack. I don't understand why we use regexp in the first place though.
Isn't a hashtable better suited, since we are displaying emojis by region with emojify-redisplay-emojis-in-region?

@ag91
Copy link

ag91 commented Nov 25, 2024

oh I get it now, catching the ascii and unicode ones is a real pain unless you enumerate them.
Luckily emacs-slack uses github style ones, which should make the hashtable way easier.

@ag91
Copy link

ag91 commented Nov 25, 2024

very cool, so the solution is just to have (setq emojify--user-emojis-regexp nil) when you have a large number of (github) user emojis, because that way will work with the default github regex, that for emacs-slack is sufficient.
Pretty cool this was possible and that the hash table way was already implemented: my bad I didn't see it immediately!

ag91 added a commit to emacs-slack/emacs-slack that referenced this issue Nov 26, 2024
emojify creates an OR regex with all the custom user emojis, when they
are over 2000 the regex overflow emacs limit.

Setting the regex to nil works around iqbalansari/emacs-emojify#103
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants