Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codeblocks should not be parsed #12

Open
anakojm opened this issue Jan 26, 2023 · 5 comments
Open

Codeblocks should not be parsed #12

anakojm opened this issue Jan 26, 2023 · 5 comments

Comments

@anakojm
Copy link
Contributor

anakojm commented Jan 26, 2023

Obsidian-to-hugo wrongly convert the following:

```python
if foo==bar and foo==baz:
    L = [[12,42],[13,90]]
```

to

```python
if foo<mark>bar and foo</mark>baz:
    L = [12,42],[13,90]({{< ref "12,42],[13,90" >}})
```

Codeblocks should instead be skipped to prevent such false positives (I have no idea how to implement this).

@anakojm anakojm changed the title Codeblock should not be parsed Codeblocks should not be parsed Jan 26, 2023
@devidw
Copy link
Owner

devidw commented Jan 28, 2023

Hey @anakojm

I guess regex lookarounds should work for this use case, this would ideally nail the regex down to only those matches that are not written in between triple quotes

If you would like to give it a shot, feel free to add a test case for this in the md marks suit

@devidw devidw changed the title Codeblocks should not be parsed Marks processor: Codeblocks should not be parsed Jan 28, 2023
@anakojm
Copy link
Contributor Author

anakojm commented Jan 29, 2023

I am willing to try but one problem I am facing is that I can't do something like that r"(?<!^```.*?$).*?==([^=\n]+)==.*?(?!^```$)"gsm because it is not supported: re.error: look-behind requires fixed-width pattern.

I think you would be better off dealing with this issue as I lack experience in the matter.

Also why did you restrict the issue to the marks processor?
The issue affect the wikilinks parser too, as shown by my example

In the meantime, I have written test cases, should I PR them? Maybe in another branch?

@devidw devidw changed the title Marks processor: Codeblocks should not be parsed Codeblocks should not be parsed Jan 29, 2023
@devidw
Copy link
Owner

devidw commented Jan 29, 2023

Alright I see

Also why did you restrict the issue to the marks processor?
The issue affect the wikilinks parser too, as shown by my example

Good point, have overseen the change in the second line of the example 🙈

If we want to point out the issue clearly and avoid misunderstandings, we can use the diff block on GH 😉

```python
- if foo==bar and foo==baz:
+ if foo<mark>bar and foo</mark>baz:
-    L = [[12,42],[13,90]]
+    L = [12,42],[13,90]({{< ref "12,42],[13,90" >}})
```

In the meantime, I have written test cases, should I PR them? Maybe in another branch?

Cool, yes that would be awesome, maybe an extra branch like bug-codeblocks

@vonloxley
Copy link

vonloxley commented Jan 27, 2024

This might do the trick since Python 3.6:

    wiki_link_regex = r"(?ms:```.*?```)|\[\[(.*?)\]\]"
    for match in re.finditer(wiki_link_regex, text):
        if not match.group(1):
            continue

@anakojm
Copy link
Contributor Author

anakojm commented Jan 29, 2024

it might work but i believe the problem is more fundamental. we can’t parse markdown with regex properly since markdown is not a regular language.

vonloxley added a commit to vonloxley/obsidian-to-hugo that referenced this issue Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants