Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sourcepos is incorrect for <script> tags #448

Open
GabeIsman opened this issue Jul 29, 2024 · 4 comments
Open

sourcepos is incorrect for <script> tags #448

GabeIsman opened this issue Jul 29, 2024 · 4 comments

Comments

@GabeIsman
Copy link

I've encountered an odd issue. I'm using the Ruby wrapper (Commonmarker) and inspecting the AST shows some strange behavior around the sourcepos of <script> tags. AFAICT this is only related to script tags, and only ones that appear all on one line at that.

Commonmarker.parse("<script></script>")
=> #<Commonmarker::Node(document):
  source_position={:start_line=>1, :start_column=>1, :end_line=>1, :end_column=>17}
  children=[#<Commonmarker::Node(html_block):
       source_position={:start_line=>1,
        :start_column=>1,
        :end_line=>0,
        :end_column=>0}>]>

note the end_line and end_column are both zero. In general I've observed that end_line is start_line - 1, and end_column is always zero.

Let me know if I should open an issue on Commonmarker instead, but I don't see how this could be caused by the wrapper. I just lack the rust expertise to test it directly in rust, sorry!

@kivikakk
Copy link
Owner

Thanks for the report! This is definitely a Comrak thing. No guarantees on when I'll be able to look into this, but it shoooooould be fairly simple. 🤞

@digitalmoksha
Copy link
Collaborator

@kivikakk it looks like this is occurring for the tags <script>, <pre>, <textarea>, and <style>. They get recognized in html_block_start,

[<] ('script'|'pre'|'textarea'|'style') (spacechar | [>]) { return Some(1); }
and returns a 1, and then

comrak/src/parser/mod.rs

Lines 1928 to 1938 in feaf5cf

AddTextResult::HtmlBlock(block_type) => {
self.add_line(container, line);
let matches_end_condition = match block_type {
1 => scanners::html_block_end_1(&line[self.first_nonspace..]),
2 => scanners::html_block_end_2(&line[self.first_nonspace..]),
3 => scanners::html_block_end_3(&line[self.first_nonspace..]),
4 => scanners::html_block_end_4(&line[self.first_nonspace..]),
5 => scanners::html_block_end_5(&line[self.first_nonspace..]),
_ => false,
};
handles it.

I'm not sure why these would get treated differently than any other HTML block tag, such as a <p>?

@kivikakk
Copy link
Owner

@digitalmoksha
Copy link
Collaborator

Of course, I should've looked at the spec 🤦

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants