Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SourceText collection & toString() for fns and methods #4038

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

Nikita-str
Copy link
Contributor

This Pull Request closes #3780

In current approach all SourceText collected (because mostly a script consist of function codes and classes) and LinearSpan used to retrieve it when needed (to be fair toString() not often called); Gc::<SourceText> used for storage and therefore it can be deallocated only when we have access to no function from a source anymore.

It actually solve some tests from test262, but there is implementation for checking toString() correctness that take any [native code] as correct, without regard to whether it should accept in concrete case or not (assertToStringOrNativeFunction).

Also I don't sure about derive(arbitrary::Arbitrary) for core\ast\src\source.rs::Script & LinearSpan, just derive is ok here?

TODO: find out is `derive(arbitrary::Arbitrary)` in `core\ast\src\source.rs::Script` correct?
TODO: also find out correctness of `derive(arbitrary::Arbitrary)` for `LinearSpan`, `LinearPosition`
Copy link

codecov bot commented Nov 11, 2024

Codecov Report

Attention: Patch coverage is 78.93701% with 107 lines in your changes missing coverage. Please review.

Project coverage is 54.92%. Comparing base (6ddc2b4) to head (01e9a2f).
Report is 340 commits behind head on main.

Files with missing lines Patch % Lines
core/parser/src/parser/statement/if_stm/mod.rs 0.00% 15 Missing ⚠️
core/engine/src/bytecompiler/mod.rs 54.83% 14 Missing ⚠️
core/engine/src/spanned_source_text.rs 58.62% 12 Missing ⚠️
core/ast/src/position.rs 80.39% 10 Missing ⚠️
core/ast/src/source_text.rs 73.91% 6 Missing ⚠️
core/parser/src/lexer/identifier.rs 16.66% 5 Missing ⚠️
.../statement/declaration/hoistable/class_decl/mod.rs 64.28% 5 Missing ⚠️
core/ast/src/source.rs 40.00% 3 Missing ⚠️
core/parser/src/lexer/mod.rs 93.61% 3 Missing ⚠️
core/parser/src/lexer/number.rs 66.66% 3 Missing ⚠️
... and 24 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4038      +/-   ##
==========================================
+ Coverage   47.24%   54.92%   +7.67%     
==========================================
  Files         476      487      +11     
  Lines       46892    48747    +1855     
==========================================
+ Hits        22154    26774    +4620     
+ Misses      24738    21973    -2765     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@raskad
Copy link
Member

raskad commented Nov 15, 2024

I have not done an in depth review yet, just an initial question. As far as I see, the positions you used for the function start and end (LinearSpan and LinearPosition) only contain the start and end line of the function, correct?

I think that would not work in cases where other stuff is in the same line, e.g.:

let a = 1; function f() {return a + 1}; let b = 3;

In this case f.toString() should only be function f() {return a + 1} but it would return the complete line, right?

@Nikita-str
Copy link
Contributor Author

@raskad
No, it contains linear position, in source code, not line. It is position(and span) in representation of source code as ~vector of code points.
so f.toString() will return function f() {return a + 1}.

in details:

let a = 1; function f() {return a + 1}; let b = 3;
//         ↑                         ↑ 
//     span.start                span.end

here is tests
and in particular it tests that:

        ia1 = (b) => b * 2    ;
        // \\ // \\ // \\ // \\ // \\ // \\ // \\
        A    =    ia1 . toString(    );

A will be exactly (b) => b * 2

@Nikita-str
Copy link
Contributor Author

also you can read the issue where it was accepted to collect the source code(it should slow down parse stage performance)

Copy link
Member

@raskad raskad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for letting you wait so long for the full review. First of all thank you for the great contribution! Really nice work!

I added some specific comments, but I have one more general one. I would be in favor of Removing all the Options for the various SourceText fields. In principle, everytime we parse, compile and execute code, we must have the source text and a span. I did a quick check an I'm 99% sure the Options can be removed in all places.
In addition to the Script struct, the Module struct would also need the source text.
In some cases (e.g. the "dummy" use of the Bytecompiler for syntetic modules) there is really no source text, but there it is also never used. In those cases we can just use the default (empty vec).

Regarding Arbitrary, I think it should be a manual implementation that is just an empty source text. You can check here for en example of a manual impl:

#[cfg(feature = "arbitrary")]
impl<'a> arbitrary::Arbitrary<'a> for Literal {
fn arbitrary(u: &mut arbitrary::Unstructured<'a>) -> arbitrary::Result<Self> {
let c = <u8 as arbitrary::Arbitrary<'a>>::arbitrary(u)? % 6;
match c {
0 => Ok(Self::String(<Sym as arbitrary::Arbitrary>::arbitrary(u)?)),
1 => Ok(Self::Num(<f64 as arbitrary::Arbitrary>::arbitrary(u)?)),
2 => Ok(Self::Int(<i32 as arbitrary::Arbitrary>::arbitrary(u)?)),
3 => Ok(Self::BigInt(Box::new(
<BigInt as arbitrary::Arbitrary>::arbitrary(u)?,
))),
4 => Ok(Self::Bool(<bool as arbitrary::Arbitrary>::arbitrary(u)?)),
5 => Ok(Self::Null),
_ => unreachable!(),
}
}
}

core/ast/src/source.rs Outdated Show resolved Hide resolved
core/ast/src/source_text.rs Outdated Show resolved Hide resolved
core/engine/src/spanned_source_text.rs Outdated Show resolved Hide resolved
core/engine/src/spanned_source_text.rs Outdated Show resolved Hide resolved
core/engine/src/spanned_source_text.rs Outdated Show resolved Hide resolved
@Nikita-str
Copy link
Contributor Author

If remove Option<SourceText> from everywhere then I don't like what will happen with ByteCompiler::new / FunctionCompiler::new in no-source cases: we should have non-init marker, and create an empty Vec looks not safe enough. Option directly say that there is no object. But with Option we can forget to init it... I can add Option<SourceText> as fn new's arg and there will be no way to forget init.

@raskad
Copy link
Member

raskad commented Jan 9, 2025

If remove Option from everywhere then I don't like what will happen with ByteCompiler::new / FunctionCompiler::new in no-source cases: we should have non-init marker, and create an empty Vec looks not safe enough. Option directly say that there is no object. But with Option we can forget to init it... I can add Option as fn new's arg and there will be no way to forget init.

I was mostly concerned with the code in the bytecompiler being a bit easier to work with. Like you mentioned, with the Option and the set method you always have to remember it and do the whole if let check.
I don't see how the empty vec is unsafe, but what do you think about changing Gc<Inner> in SourceText to Option<Gc<Inner>>, or alternativeley, if you implement #4038 (comment), SpannedSourceText would be the only struct that is actually used outside of spanned_source_text.rs. That way you could also do the Option directly there.
If I'm not missing anything that should be a good solution, because we do not have the Options everywhere in the bytecompiler, the type is still the same size and in the case where we have no source, we do not allocate an empty Vec or a Gc.

@Nikita-str
Copy link
Contributor Author

Nikita-str commented Jan 9, 2025

Seems like all done.
I also pass SourceText to the Module & then pass it from the Module to the ByteCompiler, but I don't test it.


We don't always have LinearSpan (for example here) so SpannedSourceText contains Option<LinearPosition>:

pub struct SpannedSourceText {
source_text: SourceText,
#[unsafe_ignore_trace]
span: Option<LinearSpan>,
}

In all other places Option is removed.

@raskad
Copy link
Member

raskad commented Jan 10, 2025

Looks really nice! Thank you for the work! Could you do a rebase, best would be after #4117 is merged, because the CI is broken without that.

One very minor nitpick: I would like to rename SourceText::new_pseudo to SourceText::new_empty and PSEUDO_LINEAR_SPAN to EMPTY_LINEAR_SPAN. I think it describes the values a bit better.

@Nikita-str Nikita-str force-pushed the fn-to-string-2nd-way branch from 39e71a0 to 782a218 Compare January 10, 2025 03:58
Copy link
Member

@raskad raskad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution. Great stuff. I'd be happy to merge this.

@raskad raskad added this to the next-release milestone Jan 10, 2025
@raskad raskad added parser Issues surrounding the parser execution Issues or PRs related to code execution labels Jan 10, 2025
@raskad raskad added the ast Issue surrounding the abstract syntax tree label Jan 10, 2025
@raskad raskad requested a review from a team January 10, 2025 23:23
@jedel1043
Copy link
Member

jedel1043 commented Jan 10, 2025

How much progress is this towards errors with backtraces? Haven't looked at it yet, I'm just curious if this would be another step forward or if we would have to essentially rewrite everything when we have nicer errors

@Nikita-str
Copy link
Contributor Author

How much progress is this towards errors with backtraces?

Sorry, I actually don't sure what is needed for backtraced errors.
In this PR propagated only those linear spans that are needs for functions' toString.
LinearSpans are only needed when you need unchanged source code.
And I don't sure, do you need unchanged source code for backtraced errors at all?
If you need then at least it will be necessary to add LinearSpans to others AST nodes.
Otherwise it will be just parallel things, I suppose that you will not have to rewrite anything in any case.

@jedel1043
Copy link
Member

If we just need to increasingly add LinearSpan to the rest of the AST nodes, that sounds perfect! And yeah, we need unchanged source code, because even after optimization passes we need to be able to reference the original line that a bytecode instruction comes from.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ast Issue surrounding the abstract syntax tree execution Issues or PRs related to code execution parser Issues surrounding the parser
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Function.prototyp.toString only returns a fixed [native code] string
3 participants