Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor codeAsInSource to not require command + fix nbJs bug fixes #163

Merged

Conversation

HugoGranstrom
Copy link
Collaborator

@HugoGranstrom HugoGranstrom commented Jan 22, 2023

This aims to solve #134 by changing the way nimib tries to find the start of a code block. This is mostly due to solving a bug that stopped us from getting the accurate start position.

TODO:

  • The nbCode might not be in the main file, so we must use the filename from lineInfo to get the file. Currently, we only have nb.source with the current file, but we need to handle this case:
# File A.nim
template hugoSlides() =
  nbCode:
    echo "Hello world"

# File B.nim
import A

hugoSlides()

B.nim is the main file in this case, but the source is in A.nim. I'm thinking about adding a nb.sources: Table[string, string] so that we can at least cache the files that we open.

  • Test this rigorously, all tests and docs are building without error. But I haven't verified that the docs are correct yet.
    • Check nimib docs
    • Check scinim/getting-started
    • Check nimiSlides
    • Check nimConf2022
    • norm
    • Snorlogue

This also fixes #167 and #164 by adding a check so nimib doesn't compile an empty nim file if no nbJs was used. And the temporary js file created now has a unique name for each file, so parallel builds should be safer now.

@HugoGranstrom
Copy link
Collaborator Author

From a quick look at the docs, it looks like this new approach is actually working 🎊 And if there would appear any edge cases, it is trivial to fix most of them by adding code in either startPosNew or findStartLineNew.

There is a list of all node kinds here but I don't even know what half of them means or what code looks like that produces them. So I'm leaning towards letting our users find out for us in the future which more ones actually require handling 🤣 I will go over all existing nimib uses I can find though, so I might still find new edge cases. Other than checking that, this PR is basically done implementation-wise. :o That was easier than I thought just a week ago 😎

@pietroppeter
Copy link
Owner

Good job! I guess there are a few lines of code that can be removed before merging?

@HugoGranstrom
Copy link
Collaborator Author

It depends on if you want to keep the old codeAsInSource around behind a flag or if we should just delete it?

@HugoGranstrom
Copy link
Collaborator Author

Ok, I have hit some trouble... It was too good to be true 🙃

It has to do with templates and the fact that they transform the AST into typed. If we have a nbCode in a template:

template foo() =
  nbCode:
    nb.renderPlans["nbText"] = @["mdOutputToHtml"]

then it will have a different AST compared with a "raw" nbCode:

nbCode:
  nb.renderPlans["nbText"] = @["mdOutputToHtml"]
# "raw" version AST
StmtList
  Asgn
    BracketExpr
      DotExpr
        Ident "nb"
        Ident "renderPlans"
      StrLit "nbText"
    Prefix
      Ident "@"
      Bracket
        StrLit "mdOutputToHtml"
# template version AST
StmtList
  Call
    OpenSymChoice 31 "[]=" # this causes the trouble!
    DotExpr
      Sym "nb"
      Ident "renderPlans"
    StrLit "nbText"
    Prefix
      OpenSymChoice 3 "@"
      Bracket
        StrLit "mdOutputToHtml"

In the "raw" untyped version, we will get the position of nb, but in the template version we will get the position of []= instead. And this is only a problem because the compiler shuffles around the order of the nodes when going from untyped to typed. And my code currently always takes the first node, because I thought it would always be the case that the first node also came first in the source.

I am not sure how to solve this at the moment, and this is mostly for my own memory before going to bed. Options:

  • Analyzing the AST to see if it is typed. But it is the fact that the function in this case is a kinda infix function that trips us up. And then we would need to write logic to determine if a given function ([]= in this case) is infix or not. And if it is, skip it.
  • Make the "command"-finding code better so we don't have to use the column information at all. The downside of this is that we have to make assumptions on the code...

@pietroppeter
Copy link
Owner

Very interesting anyway. With this notes little by little I can hopefully understand more about this stuff. It seems the problem is in this Open Choice nodes. Is it possibile to just skip those and consider the first node that is not open choice? Or do we have cases where the open choice node would be the correct pick?

@pietroppeter
Copy link
Owner

Another option, if we just go over all nodes and take the minimum of lineinfo?

@HugoGranstrom
Copy link
Collaborator Author

It's not really the openChoice node that is the problem, even if it was just a sym we would have the same problem as the []= comes before nb in the AST. So the problem really is that the "infix" (it isn't an infix but is kinda similar) proc is transformed to a nnkCall. This only seems to happen with special procs though, like []=. Will have to investigate more which other construct behaves like this as well. Then it could be feasible to just check if the call is for one of these fake infix procs.

The search for the minimum could work :o we just have to make sure we don't get AST from somewhere else as well. Then we could get weird behaviors.

@pietroppeter
Copy link
Owner

we just have to make sure we don't get AST from somewhere else as well

in the lineinfo object there is also the filename, right? we could use that to make sure we take the minimum conditioned on the correct file (if we actually are able to know which is the right one...)

@pietroppeter
Copy link
Owner

pietroppeter commented Jan 24, 2023

over all nodes

and I guess if we have a StmtList we might need only to go over all nodes in the first statement, I guess statements should not be mixed up...

actually there is the case of Call above, let's say all children of first node? to be honest not sure if I know what I am talking about, the above is mostly coming from your examples above... :D

@HugoGranstrom
Copy link
Collaborator Author

Regarding lineinfo, how do we know which filte of the correct one? A majority vote? (That would be in favour of the l file with longer code) And even if we get the correct file, if we use nnCode inside a template, the template will have a lower line number than the call to it.

But if we assume the file is correct, your idea is quite similar to what we are already doing. With the adddition of checking all childs of all nodes except nnkStmtList. And taking the minimum. I will try this out when I get some time :) thanks for the ideas! 😁

@pietroppeter
Copy link
Owner

Regarding lineinfo, how do we know which filte of the correct one?

Not sure if it works but we could probably use as source of truth the file set through nbInit (which can be overriden by the user if they know better, for example the override is used by nimibook to render plain markdown files):

nb.thisFile = instantiationInfo(-1, true).filename.AbsoluteFile

@HugoGranstrom
Copy link
Collaborator Author

That is a good idea, most of the time the code we read will be from the end-user and not the libraries (why would they show their code :P). 😄
Will give this a try this evening.

@HugoGranstrom
Copy link
Collaborator Author

Not sure if it works but we could probably use as source of truth the file set through nbInit (which can be overriden by the user if they know better, for example the override is used by nimibook to render plain markdown files):

One problem with this idea: we don't have access to nb.thisFile at compile time so the macro can't use it. I guess we could save all the lineInfos of all the nodes and try and inlining them at runtime, but at that point it becomes much harder for a hypothetical advantage. I'd say we wait for the first bug report and the nature of that code before we complicate things here.

@pietroppeter
Copy link
Owner

Ah right this is compile time!

@HugoGranstrom
Copy link
Collaborator Author

I haven't looked everywhere but this passes all tests so we might not need to know the actual file. The only time we get problems is if someone tries to do this:

template foo(body: untyped) =
  nbCode:
    echo "This line will be the one found"
    body

but I'm not really sure why one would do something like that either.

@HugoGranstrom
Copy link
Collaborator Author

I have now gone over the majority of the nimib code I could find and found one minor bug. But I think this is ready a merge after I clean up the code tomorrow. Please have a look yourself and see if you can find any edge cases. Otherwise we'll tag a new release in the weekend.

@pietroppeter
Copy link
Owner

ok, looks good thanks, I will review this (possibly later this evenining). I think this PR also fixes #167 and #164 and we should mention that in the title. also not sure if there is anything to document here, at the very least we should add some changelog lines, and I guess also bump version, since after this is merged we should release a new 0.3.x

@HugoGranstrom HugoGranstrom changed the title refactor codeAsInSource to not require command refactor codeAsInSource to not require command + fix nbJs bug fixes Jan 27, 2023
Copy link
Owner

@pietroppeter pietroppeter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a question because I gotta ask and a suggestion for an uncorrelated fix ;)
other than that looks good for me, thanks!

nimib.nimble Outdated
@@ -1,6 +1,6 @@
# Package

version = "0.3.4"
version = "0.3.5"
author = "Pietro Peterlongo"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, just noticed this but you need to add your name here!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh :o

line*: int
column*: int

proc `<`*(p1, p2: Pos): bool =
(p1.line, p1.column) < (p2.line, p2.column)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am guessing since you checked on quite a few examples it is not something that happens often or possibly ever, but I gotta ask. is it possible we end comparing positions from two different files? should we check that this does not happen and put an assert? and in case it happens? I guess we should only compare stuff that comes from same file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is totally possible that we compare from two different files. I'll add an assert with a prompt to open an issue if it ever happens. 👍

@pietroppeter
Copy link
Owner

if you want after this you can merge and release!

@HugoGranstrom
Copy link
Collaborator Author

Thank you :D

I will let this sit for the night and merge and release it first thing tomorrow :)

@pietroppeter
Copy link
Owner

During the night I was thinking about the assert filename thing. Maybe a warning is better? If it happens to someone in a harmless way we might not want make nimib useless... not sure about this either though (an assert feels correct)

@HugoGranstrom
Copy link
Collaborator Author

The only scenario where it's harmless is if it's on the exact same line in both files (or on adjacent whitespace lines), so I wouldn't exactly say that there are any realistic harmless cases. So I think it's better that we shout at them and investigates it rather than a silent bug (I rarely reas hints and warnings 🙊)

@HugoGranstrom
Copy link
Collaborator Author

Ok, it's also harmless if the line number in the incorrect file is larger than the correct file. But that's just a 50% probability. The other 50% of the times you would get an undefined behavior that depend on how you structure another file. Imagine this scenario: you write and check your nbCode output and it's correct and you never expect to have to update it again. Sometime in the future you refactor the incorrect file and trigger the bug. But if you don't explicitly check the nimib document (why would you, it was correct and you haven't changed it), this bug would silently creep into your document.

@pietroppeter
Copy link
Owner

ok, let's keep the assert. but this implicitly means that we do NOT expect to have code from different files, otherwise we should have a way to deal with that and not run in the assertion error. what could be the logic we use? first node is authority for filename (and we compare only with nodes with same filename) unless we explicitly override? do you think it could make sense to test again with the assert the different documents? or to try and actively create a case where we do expect to have different filenames? I am not entirely sure this can indeed happen, never tried... also, should it be a doAssert so that it still checks in release mode?

@HugoGranstrom
Copy link
Collaborator Author

The case when we run into trouble is when we have in file A a template with a nbCode which both inputs the body it gets as input, but also a piece of code of it's own:

template foo(body: untyped) =
  nbCode:
    body # code from file B
    echo "Hello world" # file A

And then in file B:

foo:
  echo "Code from file B"

Except for this kind of use-case, I don't currently see any way for the assert to be triggered. And what would be the correct code to show in this case? Of course we would like to combine the code from both files ideally. But what if the template is defined in file B instead, then we want to comine two different code blocks from the same file. And how we would differentiate between those is even harder.

So until we have a solution for these problems, I would say any usecase which triggers the assert is unsupported by codeAsInSource and should use codeFromAst instead. Giving the nodes authority based on order makes no sense as it depend on the order you put the statements in foo in. Better to be strict about this and just say it's not supported imo.

I can run through all the nimib documents and check so they don't assert 👍 If I recall correctly, asserts are only removed in danger mode. Release mode previously removed them but I think it was changed quite a while ago. But I'll change it to doAssert just in case 👍

@HugoGranstrom
Copy link
Collaborator Author

I have improved the error message now and tested it against all the nimib documents. And no assertions were raised.

@pietroppeter
Copy link
Owner

the more I think about it, the more complex it looks. :)
This PR is anyway definitely an improvement and it seems it does not break anything so let's go!

@HugoGranstrom
Copy link
Collaborator Author

Yes exactly, this is an improvement to the current. Merging and releasing now. Let's rerun the CI over on the nimibook repo when it's done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

do not output a js file if there is no js block
2 participants