You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sem-check symbols lazily, deferring semcheck of declaration and body until they're needed:
a symbol declared gets added to current scope but isn't semchecked yet
a reference to an overloaded symbol (ie a rountine call fn(a,b) triggers semcheck of declarations of each overload of fn (only the declaration, not the body)
once a call is resolved to a symbol, that symbol's body is semchecked, at which point a new scope is created (with parent scope = scope where symbol was declared) and semcheck recurses in this scope in same way as for the top-level scope
It works similarly to how nim implements dead code elimination in backend, by processing declarations top-down, which has the built-in property of skipping unused declarations (even if those appear in a cycle, eg if foo calls bar but no top-level code calls (transitively) either foo or bar, both foo and bar will be dead-code eliminated.
semcheck as depth first search
We define a processing stack containing declaration scopes (PScope), and semcheck consists in lazily processing statements in a scope and recursing when a non-declaration statement is visited that requires semchecking a symbol.
There is no need for fwd declarations nor complex data structures nor keeping track of scopes attached to declarations; all that's needed is a stack of scopes.
At any given time during semcheck, you only need to keep track of a stack of N scopes where N is the processing depth (eg: fn1 calls fn2 calls fn3 => N = 3). When a symbol f is declared in a scope S, f's scope will grow if new declarations occur after f is declared (in same lexical scope); when a statement triggers semcheck of f (delaration or body), a new scope is created (whose parent is S).
The parent relationship implicitly defines a tree (rooted at module top-level), but we never need to visit the children of a scope; all that's needed is walking up the scope when doing symbol resolution (ident => symbol).
semcheck consists in doing depth first search in the symbol graph, where recursion is triggered for each statement that requires semchecking a declaration (a new declaration doesn't require semchecking nor recursion; instead the symbol just declared is merely added to the current scope). Each time a symbol is semchecked, a new scope is created (with an appropriate parent) and pushed to the stack; it is popped when semchecking for this symbol is completed.
in particular, the position in processingStack is unrelated to the parent field.
example
# m1.nimtype
A =object
B =typeof(fn3())
C =seq[B]
procfn1(a: C) =from m2 import fn2
fn2(a)
procfn3(): int=1# until this point, the symbol table just contains shallow entries: `A(type) , B(type), C(type), fn1 (proc), fn3(proc)`;# in particular, fn1, fn3, C have not been semchecked.# If the module ended here, that would be the end of semcheckdiscardfn3() # this triggers semcheck of fn3var a: C # this triggers semcheck of C => B => fn3 (cached)discardfn1(a) # this triggers semcheck of fn1 => registers m2; then fn2(a) triggers import of m2, and semcheck of `fn2`
scoping rules
the declaration scope looks both before and after a declaration
procfn1=discard# scope: fn1block:
procfn2=# if/when body is semchecked, scope = fn1, fn2, fn3fn3()
procfn3=discard# fn2 and fn3 are not in scope anymore (scope = fn1)
behavior of declared
const a =declared(foo)
static: echo a # prints `false` because a top-level code triggers semcheck of `a` and `foo` isn't yet declaredprocfoo() =discard
contrast with:
const a =declared(foo)
procfoo() =discardstatic: echo a # prints true # the top-level code triggers after foo is declared
behavior of lazy imports
# a.nimproca1=from b import b1
b1()
proca2=from b import b2
b2()
a1()
a2()
# b.nimprocb0=discardprocb1=b0()
procb2=discard
semcheck steps:
symbol table registers a1, a2 (without semchecking those)
a1() is seen, triggers semcheck of a1
from b import b1 is seen, registers symbols b, b1 (no semcheck nor import yet)
b1() is seen, triggers semcheck of b1, which triggers import of b, then of b1()
b0() is seen, triggers semcheck of b0
a2() is seen, triggers semcheck of a2
from b import b2 is seen, registers symbols, b, b2
b2() is seen, triggers semcheck of b2
benefits
this would solve cyclic dependencies
forward declarations become un-needed
this is a more principled and effective way to achieve {.experimental: "codeReordering".}, and subsumes this feature
this would massively speed up compilation times, and possibly also binary sizes; the dead code elimination that nim currently does cannot recover from certain earlier decisions such as generating module initializations for modules that would be skipped thanks to lazy symbol resolution
there would be no need for include files anymore (these are still pervasive in compiler code)
proposal
Sem-check symbols lazily, deferring semcheck of declaration and body until they're needed:
fn(a,b)
triggers semcheck of declarations of each overload offn
(only the declaration, not the body)It works similarly to how nim implements dead code elimination in backend, by processing declarations top-down, which has the built-in property of skipping unused declarations (even if those appear in a cycle, eg if
foo
callsbar
but no top-level code calls (transitively) either foo or bar, both foo and bar will be dead-code eliminated.semcheck as depth first search
We define a processing stack containing declaration scopes (PScope), and semcheck consists in lazily processing statements in a scope and recursing when a non-declaration statement is visited that requires semchecking a symbol.
There is no need for fwd declarations nor complex data structures nor keeping track of scopes attached to declarations; all that's needed is a stack of scopes.
At any given time during semcheck, you only need to keep track of a stack of N scopes where N is the processing depth (eg: fn1 calls fn2 calls fn3 => N = 3). When a symbol
f
is declared in a scopeS
,f
's scope will grow if new declarations occur afterf
is declared (in same lexical scope); when a statement triggers semcheck off
(delaration or body), a new scope is created (whose parent isS
).The parent relationship implicitly defines a tree (rooted at module top-level), but we never need to visit the children of a scope; all that's needed is walking up the scope when doing symbol resolution (ident => symbol).
processing steps:
semcheck consists in doing depth first search in the symbol graph, where recursion is triggered for each statement that requires semchecking a declaration (a new declaration doesn't require semchecking nor recursion; instead the symbol just declared is merely added to the current scope). Each time a symbol is semchecked, a new scope is created (with an appropriate parent) and pushed to the stack; it is popped when semchecking for this symbol is completed.
this can be represented simply as follows:
in particular, the position in processingStack is unrelated to the
parent
field.example
scoping rules
the declaration scope looks both before and after a declaration
behavior of
declared
contrast with:
behavior of lazy imports
semcheck steps:
from b import b1
is seen, registers symbols b, b1 (no semcheck nor import yet)from b import b2
is seen, registers symbols, b, b2benefits
{.experimental: "codeReordering".}
, and subsumes this featurelinks
The text was updated successfully, but these errors were encountered: