Skip to content

Commit

Permalink
dyno: Resolve nested functions without outer variables (#24523)
Browse files Browse the repository at this point in the history
This PR adds support to the new compiler for resolving nested functions
that do not contain any references to outer variables.

To do so, it adds a new meta-type `OuterVariables`, which stores the
outer variables used in a nested function as well as their mentions.
Currently, both are stored in lexical order. Note that module-scope
variables are not considered to be an outer variable for the purposes of
this analysis, as they have an infinite lifetime.

The set of outer variables is computed using a new query
`computeOuterVariables`. This query is implemented by leveraging the
`scopeResolveFunction` query, since that also necessitates full
traversal of the function. Note that determination of outer variables
for a given nested function 'NF' also requires us to call
`computeOuterVariables` for any child functions of NF, so that we can
propagate distant outer variables.

The queries `resolveFunction` and `resolveConcreteFunction` as well as
`instantiateSignature` have been adjusted. If a nested function does not
capture outer variables, it can be resolved using the normal process. In
this case, nesting of the function is just a syntactic convenience for
the user, and does not affect the function's resolution at all.

I've added a predicate `idIsNestedFunction` to the parsing queries
header.

I've also introduced a new `ID` method called `isSymbolDefiningScope`,
which before was implicitly encoded in the test `id.postOrderId() ==
-1`.

FUTURE WORK

- We may not need to store mentions in `OuterVariables`, and we may not
need to store variables in lexical order. In this case, the type can be
simplified. I leave that as future work.
- Nested functions that use outer variables will need to be resolved
eventually, and the implementation will probably consist of a
combination of: storing close outer variables (those defined in the
scope of our immediate parent) in a function's `TypedFnSignature`, as
well as passing in some sort of `CallContext` when resolving a call to a
(potentially nested) function. The `CallContext` would include the types
of outer variables represented in some form, as well as the current
`PoiInfo`. This would enable us to instantiate the initial signature for
a nested function and perform resolution as normal.

Reviewed by @DanilaFe. Thanks!
  • Loading branch information
dlongnecke-cray authored Mar 13, 2024
2 parents 1408aca + c0a52ce commit fe028ef
Show file tree
Hide file tree
Showing 11 changed files with 575 additions and 17 deletions.
7 changes: 7 additions & 0 deletions frontend/include/chpl/framework/ID.h
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,13 @@ class ID final {
*/
int postOrderId() const { return postOrderId_; }

/**
Returns 'true' if this symbol has a 'postOrderId()' value of == -1,
which means this is an ID for something that defines a new symbol
scope.
*/
inline bool isSymbolDefiningScope() const { return postOrderId_ == -1; }

/**
Some IDs are introduced during compilation and don't represent
something that is directly contained within the source code.
Expand Down
5 changes: 5 additions & 0 deletions frontend/include/chpl/parsing/parsing-queries.h
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,11 @@ uast::AstTag idToTag(Context* context, ID id);
*/
bool idIsParenlessFunction(Context* context, ID id);

/**
Returns true if the ID is a nested function.
*/
bool idIsNestedFunction(Context* context, ID id);

/**
Returns true if the ID refers to a private declaration.
*/
Expand Down
9 changes: 9 additions & 0 deletions frontend/include/chpl/resolution/resolution-queries.h
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,9 @@ ApplicabilityResult instantiateSignature(Context* context,
Compute a ResolvedFunction given a TypedFnSignature.
Checks the generic cache for potential for reuse. When reuse occurs,
the ResolvedFunction might point to a different TypedFnSignature.
This function will resolve a nested function if it does not refer to
any outer variables.
*/
const ResolvedFunction* resolveFunction(Context* context,
const TypedFnSignature* sig,
Expand Down Expand Up @@ -260,6 +263,12 @@ const ResolvedFunction* resolveConcreteFunction(Context* context, ID id);
*/
const ResolvedFunction* scopeResolveFunction(Context* context, ID id);

/**
Compute the set of outer variables referenced by this function. Will return
'nullptr' if there are no outer variables.
*/
const OuterVariables* computeOuterVariables(Context* context, ID id);

/*
* Scope-resolve an AggregateDecl's fields, along with their type expressions
* and initialization expressions.
Expand Down
169 changes: 169 additions & 0 deletions frontend/include/chpl/resolution/resolution-types.h
Original file line number Diff line number Diff line change
Expand Up @@ -341,6 +341,175 @@ class UntypedFnSignature {
/// \endcond DO_NOT_DOCUMENT
};

/**
This type represents the outer variables used in a function. It stores
the variables and all their mentions in lexical order. It presents the
concept of a 'reaching variable', which is a reference to an outer
variable that is not defined in the symbol's immediate parent.
*/
// TODO: We can drop some of this state if we decide we don't care about
// preserving lexical ordering or mentions at all (not 100% sure yet).
class OuterVariables {

// Record all outer variables used in lexical order.
std::vector<ID> variables_;

// Record all mentions of variables in lexical order. A variable may have
// zero mentions if it was only ever referenced by a child function. In
// this case, we still record the variable so that we can know to propagate
// it into our parent's state.
std::vector<ID> mentions_;

using VarAndMentionIndices = std::pair<size_t, std::vector<size_t>>;
using IdToVarAndMentionIndices = std::unordered_map<ID, VarAndMentionIndices>;

// Enables lookup of variables and their mentions given just an ID. The
// first part of the pair is the index of the variable, and the second
// component is the list of mention indices.
IdToVarAndMentionIndices idToVarAndMentionIndices_;

// The number of outer variables that are defined in distant (not our
// immediate) parents. Only variables defined by a function's most
// immediate parents need to be recorded into its 'TypedFnSignature'.
int numReachingVariables_ = 0;

// The function that owns this instance.
ID symbol_;

// The immediate parent of 'symbol_'. So that we can detect if a variable
// is 'reaching' without needing the compiler context.
ID parent_;

template <typename T>
static inline bool inBounds(const std::vector<T> v, size_t idx) {
return 0 <= idx && idx < v.size();
}

public:
OuterVariables(Context* context, ID symbol)
: symbol_(std::move(symbol)),
parent_(symbol_.parentSymbolId(context)) {
}

~OuterVariables() = default;

bool operator==(const OuterVariables& other) const {
return variables_ == other.variables_ &&
mentions_ == other.mentions_ &&
idToVarAndMentionIndices_ == other.idToVarAndMentionIndices_ &&
numReachingVariables_ == other.numReachingVariables_ &&
symbol_ == other.symbol_ &&
parent_ == other.parent_;
}

bool operator!=(const OuterVariables& other) const {
return !(*this == other);
}

void swap(OuterVariables& other) {
std::swap(variables_, other.variables_);
std::swap(mentions_, other.mentions_);
std::swap(idToVarAndMentionIndices_, other.idToVarAndMentionIndices_);
std::swap(numReachingVariables_, other.numReachingVariables_);
std::swap(symbol_, other.symbol_);
std::swap(parent_, other.parent_);
}

void mark(Context* context) const {
for (auto& v : variables_) v.mark(context);
for (auto& id : mentions_) id.mark(context);
for (auto& p : idToVarAndMentionIndices_) p.first.mark(context);
symbol_.mark(context);
parent_.mark(context);
}

static inline bool update(owned<OuterVariables>& keep,
owned<OuterVariables>& addin) {
return defaultUpdateOwned(keep, addin);
}

// Mutating method used to build up state.
void add(Context* context, ID mention, ID var);

/** Returns 'true' if there are no outer variables. */
bool isEmpty() const { return numVariables() == 0; }

/** The total number of outer variables. */
int numVariables() const { return variables_.size(); }

/** The number of outer variables declared in our immediate parent. */
int numImmediateVariables() const {
return numVariables() - numReachingVariables_;
}

/** The number of outer variables declared in our non-immediate parents. */
int numReachingVariables() const { return numReachingVariables_; }

/** The number of outer variable mentions in this symbol's body. */
int numMentions() const { return mentions_.size(); }

/** Get the number of mentions for 'var' in this symbol. */
int numMentions(const ID& var) const {
auto it = idToVarAndMentionIndices_.find(var);
return it != idToVarAndMentionIndices_.end()
? it->second.second.size()
: 0;
}

/** Returns 'true' if there is at least one mention of 'var'. */
bool mentions(const ID& var) const { return numMentions(var) > 0; }

/** Returns 'true' if this contains an entry for 'var'. */
bool contains(const ID& var) const {
return idToVarAndMentionIndices_.find(var) !=
idToVarAndMentionIndices_.end();
}

/** Get the i'th outer variable or the empty ID if 'idx' was out of bounds. */
ID variable(size_t idx) const {
return inBounds(variables_, idx) ? variables_[idx] : ID();
}

/** A reaching variable is declared in a non-immediate parent(s). */
bool isReachingVariable(const ID& var) const {
auto it = idToVarAndMentionIndices_.find(var);
if (it != idToVarAndMentionIndices_.end()) {
auto& var = variables_[it->second.first];
return !parent_.contains(var);
}
return false;
}

/** A reaching variable is declared in a non-immediate parent(s). */
bool isReachingVariable(size_t idx) const {
if (auto id = variable(idx)) return isReachingVariable(id);
return false;
}

/** Get the i'th mention in this function. */
ID mention(size_t idx) const {
return inBounds(mentions_, idx) ? mentions_[idx] : ID();
}

/** Get the i'th mention for 'var' within this function, or the empty ID. */
ID mention(const ID& var, size_t idx) const {
auto it = idToVarAndMentionIndices_.find(var);
if (it == idToVarAndMentionIndices_.end()) return {};
return inBounds(it->second.second, idx)
? mentions_[it->second.second[idx]]
: ID();
}

/** Get the first mention of 'var', or the empty ID. */
ID firstMention(const ID& var) const { return mention(var, 0); }

/** Get the ID of the symbol this instance was created for. */
const ID& symbol() const { return symbol_; }

/** Get the ID of the owning symbol's parent. */
const ID& parent() const { return parent_; }
};

/** CallInfoActual */
class CallInfoActual {
private:
Expand Down
9 changes: 9 additions & 0 deletions frontend/lib/parsing/parsing-queries.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -857,6 +857,14 @@ bool idIsParenlessFunction(Context* context, ID id) {
return idIsFunction(context, id) && idIsParenlessFunctionQuery(context, id);
}

bool idIsNestedFunction(Context* context, ID id) {
if (id.isEmpty() || !idIsFunction(context, id)) return false;
if (auto up = id.parentSymbolId(context)) {
return idIsFunction(context, up);
}
return false;
}

bool idIsFunction(Context* context, ID id) {
// Functions always have their own ID symbol scope,
// and if it's not a function, we can return false
Expand Down Expand Up @@ -984,6 +992,7 @@ const ID& idToParentId(Context* context, ID id) {
}

const uast::AstNode* parentAst(Context* context, const uast::AstNode* node) {
if (node == nullptr) return nullptr;
auto parentId = idToParentId(context, node->id());
if (parentId.isEmpty()) return nullptr;
return idToAst(context, parentId);
Expand Down
82 changes: 77 additions & 5 deletions frontend/lib/resolution/Resolver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -226,11 +226,13 @@ Resolver::createForInitializer(Context* context,
Resolver
Resolver::createForScopeResolvingFunction(Context* context,
const Function* fn,
ResolutionResultByPostorderID& byId) {
ResolutionResultByPostorderID& byId,
owned<OuterVariables> outerVars) {
auto ret = Resolver(context, fn, byId, nullptr);
ret.typedSignature = nullptr; // re-set below
ret.signatureOnly = true; // re-set below
ret.scopeResolveOnly = true;
ret.outerVars = std::move(outerVars);
ret.fnBody = fn->body();

ret.byPostorder.setupForFunction(fn);
Expand Down Expand Up @@ -429,6 +431,46 @@ types::QualifiedType Resolver::typeErr(const uast::AstNode* ast,
return t;
}

static bool isOuterVariable(Resolver* rv, ID target) {
if (target.isEmpty()) return false;

// E.g., a function, or a class/record/union/enum. We don't need to track
// this, and shouldn't, because its current instantiation may not make any
// sense in this context. As well, these things have an "infinite lifetime",
// and are always reachable.
if (target.isSymbolDefiningScope()) return false;


auto parentSymbolId = target.parentSymbolId(rv->context);

// No match if there is no parent or if the parent is the resolver symbol.
if (parentSymbolId.isEmpty()) return false;
if (rv->symbol && parentSymbolId == rv->symbol->id()) return false;

switch (parsing::idToTag(rv->context, parentSymbolId)) {
case asttags::Function: return true;

// Module-scope variables are not considered outer-variables. However,
// variables declared in a module initializer statement can be, e.g.,
/**
module M {
if someCondition {
var someVar = 42;
proc f() { writeln(someVar); }
f();
}
}
*/
case asttags::Module: {
auto targetParentId = parsing::idToParentId(rv->context, target);
return parentSymbolId != targetParentId;
} break;
default: break;
}

return false;
}

/**
Find scopes for superclasses of a class. The passed ID should refer to a
Class declaration node. If not, this function will return an empty vector.
Expand Down Expand Up @@ -2782,6 +2824,13 @@ void Resolver::resolveIdentifier(const Identifier* ident,

maybeEmitWarningsForId(this, type, ident, id);

// Record uses of outer variables.
if (isOuterVariable(this, id) && outerVars) {
const ID& mention = ident->id();
const ID& var = id;
outerVars->add(context, mention, var);
}

if (type.kind() == QualifiedType::TYPE) {
// now, for a type that is generic with defaults,
// compute the default version when needed. e.g.
Expand Down Expand Up @@ -2974,10 +3023,33 @@ bool Resolver::enter(const NamedDecl* decl) {
}

void Resolver::exit(const NamedDecl* decl) {
if (decl->id().postOrderId() < 0) {
// It's a symbol with a different path, e.g. a Function.
// Don't try to resolve it now in this
// traversal. Instead, resolve it e.g. when the function is called.
// We are resolving a symbol with a different path (e.g., a Function or
// a CompositeType declaration). In most cases we do not try to resolve
// in this traversal. However, if we are a nested function and the child
// is also a nested function, we need to check and potentially propagate
// their outer variable set into our own.
auto idChild = decl->id();
if (idChild.isSymbolDefiningScope()) {
if (this->symbol != nullptr &&
parsing::idIsNestedFunction(context, this->symbol->id()) &&
parsing::idIsNestedFunction(context, idChild) &&
outerVars.get()) {
if (auto ovs = computeOuterVariables(context, idChild)) {
for (int i = 0; i < ovs->numVariables(); i++) {

// If the variable is reaching in the child function, it means it
// was defined in one of _our_ parent(s). So we need to track it.
if (ovs->isReachingVariable(i)) {
ID var = ovs->variable(i);

// Mentions from child functions are not recorded in the parent
// function's info, so just use the first (as a convenience).
ID mention = ovs->firstMention(var);
outerVars->add(context, mention, var);
}
}
}
}
return;
}

Expand Down
7 changes: 5 additions & 2 deletions frontend/lib/resolution/Resolver.h
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ struct Resolver {
ReceiverScopesVec savedReceiverScopes;
Resolver* parentResolver = nullptr;
owned<InitResolver> initResolver = nullptr;
owned<OuterVariables> outerVars;

// results of the resolution process

Expand Down Expand Up @@ -91,7 +92,8 @@ struct Resolver {
const PoiScope* poiScope)
: context(context), symbol(symbol),
poiScope(poiScope),
byPostorder(byPostorder), poiInfo(makePoiInfo(poiScope)) {
byPostorder(byPostorder),
poiInfo(makePoiInfo(poiScope)) {

tagTracker.resize(uast::asttags::AstTag::NUM_AST_TAGS);
enterScope(symbol);
Expand Down Expand Up @@ -143,7 +145,8 @@ struct Resolver {
// set up Resolver to scope resolve a Function
static Resolver
createForScopeResolvingFunction(Context* context, const uast::Function* fn,
ResolutionResultByPostorderID& byPostorder);
ResolutionResultByPostorderID& byPostorder,
owned <OuterVariables> outerVars);

static Resolver createForScopeResolvingField(Context* context,
const uast::AggregateDecl* ad,
Expand Down
Loading

0 comments on commit fe028ef

Please sign in to comment.