Skip to content

Recursion lint, part 2 of 2: Indirect Recursion #15020

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
felix91gr opened this issue Jun 9, 2025 · 7 comments
Open

Recursion lint, part 2 of 2: Indirect Recursion #15020

felix91gr opened this issue Jun 9, 2025 · 7 comments
Assignees
Labels
A-lint Area: New lints

Comments

@felix91gr
Copy link
Contributor

felix91gr commented Jun 9, 2025

What it does

This is a restriction lint, mostly intended to be used in Safety Critical contexts.
It checks for indirect recursive calls.

It's a follow-up to #428, the Direct part of which has been implemented in #15006

I plan to be implementing it during this month.

Advantage

Context: the form asked about recommendations over the linted code

There are no universal good recommendations over the original code. Some recursive algorithms can be converted to constant-memory loops, but a lot of them actually require a stack to work. The reason why recursion is tightly controlled in Safety Critical systems is them running out of memory. This is why we cannot just recommend a loop with a stack: it would still potentially run out.

There is no "right" answer to how to change the original code:

  • Sometimes it must be changed to a loop,
  • Sometimes it must be memory-bound in some way, and then vetted to be correct,
  • And sometimes an entirely different solution is needed.

That's why this lint should issue no recommendations.

Drawbacks

I can think of two drawbacks of running this lint.

  • Possible performance issues related to the amount of unique call cycles. If this is found to be a problem, I'll add a setting to limit the cycle size to check. I don't expect it to be necessary if I do early-exiting properly, but it is a Plan B.
  • The UX problem of how to give good error messages of this lint to the user is not obvious. If done poorly, this lint could be very obtuse to read or feel clumsy to use. I want to iterate on this until the lint issues good error messages in all scenarios.

Example

fn i_call_myself_indirectly() {
  i_call_my_caller();
}

fn i_call_my_caller() {
  i_call_myself_indirectly();
}
@felix91gr felix91gr added the A-lint Area: New lints label Jun 9, 2025
@felix91gr
Copy link
Contributor Author

@rustbot claim

@felix91gr
Copy link
Contributor Author

Design work

Preamble

This lint reduces to finding loops in the function call graph. The Function Call Graph is a Directed Multigraph:

  1. The Nodes of this graph are function bodies.
  2. The Edges of this graph are function calls: if A calls B in a particular line of A, then that line becomes an edge of the form A -> B.
  3. It is a multigraph, which means that there can be more than one edge between two nodes. In this case, a function A could call another function B at more than one point in the body of A.
  4. It is a directed graph, which means that edges have a direction: edges are not of the form A - B but rather of the form A -> B or B -> A. In this case, if A calls B, it shows in the graph as an edge of the form A -> B.
  5. Recursive calls look like chains of edges where the same function's name is at the start and end. That's why this lint is tasked with finding loops in the graph.
    For example, a function that calls itself directly has a loop in the graph that looks like this: A -> A.
    A function that calls itself through other two functions has a loop in the graph that looks like this: A -> B -> C -> A.

UX and error messages

It might make the most sense to lint for the smallest call loops that we can find. The smallest loops should be easier to reason about, and this would allow us to do Early Exiting and avoid an explosive performance cost due to extremely long call loops.

The Direct Recursion lint has already been implemented. That lint handles all loops of size 1. Therefore, the UX work of this lint will be focused on loops of size 2 or higher.

The algorithm

In principle, I'd do an IDDFS (Iterative-Deepening Depth-First Search) over the function call graph. This is an algorithm for finding shortest paths in unweighted graphs, which uses very little memory (and thus hopefully should be faster than BFS, specially in very large programs). It also lets us keep the specific loop's details, which is something that many algorithms don't do. It also works seamlessly with multigraphs.

The algorithm works as follows:

  1. We set a target loop depth (we start at 2, since the Direct Recursion lint takes care of depth 1).
  2. We perform DFS (depth-first-search) recursively from every function body in the graph. Every time we find a function call, we check if it's our starting function.
    If it is, we store this call chain and continue.
    If it's not and we're at the depth limit, we ignore it. If we're not at the depth limit, we increase our depth counter and start checking that function's body.
  3. When this limited-depth DFS returns, we check if we have stored any call chains.
    If we have, those are loops of length equal to the Depth Limit. We lint for those and exit.
    If we haven't, then that means the Depth Limit we had set is too small. We increase it by 1 and take Step 2 once again.

Performance considerations

Final depth

Let's say the program has N functions. Then, the Function Call Graph will have N nodes.

Since loops of a larger length than N aren't going to be "elementary" (I can explain what this means in another comment), we can ignore them. If we have reached a Depth Limit of N and we still haven't found any call loops, we know the graph is loop-free, and therefore our program is recursion-free.

What if "Too Many Edges"?

Normal Graphs with N nodes have an edge limit of N^2. This is not the case for Multigraphs. Multigraphs can have an arbitrary number of edges between two nodes.

In case this is a problem, I have an idea of an optimization that can be done to know in advance how long the smallest loops will be (or if there will be any at all).

Let's say we ignore for a moment that this is a multigraph, and make an adjacency matrix for the Function Call Graph as if it were a normal graph. This would be a square matrix of NxN, where N is the number of functions in the program.

The matrix is filled as follows.

Every index i, 0 <= i < N and j, 0 <= j < N, represent specific functions in the program. In the adjacency matrix, if (i, j) = 1, then there is a function call from i to j, and it's 0 otherwise.

Let this matrix be called A, for Adjacency Matrix. If we take the powers of A then and look at the diagonal elements:

If and only if (i, i) > 0 in A^k, for k =1..N, then there is a loop of length k from i to i.

And furthermore:

If for any k, A^k = 0, then there are no loops of any length in A.

The optimization I have in mind would be to first build A and then compute its powers from 1 to N. The first power A^k to have a non-zero element in its diagonal would indicate that the smallest loops in the graph are of length k, and in particular, only for the functions with index i such that (i, i) > 0 in A^k.

This would allow us to run our IDDFS knowing the exact depth that we're looking for (making it k-DFS I guess), as well as knowing exactly what functions to start in (only the ones that have a non-zero (i, i) element in A^k).

It would also allow us to do an early exit in case the lint finds no loops in the graph.

Algorithm requirements

Or: what would other algorithms have to have in order to be good alternatives

The algorithm we use must be able to:

  1. Find ALL the shortest loops in the graph, not just the first one;
  2. Be able to store span data in order to emit lints later; and
  3. Be able to work in multigraphs.

While hopefully:

  1. Not having bad memory nor time performance (BFS can be pretty bad at this)
  2. Being able to exit early if we need it to.

Of all the algorithms I've been combing through this past week, only IDDFS (plus the optional Adjacency Matrix optimization) fit this bill. Please let me know if you can think of any alternatives that would too :)

@samueltardieu
Copy link
Contributor

Why would you need a multigraph? There seems to me that there is several ways to use a regular directed graph.

  • If your edge payload can be arbitrary, then they can be a vector of spans representing the (possibly multiple) calls from the starting vertices to the ending one.
  • Even without this, you may complement your graph with a hash table with (start_def_id, end_def_id) keys and vec![Span] values.
  • Or you can ignore spans altogether while building the graph: once a cycle is detected and you want to emit a diagnostic, then you leave the performance critical path and can do more costly things. For example, now that you know the loop you want to warn about, you may go through one node and look up calls for the next node.

Also, note that closures (and function refs) will need to be handled as well. For example, if function f calls function g and pass it closure c, then you will need to assume that f calls g, and that f also calls c, but you should not assume that g calls c in the general case (especially for g coming from the standard library – those functions are well defined and "clean"). Otherwise, you will risk marking any closure passed to map and using map themselves to being part of loops when they aren't.

Also, closures and function refs might be stored in structures that will be passed to other functions (or used through a static variable with inner mutability) and called from there. This might be another difficulty there.

@felix91gr
Copy link
Contributor Author

felix91gr commented Jun 9, 2025

@samueltardieu thank you! That's a lot of useful feedback.


Why would you need a multigraph? There seems to me that there is several ways to use a regular directed graph.

  • If your edge payload can be arbitrary, then they can be a vector of spans representing the (possibly multiple) calls from the starting vertices to the ending one.

  • Even without this, you may complement your graph with a hash table with (start_def_id, end_def_id) keys and vec![Span] values.

The problem I see with that is that the size of the edge table is quadratic (in the number of functions) in the worst case. I'd rather have the table be implicit, and only put in memory the parts that must be used for emitting a lint.

Are there other lints that incur in quadratic costs? Maybe, if it's not such a terrible cost in practice (specially with very large programs like rustc), then I could use structures like those and simplify the lint's structure.

Regardless, what you say about not needing a multigraph is quite valid. Even in IDDFS, I should be able to remember what fn calls I've already looked at and not recurse through calls to the same functions.


  • Or you can ignore spans altogether while building the graph: once a cycle is detected and you want to emit a diagnostic, then you leave the performance critical path and can do more costly things. For example, now that you know the loop you want to warn about, you may go through one node and look up calls for the next node.

Indeed. That's my thinking process for using the adjacency matrix optimization. I can build the adjacency matrix first, and then find the size and starting point of the smallest loops by computing its powers. The AM is quadratic in the number of functions however, so I should probably be careful with how I represent it in memory.

What's the single program with the most functions in the Rust ecosystem? I'd like to try testing how fast or slow the AM method is with programs of sizes such as those.


Also, note that closures (and function refs) will need to be handled as well. For example, if function f calls function g and pass it closure c, then you will need to assume that

I'm going to separate this last bit in many pieces because I have a lot to ask about them.

f calls g,

This makes sense to me.

and that f also calls c,

Why?

but you should not assume that g calls c in the general case (especially for g coming from the standard library – those functions are well defined and "clean").

What do you mean by "clean", exactly?

Because the main target for the Safety Critical guidelines about recursion is the (possibly uncontrolled) increase in stack size. If g calls c and c calls f, then it doesn't matter how good or clean g is; it will still give f a possible call loop unto itself. In SC contexts where a specific use of recursion is desirable, the user is tasked with vetting said call and annotating it with something along the lines of #[expect(clippy::recursion, reason="We have vetted this call to be bounded and safe enough")].


Also, closures and function refs might be stored in structures that will be passed to other functions (or used through a static variable with inner mutability) and called from there. This might be another difficulty there.

You're right. I reckon those cases might be outside the scope of this lint. Maybe those should be addressed in a future extension, or through another SC Coding Guideline that restricts function refs + closures and statics with inner mutability.

@samueltardieu
Copy link
Contributor

Also, note that closures (and function refs) will need to be handled as well. For example, if function f calls function g

f calls g,

This makes sense to me.

and that f also calls c,

Why?

Imagine that g is map: calling iter.map(|x| c(x, 3)) in a function f is akin to calling c from f, except that it is done several times. If g (here map) behaves properly, then c is not stored anywhere after the call, that's what I called "clean" (i.e. no surprise done with the argument function such as storing it for calling it later from unrelated piece of code).

Just as if, from f, you do a iter.map(f), you certainly want this recorded as a (here direct) recursive call.

Because the main target for the Safety Critical guidelines about recursion is the (possibly uncontrolled) increase in stack size.

Maybe the "main" target, but probably not the only one. If you look at what is done in Ada for example, many standards forbid loops with dynamic bounds (a loop has to either loop forever when this is allowed, or a number of times determined before the loop starts in other cases). This is another situation where you might want to avoid recursion, even if tail recursion (and this controlled stack size) was guaranteed.

@felix91gr
Copy link
Contributor Author

Also, note that closures (and function refs) will need to be handled as well. For example, if function f calls function g

f calls g,

This makes sense to me.

and that f also calls c,

Why?

Imagine that g is map: calling iter.map(|x| c(x, 3)) in a function f is akin to calling c from f, except that it is done several times. If g (here map) behaves properly, then c is not stored anywhere after the call, that's what I called "clean" (i.e. no surprise done with the argument function such as storing it for calling it later from unrelated piece of code).

Just as if, from f, you do a iter.map(f), you certainly want this recorded as a (here direct) recursive call.

Ah! I see what you mean now. That makes sense.

However, I would leave this level of analysis to future improvements to the lint.

I would be rehashing what I said in the PR for Direct Recursion: #15006 (comment), but basically this kind of analysis requires the compiler's dataflow analysis machinery to be a lot more sophisticated than it is right now. It requires an analysis tool that's at least capable of powering program-flow-aware constant propagation, which I know the current machinery to not be adequate for.

Because the main target for the Safety Critical guidelines about recursion is the (possibly uncontrolled) increase in stack size.

Maybe the "main" target, but probably not the only one. If you look at what is done in Ada for example, many standards forbid loops with dynamic bounds (a loop has to either loop forever when this is allowed, or a number of times determined before the loop starts in other cases). This is another situation where you might want to avoid recursion, even if tail recursion (and this controlled stack size) was guaranteed.

I get what you mean about loops (that's probably a coding guideline worthy of its own lint), but I don't get how recursion enters the picture here.

Regarding Tail Recursion, there is a lint proposal since eons ago #1678, as well as a recent RFC that looks like it might finally give us guaranteed TR rust-lang/rfcs#3407. I don't think that lint is possible until we're able to guarantee TR, but whenever it becomes possible, I think it would be a nice complement to the Direct Recursion lint.

@felix91gr
Copy link
Contributor Author

Here's the optimization I have in mind, better written.

The graph is the call graph of the program, which we build as an adjacency matrix.

This optimization is intended to allow us to later collect only the Spans of the functions that we know we want to lint.

The problem

  • We have a directed graph,
  • With potentially tens of thousands of nodes,
  • With relatively few edges (let's say that the number of edges E is about 10 x N, where N is the number of nodes),

Note 1: I'm going by rustc's numbers here, which seem to be 30k functions under the /compiler directory
Note 2: I say "relatively few" because a more densely connected graph can have upwards of N x N edges

From that graph, we want to know:

  1. If it has any loops,
  2. What the smallest loop length is (if it has any), and
  3. What nodes are there that loop unto themselves at that loop length.

To do this, the fastest solution I've been able to think of is to use an adjacency matrix A, where vectors are bitfields and their dot product looks like this:


Sidenote: Dot Product for when Adjacency Matrices are implemented as bitfields

// Bitfield dot product
v x u is defined as:
w = v & u;, w is the bitwise AND of the two vectors
w' = 0 if w == (0....0); else 1;, w' is 0 if the resulting vector is 0, and 1 otherwise.

Basically it's a normal dot product, only that scalar results greater than 1 get clamped to 1.


My solution draft

First requirement: finding out if the graph has loops

Anyways. Using the adjacency matrix, we can quickly check if the graph contains loops. We know from algebraic graph theory:

The powers of an adjacency matrix A reach zero if and only if the graph is acyclic

And the largest power that matters for such an adjacency matrix is N, where N is the number of nodes in the graph. That is, if A is an NxN matrix, then:

A represents an acyclic graph if and only if A^N = 0

We can reach the Nth power in log N operations, since we can square our result repeatedly:

(((...log N times...(A^2)^2)...log N times)^2 = A^N in log N matrix multiplications

If the result is 0 at any step, we exit early since we know there are no loops.
If the result is not 0 by the end, we know there are loops

This step then has a complexity of O(N^3) for each matrix multiplication, for a total time cost of O(N^3 log N).

Second requirement: finding out the smallest loop length

Optimizing the first requirement

A shortcut we can take for when there are loops in the graph, is the following fact:

If A^k has 1 in one of its diagonal elements at index [i, i], then A has a loop of at most k steps, starting and ending at node i.

So in our prior requirement, if we check at every step for the presence of a non-zero diagonal element, we will know there are loops in the graph and can continue on.

Actually computing the second requirement

Knowing that some power k of A gives rise to loops in the graph, we can find out the smallest k by doing binary search over the exponents.

Finding out this special k, from now on k*, will take O(log N) matrix multiplications, each one of them costing O(N^3), for a total cost of O(N^3 log N).

Third requirement

After calculating A^k*, its non-zero diagonal elements will be the nodes that loop unto themselves in k* edges.

Since this step reuses the result of the second step, we incur a cost of O(N) for reading the diagonal elements.

Total cost and better alternatives?

Sparse adjacency matrices

There is something to be said about using sparse matrices. However, sparse adjacency matrices quickly become dense when we start computing their powers. So sparse matrices are out of the question.

Floyd-Warshall

One could argue that the Floyd-Warshall algorithm fares better, since it's only of cost O(N^3). However, this alternative algorithm allows us to exit early if there are no loops or if there are only short loops, which should be the most common scenario for the users. And we want that scenario to be as lightweight as possible.

Total cost

I don't think there's anything better than O(N^3) for computing this lint. I might be able to give a proof of this if it's needed. But my gut and a quick scribble of the associated costs tell me that it's not lower than O(N^3).

That being said, my hope is that by leveraging fast matrix multiplications, we can make the constants associated with each operation as small as possible. And that for small enough programs, performing this computation is reasonably fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-lint Area: New lints
Projects
None yet
Development

No branches or pull requests

2 participants