Lesson 4: Data Flow #285

sampsyo · 2022-02-03T19:28:33Z

sampsyo
Feb 3, 2022
Maintainer

This thread is for summarizing how your implementation of the data flow framework and its client optimizations went!

gsvic · 2022-02-15T03:11:59Z

gsvic
Feb 15, 2022

I am finishing up a small data flow framework. The implementation is slightly different from the worklist algorithm, but I would like to share it and hear some thoughts. Some background on the abstractions:

API Overview

Block

This class represents a block. It contains an instruction list and some methods for the easy manipulation, e.g. get_uses() or get_definitions().

BlockCFGNode

A class that represents a node in the CFG. It contains a Block instance and methods for getting information about the semantics of the graph programatically, like the successors, the predecessors, inputs and outputs.

CFG

The class that represents the Control-Flow-Graph. It takes as input a list of Blocks, creates a BlockCFGNode instance per block and then it creates the edges between the block nodes.

Walkthrough

The main idea is that the BlockCFGNode class contains two methods: get_inputs() and get_outputs. In summary, the get_inputs will invoke the get_outputs() of each predecessor and it will end-up with all the reaching definitions. The get_outputs simply invokes the get_inputs of the self, and outputs the union of the inputs with its definitions. Here's a small snippet of the get_inputs()

`get_inputs()`

for predecessor in self._predecessors.values():
    if predecessor.get_name() not in seen:
        seen.add(predecessor.get_name())
        predecessor_out = predecessor.get_outputs(seen)
        for d in predecessor_out:
            inputs.add(d)

The get_outputs() does the following:

outputs = self._get_inputs(seen)

for d in self._block.get_definitions():
  outputs.add(d)

Test

I tested the implementation with the factorial bril program, and the outputs looks correct:

0:
  In:  None
  Out: {'i', 'result'}
header:
  In:  {'cond', 'zero', 'result', 'one', 'i'}
  Out: {'cond', 'zero', 'result', 'one', 'i'}
body:
  In:  {'cond', 'zero', 'i', 'result', 'one'}
  Out: {'cond', 'zero', 'i', 'result', 'one'}
end:
  In:  {'cond', 'zero', 'i', 'result', 'one'}
  Out: {'cond', 'zero', 'i', 'result', 'one'}

I am working on adding more features, like checking if a definition is getting killed.

3 replies

sampsyo Feb 15, 2022
Maintainer Author

Interesting! It looks like the seen set in your setup is there to prevent infinite recursion when the CFG has a cycle, right? The result is that your analysis will only ever traverse each cycle once. That's different from the worklist algorithm, which will keep traversing cycles until the answer stops changing (that's the purpose of the titular "worklist").

I think this approach might have trouble with a simple loop like this:

@main {
  one: int = const 1;
  zero: int = const 0;
  i: int = const 5;
.top:
  print i;
  cond: bool = eq i zero;
  i: int = sub i one;
  br cond .end .top;
.end:
}

…where the second definition of i (in the .top) block does reach the print i, but determining this requires visiting the same block twice, if I'm not mistaken.

I'm also a little confused by the output you're showing. It sounds like you implemented reaching definitions, but the output at the beginning/end of each block appears to be a list of variables, not definitions. Maybe this is the output for a different analysis, such as defined variables?

gsvic Feb 18, 2022

Thanks for the heads-up on that! The seen variable is used to avoid infinite recursion indeed, and the output I presenting was defined variables. Below is my update.

DataFlow Implementation

I implemented three data flow methods and a generic solver that takes as input a merge and a transfer method and applies the methods to the inputs/outputs of each block. Below I am pointing out the links with the line that each implementation starts in my code.

Worklist

A generic solver that takes as input the merge and transfer methods, as well as an inverse parameter indicating wether the analysis is going to be reversed. In fact, if inverse is set to True, the switch_directions() method of the CFG instance is invokes, which swaps the predecessors with the successors in each of the BlockCFGNode instance.

Link: https://github.com/gsvic/CornellCS6120/blob/data-flow/L4/Worklist.py#L88

Defined Variables

Link: https://github.com/gsvic/CornellCS6120/blob/data-flow/L4/Worklist.py#L159

Live Variables

Link: https://github.com/gsvic/CornellCS6120/blob/data-flow/L4/Worklist.py#L165

Includes the invocation of the transfer_live method.

Constant Propagation

Link: https://github.com/gsvic/CornellCS6120/blob/data-flow/L4/Worklist.py#L171

Includes the cprop_union and cprop_transfer.

Test

A few test cases can be found here.

sampsyo Feb 20, 2022
Maintainer Author

Awesome! It was fun to think through what could go wrong with the "process every block at most one" approach. I'm glad the worklist version worked out!

tonyjie · 2022-02-17T00:37:06Z

tonyjie
Feb 17, 2022

My code is here.

Dataflow Analysis

df.py show the implementation of dataflow analysis using worklist algorithms. It provides a general framework for several kinds of analyses, including forward dataflow problem and backward dataflow problem. It supports 3 analyses with different input arguments: reaching definition - forward (defined), live variables - backward (live), constant propagation - forward (cprop).

I copy the testcases under here and generate my own output. The testcases can be easily run following the README using turnt.

Some Limitations & Rules of my implementation

Reaching Definition

I decide not to consider the function arguments here for simplicity and brevity. It is easy to add the func arguments to the init of In, but for the other analysis, it is someting else.

Constant Propagation

There is some limitations or so-called rules for my implementation on Constant Propagation analysis. It is also because of simplicity.

There's no constant folding and any actual computation involved. For a .bril program as follows:

@main(cond: bool) {
  a: int = const 47;
  b: int = const 42;
  c: int = add a b;
}

It would not recognize that both args of c are const, and compute the result of c. Its dataflow output would be:

block.0:
  in:  ∅
  out: a: 47, b: 42, c: ?

1 reply

sampsyo Feb 20, 2022
Maintainer Author

Great! 👏 I think this is the right thing to do for "constant propagation," FWIW. You could imagine creating a separate analysis for constant folding, but it would need a different name.

atucker · 2022-02-17T04:12:00Z

atucker
Feb 17, 2022

My code is here.

CFG

I used my CFG implementation from last time, though I modified it to use the function names as the name of the first basic block of a function, or entry if there is only one function in the program. This is code is here.

The main testing came from the correctness of the code analyses. I should add a test that shows that the multiple function naming does something reasonable.

Reachability

I first started the project by implementing reachability (tested in cond-args-reach.bril and cond-reach.bril), then refactored it to use a worklist algorithm. The reachability implementation is here, with most of the code starting here.

The main issue for me here was in coming up with a way of keeping track of the variable values when they get overwritten. Instead of performing a pass and renaming the variables that get overwritten, I instead just annotated each variable with a list of names of the basic blocks it might have been defined in, and deleted that information whenever it got "killed". This meant that instead of the domain of my worklist algorithm being over sets, it was over a mapping from a variable name to a set of basic block names. While this generally made the worklist functions (merge, transfer, etc.) more tricky than a simple set operation, it seemed easier to me to do that than to rename the variables but also keep track of an equivalence relation over renamed variables, so that for instance I would know that b1 and b2 are different versions of the same things that should get overwritten if b3 is defined.

Worklist Algorithm

My worklist algorithm implementation is here. I started by writing the reachability code in a worklist-ish shape, then refactored the code to use a worklist abstraction once I was reasonably confident that it was working.

There was a brief snag in implementing the worklist when I realized that I 1) didn't need to put things in the worklist that were already there (which I don't think causes correctness issues, but did make me sad when looking at the debugging outputs), and 2) that I needed to be popping things off the worklist if I was going to check what was already there or not.

The worklist algorithm is really fun! I was super happy with how little additional code getting live to work was once I understood the problem.

Liveness

Finally, I confirmed my worklist implementation by also implementing live variable checking, implemented here with most code here, and tests in cond-args-live.bril, cond-live.bril, and fact.bril (all based on the example tests).

There were two main issues here.

The first issue was that I couldn't quite share as much code with reachability as I wanted. Assuming (as we do in this class) that the compiler only needs to handle correct bril programs, anything which is used before its defined is going to be a live variable. However, I wanted to use a simple function to compute the variables which were used. This didn't work, because live cares about if a variable is used before or after it is defined, since defining and then using a variable in a basic block (like one in fact.bril) means that it's no longer live! After fixing this it was mostly straightforward.

The second issue was that I realized that my representation for variables from reachability wasn't fantastic, in that liveness seems to mostly care about what variables exist and are used, while reachability cares about which versions of the variables are used. So I just switched to using sets of variables instead of mappings from variables to sets of basic block names.

This code was fun in that it was one of the only times I've worked on an assignment and found that my implementation already handled an issue -- adding args was fine and didn't need any extra work!

1 reply

sampsyo Feb 20, 2022
Maintainer Author

This meant that instead of the domain of my worklist algorithm being over sets, it was over a mapping from a variable name to a set of basic block names.

Ah, that's interesting! Just to understand, does this mean that if there are multiple definitions of a variable within a single basic block, those would "show up" as the same definition? (You also discuss later moving to sets of variables, but perhaps you mean sets of definitions to disambiguate different writes to the same variable.)

The worklist algorithm is really fun! I was super happy with how little additional code getting live to work was once I understood the problem.

Yeah! I agree this can be super satisfying. 😄

zzzDavid · 2022-02-17T06:17:21Z

zzzDavid
Feb 17, 2022

My general dataflow solver implementation is here.

Description

I implemented five passes/analysis:

Reaching definitions
Live variables
Constant propagation
Available expressions → common sub-expression elimination
Constant folding

These passes can be invoked by passing different arguments to the main.py file, please refer to the documentation for details.

All passes use the same general worklist solver, each pass has its own merge and transfer functions. Reaching definitions and live variables are intuitive to implement. For CP, CSE, and CF, I added the local LVN in the last assignment to the transfer function.

Tests

I added all df tests from Bril's examples, and added non-local CP, CSE, CF test cases. All tests can be checked with turnt

Discussion

My idea of implementing constant propagation and other LVN-related passes is to generate value tuples at the input of each basic block. When we perform LVN on a basic block, its upstream value tuples are treated as "pseudo instructions" to provide information about constants and existing expressions.

When adding LVN, I noticed one issue: since all the merge/transfer functions are based on set operation so the order is ignored. This could cause some issue for LVN, for example:

@main {
  x: int = const 4;
  sum1: int = add x x;
  jmp .label;
.label:
  sum2: int = add x x;
  print sum2;
}

When we perform LVN on the label basic block, two "pseudo instructions" will be added in the beginning: x: int = const 4; sum1: int = add x x;. If sum1 comes first, then it won't be able to find its arguments in the environment, and CSE would fail.

My ad-hoc solution for this issue, is to convert the upstream set of "pseudo instructions" from set to list first, and then move all the constant operations upfront.

3 replies

sampsyo Feb 20, 2022
Maintainer Author

Very interesting! I'm intrigued by your idea to make LVN into a kind of global analysis by plugging it into a data-flow analysis. If you have time, I'd be interested to hear more about how this works—namely, maybe you could call out what your transfer and merge functions look like? I imagine that the transfer function is mostly the old LVN analysis, but the merge function seems somewhat tricky… like, what happens for a control flow "diamond" where values are defined along one path but not the other, like this?

@main(x: int, cond: bool) {
  br cond .then .else;
.then:
  z: int = add x x;
  print z;
  jmp .end;
.else:
  jmp .end;
.end:
  y: int = add x x;
  print y;
}

zzzDavid Feb 23, 2022

Hi Adrian, sorry for the late reply. My merge function for LVN is here.

I used intersection of value tuples as the merge function. The reason is that if both branches have different definitions of a value or only one of the branches has the definition, then we shouldn't do constant propagation/folding because we don't know which branch will be taken. If the definition is above the branches, then it will be propagated to both branches, and if it is not overwritten in any branches, the intersection merge function will propagate it to the next block.

sampsyo Feb 25, 2022
Maintainer Author

Sounds cool!

JonathanDLTran · 2022-02-17T18:42:28Z

JonathanDLTran
Feb 17, 2022

Summary

I implemented dataflow analysis using the worklist algorithm covered in class. I implemented several analyses that seemed interesting to me: reaching definitions, constant propagation, live variables and available expressions. For each of the these analyses, I wrote test cases and used turnt as a regression testing system to make sure changes I made did not break previously checked test outputs.

The dataflow analyses can be ran by running bril2json < test-name | python3 dataflow.py --reaching=True --constant=True --live=True --available=True with appropriate flags being turned off as needed. To run the turnt tests, choose the appropriate testing directory and run turnt test-direct/*.bril.

Testing

I wrote different test programs for each of the analyses. In general, I aimed to see that specific features of each analysis were correct. For instance, for the liveness analysis, I created tests that make sure that variables defined in one basic block would
be marked as live at the end of the basic block if there were no redefinitions. For available expressions, I made example tests where different branches might have some expression, like x + y defined one branch, but not another, which would not propagate
the expression x + y to the next block succeeding both branches. Similarly, I did the same for constant propagation, where one branch might have a variable x holding the value 3, while another may have x holding a value 4.

I then used turnt testing at the end to make sure changes I made in the programs did not change the results of the test to be incorrect.

Results

In terms of results, for this task, reviewed my results from each test and made sure
that the analysis was correct. For instance, for the test:

@main() {
    a: int = const 3;
    b: bool = const true;
    br b .first .second;
.first:
    a: int = const 4;
    jmp .end;
.second:
    a: int = const 5;
    jmp .end;
.end:   
    a: int = const 6;
    print a;
    print b;
    c: int = add a a;
    print a;
}

I checked the output:

Function: main
In:
	b0: No Reaching Definitions.
	first: a on line 1.
	first: b on line 2.
	second: a on line 1.
	second: b on line 2.
	end: b on line 2.
	end: a on line 5.
	end: a on line 8.
Out:
	b0: a on line 1.
	b0: b on line 2.
	first: b on line 2.
	first: a on line 5.
	second: b on line 2.
	second: a on line 8.
	end: b on line 2.
	end: a on line 11.
	end: c on line 14.

which I checked to be make sense relative to the definitions we had be given.

Difficulties

When I first started, I had trouble understanding the live variables analysis. It took me a bit of time to understand live variables would be variables that were defined, and would have their value used sometime in the future. Another problem I faced was for the available expressions analysis, where I associated each expression with a value, like those calculated in local value numbering. I later realized the available expressions should be thought as combinations of variables that you do not want to recompute.

Interestingly, making the worklist solver generic was not too complicated, though that might be because I used python. I can imagine using another language like OCaml would make the worklist solver more difficult to parametrize, especially depending on what containers or functions one chooses to use.

1 reply

sampsyo Feb 20, 2022
Maintainer Author

For each of the these analyses, I wrote test cases and used turnt as a regression testing system to make sure changes I made did not break previously checked test outputs.

Awesome!!! This is exactly the reason 6120 encourages snapshot testing with Turnt… it makes it pretty straightforward to make sure you don't break stuff while working, without requiring lots of work on the tests themselves.

Another problem I faced was for the available expressions analysis, where I associated each expression with a value, like those calculated in local value numbering. I later realized the available expressions should be thought as combinations of variables that you do not want to recompute.

This is a great point. Just figuring out the "domain" for analyses like this can sometimes be the trickiest part; after that, the transfer and merge functions are pretty straightforward.

5hubh4m · 2022-02-17T20:12:38Z

5hubh4m
Feb 17, 2022

Team

The assignment was by our group Ayaka Yorihiro (ay436), Anshuman Mohan (am3327), and Shubham Chaudhary (sc2937). Yay for compilers!

Data Flow Analysis

The python file src/data_flow.py reads a Bril program in JSON and performs data flow analysis.
The actual code for performing data flow is in the function data_flow_analysis which takes in
a CFG, a list of blocks (with their instructions) and a generic framework, which is an object that contains
the methods for performing any analysis.

We implemented the following analyses:

bril2json < /path/to/bril/program.bril | python3 src/data_flow.py [live | const | defs]

There is a Turnt test specification in test/dataflow for the live and const analyses.

Examples

@main(cond: bool) {              main:
  a: int = const 47;               l0:
  b: int = const 42;                 in: cond
  br cond .left .right;              out: a
.left:                             left:
  b: int = const 1;       LIVE       in: a
  c: int = const 5;       ====>      out: a, c
  jmp .end;                        right:
.right:                              in: ∅
  a: int = const 2;                  out: a, c
  c: int = const 10;               end:
  jmp .end;                          in: a, c
.end:                                out: ∅
  d: int = sub a c;
  print d;
}

@main {                            main:
  a: int = const 47;                 l0:
  b: int = const 42;                   in: ∅
  cond: bool = const true;             out: a = 47, b = 42, cond = True
  br cond .left .right;              left:
.left:                      CONST      in: a = 47, b = 42, cond = True
  b: int = const 1;         =====>     out: a = 47, b = 1, c = 5, cond = True
  c: int = const 5;                  right:
  jmp .end;                            in: a = 47, b = 42, cond = True
.right:                                out: a = 2, b = 42, c = 10, cond = True
  a: int = const 2;                  end:
  c: int = const 10;                   in: a = ?, b = ?, c = ?, cond = True
  jmp .end;                            out: a = ?, b = ?, c = ?, cond = True, d = ?
.end:
  d: int = sub a c;
  print d;
}

Experience

While Python made using and manipulating Bril JSON very easy, one pitfall we fell into a couple times was with references and mutable methods. And therefore we had to be careful about making sure that if we are returning objects from functions we are returning copy.deepcopy-fied objects if that object is going to be mutated somewhere else in the program. One particularly nasty version of this issue was in this piece of code:

       for m in l:
         for k, vs in m.items():
             if k not in out:
                 out[k] = vs # !!  DANGEROUS  !!
             else
                 out[k].update(vs)

When vs was being modified elsewhere, it ended up changing the value of an unrelated object. The less wrong version of the code could be:

       for m in l:
         for k, vs in m.items():
             if k not in out:
                 out[k] = set()
             out[k].update(vs)

1 reply

sampsyo Feb 20, 2022
Maintainer Author

Neat summary; thanks! That's always a fun consequence of using an imperative-first language with pervasive mutation. If you ever want, it can be fun to try to challenge yourself to use a functional-first, immutable-everywhere strategy, even if the language makes it awkward to do so… for example, Python has a special immutable frozenset type that works mostly like set but without the mutation operators.

alaiasolkobreslin · 2022-02-18T00:22:10Z

alaiasolkobreslin
Feb 18, 2022

Summary

I implemented constant propagation for my data flow analysis. I used previously-written code to construct a CFG from a Bril program, and wrote some additional code to construct a predecessors CFG (mapping from block name to predecessors).

Here is an example run:

@main(cond: bool) {
  a: int = const 47;
  b: int = const 42;
  br cond .left .right;
.left:
  b: int = const 1;
  c: int = const 10;
  jmp .end;
.right:
  a: int = const 47;
  b: int = const 100;
  c: int = const 10;
  jmp .end;
.end:
  d: int = sub a c;
  print d;
}

Output:

b0:
	in:  {}
	out:  a: 47, b: 42
left:
	in:  a: 47, b: 42
	out:  a: 47, b: 1, c: 10
right:
	in:  a: 47, b: 42
	out:  a: 47, b: 100, c: 10
end:
	in:  a: 47, b: ?, c: 10
	out:  a: 47, b: ?, c: 10, d: ?

Notice that in the last line of output, the values of a and c are known, but d is unknown. I could further optimize this analysis to do constant folding and assign d the value 37.

Testing

I used turnt to test, and I wrote several new example programs in the test/df directory. I made sure to test programs with branches that resulted in different variable assignments (and checked that these variable values were unknown after the merge), and some that resulted in the same variable assignments (and checked that these variable values were known after the merge).

Thoughts

Implementing constant propagation was pretty straightforward. The only unexpected issue I ran into was with printing my dictionaries directly using print and getting different orderings of variables every time, which turnt did not like. So I had to implement a printer function that sorted by keys.

1 reply

sampsyo Feb 20, 2022
Maintainer Author

Sounds great! FWIW, I think this is the right thing to do for a pass called "constant propagation"; doing something smarter through operations would want to be renamed "constant folding" instead.

The only unexpected issue I ran into was with printing my dictionaries directly using print and getting different orderings of variables every time, which turnt did not like. So I had to implement a printer function that sorted by keys.

I know it's just an engineering issue, but in my experience, this kind of printing annoyance comes up in compilers work all the time! It's really easy to let some kind of nondeterministic hash-table ordering sneak into a compiler, and doing so makes testing very annoying. Many real-world compilers try very hard to avoid any nondeterministic ordering for this reason.

chhzh123 · 2022-02-18T01:40:27Z

chhzh123
Feb 18, 2022

I implemented a generic dataflow analysis framework that enables users to easily plug in different algorithms. My code can be found here, and tests are in the test folder. Based on this framework, I implemented several analysis passes below:

Reaching definitions
Live variables
Uninitialized variables
Sign analysis

Since I have implemented constant propagation with simple control flow last time, and I think it is relatively straightforward to do that within the dataflow framework (which essentially needs to plug LVN into the transfer function), I didn't pay lots of effort on that. Instead, I tried to implement the uninitialized variables and sign analysis, which are not directly covered in the lecture but are widely used in static analysis.

Reaching definitions

This is the textbook algorithm but is still somehow tricky since it actually works on instructions/definitions instead of variables. Therefore, the implementation in class is not completely correct, which also obfuscates with live variables.

To get the correct definition references, we need to first label those definitions. I used d#id_<var> to label the definitions in order to distinguish different definitions referring to the same variable. In this way, I can easily take the variable name after the underscore When examining whether a variable is killed or not.

I also added support for function arguments, which should be added to initial definitions.

Live variables

The provided tranfer function is correct in general but requires more consideration when implementation. Considering the following example, if using the transfer function in_b = use_b U (out_b - def_b), we can get out_b=∅, def_b={d}, and use_b={a,c,d}, therefore in_b={a,c,d}. This is because d is defined in this basic block and immediately used later, but it should not be considered as alive before this block.

.end:
  d: int = sub a c;
  print d;

To fix this problem, we should do the instruction-wise analysis, meaning Set in and out should actually hold for each instruction.

i2-> d: int = sub a c;
i1-> print d;

By traversing the instructions from back to front, we have out[i2]=in[i1]={d}, and then in_b=in[i2]={a,c}U({d}-{d})={a,c}, which is the correct answer. Also, considering some instructions may have read and write to the same variable (e.g. result: int = mul result i), we should first remove the uses ("dest") and add the references ("args) to the Set.

In summary, this example shows sometimes the transfer function should be applied per instruction instead of per block. Otherwise, it may lead to a different result.

Uninitialized variables

This algorithm tries to examine whether there exist uninitialized variables in the program. I take all the variables and make them uninitialized at the beginning. When some variables are defined in the block, they are removed from the Set.

@main(cond: bool) {
  br cond .left .right;
.left:
  a: int = const 1;
  b: int = const 5;
  jmp .end;
.right:
  a: int = const 2;
  c: int = const 10;
  jmp .end;
.end:
  d: int = sub a c;
  e: int = add b c;
  print d;
}

Take the above program as an example, this analysis can capture all the uninitialized variables from different paths. Although c is defined in .right, it is not defined in .left, thus my pass views it uninitialized in the block .end. On the contrary, a is defined in both of the branches, so it is initialized in the .end block.

b1:
  in:  a, b, c, d
  out: a, b, c, d
left:
  in:  b, c, a, d
  out: c, d
right:
  in:  b, c, a, d
  out: b, d
end:
  in:  b, c, d
  out: b, c

Actually, this piece of program is valid in Python, since Python does not have this kind of static analysis to check whether a variable is initialized in all the paths. But for C/C++, this should be an obvious error that will be captured during compile-time.

Sign analysis

For sign analysis, it basically determines the sign of each variable. I used the lattice definition, except for 0, +, and -, there are two more elements -- ⊥ means the undefined values, and T means unknown values that can have different signs.

For example, for the add operation, the sign analysis should follow the below computation rule.

add	⊥	0	-	+	T
⊥	⊥	⊥	⊥	⊥	⊥
0	⊥	0	-	+	T
-	⊥	-	-	T	T
+	⊥	+	T	+	T
T	⊥	T	T	T	T

I only defined the rules for arithmetic operations like add, sub, mul, and div. A simple example is shown below.

@main {
  a: int = const 10;
  b: int = const -5;
  cond: bool = const true;
  br cond .left .right;
.left:
  c: int = add a b;
  d: int = mul a b;
  jmp .end;
.right:
  one: int = const 1;
  c: int = add a one;
  jmp .end;
.end:
  print c;
}

The analysis outputs the following signs. We can see that c is + in .right but is T in .left, so after merging, the sign of c is T.

b1:
  in:  {'a': '⊥', 'b': '⊥', 'cond': '⊥', 'c': '⊥', 'd': '⊥', 'one': '⊥'}
  out: {'a': '+', 'b': '-', 'cond': '+', 'c': '⊥', 'd': '⊥', 'one': '⊥'}
left:
  in:  {'a': '+', 'b': '-', 'cond': '+', 'c': '⊥', 'd': '⊥', 'one': '⊥'}
  out: {'a': '+', 'b': '-', 'cond': '+', 'c': 'T', 'd': '-', 'one': '⊥'}
right:
  in:  {'a': '+', 'b': '-', 'cond': '+', 'c': '⊥', 'd': '⊥', 'one': '⊥'}
  out: {'a': '+', 'b': '-', 'cond': '+', 'c': '+', 'd': '⊥', 'one': '+'}
end:
  in:  {'a': '+', 'b': '-', 'cond': '+', 'c': 'T', 'd': '⊥', 'one': '⊥'}
  out: {'a': '+', 'b': '-', 'cond': '+', 'c': 'T', 'd': '⊥', 'one': '⊥'}

Sign analysis actually does a similar thing of finding uninitialized variables. Once the sign of a variable is ⊥ in some blocks, it means the variable is undefined from some execution paths.

1 reply

sampsyo Feb 20, 2022
Maintainer Author

This is the textbook algorithm but is still somehow tricky since it actually works on instructions/definitions instead of variables. Therefore, the implementation in class is not completely correct, which also obfuscates with live variables.

Hmm; this section is about reaching definitions, but the link goes to an example implementation of defined variables… those two have different domains.

Thanks for the detailed discussion of how to do the transfer function for live variables! This is indeed the kind of trick that is necessary when doing data flow "blockwise," and this is a great description of how to implement it.

Covering sign analysis is also really instructive! It was a great way to illustrate the connection to lattices/partial orders in data flow domains.

michaelmaitland · 2022-02-18T03:15:40Z

michaelmaitland
Feb 18, 2022

Introduction

I decided to skip writing a single data flow analysis and skipped straight to writing a generic data flow solver. The reason I chose to do this is because I have written live variable analysis, reaching definitions, available expressions, and available copies for CS 4120.

Implementation

In writing a generic solver, I first created a CFG with the ability to get the predecessors and successors for each block in O(1) time. I first created a mapping to help me identify blocks in the CFG uniquely. I did chose to identify each block by the label at the start of the block. I modified Professor Sampson's form_blocks code to generate a label at the start of any block that does not have one. This can occur at the start of a function. From here I iterated through each block, where in each iteration I built up two mappings: one from block name to successor blocks, and another from block name to predecessor blocks. These two maps allow for us to later lookup in constant time the successors and predecessors for any block.

The next part was to create the generic Dataflow framework. As discussed in class there are 4 elements to any analysis:

A direction of analysis. This can be either forwards or backwards, which I chose to represent as an enum
An initial value for either in_b or out_b depending on whether the analysis is forwards or backwards respectively. I represented this as an enum because the analyses I implemented always initialized to the empty set or the top set. The reason why I chose to not instantiate a real set was because we can more efficiently represent top as a concept rather than the actual top set.
A merge function
A transfer function

From here I realized that although we were working on sets, there is a distinction between having a set with real and tangible values compared to the top set which is just an idea. I opted to create a representation that captures this: DfSet which is a set that either contains real values or is the top set. I implemented common set operations on these DfSets such as union, intersection, and difference as I knew I would need this for the merge and transfer functions.

The next step was to build the worklist algorithm. Since in and out, and preds and succs swap depending on the direction of the analysis, I ended up with duplicate code to make this happen. After cleaning it up a little, the work list function fit on a screen as a single page of code. Although I believe I could refactor out some more commonalities, I leave this as future work. I found the actual implementation of this step quite straight forward.

Next, I created a few different analysis. I chose to implement reaching definitions, live variable analysis, and defined variable analysis. I chose to do reaching definitions because we discussed it in class. I did live variable and defined variable analysis so I could reuse Professor Sampson's tests, and also because I had both forward and backward analysis to test.

The last step was to create some pretty printing functions which was quite trivial.

Testing

I thought I was smart in implementing analysis that Professor Sampson used so I could compare against his tests. However, I was printing out sets in arbitrary order (since sets in python are unordered). This made it so my output was non-deterministic. I tried running live variable analysis for a small program a few times and eventually I got lucky on the ordering of the prints that turn outputted okay.

But moving to larger programs made the probability of getting the same printout as Professor Sampsons quite low. So I inspected manually for each program in the examples/test/df test suite. I verified that my output, except for ordering was identical.

In testing, I caught some bugs with my live variable analysis. Specifically, I had written the use function incorrectly. I initially wrote it to use any variable even if it was used after a definition in that block. I ended up writing the CFG by hand here for the fact.bril program to understand where I was going wrong. After I noticed this, I made the necessary changes to my code and my tests began to pass for all in the test suite.

Although my tests have passed the 3 or 4 examples in the test suite, I cannot say that I am very convinced that my code works. In order to be more confident, I would love to test on larger and more complex programs, but I cannot think of another way to do it than manual inspection. I suppose I could hammer out the ordering of printing sets and compare my output to Professor Sampson's output on more programs, but this only shows how correct I am relative to how correct his analysis is. It's better than nothing though.

Another way to become more confident is to write transformations that use these analysis. Then I can run the transformed code and compare against expected values for all the benchmark programs. This is what I plan to do in the next lessons which is why I did not spend the time ordering my sets and doing manual inspection on larger programs. I understand this approach is less modular, and in a production environment I would suck it up and make sure the analysis is correct independent to any transformation, but in this learning environment, I hope this is sufficient.

1 reply

sampsyo Feb 20, 2022
Maintainer Author

Thanks for the detailed discussion of the CFG construction; that's very useful to see!

From here I realized that although we were working on sets, there is a distinction between having a set with real and tangible values compared to the top set which is just an idea. I opted to create a representation that captures this: DfSet which is a set that either contains real values or is the top set. I implemented common set operations on these DfSets such as union, intersection, and difference as I knew I would need this for the merge and transfer functions.

Super interesting idea. I like the idea of "lifting" those set operations onto your optional-set type. FWIW, one other approach that can work for some (but not all) analyses is, if you know the finite universe of all possible values ahead of time, you can make the "top" set equal to the set containing all of those values. Even so, representing that internally in a special way instead of as a literal set can be wise for performance/memory reasons.

This made it so my output was non-deterministic. I tried running live variable analysis for a small program a few times and eventually I got lucky on the ordering of the prints that turn outputted okay.

Always a tricky thing! You'll notice above that @alaiasolkobreslin discussed a similar effect. A common resolution is to do some sort of sorting or seed-fixing to force your compiler to be deterministic. (Users typically appreciate deterministic compilers.)

Although my tests have passed the 3 or 4 examples in the test suite, I cannot say that I am very convinced that my code works

Some very appropriate humility here. 😄 Always good to be skeptical that your code actually works!

It can sometimes be fun to think through what it would look like to come up with an "intrinsic" test for the correctness of an analysis. For example, is there some kind of expensive ground-truth way to check whether a given definition does, in fact, reach a use in the CFG? As ever, however, certifying that compilers are correct is harder than writing them in the first place…

charles-rs · 2022-02-18T03:52:07Z

charles-rs
Feb 18, 2022

My implementation is here

CFG

I went through a couple iterations of what the CFG should be, and settled on what in hindsight was not ideal. I made a class for basic blocks that had a label, a list of instructions, and a variant for the outgoing edge(s): either going to another label, returning something, or a branch. This way I could write my analyses ignoring these control flow instructions. This was very stupid, since branch instructions "use" their condition, so i had to put them back in. In hindsight I would not have done it this way. The actual CFG is three hashmaps: labels -> labels for succs, labels -> labels for preds, and labels -> blocks to keep track of the code. This is partially because it is difficult to hash objects in LISP, so much easier to hash only strings.

Dataflow

I actually believe that implementing the generalized dataflow was easier, as it gave me another layer of abstraction to work with. Also lisp has first class-ish functions, so the syntax was pretty easy, and I could pretty much copy it from the lecture. I made a recursive function to do the flowing, so the nodes are processed as if they are being pushed onto a stack. I'm honestly not quite sure if this is ideal or not, but in my (somewhat limited) testing it didn't seem to be a problem.

I implemented everything as a forward analysis, and then sneakily defined helper functions for get-succs and get-preds that have extremely misleading names when you are running a backward analysis :) I didn't swap the in and out maps, so the user has to make sure to put them the right way around when they get them returned (this is the return value of my worklist algorithm)

Reaching Definitions

After implementing the worklist algorithm, I implemented reaching definitions partially as a way to test it, and partially because, well, that was the assignment. Again, this was pretty much just copied from the lecture, but I did have to add numbers to instructions. Each instruction in the function (including labels since why bother filtering them out) is assigned an index so that if there are two "identical" instructions they can be easily differentiated. I initially forgot to do this, but it was pretty easy to add in. This was all relatively straightforward, as I used lists for my sets, and lisp has build in union/intersection/set difference with lists.

The output for this was not the most pretty, i put a label for each function, and under that a label for each label, corresponding to a basic block. I then gave the list of definitions at the beginning and end of that block. I used the default printing of my internal representation, so it's readable, but not very bril-like

Live variables

In order to make sure I got the backward analysis working properly, I also implemented live variable analysis. This again was pretty much just from the lecture, and didn't present any serious issues. This was when I realized that I needed to include branch instructions to count their uses.

Similar to reaching defs, I printed the live variables at the beginning and end of each basic block. I don't know why I was surprised to see that a variable defined and only used in a single block is never reported as live, but this does make sense lol.

Testing

Testing was one of the most challenging parts of this assignment, as I couldn't just run the code and make sure it did the right thing. Initially, I wrote the CFG, and wrote a function to recover the instructions from it. Then, I verified that program -> cfg -> program preserved behavior on the benchmarks by using turnt.

For my analyses, I did most of my work in the REPL, as it was a more integrated experience and helpful for testing bits and pieces of things. For the "real" testing, I again used turnt with the -pv flags so that I could manually inspect the output. When i was convinced it was correct, i ran turnt --save, so that if i found a bug later I could make sure i didn't break anything fixing it. I took the suggestion of using command line arguments to select the analysis, and that made integration with turnt pretty painless.

Difficulties/experience/mistakes.

Lowlights:

trying to "fix" my jank CFG from the LVN that only had succs into something a little more robust. Bad idea. Wasted time, but once I started over from scratch it was relatively painless
Not really working on the assignment till yesterday. This is clearly a bad idea.

Remark

I should have started earlier and implemented more different analyses. I think doing one forward and one backward means that I probably eliminated most bugs from the actual dataflow though

2 replies

michaelmaitland Feb 18, 2022

Your methodology for testing CFG construction is neat!

sampsyo Feb 20, 2022
Maintainer Author

Indeed; thanks for the detailed discussion of the "wrong path" in constructing your CFG! And doing the whole "round trip" thing to reassemble linear programs is quite elegant; it's exactly the kind of thing that goes on in a serious IR like LLVM.

Thanks also for the discussion about the need to add "IDs" to instructions to differentiate definitions. It's an important trick and very easy to overlook when just staring at the mathy on-paper versions of data flow analyses.

andrewb1999 · 2022-02-18T05:05:38Z

andrewb1999
Feb 18, 2022

I implemented a general dataflow framework in Rust. On top of this framework I implemented Live variable analysis and defined variables. It can easily be extended to support any of the discussed dataflow analyses.

CFG

My CFG representation is a struct with maps for predecessors and successors, represented as Strings, and an ordered map from label strings to blocks.

Dataflow

A dataflow pass is defined in my framework as a struct that implements the Dataflow trait, written below:

trait Dataflow {
    type Item;

    fn merge(&self, sets : impl Iterator<Item=HashSet<Self::Item>>) -> HashSet<Self::Item>;

    fn transfer(&self, b : &Block, in_b : &HashSet<Self::Item>) -> HashSet<Self::Item>;

    fn is_reverse(&self) -> bool;

    fn init(&self) -> HashSet<Self::Item>;
}

Item can be set to change the type of items stored in the dataflow analysis sets. A struct that implements this trait is then passed into the general worklist algorithm which calls the corresponding functions.

Testing

The framework is tested using a small set of turnt tests. The output is just a simple formatting of the dataflow analysis output, but this can be easily modified for each specific analysis.

Discussion

The most challenging part of this assignment was figuring out the best way to represent the generic dataflow framework in Rust. Dealing with generics mixed with structs in Rust is still new to me, but I think I'm starting to get the hang of it.

3 replies

sampsyo Feb 20, 2022
Maintainer Author

Awesome! Great idea to use a Rust struct for each analysis.

One tiny implementation question: since I notice you used HashSet for the domains in the analysis, did you run into trouble with nondeterministic ordering? Like, did the order of the sets look randomized on every execution? (The resolution is often to either sort the output or switch to a deterministic container like BTreeSet).

andrewb1999 Feb 20, 2022

I ended up just sorting the output before displaying it to the console. I think either way would work fine, but obviously has some set of performance trade-offs.

sampsyo Feb 20, 2022
Maintainer Author

Sounds prefect to me! We do exactly the same thing in Calyx all the time, for example, such as here:
https://github.com/cucapra/calyx/blob/93ecebc8fbd782596cbde9b53d03a028a424d13e/src/backend/verilog.rs#L218

andreyyao · 2022-02-18T05:16:18Z

andreyyao
Feb 18, 2022

My implementation: https://github.coecis.cornell.edu/awy32/Brilliant/tree/main/lib/df

CFG

Yes, I was the person called out by Prof.Sampson in lecture the other day for using a full fledged graph library to build the cfg. In fact my initial ideas was even worse, as I tried to put the blocks themselves directly as nodes in the graph. This is very expensive and requires the block type to be hashable, comparable, etc., so I decided to just switch to a graph where the nodes are the block names. This has the additional advantage that it's really easy to build up the edges when converting a function to cfg, for if an edge target node doesn't exist yet, you can just create it without having to worry about the instructions in the block represented by the node. One advantage of using the graph library tho, is that I get a digraph so preds and succs are already defined functions. Furthermore, I decided to use mutable array instead of lists to hold the instructions inside each block. This should make traversal by index and applying optimizations much easier.

Dataflow

I built a generic dataflow framework, which turns out to be quite intuitive. The OCaml module system is very helpful here. I basically defined a module type for posets, requiring the appropriate functions and values like top, transfer, meet, etc. Then I built a functor(actually two functors, one for forward and one for backward) that takes in this signature and outputs the module representing this particular analysis, with functions like solve which is the worklist algorithm that solves dataflow equations generically. Once I got this functor written down, adding additional analyses is really easy.

Reaching Definitions

This one is a bit tricky because OCaml has finicky physical equality. In order to tell various copies of an instruction apart, I need to either add field to the instruction type(which would require rewriting the parsing from json, etc) or to represent the instructions in the poset values in a different way. I chose the latter after Sampson's suggestion on Zulip and recorded the (block_name, index_in_block) pair for the reaching definitions. I also represented the instructions in a block using an array so I can index into them with constant time. For additional efficiency I also hash the variable names to the instructions that define them.

Testing

This has not happened despite my valiant effort. However I would argue that since I'm writing the code in OCaml, the fact that the code compiles means it's probably correct...

Other Dataflow Analyses

I plan to churn out a few more analyses, which should be really quick.

5 replies

michaelmaitland Feb 18, 2022

While you definitely have a clean solution (I concede, more impressive then mine), I am not convinced that is enough to convince me that your solution works confidently without more analysis on testing :)

andreyyao Feb 18, 2022

We'll find out once I get the testing framework working :D. Are you also using OCaml?

sampsyo Feb 20, 2022
Maintainer Author

Yes, I was the person called out by Prof.Sampson in lecture the other day for using a full fledged graph library to build the cfg.

😂 Sorry for calling this out!! It can certainly work, so it wasn't a bad idea to try. It's interesting that the problem ended up being the hashability of the nodes; switching to just using names is a good resolution.

Then I built a functor(actually two functors, one for forward and one for backward) that takes in this signature and outputs the module representing this particular analysis, with functions like solve which is the worklist algorithm that solves dataflow equations generically.

Super cool use of functors/modules in OCaml. Seems like just the right thing.

Out of curiosity, regarding the testing troubles, what does the text output of your analysis look like? Is it something that a human can read and guess about the correctness of, or is it harder to parse than that?

andreyyao Feb 20, 2022

The testing troubles really comes from not having a command line interface. I just implemented that over the weekend, so now testing is a lot easier!

I also rewrote my code without the graph library, and everything is in fact much simpler now...

sampsyo Feb 21, 2022
Maintainer Author

Awesome!

orkosinha · 2022-02-18T06:30:48Z

orkosinha
Feb 18, 2022

My implementation is here

CFG

For constructing the CFG, I relied on the method from Lesson 2 with the form_blocks and block_map method. Tracking the predecessors and the fall to the next instruction were a bit tricky. I solved the fall to the next block issue with enumerating through the block_map, and checked if it was the last block when I would set the successor to the next block. Additionally, I tracked predecessors by updating it with the inverse of the successor map anytime I updated the successor map.

Also, I guess I wanted to get ahead of any of the same labels being generated, so I used the Python uuid library to generate block names, but this is a bit overkill IMO.

Dataflow Analysis

I didn't get a chance to implement constant propagation with LVN, so I tried to do it here with dataflow analysis. Implementing the worklist algorithm was the easiest part, especially after implementing it for just finding definitions. I structured my code as close to the pseudocode as possible, and it ended up working pretty well with Python. Implementing constant folding required some drawing of CFGs and looking at how to solve the equations. If I had more time, I'd try to modularize my worklist function, which I got close to doing but I lack DFA configs for various analysis which I need to figure out.

For demonstration my program lvn03.bril (I'm also going to switch to more descriptive names here)

@main {
    a: int = const 4;
    b: int = const 2;

    sum1: int = add a b;
    sum2: int = add a b;
    prod1: int = mul sum1 sum2;

    sum1: int = const 0;
    sum2: int = const 0;

    sum3: int = add a b;
    prod2: int = mul sum3 sum3;

    print prod2;
}

gets analyzed to

b.15df3224bf4e479294b60ecf837f7c26:
  in: a: 4, b: 2, sum1: ?, sum2: ?, prod1: ?, sum3: ?, prod2: ?
  out: a: 4, b: 2, sum1: 0, sum2: 0, prod1: 36, sum3: 6, prod2: 36

Testing

I moved my testing directory to be a general dump of all the tests I've created so far. This is one category I need to do more with, and organize each test for specific functions rather than running through all of them. Especially with my DFA not producing executable Bril, a lot of testing was just by inspection.

1 reply

sampsyo Feb 20, 2022
Maintainer Author

Cool; thanks for the summary! It is sort of inevitable that testing needs to rely on human inspection for analyses like this, but even so, it can be useful to automate regression testing: to make sure you don't break things that you've already inspected in the past while refactoring or adding new features.

susan-garry · 2022-02-18T08:08:19Z

susan-garry
Feb 18, 2022

My implementation is here. I implemented a generic dataflow analysis solver and the reaching definitions dataflow analysis, which I tested using the given tests.

Generic Dataflow Analysis Solver

My generic solver, written in python, implements the takes an initial value, meet operator, and transfer function as its input (along with an optional pretty-printer parameter). It keeps track of a mapping from each block to its input and output, and assumes that each block begins with a unique label (which is accomplished using my cfg implementation). The CFG is represented by

A map from labels to successor blocks
A map from labels to predecessor blocks
A list of blocks (used here to determine the order of output for ease when testing with turnt)
A map from labels to blocks

Limitations

Admittedly, I haven't yet tested my generic solver on an algorithm that requires a backward pass, but the backward pass works simply by inverting the in and out objects both before and after the data analysis, effectively reversing the direction of the control flow.
Additionally, this approach is limited by a lack of constant folding and branch analysis, as it does not check if the guard of a branch can be determined at the time the instruction is executed.

Discussion

The workflow algorithm was surprisingly easy to implement, but I ran into trouble implementing the dataflow analysis for reaching definitions because I did not realize that in python, when a dictionary is copied, the elements are not, so altering a value in one dictionary will also alter the value in a copy of it.

1 reply

sampsyo Feb 20, 2022
Maintainer Author

Awesome; thanks for the discussion! Yeah, dealing with mutability in pervasively-mutable languages like Python always causes some trickery.

I was curious what you meant by this:

Additionally, this approach is limited by a lack of constant folding and branch analysis, as it does not check if the guard of a branch can be determined at the time the instruction is executed.

Typically, most data flow analyses do not attempt to guess whether a branch is guaranteed to go one way or another. That's extra icing on the cake, but you can still do stuff like constant folding conservatively, i.e., giving up if you don't know that the constant is identical along both branches. Does this make sense?

yy665 · 2022-02-18T13:04:51Z

yy665
Feb 18, 2022

Here is my implementation.

The data flow framework

The algorithm used is exactly the same as the pseudo code provided in class. The cfg generator and form_blocks codes are adopted as dependency for this work.

I have implemented a generic solver so that any only three new functions and one property need to be defined for new analyses, which are:

Merge function - Basically a "reducer" given a set of needed ins, outs
Transfer function - How to process current block with ins, outs information
Output process function - What's the output we'd like to show given processed data
Flipped? - Is it a forward or backward dataflow problem.

It's also possible to add some common helper functions that multiple analyses can benefit from in the "Analysis" class.

Dataflow Pass

I have implemented Reaching definitions pass. The testing was done by manually inspect the output with test files under examples/test/df.

Discussion

I feel like implementing dataflow algorithms is fairly straight forward. The interesting part was to design a generic solver that can maximize reuses for different dataflow analyses. I feel like there are still a lot of work (tools) could be done on the generic framework to support any new dataflow analyses, and meanwhile make it clean and simple.

1 reply

sampsyo Feb 20, 2022
Maintainer Author

Thanks for the writeup! I'm interested in your addition of output_process_fn to the analyses (as you put it, "what's the output we'd like to show given processed data.") I notice that this is the identity function for your reaching definitions analysis. Under what circumstances do you imagine using something fancier for that function?

barabanshek · 2022-02-18T21:38:46Z

barabanshek
Feb 18, 2022

I implemented the generic dataflow analysis algorithm here and tested it on the constant propagation use case. Bellow is an example on fact.bril test with constant propagation:

@main {
  result: int = const 1;
  i: int = const 8;

.header:
  # Enter body if i >= 0.
  zero: int = const 0;
  cond: bool = gt i zero;
  br cond .body .end;

.body:
  result: int = mul result i;

  # i--
  one: int = const 1;
  i: int = sub i one;

  jmp .header;

.end:
  print result;
}

Constant propagation result (first line in each basic block corresponds to the first instruction on the block):

{'dest': 'result', 'op': 'const', 'type': 'int', 'value': 1} :
     in:  {}
     out:  {'result': 1, 'i': 8}
{'label': 'header'} :
     in:  {'result': '?', 'i': '?', 'zero': 0, 'cond': '?', 'one': 1}
     out:  {'result': '?', 'i': '?', 'zero': 0, 'cond': '?', 'one': 1}
{'label': 'body'} :
     in:  {'result': '?', 'i': '?', 'zero': 0, 'cond': '?', 'one': 1}
     out:  {'result': '?', 'i': '?', 'zero': 0, 'cond': '?', 'one': 1}
{'label': 'end'} :
     in:  {'result': '?', 'i': '?', 'zero': 0, 'cond': '?', 'one': 1}
     out:  {'result': '?', 'i': '?', 'zero': 0, 'cond': '?', 'one': 1}

Limitations:

My current implementation does not do constant folding, just propagation (see discussion bellow).

Discussion

I'm wondering if df analysis can be hooked-up with constant folding (for example) and provide global constant folding. For example, in the above program, would it be possible to compute the whole eight iterations in compile time and fold everything, including the branches. My thoughts is that this is a very advanced feature for compilers, and very rare real compilers implement it. Am I right?

For example, such global constant folding is basically the whole constexpr business in C++11 and higher (that allows to fold the whole program to a single value), and, I suppose, it's extremely complex to implement. Here is an awesome example of what one can do with the global constexpr folding in C++: https://bitbucket.org/jjeka/mathcpp/src/master/ - the code reduces complex computations (with brackets) over const numbers expresses as string expressions (like "(67+987^(7-32))(34-123)+17^2+(-1)") into a single value entirely in compile time. A 500-LoC program is getting reduced into a single line one ;)

2 replies

sampsyo Feb 20, 2022
Maintainer Author

I'm wondering if df analysis can be hooked-up with constant folding (for example) and provide global constant folding. For example, in the above program, would it be possible to compute the whole eight iterations in compile time and fold everything, including the branches. My thoughts is that this is a very advanced feature for compilers, and very rare real compilers implement it. Am I right?

Yes and no! You definitely can do a "global constant folding" analysis that propagates constants through operations in a data-flow framework without much trouble. I suggest giving it a shot!

But I think that straightforward analysis would have trouble completely constant-ifying this program. Can you think of precisely where it might run into trouble? (Or you could just implement it and observe. 😃)

On the other hand, LLVM has no problem with this program. Even clang -O1 is enough to optimize this entire function to return a constant: https://godbolt.org/z/1WGx3r8Yh

The key is almost certainly a loop analysis (based on natural loops, as we discussed on Thursday).

barabanshek Feb 20, 2022

Yeah, propagating constants globally is do'able, and I think, my current implementation can partially do this (it propagates one/zero to the very end). Adding folding of expressions is not hard, we did it in the previous assignment.

But the challenging part here (I think) is "unrolling" and then folding the loop. We also need to make sure that this is possible, i.e. the loop is bounded and identify the variables in it that only depend on constant definitions outside of the loop. I think, it's impossible to do without the loop analysis, i.e purely based on the df framework.

thomasyang18 · 2022-05-21T13:23:33Z

thomasyang18
May 21, 2022

Implementation and Testing

I implemented constant folding, live variable analysis, and a (probably very broken) interval analysis here. A lot more detail went into the README there so I don't want to really repeat myself, so I'll leave a brief description here.

I also made my dataflow analysis general, so as long as your combine operator followed a monotone lattice definition, it would work. For example, I had a whole "evaluate_expr" function that would take the most specific guess it could get, but if it couldn't guess it it would guess '?' (for constant folding.

I tested my tests on lvn (to verify constant folding was working), the dataflow section, and some of my own tests that I thought of to test loops. Live variables and constant folding seemed to work fine but interval analysis kept breaking (or did it?) For example, when I had an infinite loop that kept dividing by two, the interval analysis would say [-MAXINT, MAXINT] for the in block and [-MAXINT/2, MAXINT/2] for the outblock. I mean, I guess, but I feel like it should be more precise than just saying literally maxint and maxint/2.

Dataflow Analysis Theory

After implementing general dataflow analysis, I'm wasn't sure if I understood why the dataflow analysis works. At first, I sort of intuitively understood monotonicity as "any time you join two variables, make sure you can't get anything more specific the inputs" and that seemed to work for constant folding and live variable analysis, but when it got to interval analysis my programs kept breaking (well, not "breaking" per say, but the resulting variables were way too general), and I thought that assignment operations didn't seem so monotone. So I decided to look into this book that was supplied in class to learn more about the theory behind the algorithm.

To be honest, I just handwaved everything past exercise 4.29 as "oh of course if you repeatedly apply functions they'll converge" or something. But I read all the way up until exercise 4.28, and I think my understanding of why the algorithm works has increased up to the point where I'm at least convinced it works. Since I thought that assignment operations were weird - sometimes, you can override a "top" variable with a smaller variable, that doesn't seem monotonic at all.

But then thinking through exercise 4.26-4.28, what it's really saying is that if the set of functions that act on each variable are individually monotonic, and thus the entire mapping lattice is monotonic. I think what I was thinking of "monotone" as before was actually part 2 of exercise 4.26, where I thought you were changing the function itself, instead of changing the set of functions that act on the entire mapping. If that were the case, because of part 2 of exercise 4.26, of course changing the function itself isn't monotone, but exchanging one monotone function for another is still monotone. If that's the correct interpretation, I think that's pretty cool - instead of needing to think about whatever f_1...f_n is with the dataflow constraints which I didn't really get, all you need to ensure is that every function you apply to a variable is monotone, which is easier and more elegant to think about IMO.

So the two types of operations in constant folding are "evaluate_expr" and "assignment". The former is monotone because I specifically programmed it that way; the latter is monotone because it's a constant function.

(at least, that's what I think is going on. I could have just misinterpreted the whole thing, which is def possible since I had to ask a friend for help to understand what was going on in exercise 4.26 and took 2 days to understand that part alone lol)

(also, did they get notation wrong? Why is f_1(x) a mapping from L^n -> L, but x_1 = [some mapping from A -> L]? I understood the A->L mapping thing but didn't understand what f_1(x) was really doing, I kind of just interpreted it the way I did above, which is another reason why I'm a little hesitant to just conclude my interpretation is correct).

It's a really cool algorithm, regardless - it kind of reminds me of bellman ford because of how you can locally update the answer and achieve a global minimum (or a global minimum to the best of our ability), and it also reminds me of segment trees because of using a mathematical property to generalize operations more abstractly (in the case of segment trees, it's using monoids, here it's using lattices and the idea of monotonicity).

2 replies

sampsyo May 21, 2022
Maintainer Author

Yes, the definition of monotonicity is a tricky part of understanding this algorithm. Without diving into the book chapter, I think it's tempting to mistakenly intuit that the definition is: $f$ is monotone iff $x \sqsubseteq f(x)$ for all $x$, i.e., the function takes things "up the lattice." But in fact the definition says $x \sqsubseteq y \Rightarrow f(x) \sqsubseteq f(y)$, i.e., the function preserves ordering.

thomasyang18 May 21, 2022

Yeah, I when implementing interval analysis I fell for that trap of, "if I assign an interval that was [2,3] to the constant 4, I'll have to make it the interval [2,4]" because that's what I thought monotone meant - to keep increasing the interval. And I thought division was like just straight up wrong, because it shrinks the interval - there's no way that makes sense (at the time).

I think it was easier for me to conceptually accept constant folding because, in terms of the book, it was basically determined as the flat Z lattice, so there's really not that much to compare in a lattice of height 3. So "jumping around" the lattice is kind of easier to accept, since I conceptualized it as, "they're all on the 'same level' of importance." But then actually working through more complicated examples - even something as simple as the powerset lattice of 4, made me realize there's a big difference between the two definitions you mentioned, because there's actual comparison between elements (besides just \top and \bot) going on. Like the function f(x) = "remove 4 from set" is still monotone but definitely doesn't just "always go up".

Thanks for this lesson! It was pretty enjoyable, especially with the provided reading materials. Definitely one of the coolest algorithms I've seen.

Lesson 4: Data Flow #285

sampsyo Feb 3, 2022 Maintainer

Replies: 17 comments · 30 replies

API Overview

Walkthrough

get_inputs()

Test

sampsyo Feb 15, 2022 Maintainer Author

DataFlow Implementation

Worklist

Defined Variables

Live Variables

Constant Propagation

Test

sampsyo Feb 20, 2022 Maintainer Author

Dataflow Analysis

Some Limitations & Rules of my implementation

Reaching Definition

Constant Propagation

sampsyo Feb 20, 2022 Maintainer Author

CFG

Reachability

Worklist Algorithm

Liveness

sampsyo Feb 20, 2022 Maintainer Author

Description

Tests

Discussion

sampsyo Feb 20, 2022 Maintainer Author

sampsyo Feb 25, 2022 Maintainer Author

Links

Summary

Testing

Results

Difficulties

sampsyo Feb 20, 2022 Maintainer Author

Team

Data Flow Analysis

Examples

Experience

sampsyo Feb 20, 2022 Maintainer Author

Summary

Testing

Thoughts

sampsyo Feb 20, 2022 Maintainer Author

Reaching definitions

Live variables

Uninitialized variables

Sign analysis

sampsyo Feb 20, 2022 Maintainer Author

Introduction

Implementation

Testing

sampsyo Feb 20, 2022 Maintainer Author

CFG

Dataflow

Reaching Definitions

Live variables

Testing

Difficulties/experience/mistakes.

Lowlights:

Remark

sampsyo Feb 20, 2022 Maintainer Author

CFG

Dataflow

Testing

Discussion

sampsyo Feb 20, 2022 Maintainer Author

sampsyo Feb 20, 2022 Maintainer Author

CFG

Dataflow

Reaching Definitions

Testing

Other Dataflow Analyses

sampsyo Feb 20, 2022 Maintainer Author

sampsyo
Feb 3, 2022
Maintainer

Replies: 17 comments 30 replies

`get_inputs()`

sampsyo Feb 15, 2022
Maintainer Author

sampsyo Feb 20, 2022
Maintainer Author

sampsyo Feb 20, 2022
Maintainer Author

sampsyo Feb 20, 2022
Maintainer Author

sampsyo Feb 20, 2022
Maintainer Author

sampsyo Feb 25, 2022
Maintainer Author

sampsyo Feb 20, 2022
Maintainer Author

sampsyo Feb 20, 2022
Maintainer Author

sampsyo Feb 20, 2022
Maintainer Author

sampsyo Feb 20, 2022
Maintainer Author

sampsyo Feb 20, 2022
Maintainer Author

sampsyo Feb 20, 2022
Maintainer Author

sampsyo Feb 20, 2022
Maintainer Author

sampsyo Feb 20, 2022
Maintainer Author

sampsyo Feb 20, 2022
Maintainer Author