Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

goto Removal Algorithm #72

Open
aminya opened this issue Apr 26, 2020 · 3 comments
Open

goto Removal Algorithm #72

aminya opened this issue Apr 26, 2020 · 3 comments

Comments

@aminya
Copy link

aminya commented Apr 26, 2020

LLVM C back-end now generates a lot of gotos in the C code. These gotos can be removed by using certain transformations on the code. This can significantly improve the readability of the code.

In the following, I discuss a practical summary of the algorithm that was introduced in Taming Control Flow: A Structured Approach to Eliminating Goto Statements, and in the end, an implementation by @mbergin is given.

goto Removal Algorithm

It has 5 steps (keep reading if it has unknown words. Most of these are discussed later):

  • Collect a list of all goto and Target statements.
  • "Introduce one logical variable for each Target, initialize it to false, and insert another reinitialization to false at the point of the Target. These are required to make sure that the value of the logical variable is false on all paths except the path coming from the point at which the appropriate conditional test evaluated to true."
  • Convert unconditional gotos to conditional gotos: if (true) goto Target;
  • Eliminate Each goto one at a time: do movement transformations if needed to bring Target and goto to the same level (make them sibling). Then applying elimination transformation.
  • Remove unused labels

There are two main types of transformation that need to be applied:

Elimination Transformations:

when goto and Target are in the same level (siblings):

goto before target:

// ...

if (goto_condition) {
	goto Target;
}
statement_1;
Target: target_statement

    can be replaced with:

// ...

if (!goto_condition) {
	statement_1;
}
Target: target_statement

goto after target:

// ...

Target: target_statement;

if (goto_condition) {
	goto Target;
}

    can be replaced with:

// ...

do {
	Target: target_statement;
} while(goto_condition)

Movement Transformations:

when goto and Target are in different levels:

We should apply these transformations to bring the gotos to the same level of their Target. Then, we can use elimination transformations to remove them.

Each of these transformations moves goto up a level. So, it should be repeated until goto and its target are at the same level.

Outward-movement:

It doesn't matter if Target is before or after goto.

Move goto out of loop:

All types of loops are similar:

for (expression) {
    // ...
	if (goto_condition) {
		goto Target;
	}
    // ...
}
// ...
Target: target_statement;

    can be replaced with:

for (expression) {
    // ...
    goto_condition_var = goto_condition;
	if (goto_condition_var) {
		break;
	}
    // ...
}
if (goto_condition_var) {
    goto Target;
}
// ...
Target: target_statement;

Move goto out of if:

if (expression) {
    // ...

    if (goto_condition) {
        goto Target;
    }
    statement_1;
}
// ...
Target: target_statement;

    can be replaced with:

if (expression) {
    // ...

    goto_condition_var = goto_condition;
    if (!goto_condition_var) {
        statement_1;
    }
}
if (goto_condition_var) {
    goto Target;
}
// ...
Target: target_statement;

Move goto out of switch:

switch(i): {
	case 1:
        // ...

    	if (goto_condition) {
    		goto Target;
    	}

        // ...
    	break;

	case 2:
        // ...

    default:
        // ...
}
// ...
Target: target_statement;

    can be replaced with:

switch(i): {
	case 1:
        // ...

        goto_condition_var = goto_condition;
    	if (goto_condition_var) {
    		break;
    	}

        // ...
    	break;

    case 2:
        // ...

    default:
        // ...
}
if (goto_condition_var) {
    goto Target;
}
// ...
Target: target_statement;

Inward-movement:

We here assume that goto is before Target. If not, you should do go-lifting transformation before this step to bring goto before.

Move goto into a while-loop:

if (goto_condition) {
    goto Target;
}
statement_1;
while (loop_expression) {
    // ...
    Target: target_statement;
    // ...
}

    can be replaced with:

goto_condition_var = goto_condition;
if (!goto_condition_var) {
    statement_1;
}
while (goto_condition_var || loop_expression) {
    if (goto_condition_var) {
        goto Target;
    }
    // ...
    Target:
        goto_condition_var = false;
        target_statement;
    // ...
}

Move goto into a for-loop:

We convert for to an equivalent while and then transform it similar to the previous transformation:

if (goto_condition) {
    goto Target;
}
statement_1;
for (loop_expression) {
    // ...
    Target: target_statement;
    // ...
}

    can be replaced with:

goto_condition_var = goto_condition;
if (!goto_condition_var) {
    statement_1;
}
i = 0;
while (goto_condition_var || equivilant_loop_expression) {
    if (goto_condition_var) {
        goto Target;
    }
    // ...
    Target:
        goto_condition_var = false;
        target_statement;
    // ...

    i++; // loop counter
}

Move goto into a do-while-loop:

if (goto_condition) {
    goto Target;
}
statement_1;
do {
    // ...
    Target: target_statement;
    // ...
} while(loop_expression)

    can be replaced with:

goto_condition_var = goto_condition;
if (!goto_condition_var) {
    statement_1;
}
do {
    if (goto_condition_var) {
        goto Target;
    }
    // ...
    Target:
        goto_condition_var = false;
        target_statement;
    // ...
} while(loop_expression)

Move goto into a if:

  • Target is in the then part:
if (goto_condition) {
    goto Target;
}
statement_1;
if (if_expression) {
    // ...
    Target: target_statement;
    // ...
}

    can be replaced with:

goto_condition_var = goto_condition;
if (!goto_condition_var) {
    statement_1;
}
if (goto_condition_var || if_expression) {
    if (goto_condition_var) {
        goto Target;
    }
    // ...
    Target: target_statement;
    // ...
}
  • Target is in the else part:
if (goto_condition) {
    goto Target;
}
statement_1;
if (if_expression) {
	// ...

} else {
	if (goto_condition_var) {
			goto Target;
	}
	// ...
	Target: target_statement;
	// ...
}

    can be replaced with:

goto_condition_var = goto_condition;
if (!goto_condition_var) {
    statement_1;
}
if (!goto_condition_var && if_expression) {
    // ...

 else {
	 // ...
	 Target: target_statement;
	 // ...
}

Move goto into a switch:

if (goto_condition) {
    goto Target;
}
statement_1;
switch(i): {
	case 1:
        // ...
        Target: target_statement;
        // ...
    	break;

	case 2:
        // ...

    default:
        // ...
}

    can be replaced with:

goto_condition_var = goto_condition;
if (!goto_condition) {
    statement_1;
    t_switch = i;  // `i` is the original switch value
} else {
    t_switch = 1;  // `1` is the case that contains `Target`
}
switch(t_switch): {
	case 1:
        if (goto_condition_var) {
            goto Target;
        }
        // ...
        Target: target_statement;
        // ...
    	break;

	case 2:
        // ...

    default:
        // ...
}

Goto-lifting:

Inward movements require the goto to be before Target. In cases where Target is before goto, we can do goto-lifting transformation first and then apply inward movement.

The transformation is as follows:

// ...
// ...
Target: target_statement;
// ...
if (goto_condition) {
    goto Target;
}
// ...

    can be replaced with (we move the statement that contains Target):

int goto_condition_var = false;
// ...
do {
    if (goto_condition_var) {
        goto Target;
    }
    // ...
    Target: target_statement;
    // ...
    goto_condition_var = goto_condition;
} while (goto_condition_var);
// ...

Conflict of do-loop with outer break/continue:

There is a possibility that a do loop that we introduce has a break or continue that belongs to an outer loop or switch statement. To solve these cases, we use an extra check. For example:

while(true) {   // outer loop
    Target: target_statement;
    // ...
    if (outer_loop_break_condition) {
        break;
    }
    // ...
    if (goto_condition) {
        goto Target;
    }
    // ...
}

    should be replaced with the following for introducing do-while-loop instead of what we discussed before:

int do_break = 0;
while(true) {   // outer loop
    do {
        Target: target_statement;
        // ...
        if (outer_loop_break_condition) {
            do_break = 1;
            break;
        }
        // ...
        goto_condition_var = goto_condition;
    } while( goto_condition_var);
    if (do_break) {
        do_break = 0;
        break;
    }
    // ...
}

Optimizations

The cases that we can simplify our transformations:

  • goto next to Target:
// ...
if (goto_condition) {
    goto Target;
}
Target: target_statement;

    is reduced to:

// ...
Target: target_statement;
  • goto at the end of an if:
if (expression) {
	// ...
	if (goto_condition) {
   	 goto Target;
	}
}
// ...
Target: target_statement;

    is replaced with:

if (expression) {
	// ...
	goto_condition_var = goto_condition;
}
if (goto_condition_var) {
	 goto Target;
}
// ...
Target: target_statement;
  • goto right before a loop:
if (goto_condition) {
	 goto Target;
}
while (loop_expression) {
	//...
	Target: target_statement;
	//...
}

   
    is replaced with:

goto_condition_var = goto_condition;
while (goto_condition || loop_expression) {
	if (goto_condition_var) {
		 goto Target;
	}
	//...
	Target: target_statement;
	//...
}
  • goto right before switch
if (goto_condition) {
	 goto Target;
}
switch (i) {
	case 1:
		//...
		Target: target_statement;
		//...
	case 2:
		//...
	default:
		//...
}

    is replaced with:

goto_condition_var = goto_condition;
if (goto_condition_var) {
	i = 1; // `1` contains `Target`
}
switch (i) {
	case 1:
		if (goto_condition_var){
			goto Target;
		}
		//...
		Target: target_statement;
		//...
	case 2:
		//...
	default:
		//...
}
  • More than one goto associated with a Target inside the same while, if, switch or loop statement:

It would be preferable to insert only one conditional check per label. We can first check to see if the appropriate conditional has already been inserted, and avoid duplicating the code.

For example:

switch (i) {
	case 1:
		//...
	case 2:
		//...
		goto Target;
		break;
	case 3:
		//...
		goto Target;
}
Target: target_statement;

    can be replaced with:

switch (i) {
	case 1:
		//...
	case 2:
		//...
		goto_condition_var = true;
		break;
	case 3:
		//...
		goto_condition_var = true;
		break;
}
if (goto_condition_var){
	goto Target;
}
Target: target_statement;

Implementation:

This algorithm is implemented by @mbergin in https://github.com/mbergin/controlflow. The important functions of this program are:

  • ElimGotos/elimGotos : main function that removes any goto statements from stmts by rewriting them as conditionals and loops. It has a switch case and applies these functions:
    • movement functions:

    • IfStmt -> moveGotosOutOfIf: before: if body contains only gotos that refer to an outer label. after: if body contains no gotos

    • ForStmt -> moveGotosOutOfFor

    • RangeStmt -> moveGotosOutOfRange

    • SwitchStmt -> moveGotosOutOfSwitch

    • removal functions

      • BlockStmt -> elimSiblings: Eliminate conditional gotos from this block -> removeLabels: Remove the (now unused) labels
      • BranchStmt: Unconditional goto. Wrap in "if true { goto L }"

Here is the summary of the algorithm from the original paper:

References:

Related: #6

@vtjnash
Copy link
Member

vtjnash commented Apr 26, 2020

That's an awesome summary!

The history of this repo contains a past attempt at this, but I reset the state of the repository in f02d746 to exclude it since it was broken.

There is possibly also an implementation of this algorithm on LLVM code as "relooper", inside the WebAssembly target (https://reviews.llvm.org/D11691).

@aminya
Copy link
Author

aminya commented Apr 26, 2020

You're welcome!

Thanks for the links. I skimmed their code quickly. It seems that they are not directly working on gotos but certainly related.

Do you think we can implement this?

For the start, we can just implement elimination transformations for the cases where goto and Target are at the same level and probably have some simple optimizations for removal of obvious redundant gotos.

Later we can add movement transformation.

@jfm535
Copy link

jfm535 commented Jul 18, 2021

if you want to see another implementation of this it can be seen here https://github.com/lifting-bits/rellic

i think its implemented in this file https://github.com/lifting-bits/rellic/blob/master/rellic/AST/GenerateAST.cpp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants