Skip to content

Commit

Permalink
cEP-0025.md: Integrate pyflakes AST
Browse files Browse the repository at this point in the history
Closes #143
  • Loading branch information
ankitxjoshi committed May 19, 2018
1 parent 5cde28a commit 95bc075
Showing 1 changed file with 286 additions and 0 deletions.
286 changes: 286 additions & 0 deletions cEP-0025.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,286 @@
# Integration of pyflakes-enhanced-AST into coala

| Metadata | |
| -------- | ----------------------------------------------- |
| cEP | 22 |
| Version | 1.0 |
| Title | Integration of pyflakes-enhanced-AST into coala |
| Authors | Ankit Joshi <mailto:[email protected]> |
| Status | Proposed |
| Type | Feature |

## Abstract

This document describes how to create a metabear to integrate
**pyflakes-enhanced-AST** into coala and demonstrates its use by
creating a simple dependent bear.

## Introduction

**Flake8** is the most commonly used linter framework when it comes to static
code analysis for python. It wraps around the tools **pyflakes**,
**pycodestyle** and **mccabe** scripts. These tools are implemented on different
interfaces and flake8 combines their potential.

It even provides with a plugin mechanism. Additional linters are
supported using the flake8 plugin system, and may use **CPython** ast standard
library or may be built using pycodestyle's internal logical-line-based
checker API (checkout [Hacking](https://github.com/openstack-dev/hacking)).

There has not been an option to use the pyflakes-enhanced-AST based checker
API to create custom linters. Thus, the full potential of enhanced AST isn't
utilized , a whole lot of rework is required to do the basic traversing
and collection of important nodes. Pyflakes provides with a basic API
that does the traversing. So, if a developer uses enhanced AST he just
needs to work on the implementation of the new logic that his/her plugin
provides and not about the fidelity of the basic node handlers.

The reason for choosing pyflakes over its equivalents like
[Astroid](https://github.com/PyCQA/astroid) lies in its simplicity
and provision of lots of re-usable analysis. Astroid is more detailed and
powerful, thus making it complicated. Plus, there is a much larger ecosystem
of flake8 plugins.

## Proposed Change

Here is a brief overview of the architecture:

1. There will be a separate repository named as coala-pyflakes which will
be installable via `pip install pyflakes-bears`.
2. The repository will contain a new package `pyflakes-bears` which
would house all bears which use pyflakes-enhanced-ast.
3. The `pyflakes-bears` would contain a metabear `PyFlakesASTBear`
which would be used by all other plugin bears.
4. The repository would even contain another package
`pyflakes-generic-plugins` that would be independent of coala and could
be run by flake8 or coala or any other similar tool.

## Management of the new `coala-pyflakes` repository

The repository will use a similar mechanism of what is being
used in `coala-bears` to test all its bears.

## Implementation of PyFlakesASTBear

1. `PyFlakesASTBear` will be `LocalBear` returning result of the type
`HiddenResult` since these results are not meant for the users but
for the dependent bear. The result structure is described by
`PyFlakesResult` class.
2. `PyFlakesResult` consists of a snapshot of the module at class,
function, module, generator and doctest level. Thus, providing
the developer multiple views of a file allowing him to retrieve
results as per his demand.
3. Apart from scope information, the metabear also returns a list of
**messages generated by pyflakes** itself. These s can be used
by the developer to add an additional filter layer that is
required to be resolved by the user before executing the plugin.
This is primarily done because pyflakes emits additional syntax
errors which are not caught by python interpretor and may affect
the working of a plugin. This check is optional and may or may not
be incorparated by the developer.
4. Helper function `get_nodes` provides with all the nodes of a
particular type present in a scope.

Here is a prototype for the implementations of `PyFlakesASTBear`:

```python
# Imports not mentioned to maintain brevity

class PyFlakesResult(HiddenResult):

def __init__(self, origin, deadScopes, pyflakes_messages):

Result.__init__(self, origin, message='')

self.module_scope = self.get_scopes(ModuleScope, deadScopes)[0]
self.class_scopes = self.get_scopes(ClassScope, deadScopes)
self.function_scopes = self.get_scopes(FunctionScope, deadScopes)
self.generator_scopes = self.get_scopes(GeneratorScope, deadScopes)
self.doctest_scopes = self.get_scopes(DoctestScope, deadScopes)
self.pyflakes_messages = pyflakes_messages

def get_scopes(self, scope_type, scopes):
return list(filter(lambda scope: isinstance(scope, scope_type),
scopes))

def get_nodes(self, scope, node_type):
for _, node in scope.items():
if isinstance(node, node_type):
yield node


class PyFlakesASTBear(LocalBear):
AUTHORS = {'The coala developers'}
AUTHORS_EMAILS = {'[email protected]'}
LICENSE = 'AGPL-3.0'

def run(self, filename, file):
"""
Generates pyflakes-enhance-AST for the input file and returns
deadScopes as HiddenResult
:return:
One HiddenResult containing a dictionary with keys being
type of scope and values being a list of scopes generated
from the file.
"""
tree = ast.parse(''.join(file))
result = Checker(tree, filename=filename, withDoctest=True)

yield PyFlakesResult(self, result.deadScopes, result.messages)
```

## Demonstrating use of PyFlakesASTBear by creating a simple bear

A hassle free solution to the problem of finding all the future imports
present in the source code.

Here is a prototype for the `NoFutureImportBear` that uses `PyFlakesASTBear`:

```python
# Imports not mentioned to maintain brevity

class NoFutureImportBear(LocalBear):
"""
Uses pyflakes-enhance-AST to detect use of future imports
"""
BEAR_DEPS = {PyFlakesASTBear}

def run(self, filename, file,
dependency_results=dict()
):
for result in dependency_results.get(PyFlakesASTBear.name, []):
for node in result.get_nodes(result.module_scope,
FutureImportation):
corrected = self.remove_import(file, node.source.lineno, node.name)
yield Result.from_values(
origin=self,
message='Future import %s found' % node.name,
file=filename,
diffs={filename: corrected},
line=node.source.lineno)
```

## A generic implementation for detecting `__future__` imports

This implementation doesn't requires coala to be installed and is
designed so that it can be used in other tools like flake8. coala
will also be able to use this plugin, but it is also usable by flake8.
This implementation will be housed in `pyflakes-generic-plugins`.

Here is a prototype for the `NoFutureImport` using **pyflakes-enhanced-AST**:

```python
# Imports not mentioned to maintain brevity
__version__ = '0.1'

CODE = 'F482'


class NoFutureImport(object):
name = 'no_future'
version = __version__

def __init__(self, tree, filename):
self.tree = tree
self.filename = filename
self._checker = None
self._module_scope = None

@property
def checker(self):
if self._checker is None:
self._checker = Checker(self.tree, self.filename)
return self._checker

@checker.setter
def checker(self, checker):
self.checker = checker

@property
def module_scope(self):
if self._module_scope is None:
self._module_scope = list(filter(lambda scope: isinstance(scope,
ModuleScope),
self.checker.deadScopes))[0]
return self._module_scope

def run(self):
for _, node in self.module_scope.items():
if isinstance(node, FutureImportation):
message = '{code}: Future import {name} found'
.format(name=node.name,
code=CODE)
yield (node.source.lineno, node.source.col_offset,
message, NoFutureImport)
```

## Demonstrating use of PyFlakesASTBear by creating a fixes bear

A hassle free solution to the problem of removing all the future imports
present in the source code.

Here is a prototype for the `NoFutureImportBear` that uses `PyFlakesASTBear`:

```python
# Imports not mentioned to maintain brevity

class NoFutureImportBear(LocalBear):
"""
Uses pyflakes-enhance-AST to remove future imports
"""
BEAR_DEPS = {PyFlakesASTBear}

# TODO: Adjust logic to support removal when `\` is used
def remove_future_imports(self, file, lineno,
max_col_offset, affected_lines):
"""
Removes all __future__ imports from the input line
"""
diff = Diff(file)
line = file[lineno - 1]
semicolon_indices = [i for i, a in enumerate(line) if a == ';']
if len(semicolon_indices) == 0:
diff.delete_line(lineno)
else:
for indice in semicolon_indices:
if indice > max_col_offset:
replacement = line[indice + 1:].lstrip()
diff.modify_line(lineno, replacement)
break
return diff

def run(self, filename, file,
dependency_results=dict()
):
"""
Uses PyFlakesASTBear to get only FutureImportation
"""
# Preprocess all the lines containing __future__ imports
affected_lines = set()
# Get the maximum column offset of FutureImportation for a line
max_column_for_a_line = dict()
for result in dependency_results.get(PyFlakesASTBear.name, []):
for node in result.get_nodes(result.module_scope,
FutureImportation):
lineno = node.source.lineno
col_offset = node.source.col_offset
if lineno in max_column_for_a_line:
max_column_for_a_line[lineno] = max(
max_column_for_a_line[lineno],
col_offset)
else:
max_column_for_a_line[lineno] = col_offset
affected_lines.add(lineno)

for line in affected_lines:
corrected = self.remove_future_imports(
file, line, max_column_for_a_line[lineno],
affected_lines
)
yield Result.from_values(
origin=self,
message='Future import(s) found',
file=filename,
diffs={filename: corrected},
line=line)
```

0 comments on commit 95bc075

Please sign in to comment.