Skip to content
Michael E. Rose edited this page Apr 16, 2016 · 24 revisions

Welcome to the NetworkX wiki!

Design Specification - API Sandbox

Developer Notes


Top of the Module

  • Formatting copyright, list of authors, "the top of the module"

    If you use utf-8 unicode characters in your module the first line must be

      # -*- coding: utf-8 -*-
    

    This is followed by comment lines giving the copyright, a blank comment line and then the authors. The authors line can be formatted as

      # Authors: name (email) and other_name (email)
    

    or

      # Authors: name (email)
      #          other_name (email)
    

    After the authors, the docstring for the module should appear. Use an r""" to start the docstring if you use LaTeX math codes. Otherwise use """. The first line should be a one-line description of the function followed by a blank line and then a description of what the function does.

    After the module docstring, import python libraries followed by imports of any Networkx modules. Typically this is followed by a definition of __all__ to list the publicly provided functions of the module. Then add two blank lines and you can start your code.

    See richclub.py for an example. It's also a good example for using special features like displayed math and references.


Docstring formatting

  • Formatting doc_strings to look nice in code and in sphinx. First, we generally follow numpydoc recommendations. For example, we use back-ticks whenever referring to parameters like G.

    Common Gotchas:

    • Sections are separated by underlined headings. The underline must be the same length as the heading.
    • Extra blank lines are often flagged as Warnings by sphinx. Check for the Warnings when you locally make the docs.
    • The preferred order of sections is as follows: Parameters, Returns, Raises, Examples, See also, Notes, References

    Additional comments about syntax where it is not so clear from the link above:

    • our primary goal is to make the docs pretty both in their raw text form and in the html and pdf generated at ReadTheDocs. The guidelines can help, but sometimes you have to make a judgement call.
    • Python objects True, False and None do not require any formatting.
    • When appearing in a sentence/paragraph, object types like int and float refer to their class as in: :class:`int`
    • literal strings appear as regular text with single quotes, as in: 'full', or 'in+out'
    • double back-ticks are for longer pieces of python code. (what is long? judgement call)
    • exceptions are formatted using: :exc:`NetworkXError`
    • math expressions can often be written in ascii-text. For LaTeX formatting use: :math:`\sum_{i=0}^{\infty} x^i`

    If you are working on doc_string issues, it is very helpful to get sphinx installed locally so you can create the html pages locally and tweak without having to run everything through ReadTheDocs first.


Handling None

  • How we tend to handle None

    When checking a value against None use the if var is None: construction. We also use None (as does Python itself) as a null argument to functions where the default argument must be newly computed at runtime. For example, if you use def f(nlist=[]): the list will be created when the function is defined and it will be the same list for all function calls. If you want a new list for each function call use

     def f(nlist=None):
         if nlist is None:
             nlist = []
    

    In some functions we allow nodes to be specified with a default of processing all nodes. We use None to indicate whether an argument was provided or not. Thus, None is special in that it cannot be a node in our graphs. We state that a node can be any hashable other than None.


Decorators for Not Implemented

  • Explicitly not supporting directed or multigraph in a function:

    Use the decorator not_implemented_for in networkx/utils/decorators.py to designate that a function doesn't accept 'directed', 'undirected', 'multigraph' or 'graph'. The function should have its first argument be the graph object to be checked.

     @nx.not_implemented_for('directed', 'multigraph')
     def function_not_for_MultiDiGraph(G, others):
         # function not for graphs that are directed *and* multigraph
         pass
    
     @nx.not_implemented_for('directed')
     @nx.not_implemented_for('multigraph')
     def function_only_for_Graph(G, others):
         # function not for directed graphs *or* for multigraphs
         pass
    

Squashing Commits

  • How much and when should we squash commits?

    Often as a PR evolves extra commits appear to correct typos, avoid conflicts with upstream/master, etc. Some projects like to have every PR be one commit. Others prefer to keep all the history and most are somewhere in between. We are in the "between" group. It is nice to have content oriented commits separated from e.g. PEP8 changes so that people can easily find the important change within the whitespace changes. Many times there is a logical progression of the commits as the ideas or focus of the PR changed. Those are good reasons to have more than one commit in your PR. But when it looks like your PR is about ready you should think about whether it would help clean up the commit history to "squash" the commits that don't help by being separated out. Squashing is essentially changing history for your branch. If there are other people out there who would use that history, then don't do it this way.

    How do you squash commits? Basically you use git rebase -i (you can also specify which branch to rebase relative to). An editor opens with a list of all your commits. You choose which to squash by switching the first word from "pick" to "squash". When you're done, the editor reopens and you can combine/edit the commit messages from the commits that were changed. When you're done with the messages, your history has been rewritten! The old commits still exist, but are not associated with the current branch. New commits have taken their place. Since this rewrites the history of your branch, you have to use git push --force to update the branch on github that your PR is from.

    Here's a post describing the squash process. As always it's possible to create a small test repo to play it with before actually doing it, but its also OK to play with it in your local repo until you understand it. So long as you don't push you can mess up your local repo all you want (and then re-clone to start fresh).