Skip to content

GSoC 2015 Application Himanshu Mishra : Add On System for Networkx

Himanshu Mishra edited this page Mar 22, 2015 · 38 revisions

Table of Contents


About Me

Basic Information

Name: Himanshu Mishra

University: Indian Institute of Technology, Kharagpur

Email:: [email protected]

IRC: orkohunter

Github: OrkoHunter

Blog: http://blog.himanshumishra.in

Personal Background

I am a first year student of IIT Kharagpur, India. I'm pursuing a degree in Mathematics and Computing. I work on Linux Mint 17 LTS. I love Sublime Text for its slick user interface, extraordinary features and amazing performance. I'm proficient in C and Python. I like Python because it easily lets me convert my ideas into code.

#The Project
##The Problem and Motivation

NetworkX being a pure python library provides a swift way for people to use it because python doesn't need a compiler, so everything goes smoothly on the python interpreter. In spite of that, it is realized that there also exists other graph packages written in other programming languages. While they being written in compiled languages, they offer rich functionalities whose reimplementation in Python is an important task. Some of the common fuctionalities between networkx and other packages have greater performance in them due to their use of compiled languages. This motivates us strongly to integrate them with NetworkX and make it more awesome.

Related to the add-ons, in #1325 it was suggested to remove the drawing package out of the NetworkX core package. The thought is that graph drawing would be more interesting in developing purposes for those developers who are interested in visual elements, external graphics packages rather than algorithms and data structures. Keeping it under NetworkX umbrella as NetworkX-Matplotlib would make it more easier to develop and maintain.

##The Plan

Integration doesn't seem to have a single and best way. There are mixed ideas discussed in #1167 to choose the way of implementation. As I sum up, there are mainly two ways to do this.

  1. Handling every add-on package in networkx.external. All of their necessary modules would reside in the respective sub-directory of the package in networkx.external. Then writing code to NetworkX that interfaces between NetworkX objects and bare lists/arrays expected by the add-on package. Decorators and namespaces can be handled in networkx.utils. And lastly, changing setup.py to handle installation related issues, the way Cython works. When Cython is not available, it would skip the installation part of the add-on and everything would work as if there is no add-on system for NetworkX. But this will lead to complication as all those code in NetworkX for the add-on packages and their modules itself would still remain there, unused and counterproductive. That's why we should go on for the other way.

  2. Having add-ons as "optional components" under NetworkX umbrella. First of all NetworkX proper should not take those add-ons for granted that they would reliably exist. If so, then they should become an integral part of NetworkX alike NumPy and SciPy. Further, adding more add-ons should be as smooth as appending a list of add-ons. So, the idea is that setup.py should inspect and load configurations from networkx.addons and let the add-ons decide whether they can be installed or not.

My work can be divided into three phases.

####Phase 1: Designing the approach for add-on implementation

A separate repository under NetworkX umbrella on Github will be hosted as NetworkX-<addon> containing the add-on source files, tests and code files written for Cython. The python wrapper, however will be present in the NetworkX proper package. For example, a module named partition will be created in networkx.addons.<add-on> to provide the partitioning functionality for NetworkX.

####Phase 2: Working with two graph libraries and hosting the add-ons on Github

I will have to show that the implemented add-on system works. For this I'll have to implement two officially sanctioned add-ons and host them on github. For this I choose the following two libraries-

  1. METIS is a set of serial programs written in C, used for partitioning graphs, partitioning finite element meshes, and producing fill reducing orderings for sparse matrices. It is a truly amazing software. Some of its key features are High Quality Partitions, Extraordinary speed and production of efficient low fill orderings.

  2. LEMON is graph library written in the C++ language provides implementations of common data structures and algorithms with focus on combinatorial optimization tasks connected mainly with graphs and networks. LEMON is an abbreviation of Library for Efficient Modeling and Optimization in Networks. LEMON employs generecity in C++ by using templates.

##Execution (Prototype) --To Do--

##Timeline(Tentative)

Community Bonding Period (27 April - 25 May)

Goal: Community Bonding

  • The principle focus in this period would be studying in detail, the functionalities of NetworkX. making notes, so as to compare them with that of the add-ons.

  • I'll ask guidance from my mentor upon the functioning of the add-on because I feel that would be the most challenging part of my project.

  • If possible, I would also start coding in this phase only, so that I get a head-start

Week 1 - Week 2 (25 May - 9 June)

Goal: Modifying METIS for use

  • In this period I'll be working on the original package of METIS under the hosted repository NetworkX-Metis, to modify it to work with Cython.

  • As soon as I get familiar with the METIS libraries, after modifying it, I'll head to write code for NetworkX

Week 3 - Week 4 (10 June - 26 June)

Goal: Writing Code into NetworkX, Writing tests

  • In this period, I'll be writing Python wrappers of METIS graph partitioning functions.

  • I'll also be writing tests along with it

Mid term Evaluation

  • Having a fully functioning METIS add-on

  • Fix bugs if any

Week 5 - Week 7 (27 June - 17 July)

Goal: Modifying Lemon Graph Library for use

  • Lemon Graph Library has more varied functionalities. This period will be dedicated to studying and modifying the functions to be used.

  • As soon as I'll finish modifying the Lemon graph packages, I'll head forward to write code to NetworkX.

Week 8 - Week 10 (18 July - 7 August)

Goal: Writing code to NetworkX, Writing tests

  • In this period I'll be writing Python wrappers of Lemon Graph library functions

  • Finishing the add-on after writing tests

Week 11 - Week 12 (8 August - 21 August)

Goal: Working with NetworkX-Matplotlib

  • Remove NetworkX drawing package out of the proper package and implement it as a add-on for NetworkX while it being hosted on github under the umbrella of NetworkX.

  • Finishing/fixing bugs for existing add-ons so far.

Future Work - Continue working over the add-on system forever. I'll love to see future implementations on it. :)

I would be able to devote 40 - 50 hours a week during the project, since I have no other big project devoted for the summer. My summer vacations start by 29 April and I'll not be engaged in vacations. My academic year would begin by July 17, but for the first month I would still be able to devote around 40 hours, since there would be no tests/exams.


Notes

  • I have no commitment during summer which means I'm free to work completely on my project.

  • I am very enthusiastic about my work being beneficial to other people in the open source community. I'll keep on seeing the work done by me and fixing issues if they emerge in future.


References