- Unzip the data from
data.zip.
- Install the required packages listed in
requirements.txt.
- Set PYTHONPATH to
.:../:src
(you can do this by runningexport PYTHONPATH=".:../:src"
on the shell, for example). - Please use Python 3.6 or above.
- The goal of this assignment is to implement a basic GCN model for node classification, link prediction, and graph classification. Most of the code related to training, data loading, and evaluation is provided. You need to implement the parts marked with a #TODO. Each TODO needs a few lines of code to complete (2-3 lines in most cases).
- The notebook notebooks/GraphExploration.ipynb contains the code for the first part of this assignment. This part aims to introduce you to graph.py, the class that represents a graph and some of its methods.
- This has two sub-parts:
- Implementing a GCN layer by completing this code.
- Using the GCN layer to complete the implementation of GCN here.
- Use the GCN model implemented in part 2 to classify nodes in the graph (by completing this).
- Use the GCN model implemented in part 2 to predict links in the graph (partial implementation). This notebook can be used to explore the training data.
- We will use the GCN model implemented in part 2 to classify graphs (i.e., predict a label for each graph, partial implementation here, notebook).
bash scripts/run_node_classification.sh [GRAPH_NAME]
Where:
- GRAPH_NAME is the name of the graph to use (e.g.,
cora,
citeseer
).
E.g., to run node classification on citeseer
with topological features, run:
bash scripts/run_node_classification.sh citeseer_plus_topo
- The usage link prediction is similar.
bash scripts/run_graph_classification.sh [GRAPH_PATH] [NUM_CLASSES] [POOLING_OP (max|mean|last)] [NUM_EPOCHS]
Where:
- GRAPH_PATH: path to the graph file.
- NUM_CLASSES: number of classes in the graph.
- POOLING_OP: pooling operation to use.
- NUM_EPOCHS: number of epochs to train the model.
More details can be found in scripts/run_graph_classification.sh.
E.g., to run graph classification on mutag (2 classes) with max-pooling and 200 epochs:
bash scripts/run_graph_classification.sh data/graph_classification/graph_mutag_with_node_num.pt 2 max 200
- You can create a private fork of this repository on GitHub and add the TAs as collaborators (usernames: madaan, yaushian). This might help you in asking questions without having to copy-paste your code on piazza (you can just reference Github code/copy permalink). You can use these instructions or just copy-paste the code into a new repository.
- As with all the other assignments, please do not share your solutions publicly.
- Parts of the data loader code are based on this repo.