GATTO (Graph ATtention network with TOpological information) is a framework that enhances Graph Attention Networks (GAT) by incorporating topological features for node classification tasks. Developed at the University of Padova, this research evaluates the impact of structural information on classification accuracy in citation networks.
- Degree Centrality
- Betweenness Centrality
- Closeness Centrality
- Suggested Label (via embedding clustering)
Network | Nodes | Edges | Labels | Features |
---|---|---|---|---|
Cora | 2708 | 5429 | 7 | 1443 |
Citeseer | 3327 | 4732 | 6 | 3703 |
- Python 3.8.10
- Singularity container system
# Build the Singularity container
singularity build python3.8.10 Singularity.def
The framework consists of two main components:
-
Precomputation Module
- Computes topological features from the graph
- Generates node embeddings
- Performs clustering analysis
-
GAT Module
- Two-layer GAT model
- First layer: 8 attention heads (8 features each) with ELU activation
- Second layer: Single attention head for classification with softmax activation
- Dropout rate: 0.5
Performance comparison on the Cora dataset:
Model | Accuracy | Precision | Recall | F1 Score |
---|---|---|---|---|
GAT | 0.888 | 0.891 | 0.888 | 0.888 |
GATTO | 0.890 | 0.893 | 0.890 | 0.890 |
Our rigorous statistical evaluation includes:
-
Normality Testing
- Shapiro-Wilk Test for score distributions
-
Performance Comparison Tests
- Two-Sample t-Test (equal variances)
- Two-Sample t-Test (unequal variances)
- Wilcoxon Signed-Rank Test
Results on Cora Dataset:
Statistical Test Results (ฮฑ โค 0.05):
โโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโฌโโโโโโโโโโโโฌโโโโโโโโโฌโโโโโโโโโโโโ
โ Test Type โ Accuracy โ Precision โ Recall โ F1 Score โ
โโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโผโโโโโโโโโผโโโโโโโโโโโโค
โ 2S T-Test (=v) โ 0.546 โ 0.529 โ 0.546 โ 0.524 โ
โ 2S T-Test (โ v) โ 0.546 โ 0.530 โ 0.546 โ 0.525 โ
โ Wilcoxon Test โ 0.670 โ 0.570 โ 0.670 โ 0.677 โ
โโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโโดโโโโโโโโโดโโโโโโโโโโโโ
Key Finding: While GATTO shows slight improvements (0.14%-0.54%), statistical analysis indicates no significant performance difference compared to standard GAT implementation for small datasets.
- Alternative embedding techniques
- Advanced clustering approaches
- Large-scale dataset evaluation
- Hyperparameter optimization
- Extension to feature-less graphs
- Francesco Biscaccia Carrara
- Riccardo Modolo
- Alessandro Viespoli
University of Padova - Computer Engineering