This repository is the official implementation of NeuroMLR: Robust & Reliable Route Recommendation on Road Networks.
Predicting the most likely route from a source location to a destination is a core functionality in mapping services. Although the problem has been studied in the literature, two key limitations remain to be addressed. First, a significant portion of the routes recommended by existing methods fail to reach the destination. Second, existing techniques are transductive in nature; hence, they fail to recommend routes if unseen roads are encountered at inference time. We address these limitations through an inductive algorithm called NEUROMLR. NEUROMLR learns a generative model from historical trajectories by conditioning on three explanatory factors: the current location, the destination, and real-time traffic conditions. The conditional distributions are learned through a novel combination of Lipschitz embeddings with Graph Convolutional Networks (GCN) on historical trajectories.
The code has been tested for Python version 3.8.10 and CUDA 10.2. We recommend that you use the same.
To create a virtual environment using conda,
conda create -n ENV_NAME python=3.8.10
conda activate ENV_NAME
All dependencies can be installed by running the following commands -
pip install -r requirements.txt
pip install --no-index torch-scatter -f https://pytorch-geometric.com/whl/torch-1.6.0+cu102.html
pip install --no-index torch-sparse -f https://pytorch-geometric.com/whl/torch-1.6.0+cu102.html
pip install --no-index torch-cluster -f https://pytorch-geometric.com/whl/torch-1.6.0+cu102.html
pip install --no-index torch-spline-conv -f https://pytorch-geometric.com/whl/torch-1.6.0+cu102.html
pip install torch-geometric
Download the preprocessed data and unzip the downloaded .zip file.
Set the PREFIX_PATH variable in my_constants.py
as the path to this extracted folder.
For each city (Chengdu, Harbin, Porto, Beijing, CityIndia), there are two types of data:
Stored as a python pickled list of tuples, where each tuple is of the form (trip_id, trip, time_info). Here each trip is a list of edge identifiers.
In the map folder, there are the following files-
nodes.shp
: Contains OSM node information (global node id mapped to (latitude, longitude))edges.shp
: Contains network connectivity information (global edge id mapped to corresponding node ids)graph_with_haversine.pkl
: Pickled NetworkX graph corresponding to the OSM data
After setting PREFIX_PATH in the my_constants.py
file, the training script can be run directly as follows-
python train.py -dataset beijing -gnn GCN -lipschitz
Other functionality can be toggled by adding them as arguments, for example,
python train.py -dataset DATASET -gpu_index GPU_ID -eval_frequency EVALUATION_PERIOD_IN_EPOCHS -epochs NUM_EPOCHS
python train.py -traffic
python train.py -check_script
python train.py -cpu
Brief description of other arguments/functionality -
Argument | Functionality |
---|---|
-check_script | to run on a fixed subset of train_data, as a sanity test |
-cpu | forces computation on a cpu instead of the available gpu |
-gnn | can choose between a GCN or a GAT |
-gnn_layers | number of layers for the graph neural network used |
-epochs | number of epochs to train for |
-percent_data | percentage data used for training |
-fixed_embeddings | to make the embeddings static, they aren't learnt as parameters of the network |
-embedding_size | the dimension of embeddings used |
-hidden_size | hidden dimension for the MLP |
-traffic | to toggle the attention module |
For exact details about the expected format and possible inputs please refer to the args.py
and my_constants.py
files.
The training code generates logs for evaluation. To evaluate any pretrained model, run
python eval.py -dataset DATASET -model_path MODEL_PATH
There should be two files under MODEL_PATH, namely model.pt
and model_support.pkl
(refer to the function save_model()
defined in train.py
to understand these files).
You can find the pretrained models in the same zip as preprocessed data. To evaluate the models, set PREFIX_PATH in the my_constants.py file and run
python eval.py -dataset DATASET
We present the performance results of both versions of NeuroMLR across five datasets.
Dataset | Precision(%) | Recall(%) | Reachability(%) | Reachability distance (km) |
---|---|---|---|---|
Beijing | 75.6 | 74.5 | 99.1 | 0.01 |
Chengdu | 86.1 | 83.8 | 99.9 | 0.0002 |
CityIndia | 74.3 | 70.1 | 96.1 | 0.03 |
Harbin | 59.6 | 48.6 | 99.1 | 0.02 |
Porto | 77.3 | 70.7 | 99.6 | 0.001 |
Since NeuroMLR-Dijkstra guarantees reachability, the reachability metrics are not relevant here.
Dataset | Precision(%) | Recall(%) |
---|---|---|
Beijing | 77.9 | 76.5 |
Chengdu | 86.7 | 84.2 |
CityIndia | 77.9 | 73.1 |
Harbin | 66.1 | 49.6 |
Porto | 79.2 | 70.9 |
If you'd like to contribute, open an issue on this GitHub repository. All contributions are welcome!