Skip to content

Latest commit

 

History

History
52 lines (46 loc) · 4.99 KB

README.org

File metadata and controls

52 lines (46 loc) · 4.99 KB

SpectraGAN

This repo contains a dataset of synthetic mobile traffic for 5 cities in Germany generated using publicly available context data for those cities as input to the pre-trained SpectraGAN model. We also include the input context data as part of the dataset.

data/

  • context/:
    • Each city contains a folder with data for all different contextual attributes (conditions) used (population, POI, land use and sea conditions).
  • synthetic-traffic/: synthetic traffic data for each city
    • The data is stored in the .npy format as 3D tensors with dimension [height, width, time].
      • It can be read via np.load(FILE_PATH) after import numpy as np in Python.
      • You can easily convert it to other formats—see documentation of NumPy.
    • There is also a video per city for visualisation of the traffic data as .mp4 files.

When using this dataset, please cite our paper using the following bibtex

@inproceedings{10.1145/3485983.3494844,
author = {Xu, Kai and Singh, Rajkarn and Fiore, Marco and Marina, Mahesh K. and Bilen, Hakan and Usama, Muhammad and Benn, Howard and Ziemlicki, Cezary},
title = {SpectraGAN: Spectrum Based Generation of City Scale Spatiotemporal Mobile Network Traffic Data},
year = {2021},
isbn = {9781450390989},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3485983.3494844},
doi = {10.1145/3485983.3494844},
abstract = {City-scale spatiotemporal mobile network traffic data can support numerous applications in and beyond networking. However, operators are very reluctant to share their data, which is curbing innovation and research reproducibility. To remedy this status quo, we propose SpectraGAN, a novel deep generative model that, upon training with real-world network traffic measurements, can produce high-fidelity synthetic mobile traffic data for new, arbitrary sized geographical regions over long periods. To this end, the model only requires publicly available context information about the target region, such as population census data. SpectraGAN is an original conditional GAN design with the defining feature of generating spectra of mobile traffic at all locations of the target region based on their contextual features. Evaluations with mobile traffic measurement datasets collected by different operators in 13 cities across two European countries demonstrate that SpectraGAN can synthesize more dependable traffic than a range of representative baselines from the literature. We also show that synthetic data generated with SpectraGAN yield similar results to that with real data when used in applications like radio access network infrastructure power savings and resource allocation, or dynamic population mapping.},
booktitle = {Proceedings of the 17th International Conference on Emerging Networking EXperiments and Technologies},
pages = {243–258},
numpages = {16},
keywords = {deep generative modeling, mobile network traffic data, conditional GANs, synthetic data generation},
location = {Virtual Event, Germany},
series = {CoNEXT '21}
}

or other options from the ACM page.

Context data

We provide the context data for 5 cities in Germany: Aachen, Bonn, Dresden, Frankfurt and Munich. The context data is a set of layers of publicly available information of a city as 2D images, one for each of the 27 contextual attributes. This includes census data (i.e. population), 12 types of land uses (e.g. where or not a location is green area) and 14 types of points of interest (e.g. whether or not a location is a cafe). See Section 3.1, Figure 5 and Table 1 in the paper for the full list with description of the context data as well as how it was obtained.

Synthetic mobile traffic data

We provide synthetic spatiotemporal mobile traffic data that is generated by inputting the provided context data into the pre-trained SpectraGAN model. The spatial size of the traffic data for each city varies depending on the spatial size of context data for that city. Specifically, the size of the traffic data is smaller than that of the corresponding context data to ensure that each pixel in the traffic data has sufficient surrounding context data. For the time dimension, the total duration of the traffic data is 3 weeks long at hourly granularity, i.e. 24x7x3 = 504 time steps.

The data generation process

The provided synthetic data is generated in two steps to comply with the NDAs that cover the operator provided original mobile traffic datasets and support for this work. We train the SpectraGAN model using one-month long original mobile traffic data for 9 cities in Country 1 (City A to City I as described in Section 3.1 of the paper) and their associated publicly available context data. We use this trained SpectraGAN model to generate the mobile traffic data for the above mentioned 5 cities in Germany by giving their respective context data as input to the model.