TrillionG: A Trillion-scale Synthetic Graph Generator Using a Recursive Vector Model.
-
Git
git clone https://github.com/chan150/TrillionG.git
-
Compile
./compile.sh
or./compile-local.sh
-
Execute TrillionG in a single machine
./run-local.sh output_file
-
Execute TrillionG in distributed machines
./run.sh output_file
- generate scale20 graph which has 0.1 noises as tsv output format format by using 120 machines/threads
./run.sh output -n 20 -m 120 -format tsv -noise 0.1
- -p seed parameter (e.g. -p 0.57,0.19,0.19,0.05)
- -logn or -n scale of graph (number of vertices in log-scale) (e.g. -n 20)
- -ratio or -r ratio between number of vertices and number of edges (e.g. -r 16)
- -noise adding noises for more realistic degree distribution (e.g. -noise 0.1)
- -machine or -m number of machines/threads (e.g. -m 120)
- -format output format** (e.g. -format tsv)
- -compress compression codec (e.g. -compress snappy; Snappy must be set in Hadoop)
- tsv Edge list format with tap sperated value
- adj Adjacency list format with 6 byte alignment, in addition, adj4, adj6, adj8
- csr Compresed sparse row format with 6 byte alignment, in addition, csr4, csr6, csr8
- or explicitly write the class name
- snappy Snappy compression codec
- bzip or bzip2 Bzip2 compression codec
- or explicitly write the class name
Licensed under the Apache License, Version 2.0.
For more details about TrillionG, please refer to our paper. You can download the paper via open access in my github.io site.
If you use TrillionG, please cite:
@inproceedings{park2017trilliong,
title={TrillionG: A trillion-scale synthetic graph generator using a recursive vector model},
author={Park, Himchan and Kim, Min-Soo},
booktitle={Proceedings of the 2017 ACM International Conference on Management of Data},
pages={913--928},
year={2017},
organization={ACM}
}