You can generate your own data sets for LSQB. Note that these may slightly differ in size for different versions of the data generator – for publications, it's recommended to use the pre-generated data sets linked above.
-
Run the LDBC Spark Datagen using CSV outputs and raw mode (see its README for instructions).
-
Use the scripts in the converter repository:
cd out/csv/raw/composite_merge_foreign/ export DATAGEN_DATA_DIR=`pwd`
-
Go to the data converter repository:
./spark-concat.sh ${DATAGEN_DATA_DIR} ./load.sh ${DATAGEN_DATA_DIR} --no-header ./transform.sh cat export/snb-export-only-ids-projected-fk.sql | ./duckdb ldbc.duckdb cat export/snb-export-only-ids-merged-fk.sql | ./duckdb ldbc.duckdb
-
Copy the generated files:
export SF=1 cp -r data/csv-only-ids-projected-fk/ ${LSQB_REPOSITORY_DIRECTORY}/data/social-network-sf${SF}-projected-fk cp -r data/csv-only-ids-merged-fk/ ${LSQB_REPOSITORY_DIRECTORY}/data/social-network-sf${SF}-merged-fk