-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #147 from luigi-asprino/master
Upload FBDA query generator and executor
- Loading branch information
Showing
42 changed files
with
1,631 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
# Façade-based Data Access Benchmark | ||
|
||
This folder provides a benchmark derived from GTFS-Madrid-Bench for evaluating Façade-based Data Access (FBDA) engines, such as [SPARQL Anything](https://github.com/SPARQL-Anything/sparql.anything). | ||
|
||
The extension consists of: | ||
- a *set of query templates* that translate the GTFS-Madrid-Bench's queries and RML mappings into FBDA queries; | ||
- a *query executor* which fires the queries and measures the performance of the FBDA engines under four experimental regimes: | ||
- In-memory execution over a complete materialised view (in-memory+complete); | ||
- In-memory execution optimised by a triple-filtering approach (in-memory+triple-filtering); | ||
- In-memory execution over a sliced materialised view and optimised by triple-filtering (sliced+triple-filtering); | ||
- On-disk execution optimised by triple-filtering (on-disk+triple-filtering). | ||
|
||
More details can be found in this [article](https://www.semantic-web-journal.net/content/materialisation-approaches-fa%C3%A7ade-based-data-access-sparql). | ||
|
||
|
||
## Requirements for the use | ||
|
||
To have locally installed Java 11 (or later versions). | ||
|
||
## Using FBDA Benchmark | ||
|
||
1. Generate data using GTFS-Madrid-Bench and move the result folder generated by GTFS within experiments folder. At the moment only csv, json and xml formats are allowed. | ||
|
||
2. Generate FBDA queries for the scales passed to GTFS-Madrid-Bench (e.g. 1, 10, 100) | ||
|
||
``` | ||
./generate_queries.sh "1 10 100" "TMP_FOLDER" "xml csv json" | ||
``` | ||
|
||
where: | ||
- `TMP_FOLDER` is the path to a temporary folder that will be used during the experiments | ||
- "xml csv json" are the formats passed to GTFS-Madrid-Bench | ||
|
||
3. Download the executable jar file of the FBDA engine to evaluate (e.g. [SPARQL Anything v0.9.0](https://github.com/SPARQL-Anything/sparql.anything/releases/download/0.9.0/sparql-anything-0.9.0.jar)) | ||
|
||
4. Run the the queries | ||
|
||
``` | ||
./execute_queries.sh /path/to/fbda_engine.jar "1 10 100" "xml csv json" "/path/to/results" "TMP_FOLDER" | ||
``` | ||
|
||
where: | ||
- "1 10 100" are the scales passed to GTFS-Madrid-Bench | ||
- "xml csv json" are the formats passed to GTFS-Madrid-Bench | ||
- "/path/to/results" is the path to a folder where the results of the execution of the queries (i.e. measures) will be stored | ||
- `TMP_FOLDER` is the path to a temporary folder that will be used during the experiments | ||
|
||
|
||
## Analysing the results | ||
|
||
The execution of the queries generates two TSV files for each query executed on a given format, namely `time_q<query_id>_<format>.tsv` and `mem_q<query_id>_<format>.tsv`. | ||
These files trace the execution of the queries in terms of computational resources used by the engine (i.e. memory footprint, CPU and time). | ||
|
||
The files are stored in the directory `/path/to/results` passed as argument of `execute_queries.sh`. | ||
|
||
The `time_q<query_id>_<format>.tsv` file keeps track of the execution time of the queries on a experimenting format. The table has the following structure: | ||
|
||
| Query | InputSize | Strategy | Slice | Ondisk | MemoryLimit | Run | Time | Unit | Status | STDErr | | ||
|-------|-----------|----------|-------|--------|-------------|-----|------|------|--------|--------| | ||
| | | | | | | | | | | | | ||
|
||
The `mem_q<query_id>_<format>.tsv` file keeps track of the usage by the engine of the CPU and memory during the evaluation of the queries. The table has the following structure: | ||
|
||
| Query | InputSize | Strategy | Slice | Ondisk | MemoryLimit | Run | PID | %cpu | %mem | vsz | rss | | ||
|-------|-----------|----------|-------|--------|-------------|-----|-----|------|------|-----|-----| | ||
| | | | | | | | | | | | | | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
#!/bin/bash | ||
# | ||
# Copyright (c) 2024 SPARQL Anything Contributors @ http://github.com/sparql-anything | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
|
||
SPARQL_ANYTHING_JAR=$1 | ||
RESULTS_DIR=$(pwd)/$4 | ||
TMP_FOLDER=$5 | ||
|
||
if [ ! -d $RESULTS_DIR ]; then | ||
mkdir $RESULTS_DIR | ||
else | ||
echo "$RESULTS_DIR already exists!" | ||
fi | ||
|
||
if [ ! -d $TMP_FOLDER ]; then | ||
mkdir $TMP_FOLDER | ||
else | ||
echo "$TMP_FOLDER already exists! Cleaning it.." | ||
rm -rf $TMP_FOLDER/* | ||
fi | ||
|
||
source functions.sh | ||
|
||
if [ -n "$6" ]; then | ||
QUERIES_TO_EXECUTE=$6 | ||
else | ||
QUERIES_TO_EXECUTE="1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18" | ||
fi | ||
|
||
|
||
for format in $3 | ||
do | ||
for size in $2 | ||
do | ||
for query in $QUERIES_TO_EXECUTE | ||
do | ||
|
||
#echo "Monitoring q$query strategy0 no_slice size $size $format" | ||
#monitor-query $size "q$query" "strategy0" "no_slice" $format | ||
#echo "Monitoring q$query strategy1 no_slice size $size $format" | ||
#monitor-query $size "q$query" "strategy1" "no_slice" $format | ||
#echo "Monitoring q$query strategy1 slice size $size $format" | ||
#monitor-query $size "q$query" "strategy1" "slice" $format | ||
|
||
# ON_DISK | ||
echo "Monitoring q$query strategy1 no_slice size $size $format ondisk" | ||
monitor-query $size "q$query" "strategy1" "no_slice" $format $TMP_FOLDER | ||
|
||
done | ||
done | ||
done |
Oops, something went wrong.