Skip to content

Scripts to setup ViziQuer, DataShapeServer and SPARQLforHumans - all at the same time

Notifications You must be signed in to change notification settings

apiskunovs/vq-dss-sfh-setup

Repository files navigation

vq-dss-sfh-setup

Scripts to setup ViziQuer, DataShapeServer and SPARQLforHumans - all in one go.

Setup

required command line utilities:

  • npm
  • git
  • xcopy
  • dotnet
  • powershell --> Expand-Archive
  • meteor (all except this one are checked by script check.requirements.bat)
  1. Check if majority of necessary commands are available on the system. !!! meteor is not checked
check.requirements.bat
  1. Download all git repositories, build and install dependencies. Takes a minute or so depending on speed of the network at hands
setup.bat
  1. For this step support from developers of data-shape-server will be needed. The special access has to be granted and set int .env file
START /W notepad.exe data-shape-server\server\.env
  1. Download Wikidata dump (~6h)
download.wikidata.dump.bat
  1. Process dump (~60 hours)
# for simplicity do in "SparqlForHumans.CLI" folder
cd SPARQLforHumans/SPARQLforHumans.CLI

# filter data. Takes ~7.5h.
# Not all information will be needed for our solution. This will produce filtered content and will
# create another document *.filterAll.gz
dotnet run -- -i ../../data/wikidata/latest-truthy.nt.gz -f 

# sort data. with 16 parallel process it takes around 3h. With 8 processes ~5h.
# !!! be aware that GZIP tool is expected here. Originally it was offered to be install along with
# GIT SDK and, but in practice the same tool potentially can be used from any Linux/Unix env (WSL 
# on Windows). Recommended sort command is provided as a result of filtering. creates new file
# *.filterAll-Sorted.gz
gzip -dc latest-truthy.filterAll.gz | LANG=C sort -S 200M --parallel=16 -T /tmp --compress-program=gzip | gzip > latest-truthy.filterAll-Sorted.gz

# build entity and property index. It takes ~35h and ~2h accordingly
# now indexing command supports BaseFolder configurability (-b <path>). by default it is set to
# "%userprofile%\SparqlForHumans\Wikidata". This path will be required for server configuration to look 
# indexes into. REMARK: both indexes are expected to be in the same BaseFolder.
dotnet run -- -i latest-truthy.filterAll-Sorted.gz -e 
dotnet run -- -i latest-truthy.filterAll-Sorted.gz -p

# ... OR with BaseFolder specified if default one is not ok
# dotnet run -- -i latest-truthy.filterAll-Sorted.gz -e -b ".\Wikidata"
# dotnet run -- -i latest-truthy.filterAll-Sorted.gz -p -b ".\Wikidata"

Running

  1. Run all environments. After 15 seconds ViziQuer page will open up in default web-browser app. User: [email protected] ; Password: admin
run.bat

About

Scripts to setup ViziQuer, DataShapeServer and SPARQLforHumans - all at the same time

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published