The tool is based on QDR's curation practices and will likely require modifications for other repositories.
This program executes four main tasks:
- Creating a local curation folder (ideally synced elsewhere, like dropbox) to edit the data, with subfolders for Original Deposit and Prepared
- Downloading the .zip file for the full data project to the Original Deposit folder and an unzipped version to QDR Prepared for further Curation
- Creating github issues for standard curation tasks and associating them with a github Project for the curation of the data project.
- Automatically setting metadata for PDF files based on Dataverse metadata
- Anoynymizing PDFs by stripping the beginning of filenames and metadata
Most users will want to use the applications available in the release section. The release gets built by pushing to a new tag.
NOTE: Syracuse University office computers cannot run this program directly. You should download the SU Lab Install.bat file (right click "Raw" and the top and Save Link As), and run that instead. This will install the program properly and create a shortcut on your desktop.
This program is operated primarily through the GUI. If you downloaded the self-contained binaries, just double-click and run.
Running the .exe file from the release page requires no additional software.
An .ini configuration file is used to save program settings, like the github and dataverse tokens. After entering the values into the program, you can save a .ini config file from the "File" menu at the top.
Some functions, like downloading public datasets, will operate without API tokens, but expect potential bugs.
To be fully functional, the following parameters must be set:
- A project DOI in the form
doi:10.1234/abcdef
. Once metadata is loaded for that DOI, use the "Reset dvcurator" button to input a different DOI. - A github token
- To create a github token, go to your github developer settings/personal access tokens at https://github.com/settings/tokens
- Click on "Generate New Token", and select "Generate New Token (classic)"
- Give the token a recognizable name such as "QDR Curation" and check the following boxes:
- repo
- admin:org
- project
- Click "Generate Token" at the bottom of the screen. Make sure to note down your token and keep it safe (you won't be able to access this later)
- A dataverse API key -- this must be for the dataverse installation you will work with.
- Find or create this under https://data.qdr.syr.edu/dataverseuser.xhtml?selectTab=apiTokenTab (substitute the domain if not using QDR)
Both of these tokens are entered on into the main window of dvcurator, under "Github token" and "Dataverse token" respectively.
Other parameters are:
- QDR GA folder: Where the archive will be downloaded and extracted to. Usually points to a folder that syncs with Dropbox, but does not necessarily need to be. For QDR GA's this should very literally be the "QDR GA" folder within the QDR Dropbox.
The more adventurous can install this package directly through pip. If you have both pip and git installed, this package can be downloaded and installed directly with:
pip install git+https://github.com/QualitativeDataRepository/dvcurator-python
Otherwise, this package can be installed from a zip file:
pip install dvcurator-python-master.zip
If you want to run dvcurator
as an interpreted program, the python library requrirements are listed in requirements.txt
.
Installations through pip can be run directly, e.g.
python3 -m dvcurator