Data Analysis of the SQL vs. Visual Diagrams on time and correctness matching relational query patterns Mturk study
-
Clone the repo or otherwise download the files.
-
CD
to the repo directory. Create and activate a virtual environment for this project. You may need to modify the code you use depending on what Python you have installed and how your machine is configured. Note that the authors used Python 3.10.11 on Windows with the packages specified inrequirements.txt
. -
Run the setup commands below.
- On macOS or Linux, run these three commands separately in case there are errors:
python3 -m venv env
source env/bin/activate
which python
- On Windows, run these three commands separately in case there are errors:
python -m venv env
.\env\Scripts\activate.bat
where.exe python
Check the path(s) provided by
which python
orwhere.exe python
— the first one listed should be inside theenv
folder you just created. - On macOS or Linux, run these three commands separately in case there are errors:
-
Install necessary packages
pip install -r requirements.txt
If you want the latest package versions instead of the exact versions of packages we used, instead run:
pip install -r requirements-basic.txt
If you have trouble running any of these steps, see the Troubleshooting section below.
- Run
jupyter lab
. It should open your browser and let you select select any Jupyter Notebook .ipynb file. - Run individual cells with ctrl+enter. In the menu you can run all cells and restart the kernel to clear variables.
-
Make sure to save your .ipynb file and shutdown Jupyter Lab properly through the file menu. Otherwise you need to use
jupyter notebook stop
. -
Deactivate the venv to return to your terminal using
deactivate
.
-
If you have made any changes to the required packages you should export a list of all installed packages and their versions:
pip freeze > requirements.txt
-
Before you commit a Jupyter Notebook .ipynb file, clear the outputs of all cells. This decreases file size, removes unnecessary metadata, and makes diffs easier to understand. In Jupyter Lab you can use the GUI: Edit->Clear All Outputs.
- Install JQ by running
sudo apt-get install jq
for more options check here. - Append the following block of code either in your local repo
.gitconfig
file or your global.gitconfig
. I would recommend to do it in your global.gitconfig
so you don't need to redo that for future .ipynb files.
For more details look at this great tutorial here.[core] attributesfile = ~/.gitattributes_global [filter "nbstrip_full"] clean = "jq --indent 1 \ '(.cells[] | select(has(\"outputs\")) | .outputs) = [] \ | (.cells[] | select(has(\"execution_count\")) | .execution_count) = null \ | .metadata = {\"language_info\": {\"name\": \"python\", \"pygments_lexer\": \"ipython3\"}} \ | .cells[].metadata = {} \ '" smudge = cat required = true
- Create a global gitattributes named
.gitattributes_global
file (usually placed at the root level, so~/.gitattributes_global
). - Add the following line of code
*.ipynb filter=nbstrip_full
- In the JupyterLab menu click
File
→Save and Export Notebook As
→PDF
and wait for it to finish and download the PDF.
-
To get markdown section numbering use jupyterlab-toc. To install run:
jupyter labextension install @jupyterlab/toc
- Then click the enumerated list button on the left strip in JupyterLab to bring up the table of contents. There you can click the itemized list button in the top to add section numbers to the markdown cells.
-
For a useful Spellchecker the following extension is useful.
- To install run:
jupyter labextension install @ijmbarr/jupyterlab_spellchecker
- To install run:
-
To install both of the above run:
jupyter labextension install @jupyterlab/toc @ijmbarr/jupyterlab_spellchecker
-
Are you using python 3.6 or newer? Check inside your virtual environment by running
python --version
. If not, download and install the updated version of python for your OS. -
Are you using the Anaconda Distribution? We've had nothing but trouble using Anaconda with Jupyter Lab. See the instructions at the end of the venv virtual environment section below.
-
If you get a
NotImplementedError
forasyncio
while running Python 3.8, edit/env/Lib/site-packages/tornado/platform/asyncio.py
following the instructions here. Right after the lineimport asyncio
add these lines:import sys if sys.platform == 'win32': asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
-
Are you in your virtual environment? Your command prompt / terminal prompt should be prefixed with
(env)
to show you that. -
Are you using the python executable from your virtual environment?
-
Check it!
- On macOS or Linux, run:
which python
- On Windows, run:
Check the path(s) provided by
where.exe python
which python
orwhere.exe python
— the first one listed should be inside theenv
folder you just created.
- On macOS or Linux, run:
-
If the first listed path is not inside the
env
folder you just created, then find a way to run the correct python executable.- One common problem is if you have the Anaconda Distribution installed. E.g., your first listed path matches one of these. In that case, one fix is to uninstall Anaconda completely (option B listed here) and install basic Python (if you don't have it already).
-
-
Did you rename your
env
folder after creating it? If so, delete it and run the commands to create it again.venv
uses hard-coded paths so renaming the folder is fraught.
-
You may get a warning like
WARNING: You are using pip version 20.1.1; however, version 20.2.3 is available. You should consuder upgrading...
You don't need to worry about fixing this. -
You may run into issues where pip is using a different python than Jupyter Lab is running. E.g., you may install a package but then Jupyter complains it is unavailable. In that case:
- instead of
pip install
try runningpython -m pip install
- Additionally, check if python is from the same environment as pip:
- On macOS or linux:
which pip
orwhich pip3
andwhich python
- On Windows:
where.exe pip
orwhere.exe pip3
andwhere.exe python
- On macOS or linux:
- instead of
-
You can also try installing the required packages without pinning to particular versions like we have done in
requirements.txt
. Do this by running:pip install -r requirements-noversions.txt
You can also run the installs one-by-one to see if there are issues. E.g.,
pip install altair
.
-
If you are using PowerShell (not the Command Prompt) and you get an error message saying
the execution of scripts is disabled on this system
, follow these steps.- Open a new PowerShell as Administrator.
- enable running unsigned scripts by entering
See the documentation for details.
set-executionpolicy remotesigned
-
If you get this error
numpy.distutils.system_info.NotFoundError: no lapack/blas resources found
try installing it manually. (Instructions modified from here.)-
Open Powershell,
CD
to your repo folder, and enter your virtual environment. -
Download numpy+mkl wheel from one of the links here. Use the version that is the same as your python version (check using
python --version
). E.g., if your python is 3.6.2, download the wheel which shows cp36. E.g., for python 3.9:wget https://download.lfd.uci.edu/pythonlibs/x2tqcw5k/numpy-1.19.2+mkl-cp39-cp39-win_amd64.whl -OutFile numpy.whl
-
Install the wheel:
pip install numpy.whl
-
Likewise, install SciPy from one of the links here using the same version as your python. E.g., for python 3.9:
wget https://download.lfd.uci.edu/pythonlibs/x2tqcw5k/scipy-1.5.2-cp39-cp39-win_amd64.whl -OutFile scipy.whl
pip install scipy.whl
-
-
When you run
pip install -r requirements.txt
,pip install numpy
, orpip install scipy
you may get this error:RuntimeError: Broken toolchain: cannot link a simple C program
. Note that this error may be in the middle / end of a large error message. It means that Gcc is not available for compiling C programs (which Python is based on). Follow these steps:- Try running
to see if you have Homebrew installed. If you get
brew help
command not found
, install Homebrew by running:/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
- Then run
brew install gcc
- Try running this again:
pip install -r requirements.txt
- Try running
This readme.md
file and the preregistration template is based on Leventidis et al., 2020, which was released under CC-By Attribution 4.0 International.