Skip to content

Overview

slobodan-ilic edited this page Sep 13, 2018 · 10 revisions

SCrunch is a Pythonic scripting library for cleaning data in Crunch.

In it the two most significant pieces are the Class and Functions helpers introduced to ease up the interaction with the Crunch HTTP API, and the Expression Filter interpreter which provides a nice way to express exclusion and other kind of filters for consumption in Crunch.

Installation Guide

Supported Platforms

  • Python 2.7
  • Python 3.4
  • Python 3.5
  • Python 3.6

Supported Installation Methods

Install using pip

SCrunch is not available through pypi yet, in the meantime you can install the library by checking out the code and then installing it manually, to install it please do the following:

  1. check out the code
  2. get into the directory:
cd scrunch
  1. execute the pip install command:
pip install -r requirements.txt

All done, you should have the latest version of the pycrunch and scrunch libraries installed.

How to update SCrunch

Updating scrunch requires you to pull all changes committed to the repository and reinstall the library, do it with the following commands:

  1. get into the directory:
cd scrunch
  1. pull all changes:
git pull origin master
  1. once again, execute the pip install command:
pip install -r requirements.txt

Checking the Installed SCrunch Version

Check package version

Package version is available at scrunch.version.__version__, check your version with the following piece of code:

>>> import scrunch.version
>>> scrunch.version.__version__
>>> '0.1.0'

Check revision

When installing from the source code (either from github directly using pip or from your local copy) you are installing the library at a specific revision. Execute the following command to get the revision information:

pip freeze|grep scrunch

and you will receive the following output:

-e git+https://github.com/Crunch-io/scrunch@53e814ff7989fc275442ca98a756da6354488e21#egg=scrunch

Developing SCrunch

Follow the instructions below to setup your local development environment.

How to check out the code

Clone the git repository from github:

git clone https://github.com/Crunch-io/scrunch.git

or use your own repository url if you have a fork:

git clone https://github.com/YOUR-GITHUB-USERNAME/scrunch.git

### How to setup the development environment

Get into the scrunch directory and install all development dependencies

cd scrunch
pip install -e .[testing] -r requirements.txt

How to run the tests

Now that you have your local copy with all dependencies installed, you can run the test suite using py.test and confirm if all is working for you:

py.test

This will produce something like the following:

Test session starts (platform: darwin, Python 3.6.0, pytest 3.0.5, pytest-sugar 0.8.0)
rootdir: /Users/dmo/tmp/scrunch, inifile: pytest.ini
plugins: sugar-0.8.0, cov-2.4.0, backports.unittest-mock-1.2.1

 scrunch/tests/test_datasets.py ✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓                                               34% ███▍      
 scrunch/tests/test_expressions.py ✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓           90% █████████ 
 scrunch/tests/test_recodes.py ✓✓✓✓✓✓✓✓✓✓✓✓                                                       98% █████████▊
 scrunch/tests/test_transformations.py ✓✓                                                         99% █████████▉
 scrunch/tests/test_utilities.py ✓                                                               100% ██████████

-------- coverage: platform darwin, python 3.6.0-candidate-1 ---------
Name                     Stmts   Miss  Cover   Missing
------------------------------------------------------
scrunch/__init__.py          2      0   100%
scrunch/datasets.py        847    193    77%   49-72, 85-97, ..., 1627-1660, 1669
scrunch/expressions.py     346      8    98%   133-136, 220, 446, 493, 507, 513
scrunch/variables.py         4      0   100%
scrunch/version.py           1      0   100%
------------------------------------------------------
TOTAL                     1200    201    83%


Results (3.13s):
     146 passed

Run tests using tox

This project uses tox to execute the test suite against all supported python versions, also it checks code quality with flake8 and isort.

Check your installed tox version by executing:

tox --version

Should produce something like this:

2.5.0 imported from /Users/dmo/.virtualenvs/scrunch/lib/python3.4/site-packages/tox/__init__.py

If it fails is because tox is not installed, make it available by executing:

pip insall -r dev_requirements.txt

We can run all tox enviroments at once by just executing the tox command, it will run all tests for all configured python versions plus flake8 and isort.

tox
...TRIMMED// progress of tests for each environment ... 
___________________ summary ___________________
  py27: commands succeeded
  py34: commands succeeded
  py35: commands succeeded
  py36: commands succeeded
  flake8: commands succeeded
  isort: commands succeeded
  congratulations :)

You must have all supported python versions on your development machine in order to execute tox, if you don't have one of the supported python versions you still have the ability to run tox for a specific python version or other environments by passing the environment name to the tox command with -e ENV_NAME, something like:

tox -e py35

or

tox -e flake8

All available tox environments are: py27, py34, py35, py36, flake8, isort

Run integration tests

It is possible to run a set of integration tests against a real API endpoint, it is necessary to make the following environment variables available:

  • SCRUNCH_HOST: Point to the API host https://app.crunch.io/
  • SCRUNCH_USER: An email address to log in as (Must be an advanced Crunch.io user)
  • SCRUNCH_PASS: Password for the given user

Then, run the tests with the following command from the repository root:

$ py.test integration/

Which will run all the test files under the integration directory.

Having issues?

If you find anything wrong during the installation or usage of this library please open a ticket in the issue tracker or submit a pull request if you managed to fix it by yourself.