Release 1.0.0 (#88)

Co-authored-by: Ed Moss <[email protected]>
IDEMSInternational · Jul 12, 2023 · 930b6b7 · 930b6b7
1 parent d77976b
commit 930b6b7
Show file tree

Hide file tree

Showing 145 changed files with 618 additions and 376 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -1,19 +1,22 @@
 name: Project Tests
-on: [push]
+on:
+  push:
+  pull_request:
+    branches:
+      - main
 
 jobs:
   build:
     runs-on: ubuntu-latest
     steps:
-      - uses: actions/checkout@v2
+      - uses: actions/checkout@v3
       - name: Set up Python 3.8
-        uses: actions/setup-python@v1
+        uses: actions/setup-python@v4
         with:
           python-version: 3.8
       - name: Install dependencies
         run: |
-          cd $GITHUB_WORKSPACE
           python -m pip install --upgrade pip
-          pip install -r requirements.txt
-      - name: Run unittests
-        run: python -m unittest
+          pip install .
+      - name: Run unit tests
+        run: python -m unittest discover -s src
diff --git a/README.md b/README.md
@@ -1,87 +1,84 @@
 # RapidPro Flow Toolkit
-Toolkit for using spreadsheets to create and modify RapidPro flows 
 
-This is a clean up of https://github.com/IDEMSInternational/conversation-parser
+Toolkit for using spreadsheets to create and modify RapidPro flows.
 
-In the future this should also include a rewrite of https://github.com/geoo89/rapidpro_abtesting
+# Quickstart
 
-## Setup
-1. Install python `>=3.6`
-2. Run `pip install -r requirements.txt`
-
-## Console tool
-```
-main.py {create_flows,flow_to_sheet} input1 input2 ... -o output --format {csv,xlsx,google_sheets} [--datamodels DATAMODELS]
+```sh
+pip install rpft
+rpft --help
 ```
 
-Example:
-```
-main.py create_flows tests/input/example1/content_index.csv -o out.json --format=csv --datamodels=tests.input.example1.nestedmodel
+# Command Line Interface (CLI)
+
+The CLI allows spreadsheets in various formats to be converted to RapidPro flows in JSON format. Full details of the available options can be found via the help feature:
+
+```sh
+rpft --help
 ```
 
-`main.py -h` for more details.
+Below is a concrete example of a valid execution of the command line tool. The line breaks are merely for improving readability; the command would also be valid on a single line.
 
-## Processing Google sheets
+```sh
+rpft create_flows \
+  --output flows.json \
+  --datamodels=tests.input.example1.nestedmodel \
+  --format=csv \
+  src/rpft/tests/input/example1/content_index.csv
+```
 
-Follow the steps _Enable the API_ and _Authorize credentials for a desktop application_ from 
-https://developers.google.com/sheets/api/quickstart/python
+# Using the toolkit in other Python projects
 
-Note: Google sheets need to be in native Google sheets format,
-not `XLSX`, `XLS`, `ODS`, etc
+1. Add the package `rpft` as a dependency of your project e.g. in requirements.txt or pyproject.toml
+1. Import the `create_flows` function
+1. Call `create_flows` to convert spreadsheets to flows
 
-## Running Tests
-1. Run `python -m unittest`
+```python
+from rpft.converters import create_flows
 
-# Components
+sheets = ["sheet1.csv", "sheet2.csv"]
+create_flows(
+    sheets, "flows.json", "csv", data_models="your_project.models"
+)
+```
 
-## Generic parsers from spreadsheets to data models
+_It should be noted that this project is still considered beta software that may change significantly at any time._
 
-### Cell parser
+# RapidPro flow spreadsheet format
 
-See `./parsers/common/cellparser.py`. Parser to convert a spreadsheet cell
-into a nested list. (Currently no nesting as only `;` is supported as an
-element separator.)
+The expected contents of the input spreadsheets is [documented separately][3].
 
-### Row parser
+# Processing Google Sheets
 
-See `./parsers/common/rowparser.py`. Parser to turn rows of a sheet
-into a specified data model. Column headers determine which field of the
-model the column contains data for, and different ways to address fields
-in the data models are supported. See `./parsers/common/tests/test_full_rows.py`
-and `./parsers/common/tests/test_differentways.py` for examples.
+It is possible to read in spreadsheets via the Google Sheets API by specifying `--format=google_sheets` on the command line. Spreadsheets must be in the Google Sheets format rather than XLSX, CSV, etc.
 
-The reverse operation is also supported, but only to a limited extent:
-All models are spread out into a flat dict of fields, each becoming the
-header of a column.
+Instead of specifying paths to individual spreadsheets on your local filesystem, you must supply the IDs of the Sheets you want to process. The ID can be extracted from the URL of the Sheet i.e. docs.google.com/spreadsheets/d/**ID**/edit.
 
-### Sheet parser
+The toolkit will need to authenticate with the Google Sheets API and be authorized to access your spreadsheets. Two methods for doing this are supported.
 
-See `./parsers/common/sheetparser.py`.
+- **OAuth 2.0 for installed applications**: for cases where human interaction is possible e.g. when using the CLI
+- **Service accounts**: for cases where interaction is not possible or desired e.g. in automated pipelines
 
-## RapidPro tools
+## Installed applications
 
-### RapidPro models
+Follow the steps in the [setup your environment section][1] of the Google Sheets quickstart for Python.
 
-See `./rapidpro/models`. Models for flows, nodes, etc, with convenience
-functions to assemble RapidPro flows. Each model has a `render` method
-to render the model into a dictionary, that can be exported to a json
-file whose fields are consistent with the format used by RapidPro.
+Once you have a `credentials.json` file in your current working directory, the toolkit will automatically use it to authenticate whenever you use the toolkit. The refresh token (`token.json`) will be saved automatically in the current working directory so that it is not necessary to go through the full authentication process every time.
 
-### Standard format flow parser
+## Service accounts
 
-See `./parsers/creation/flowparser.py`. Parser to turn sheets in
-the standard format (Documentation TBD) into RapidPro flows.
-See `./tests/input` and `./tests/output` for some examples.
+Follow the steps in the [creating a service account section][2] to obtain a service account key. The toolkit will accept the key as an environment variable called `CREDENTIALS`.
 
-Examples: 
-- `./tests/test_flowparser.py`
-- `./parsers/creation/tests/test_flowparser.py`
+```sh
+export CREDENTIALS=$(cat service-account-key.json)
+rpft ...
+```
 
-### Parsing collections of flows (with templating)
+# Development
 
-See `./parsers/creation/contentindexparser.py`, `parse_all_flows`.
-Examples: 
-- `./tests/test_contentindexparser.py`
-- `./parsers/creation/tests/test_contentindexparser.py`
+For instructions on how to set up your development environment for developing the toolkit, see the [development][4] page.
 
-Documentation (request access): https://docs.google.com/document/d/1Onx2RhNoWKW9BQvFrgTc5R5hcwDy1OMsLKnNB7YxQH0/edit?pli=1#
+[1]: https://developers.google.com/sheets/api/quickstart/python#set_up_your_environment
+[2]: https://developers.google.com/identity/protocols/oauth2/service-account#creatinganaccount
+[3]: https://docs.google.com/document/d/1Onx2RhNoWKW9BQvFrgTc5R5hcwDy1OMsLKnNB7YxQH0/edit?pli=1#
+[4]: https://github.com/IDEMSInternational/rapidpro-flow-toolkit/docs/development.md
diff --git a/coverage.sh b/coverage.sh
@@ -1,3 +1,3 @@
 #!/bin/sh
-python3 -m coverage run --source . --omit="*/test*" -m unittest
+python3 -m coverage run --source src --omit="*/test*" -m unittest discover -s src
 python3 -m coverage html
diff --git a/docs/components.md b/docs/components.md
@@ -0,0 +1,49 @@
+# Generic parsers from spreadsheets to data models
+
+## Cell parser
+
+See `./parsers/common/cellparser.py`. Parser to convert a spreadsheet cell
+into a nested list. (Currently no nesting as only `;` is supported as an
+element separator.)
+
+## Row parser
+
+See `./parsers/common/rowparser.py`. Parser to turn rows of a sheet
+into a specified data model. Column headers determine which field of the
+model the column contains data for, and different ways to address fields
+in the data models are supported. See `./parsers/common/tests/test_full_rows.py`
+and `./parsers/common/tests/test_differentways.py` for examples.
+
+The reverse operation is also supported, but only to a limited extent:
+All models are spread out into a flat dict of fields, each becoming the
+header of a column.
+
+## Sheet parser
+
+See `./parsers/common/sheetparser.py`.
+
+# RapidPro tools
+
+## RapidPro models
+
+See `./rapidpro/models`. Models for flows, nodes, etc, with convenience
+functions to assemble RapidPro flows. Each model has a `render` method
+to render the model into a dictionary, that can be exported to a json
+file whose fields are consistent with the format used by RapidPro.
+
+## Standard format flow parser
+
+See `./parsers/creation/flowparser.py`. Parser to turn sheets in
+the standard format (Documentation TBD) into RapidPro flows.
+See `./tests/input` and `./tests/output` for some examples.
+
+Examples:
+- `./tests/test_flowparser.py`
+- `./parsers/creation/tests/test_flowparser.py`
+
+## Parsing collections of flows (with templating)
+
+See `./parsers/creation/contentindexparser.py`, `parse_all_flows`.
+Examples:
+- `./tests/test_contentindexparser.py`
+- `./parsers/creation/tests/test_contentindexparser.py`
diff --git a/docs/development.md b/docs/development.md
@@ -0,0 +1,73 @@
+# Setup
+
+1. Install Python >= 3.7
+1. Clone the source code repository: `git clone https://github.com/IDEMSInternational/rapidpro-flow-toolkit`
+1. Change to the project root directory: `cd rapidpro_flow_toolkit`
+1. Create a virtual environment: `python -m venv .venv`
+1. Activate venv: `source .venv/bin/activate`
+1. Upgrade pip: `pip install --upgrade pip`
+1. Install the project in dev mode: `pip install --editable .`
+
+# Running tests
+
+```sh
+python -m unittest discover -s src
+```
+
+# Build
+
+1. Install the build tool: `pip install --upgrade build`
+1. Build the project: `python -m build`
+1. Results of the build should be found in the `dist` directory
+
+To verify that the build produced a valid and working Python package, install the package in a clean virtual environment.
+
+```sh
+python -m venv build_verification
+source build_verification/bin/activate
+pip install dist/rpft-x.y.z-py3-none-any.whl
+rpft --help
+deactivate
+rm -rf venv_verification
+```
+
+# Release
+
+You will need:
+
+- sufficient access to the Github repo to create Releases
+- the project in a tested and fully-working state
+
+Once ready:
+
+1. Create a release in Github
+1. Decide what the next version number should be
+1. Edit the release notes
+1. Publish the release
+
+Upon publishing, the project should be tagged automatically with the release version number. This can be used later to build specific versions of the project.
+
+# Upload to PyPI
+
+## TestPyPI
+
+It is recommended to get comfortable with uploading packages to PyPI by first experimenting on the test index. See [Using TestPyPI] for details.
+
+## PyPI
+
+You will need:
+
+- an account on PyPI
+- membership of the `rapidpro-flow-tools` project in PyPI
+- the `twine` package installed
+
+Once ready:
+
+1. Check out the project at the relevant release tag
+1. Build the project as per the [Build](#build) section
+1. Upload to TestPyPI: `twine upload -r testpypi dist/*`
+1. Check everything looks ok
+1. Upload to PyPI: `twine upload dist/*`
+
+
+[1]: https://packaging.python.org/en/latest/guides/using-testpypi/
diff --git a/main.py b/main.py
diff --git a/parsers/creation/datarowmodel.py b/parsers/creation/datarowmodel.py