Skip to content

Commit

Permalink
Dockerization, Poetry, and Github Actions (#8)
Browse files Browse the repository at this point in the history
* Add poetry

* Update pyproject.toml

* Fix tests incorrect API tag

* Refactor output paths

* Dockerize

* Add github action for unittest

* Change python version in github action

* Change python version to 3.9 from 3.9.6 for ubuntu github action

* Remove setup.cfg

* Add few changes in github actions file

* Change name for unittest

* Update file path

* Update README.md

* Add s flag in unittest command

* Add private/output to git

* Add private/output to git

* Change output paths to /private/output

* Use poetry virtualenv and update docker-compose

* Update README.md

* Add .gitkeep

* Add private/ to .gitignore

* Edit gitkeep comment
  • Loading branch information
ibrahimjaved12 committed Jan 22, 2024
1 parent 4183324 commit 69525d5
Show file tree
Hide file tree
Showing 14 changed files with 644 additions and 98 deletions.
33 changes: 33 additions & 0 deletions .github/workflows/python-unittest.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
name: Backend Unit Tests
on:
push:
branches:
- main
pull_request:
jobs:
python-unittests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install Poetry
uses: snok/install-poetry@v1
with:
version: 1.5.1
virtualenvs-create: true
virtualenvs-in-project: true

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.9"
cache: "poetry"

- name: Set up Poetry Path
run: echo "$HOME/.local/bin" >> $GITHUB_PATH

- name: Install dependencies
run: poetry install --no-interaction

- name: Run unittests
run: poetry run python -m unittest discover -s tests
5 changes: 2 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,5 @@ cython_debug/
# DS_Store
.DS_Store

# OER Export JSON and CSV files created
ocw_api_data.json
ocw_oer_export.csv
# Private directory containing output directory
private/
41 changes: 41 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
FROM python:3.9

WORKDIR /tmp

# Add, and run as, non-root user.
RUN mkdir /src
RUN adduser --disabled-password --gecos "" --shell /bin/bash mitodl

# Install Python packages
## Set some poetry config
ENV \
POETRY_VERSION=1.5.1 \
POETRY_CACHE_DIR='/tmp/cache/poetry' \
POETRY_HOME='/home/mitodl/.local' \
VIRTUAL_ENV="/opt/venv"
ENV PATH="$VIRTUAL_ENV/bin:$POETRY_HOME/bin:$PATH"

COPY pyproject.toml /src/
RUN chown -R mitodl:mitodl /src
RUN mkdir ${VIRTUAL_ENV} && chown -R mitodl:mitodl ${VIRTUAL_ENV}

## Install poetry itself, and pre-create a venv with predictable name
USER mitodl
RUN curl -sSL https://install.python-poetry.org \
| \
POETRY_VERSION=${POETRY_VERSION} \
POETRY_HOME=${POETRY_HOME} \
python3 -q
WORKDIR /src
RUN python3 -m venv $VIRTUAL_ENV
RUN poetry install

# Add project
USER root
COPY . /src
WORKDIR /src
RUN chown -R mitodl:mitodl /src

RUN apt-get clean && apt-get purge

USER mitodl
48 changes: 13 additions & 35 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,59 +12,37 @@ This demonstration project showcases how to utilize the MIT Open API. It specifi

## Initial Setup & Usage

The _ocw_oer_export_ package is available [on PyPI](link). To install:
1. Build the container:

```
pip install ocw_oer_export
docker compose build
```

### Usage as a Python Package

To use `ocw_oer_export` in your Python code:

```
from ocw_oer_export import create_csv
create_csv()
```

By default, the `create_csv` function uses `source="api"` and `output_file="ocw_oer_export.csv"`. The `source` parameter can be altered to `source="json"` if a local JSON file of courses' metadata is available. To generate the JSON file:
2. Start the container:

```
from ocw_oer_export import create_json
create_json()
docker compose run --rm app
```

Then, create the CSV from the JSON file:
To generate a JSON file containing complete API data:

```
create_csv(source="json")
```

### CLI Usage

`ocw_oer_export` also provides a Command Line Interface (CLI). After installation, you can use the following commands:

To create the CSV file:

```
ocw-oer-export --create_csv
```

To generate a JSON file:

```
ocw-oer-export --create_json
docker compose run --rm app --create_json
```

To create a CSV file from the local JSON file:

```
ocw-oer-export --create_csv --source=json
docker compose run --rm app --create_csv --source=json
```

## File Output Directory

When using either the Python package or the CLI, the output files (CSV or JSON) are saved in the current working directory from which it is executed.
The output files, whether in CSV or JSON format, are stored within the `private/output` directory relative to the current working directory from which the command is executed.

Therefore, the above commands will generate `private/output/ocw_oer_export.csv` or `private/output/ocw_api_data.json` in the current working directory.

If you want to change this, you will not only have to change the `output_path` in the function (`create_csv` or `create_json`) but also have to change the mapping in `docker-compose.yml`.

## Requirements

Expand All @@ -79,7 +57,7 @@ Additionally, the `mapping_files` should be up-to-date. If new topics are added
To run unit tests:

```
python -m unittest discover
docker run --rm ocw_oer_export python -m unittest discover -s tests
```

## Committing & Formatting
Expand Down
11 changes: 11 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
version: '3.8'

services:
app:
build: .
image: ocw_oer_export
entrypoint: ["python", "-m", "ocw_oer_export.cli"]
command: ["--create_csv"]
tty: true
volumes:
- ./private/output:/private/output
22 changes: 12 additions & 10 deletions ocw_oer_export/create_csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,8 +129,8 @@ def transform_single_course(course, ocw_topics_mapping):
"CR_PROVIDER_SET": "MIT OpenCourseWare",
"CR_COU_URL": "https://creativecommons.org/licenses/by-nc-sa/4.0/",
"CR_COU_COPYRIGHT_HOLDER": "MIT",
"CR_EDUCATIONAL_USE": get_cr_educational_use(course["resource_content_tags"]),
"CR_ACCESSIBILITY": get_cr_accessibility(course["resource_content_tags"]),
"CR_EDUCATIONAL_USE": get_cr_educational_use(course["course_feature"]),
"CR_ACCESSIBILITY": get_cr_accessibility(course["course_feature"]),
}


Expand All @@ -146,9 +146,14 @@ def transform_data(data, ocw_topics_mapping):


def create_csv(
source="api", input_file="ocw_api_data.json", output_file="ocw_oer_export.csv"
source="api",
input_file="/private/output/ocw_api_data.json",
output_path="/private/output/ocw_oer_export.csv",
):
"""Create a CSV file from either the MIT OpenCourseWare API or a locally stored JSON file."""
"""
Create a CSV file from either the MIT OpenCourseWare API or a locally stored JSON file.
output_path: The output path inside the docker container.
"""
api_data_json = {}

if source == "api":
Expand Down Expand Up @@ -182,14 +187,11 @@ def create_csv(
"CR_EDUCATIONAL_USE",
"CR_ACCESSIBILITY",
]
with open(output_file, "w", newline="", encoding="utf-8") as csv_file:
with open(output_path, "w", newline="", encoding="utf-8") as csv_file:
writer = csv.DictWriter(csv_file, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(transformed_data)

current_dir = os.getcwd()
logging.info(
"CSV file '%s' successfully created at directory: %s",
output_file,
current_dir,
"CSV file '%s' successfully created.",
output_path,
)
11 changes: 7 additions & 4 deletions ocw_oer_export/create_json.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,15 @@
logger = logging.getLogger(__name__)


def create_json(output_file="ocw_api_data.json"):
"""Fetches data from MIT OpenCourseWare API and writes it to a JSON file."""
def create_json(output_path="/private/output/ocw_api_data.json"):
"""
Fetches data from MIT OpenCourseWare API and writes it to a JSON file.
output_path: The output path inside the docker container.
"""
api_data = extract_data_from_api(api_url=API_URL)
try:
with open(output_file, "w", encoding="utf-8") as json_file:
with open(output_path, "w", encoding="utf-8") as json_file:
json.dump(api_data, json_file, ensure_ascii=False, indent=4)
logger.info("Data saved to %s at present directory.", output_file)
logger.info("JSON file '%s' successfully created.", output_path)
except IOError as e:
logger.error("Error saving data to JSON: %s", e)
Loading

0 comments on commit 69525d5

Please sign in to comment.