Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerization, Poetry, and Github Actions #8

Merged
merged 24 commits into from
Jan 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions .github/workflows/python-unittest.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
name: Backend Unit Tests
on:
push:
branches:
- main
pull_request:
jobs:
python-unittests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install Poetry
uses: snok/install-poetry@v1
with:
version: 1.5.1
virtualenvs-create: true
virtualenvs-in-project: true

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.9"
cache: "poetry"

- name: Set up Poetry Path
run: echo "$HOME/.local/bin" >> $GITHUB_PATH

- name: Install dependencies
run: poetry install --no-interaction

- name: Run unittests
run: poetry run python -m unittest discover -s tests
5 changes: 2 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,5 @@ cython_debug/
# DS_Store
.DS_Store

# OER Export JSON and CSV files created
ocw_api_data.json
ocw_oer_export.csv
# Private directory containing output directory
private/
41 changes: 41 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
FROM python:3.9

WORKDIR /tmp

# Add, and run as, non-root user.
RUN mkdir /src
RUN adduser --disabled-password --gecos "" --shell /bin/bash mitodl

# Install Python packages
## Set some poetry config
ENV \
POETRY_VERSION=1.5.1 \
POETRY_CACHE_DIR='/tmp/cache/poetry' \
POETRY_HOME='/home/mitodl/.local' \
VIRTUAL_ENV="/opt/venv"
ENV PATH="$VIRTUAL_ENV/bin:$POETRY_HOME/bin:$PATH"

COPY pyproject.toml /src/
RUN chown -R mitodl:mitodl /src
RUN mkdir ${VIRTUAL_ENV} && chown -R mitodl:mitodl ${VIRTUAL_ENV}

## Install poetry itself, and pre-create a venv with predictable name
USER mitodl
RUN curl -sSL https://install.python-poetry.org \
| \
POETRY_VERSION=${POETRY_VERSION} \
POETRY_HOME=${POETRY_HOME} \
python3 -q
WORKDIR /src
RUN python3 -m venv $VIRTUAL_ENV
RUN poetry install

# Add project
USER root
COPY . /src
WORKDIR /src
RUN chown -R mitodl:mitodl /src

RUN apt-get clean && apt-get purge

USER mitodl
48 changes: 13 additions & 35 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,59 +12,37 @@ This demonstration project showcases how to utilize the MIT Open API. It specifi

## Initial Setup & Usage

The _ocw_oer_export_ package is available [on PyPI](link). To install:
1. Build the container:

```
pip install ocw_oer_export
docker compose build
```

### Usage as a Python Package

To use `ocw_oer_export` in your Python code:

```
from ocw_oer_export import create_csv
create_csv()
```

By default, the `create_csv` function uses `source="api"` and `output_file="ocw_oer_export.csv"`. The `source` parameter can be altered to `source="json"` if a local JSON file of courses' metadata is available. To generate the JSON file:
2. Start the container:

```
from ocw_oer_export import create_json
create_json()
docker compose run --rm app
```

Then, create the CSV from the JSON file:
To generate a JSON file containing complete API data:

```
create_csv(source="json")
```

### CLI Usage

`ocw_oer_export` also provides a Command Line Interface (CLI). After installation, you can use the following commands:

To create the CSV file:

```
ocw-oer-export --create_csv
```

To generate a JSON file:

```
ocw-oer-export --create_json
docker compose run --rm app --create_json
```

To create a CSV file from the local JSON file:

```
ocw-oer-export --create_csv --source=json
docker compose run --rm app --create_csv --source=json
```

## File Output Directory

When using either the Python package or the CLI, the output files (CSV or JSON) are saved in the current working directory from which it is executed.
The output files, whether in CSV or JSON format, are stored within the `private/output` directory relative to the current working directory from which the command is executed.

Therefore, the above commands will generate `private/output/ocw_oer_export.csv` or `private/output/ocw_api_data.json` in the current working directory.

If you want to change this, you will not only have to change the `output_path` in the function (`create_csv` or `create_json`) but also have to change the mapping in `docker-compose.yml`.

## Requirements

Expand All @@ -79,7 +57,7 @@ Additionally, the `mapping_files` should be up-to-date. If new topics are added
To run unit tests:

```
python -m unittest discover
docker run --rm ocw_oer_export python -m unittest discover -s tests
```

## Committing & Formatting
Expand Down
11 changes: 11 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
version: '3.8'

services:
app:
build: .
image: ocw_oer_export
entrypoint: ["python", "-m", "ocw_oer_export.cli"]
command: ["--create_csv"]
tty: true
volumes:
- ./private/output:/private/output
22 changes: 12 additions & 10 deletions ocw_oer_export/create_csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,8 +129,8 @@ def transform_single_course(course, ocw_topics_mapping):
"CR_PROVIDER_SET": "MIT OpenCourseWare",
"CR_COU_URL": "https://creativecommons.org/licenses/by-nc-sa/4.0/",
"CR_COU_COPYRIGHT_HOLDER": "MIT",
"CR_EDUCATIONAL_USE": get_cr_educational_use(course["resource_content_tags"]),
"CR_ACCESSIBILITY": get_cr_accessibility(course["resource_content_tags"]),
"CR_EDUCATIONAL_USE": get_cr_educational_use(course["course_feature"]),
"CR_ACCESSIBILITY": get_cr_accessibility(course["course_feature"]),
}


Expand All @@ -146,9 +146,14 @@ def transform_data(data, ocw_topics_mapping):


def create_csv(
source="api", input_file="ocw_api_data.json", output_file="ocw_oer_export.csv"
source="api",
input_file="/private/output/ocw_api_data.json",
output_path="/private/output/ocw_oer_export.csv",
):
"""Create a CSV file from either the MIT OpenCourseWare API or a locally stored JSON file."""
"""
Create a CSV file from either the MIT OpenCourseWare API or a locally stored JSON file.
output_path: The output path inside the docker container.
"""
api_data_json = {}

if source == "api":
Expand Down Expand Up @@ -182,14 +187,11 @@ def create_csv(
"CR_EDUCATIONAL_USE",
"CR_ACCESSIBILITY",
]
with open(output_file, "w", newline="", encoding="utf-8") as csv_file:
with open(output_path, "w", newline="", encoding="utf-8") as csv_file:
writer = csv.DictWriter(csv_file, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(transformed_data)

current_dir = os.getcwd()
logging.info(
"CSV file '%s' successfully created at directory: %s",
output_file,
current_dir,
"CSV file '%s' successfully created.",
output_path,
)
11 changes: 7 additions & 4 deletions ocw_oer_export/create_json.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,15 @@
logger = logging.getLogger(__name__)


def create_json(output_file="ocw_api_data.json"):
"""Fetches data from MIT OpenCourseWare API and writes it to a JSON file."""
def create_json(output_path="/private/output/ocw_api_data.json"):
"""
Fetches data from MIT OpenCourseWare API and writes it to a JSON file.
output_path: The output path inside the docker container.
"""
api_data = extract_data_from_api(api_url=API_URL)
try:
with open(output_file, "w", encoding="utf-8") as json_file:
with open(output_path, "w", encoding="utf-8") as json_file:
json.dump(api_data, json_file, ensure_ascii=False, indent=4)
logger.info("Data saved to %s at present directory.", output_file)
logger.info("JSON file '%s' successfully created.", output_path)
except IOError as e:
logger.error("Error saving data to JSON: %s", e)
Loading