Skip to content

Commit

Permalink
refacto: automate transform program metadata (#171)
Browse files Browse the repository at this point in the history
  • Loading branch information
polomarcus authored May 2, 2024
1 parent aca8f17 commit 771310e
Show file tree
Hide file tree
Showing 6 changed files with 4,687 additions and 437 deletions.
1 change: 1 addition & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ RUN pip install poetry
COPY quotaclimat ./quotaclimat
COPY postgres ./postgres
COPY alembic/ ./alembic
COPY transform_program.py ./transform_program.py

# Docker compose overwrite this config to have only one Dockerfile
CMD ["ls"]
1 change: 1 addition & 0 deletions Dockerfile_api_import
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ COPY postgres ./postgres
COPY pyproject.toml pyproject.toml
COPY alembic/ ./alembic
COPY alembic.ini ./alembic.ini
COPY transform_program.py ./transform_program.py

# healthcheck
EXPOSE 5050
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -357,14 +357,14 @@ poetry run python3 quotaclimat/transform_excel_to_json.py > cc-bio.json
```

## Program Metadata table
How to update `postgres/program_metadata.json` (Program Metadata table)
After updating "quotaclimat/data_processing/mediatree/channel_program.json" you need to execute to update `postgres/program_metadata.json`
After updating "quotaclimat/data_processing/mediatree/channel_program.json" you need to execute this command to update `postgres/program_metadata.json`
```
poetry run python3 quotaclimat/transform_program.py
# need to transform it to an array of json
poetry run python3 transform_program.py
```
The SQL queries are based on this file that generate the Program Metadata table.

With the docker-entrypoint.sh this command is done automatically, so for production uses, you will not have to run this command.

### Fix linting
Before committing, make sure that the line of codes you wrote are conform to PEP8 standard by running:
```bash
Expand Down
9 changes: 9 additions & 0 deletions docker-entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,14 @@
echo "Running migrations with alembic if exists"
poetry run alembic upgrade head


echo "update program metadata file"
python transform_program.py
if [[ $? -eq 0 ]]; then
echo "Command succeeded"
else
echo "Command failed"
fi

echo "starting mediatree import app"
python quotaclimat/data_processing/mediatree/api_import.py
Loading

1 comment on commit 771310e

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage

Coverage Report
FileStmtsMissCoverMissing
postgres
   insert_data.py44784%36–38, 57–59, 64
   insert_existing_data_example.py19384%25–27
postgres/schemas
   models.py134894%113–120, 132–133, 198–199
quotaclimat/data_ingestion
   scrap_sitemap.py1341787%27–28, 33–34, 66–71, 95–97, 138–140, 202, 223–228
quotaclimat/data_ingestion/ingest_db
   ingest_sitemap_in_db.py553733%21–42, 45–58, 62–73
quotaclimat/data_ingestion/scrap_html
   scrap_description_article.py36392%19–20, 32
quotaclimat/data_processing/mediatree
   api_import.py18411040%42–46, 51–63, 67–70, 76, 79–112, 118–133, 137–138, 151–163, 167–173, 186–197, 200–204, 210, 237–238, 242, 246–265, 268–270
   channel_program.py91990%21–23, 34–36, 50, 86, 95
   config.py15287%7, 16
   detect_keywords.py180498%178, 230–232
   update_pg_keywords.py443032%14–84, 105–106, 127–152, 158
   utils.py662365%18, 29–53, 56, 65, 81–82
quotaclimat/utils
   healthcheck_config.py291452%22–24, 27–38
   logger.py241154%22–24, 28–37
   sentry.py10280%21–22
TOTAL109128074% 

Tests Skipped Failures Errors Time
79 0 💤 0 ❌ 0 🔥 57.528s ⏱️

Please sign in to comment.