Skip to content

Commit

Permalink
chores: update dep + add sentry (#107)
Browse files Browse the repository at this point in the history
  • Loading branch information
polomarcus authored Feb 21, 2024
1 parent ba1dea9 commit 0db52c5
Show file tree
Hide file tree
Showing 10 changed files with 617 additions and 539 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/deploy-main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ on:

env:
PYTHON_VERSION: '3.11'
POETRY_VERSION: '1.6.1'
POETRY_VERSION: '1.7.1'

jobs:
build:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ on:

env:
PYTHON_VERSION: '3.11'
POETRY_VERSION: '1.6.1'
POETRY_VERSION: '1.7.1'

jobs:
# Label of the runner job
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ WORKDIR /app

COPY pyproject.toml poetry.lock ./

RUN pip install poetry==1.6.1
RUN pip install poetry==1.7.1

RUN poetry install

Expand Down
2 changes: 1 addition & 1 deletion Dockerfile_api_import
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ WORKDIR /app

COPY pyproject.toml poetry.lock ./

RUN pip install poetry==1.6.1
RUN pip install poetry==1.7.1

RUN poetry install

Expand Down
2 changes: 1 addition & 1 deletion Dockerfile_ingest
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ WORKDIR /app

COPY pyproject.toml poetry.lock ./

RUN pip install poetry==1.6.1
RUN pip install poetry==1.7.1

RUN poetry install

Expand Down
2 changes: 1 addition & 1 deletion Dockerfile_streamlit
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ WORKDIR /app

COPY pyproject.toml poetry.lock ./

RUN pip install poetry==1.6.1
RUN pip install poetry==1.7.1

RUN poetry install

Expand Down
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,10 @@ When you need to install a new dependency (use a new package, e.g. nltk), run
```bash
poetry add ntlk
```
Update dependencies
```
poetry self update
```

After commiting to the repo, other team members will be able to use the exact same environment you are using.

Expand Down Expand Up @@ -264,6 +268,11 @@ Contact QuotaClimat team to 2 files with the API's username and password inside
docker compose up mediatree
```

## Monitoring
With Sentry, with env variable `SENTRY_DSN`.

Learn more here : https://docs.sentry.io/platforms/python/configuration/options/

### Batch import based on time
Use env variable `START_DATE` like in docker compose (epoch second format : 1705409797).

Expand Down
1,113 changes: 584 additions & 529 deletions poetry.lock

Large diffs are not rendered by default.

8 changes: 5 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ priority = "primary"

[tool.poetry.dependencies]
python = ">=3.11.0,<3.13.0"
pandas = "^1.5.3"
advertools = "^0.13.2"
pandas = "^2.2.0"
advertools = "^0.14.1"
xmltodict = "^0.13.0"
sqlalchemy = "^2.0.21"
psycopg2-binary = "^2.9.5"
Expand All @@ -33,9 +33,11 @@ asyncio = "^3.4.3"
tomli = "^2.0.1"
pandera = "^0.17.2"
aiohttp = "^3.8.6"
pytest-asyncio = "^0.21.1"
pytest-asyncio = "^0.23.5"
swifter = "^1.4.0"
tenacity = "^8.2.3"
sentry-sdk = "^1.40.5"
coverage = "^7.4.2"

[build-system]
requires = ["poetry-core>=1.1"]
Expand Down
14 changes: 13 additions & 1 deletion quotaclimat/data_processing/mediatree/api_import.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,18 @@
from quotaclimat.data_processing.mediatree.keyword.keyword import THEME_KEYWORDS
from typing import List, Optional
from tenacity import *
import sentry_sdk

# read SENTRY_DSN from env
sentry_sdk.init(
# Set traces_sample_rate to 1.0 to capture 100%
# of transactions for performance monitoring.
traces_sample_rate=0.7,
# Set profiles_sample_rate to 1.0 to profile 100%
# of sampled transactions.
# We recommend adjusting this value in production.
profiles_sample_rate=0.7,
)

#read whole file to a string
password = get_password()
Expand Down Expand Up @@ -204,7 +216,7 @@ def parse_reponse_subtitle(response_sub, channel = None) -> Optional[pd.DataFram
logging.info(f"{total_results} 'total_results' field")

new_df = json_normalize(response_sub.get('data'))
logging.info("Schema from API before formatting :\n%s", new_df.dtypes)
logging.debug("Schema from API before formatting :\n%s", new_df.dtypes)
new_df.drop('channel.title', axis=1, inplace=True) # keep only channel.name

new_df['timestamp'] = (pd.to_datetime(new_df['start'], unit='s').dt.tz_localize('utc').dt.tz_convert('Europe/Paris'))
Expand Down

1 comment on commit 0db52c5

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage

Coverage Report
FileStmtsMissCoverMissing
postgres
   insert_data.py46785%38–40, 59–61, 66
   insert_existing_data_example.py20385%25–27
postgres/schemas
   models.py711579%74–81, 91–92, 101–111
quotaclimat/data_analytics
   analytics_signataire_charte.py29290%1–67
   bilan.py1081080%2–372
   data_coverage.py34340%1–94
   exploration.py1251250%1–440
   sitemap_analytics.py1181180%1–343
quotaclimat/data_ingestion
   categorization_program_type.py110%1
   config_youtube.py110%1
   scaleway_db_backups.py34340%1–74
   scrap_chartejournalismeecologie_signataires.py50500%1–169
   scrap_sitemap.py1341787%27–28, 33–34, 66–71, 95–97, 138–140, 202, 223–228
   scrap_tv_program.py62620%1–149
   scrap_youtube.py1141140%1–238
quotaclimat/data_ingestion/ingest_db
   ingest_sitemap_in_db.py544026%18–39, 42–61, 65–76
quotaclimat/data_ingestion/scrap_html
   scrap_description_article.py36392%19–20, 32
quotaclimat/data_processing/mediatree
   api_import.py18110641%44–48, 53–56, 60–63, 69, 72–97, 103–118, 123–125, 150–157, 161–164, 168–174, 185–196, 199–203, 209, 234–235, 241, 243, 246–272, 276–287
   config.py15287%7, 16
   detect_keywords.py88693%101–108
   utils.py642167%27–51, 54, 73–74
quotaclimat/data_processing/sitemap
   sitemap_processing.py412734%15–19, 23–25, 29–47, 51–58, 66–96, 101–103
quotaclimat/utils
   channels.py660%1–95
   climate_keywords.py220%3–35
   healthcheck_config.py291452%22–24, 27–38
   logger.py14379%22–24
   plotly_theme.py17170%1–56
TOTAL153896537% 

Tests Skipped Failures Errors Time
39 0 💤 0 ❌ 0 🔥 10.890s ⏱️

Please sign in to comment.