Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge sitemap scaleway from polomarcus/barometre #83

Merged
merged 89 commits into from
Nov 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
dbee944
docker: scrapper app
polomarcus Oct 3, 2023
e77da36
fix(docker): scrapper app name logs
polomarcus Oct 3, 2023
f996ac0
docker: force container name
polomarcus Oct 3, 2023
47a8523
wip: docker and config
polomarcus Oct 4, 2023
ad666b0
wip: docker
polomarcus Oct 9, 2023
5268fed
chores: update poetry lock
polomarcus Oct 10, 2023
e08c6e7
fix: docker start
polomarcus Oct 10, 2023
5f2e593
fix: docker streamlit
polomarcus Oct 10, 2023
c282e67
fix(streamlit): docker start
polomarcus Oct 10, 2023
a50d422
chores: poetry lock
polomarcus Oct 10, 2023
6e3b66e
chores: python 3.10 to 3.11
polomarcus Oct 10, 2023
96d881a
remove: scrap_sitemap
polomarcus Oct 10, 2023
efa6784
Merge pull request #1 from polomarcus/docker
polomarcus Oct 10, 2023
d04c242
wip: add SQLAlchemy
polomarcus Oct 10, 2023
7755a26
wip(refacto): SQLAchemy + test
polomarcus Oct 11, 2023
62db13c
fix: ci
polomarcus Oct 11, 2023
cad584d
fix: ci install poetry
polomarcus Oct 12, 2023
2949493
test(wip): save using pandas.to_sql
polomarcus Oct 12, 2023
5a30d99
ci: --dev-dependency
polomarcus Oct 12, 2023
862b9d9
chores: pytest update
polomarcus Oct 12, 2023
e2605a0
clean: notebooks should be in another repo
polomarcus Oct 12, 2023
3648ef7
lint
polomarcus Oct 12, 2023
e8fe510
ci: test job
polomarcus Oct 12, 2023
d2db6f0
ci: need docker to run test
polomarcus Oct 12, 2023
e4899bb
ci: directly use docker
polomarcus Oct 12, 2023
913e522
docker: wait for PG to be ready
polomarcus Oct 12, 2023
a501277
ci: docker exit code from container
polomarcus Oct 12, 2023
5e30fba
ci(test): use github action services to run PG
polomarcus Oct 12, 2023
a1bb342
ci(test): use poetry first
polomarcus Oct 12, 2023
e94be43
ci(test): use github action services to run PG
polomarcus Oct 12, 2023
80b1545
ci: searching for postgres host on CI
polomarcus Oct 12, 2023
30c84e1
ci: searching for postgres host on CI
polomarcus Oct 12, 2023
f19bc72
ci: log connection
polomarcus Oct 12, 2023
f587087
ci: yet
polomarcus Oct 12, 2023
9e61f4b
ci: fix test
polomarcus Oct 12, 2023
0f42f5f
ci: lint yaml
polomarcus Oct 12, 2023
c6b0135
Merge pull request #2 from polomarcus/feat/sqlachemy
polomarcus Oct 12, 2023
5ddc729
wip(refacto): scrapping sitemap
polomarcus Oct 12, 2023
43c68a7
refacto: use ENV to dev to test locally
polomarcus Oct 16, 2023
1868f5c
test: find section in urls
polomarcus Oct 16, 2023
1c778ff
test: query_one_sitemap_and_transform
polomarcus Oct 16, 2023
fca2dc9
fix(ci): env
polomarcus Oct 16, 2023
9d7a169
ci: add nginx
polomarcus Oct 16, 2023
df8e726
ci: nginx background
polomarcus Oct 16, 2023
7ed6082
refacto: remove unsued code
polomarcus Oct 16, 2023
77b1151
refacto: sitemap
polomarcus Oct 16, 2023
03e3057
fix: test local ci
polomarcus Oct 16, 2023
a900699
fix: test depending on env
polomarcus Oct 16, 2023
4d3da1c
fix: test depending on env
polomarcus Oct 16, 2023
9c81847
auto review
polomarcus Oct 16, 2023
9f6b57b
doc: autoreview
polomarcus Oct 16, 2023
1728a98
ci: desactive all jobs
polomarcus Oct 16, 2023
786d673
review: add error log
polomarcus Oct 17, 2023
b25b96b
feat: add url to save inside pg
polomarcus Oct 17, 2023
58fcd78
Merge pull request #3 from polomarcus/refacto/sitemap_parsing
polomarcus Oct 17, 2023
20eee8d
chores: remove some deps
polomarcus Oct 18, 2023
b80811e
cd: scaleway docker
polomarcus Oct 18, 2023
f28fce1
Merge pull request #4 from polomarcus/deployment/task
polomarcus Oct 18, 2023
c3837f7
poetry lock
polomarcus Oct 18, 2023
cf423c0
fix: wrong folder
polomarcus Oct 18, 2023
f6fa733
Merge pull request #5 from polomarcus/chores/dependencies
polomarcus Oct 19, 2023
6b4e49f
Docker Scaleway (#6)
polomarcus Oct 20, 2023
31e57e7
feat: add healthcheck for scaleways deployment (#7)
polomarcus Oct 23, 2023
f846392
fix: cd
polomarcus Oct 23, 2023
b01b120
[no ci]: bumping version
Oct 23, 2023
dc77dcb
Feat/add medias (#8)
polomarcus Oct 25, 2023
4694dad
[no ci]: bumping version
Oct 25, 2023
1ae77b7
refacto: change PK due to publication date changing over time (#9)
polomarcus Oct 25, 2023
a5fa297
[no ci]: bumping version
Oct 25, 2023
f7c9db2
fix(streamlit): use sqlachemy (#10)
polomarcus Oct 25, 2023
f0aa6af
[no ci]: bumping version
Oct 25, 2023
ed429a6
medias: add francebleu, nouvelobs, mediapart
polomarcus Oct 26, 2023
b2efef2
[no ci]: bumping version
Oct 26, 2023
32a7b8b
Feat: parse description meta tag for every news (#11)
polomarcus Oct 26, 2023
ad7b97f
[no ci]: bumping version
Oct 26, 2023
9c3c8e3
feat: only save not known sitemap (PG already saved id) and sitemap l…
polomarcus Oct 27, 2023
a1f9e11
[no ci]: bumping version
Oct 27, 2023
469a7dc
fix: corsematin
polomarcus Oct 27, 2023
c2828af
[no ci]: bumping version
Oct 27, 2023
e985894
refacto: healthcheck port renamed to PORT
polomarcus Oct 30, 2023
1f9a7f4
fix(description): log 20minutes missing hat
polomarcus Oct 30, 2023
b4d01b8
[no ci]: bumping version
Oct 30, 2023
8d070cb
log: log level + colors
polomarcus Oct 30, 2023
f87677e
[no ci]: bumping version
Oct 30, 2023
d0b0f3d
refacto: docker compose / CD push steps (#13)
polomarcus Oct 30, 2023
5faa5dd
[no ci]: bumping version
Oct 30, 2023
d296c1b
add medias: charentelibre, courrierpicard, ... (#14)
polomarcus Oct 30, 2023
b857083
[no ci]: bumping version
Oct 30, 2023
fc9188e
Merge branch 'main' into merge-sitemap-scaleway
estellerambier Nov 24, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
pgdata
.git
.venv
venv
.vscode
notebooks
LICENSE
.idea
26 changes: 0 additions & 26 deletions .github/workflows/check_integration.yml

This file was deleted.

35 changes: 0 additions & 35 deletions .github/workflows/db_backup_on_scaleway.yml

This file was deleted.

56 changes: 56 additions & 0 deletions .github/workflows/deploy-main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
name: Build & Deploy to Scaleway

on:
push:
# Sequence of patterns matched against refs/heads
branches:
- main


# to be able to force deploy
workflow_dispatch:


env:
PYTHON_VERSION: '3.11'
POETRY_VERSION: '1.6.1'

jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: actions/checkout@v4
- name: Login to Scaleway Container Registry
uses: docker/login-action@v3
with:
username: nologin
password: ${{ secrets.SCALEWAY_API_KEY }}
registry: ${{ secrets.CONTAINER_REGISTRY_ENDPOINT }}
- name: Build ingest_to_db image
run: docker build -f Dockerfile_ingest . -t ${{ secrets.CONTAINER_REGISTRY_ENDPOINT }}/ingest_to_db
- name: Push ingest_to_db Image
run: docker push ${{ secrets.CONTAINER_REGISTRY_ENDPOINT }}/ingest_to_db
- name: Build streamlit image
run: docker build -f Dockerfile_streamlit . -t ${{ secrets.CONTAINER_REGISTRY_ENDPOINT }}/streamlit
- name: Push streamlit Image
run: docker push ${{ secrets.CONTAINER_REGISTRY_ENDPOINT }}/streamlit
- name: Install Poetry
uses: snok/install-poetry@v1
with:
version: ${{ env.POETRY_VERSION }}
virtualenvs-create: true
virtualenvs-in-project: true
installer-parallel: true
- name: Poetry install & bump version
run: |
poetry install --only dev
poetry version patch
git config user.name barometre-github-actions
git config user.email [email protected]
git add pyproject.toml
git commit -m "[no ci]: bumping version"
git push origin main
20 changes: 20 additions & 0 deletions .github/workflows/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: Docker Compose CI

on:
workflow_dispatch: # https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#workflow_dispatch

jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: init and load data
run: docker compose up -d
- name: sleep
run: sleep 60
- name: log sitemap
run: docker logs sitemap
- name: log db ingestion
run: docker logs ingest_to_db
- name: log streamlit
run: docker logs streamlit
42 changes: 0 additions & 42 deletions .github/workflows/homepage_lemonde.yml

This file was deleted.

35 changes: 0 additions & 35 deletions .github/workflows/main.yml

This file was deleted.

42 changes: 0 additions & 42 deletions .github/workflows/scrap_sitemap.yml

This file was deleted.

35 changes: 0 additions & 35 deletions .github/workflows/scrap_sitemap_and_ingest_db.yml

This file was deleted.

42 changes: 0 additions & 42 deletions .github/workflows/scrap_tv_program.yml

This file was deleted.

42 changes: 0 additions & 42 deletions .github/workflows/scrap_youtube.yml

This file was deleted.

Loading