WTWM Newsroom Mentions Detector

This repository contains the source code of a joint project of the AI + Automation Lab of Bayerischer Rundfunk (abbr. BR) and Mitteldeutscher Rundfunk (abbr. mdr) as well as ida to identify user comments that address the newsrooms to foster constructive exchange with our audiences.

This project documents the status of the project work during JournalismAI fellowship (more info bellow in chapter On the fellowship) in 2022. The fellowship was used by this projects team to explore technical solutions to support the mdr and BRs comment moderation teams. The goal was to allow the moderation teams to engage in real time communication with it's audience. For this purpose we constructed a system to bring comments with direct mentions of the media house to the immediate attention of the moderation team. That involves:

fetch the comment instantanenoulsy
preprocess the comments
store the comments
classify the into relevant and irrelevant comments
publish the relevant comments to the moderations teams instance
forward moderation team to comment in moderation tool
collect feedback by moderation team to improve model

Part of the running project is a text classification model that was released on huggingface.

Architecture

Local setup

Create a virtualenv using

python3.9 -m venv .venv

source .venv/bin/activate

Install dependencies using

pip3 install -r requirements.txt

Usage

Run the API as uvicorn api:app --host <ip_address> --port <port>

Necessary settings for the usage of this project can be found in settings.py.

The project's APIs are document via the endpoint /docs

Deployment

This repository is connected by git actions to the GCloud Kubernetes cluster of BR. Access to the BR infrastructure is restricted to members of the BR.

Necessary settings for the deployment can be found in config.yml.

NOTE: To optimize the deployment runtime of this repository, the documented dependencies in requirements.txt where refactored into the base image wtwm-application-base-image and are configured for inheritance. The image can be adapted from the dependencies in requirements.txt.

image:
  imageFrom: europe-west3-docker.pkg.dev/brdata-dev/cloud-deploy-images/wtwm-application-base-image

Data source information

The comment data from BR and mdr is provided through APIs external to this repository. To include own comment data APIs follow the example of the endpoints /v1/get_mdr_comments and /v1/get_latest_br_comments. The comment data must fit the format of the Comment class in src/models.py.

File/Model storage

Data files and the various models are stored in a persistent google bucket that is connected by the deployment routine to the pod.

The classification model, that was last used for the running system, can be found here.

Database integration

The processed comments and their mentions are stored in a postgres database instance that is connected by the deployment routine to the pod.

API Endpoints

The project's APIs are document via the endpoint /docs

Authentication

API endpoints are secured by a bearer token. Requests must include the bearer token to be accepted.

On the fellowship

JournalismAI is a project of Polis – the journalism think-tank at the London School of Economics and Political Science – and it’s sponsored by the Google News Initiative). If you want to know more about the Fellowship and the other JournalismAI activities, sign up for the newsletter or get in touch with the team via [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
.github/workflows		.github/workflows
.idea		.idea
.vscode		.vscode
docs		docs
schemas		schemas
src		src
.gitignore		.gitignore
README.md		README.md
api.py		api.py
config.yaml		config.yaml
license.md		license.md
requirements.txt		requirements.txt
schema.config.json		schema.config.json
settings.py		settings.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WTWM Newsroom Mentions Detector

Architecture

Local setup

Usage

Deployment

Data source information

File/Model storage

Database integration

API Endpoints

Authentication

On the fellowship

About

Releases

Packages

Languages

License

capmar00/wtwm-topic-modelling

Folders and files

Latest commit

History

Repository files navigation

WTWM Newsroom Mentions Detector

Architecture

Local setup

Usage

Deployment

Data source information

File/Model storage

Database integration

API Endpoints

Authentication

On the fellowship

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages