News Bias Tool

The goal of this project is to make it easier to digest news articles and find biases within them. Our plan is to use a vector database to store articles scraped from the web and posted by users for analysis, then use a large language model (LLM) like ChatGPT to analyze the articles' biases.

Problem: There are many different reporters and organizations reporting news, and it is impossible for one person to research each one. It is also impossible for news reporting to be 100% objective since reporters have to choose what details to include in their stories and frame the story in a way that keeps readers interested. Given all this, how can a person catch the biases that inform each news report they consume?

Solution: Build a tool that will find all the articles/reports for a given story, analyze the differences between the articles, and use AI to detect the potential biases for each story. When a consumer reads a news article, they can use the tool analyze "between the lines" of the article and get a sense of how biased/objective that article may be.

This is a senior design project. Progress and Goals is a discussion of what we completed in Fall 2023, and what we plan to complete in Spring 2024.

target customers

People who read news articles, and wish for ease in handling the complexities of news intake.

implementation

tl;dr:

scrape all the news
store in vector database
analyze bias with LLM

long form:

For this idea we'll need to scrape web data to find the publicly available articles on a specific topic to compare against the article the user wants to analyze.
We can use a vector database to store article content. This will store the relationships/similarities between articles. We can update this database with articles users upload (that they want to analyze) and articles scraped from the web. Doing so will update existing relationships in the database. Out-of-date articles can also be updated.
We will use a large language model (LLM, e.g. ChatGPT) to process an article and detect biases (and possibly find misinformation or opinionated judgements in the article). We may also keep track of media scores to deepen the model's understanding of what content is biased and in what direction.
We will have API endpoints to interact with the database and return the bias analysis on a given article.

future work

These are things we will consider but are out of the scope of our solution...for this semester, at least ;)

prioritize a user-friendly interface
include a search tool to find news articles from our site
analyze sentiment of a text

potential problems

We will be accessing so much data. It's hard to parse it all correctly and find the best way to store it.
We may not have enough information on a topic stored - this has the potential of spreading misinformation.
how to prevent hallucinations?
properly analyzing news sources with context
how to know what's biased, and what ways it's biased?
- needs research
dealing with vector databases

competitors

Ground News
AllSides
The Dispatch - right-leaning
News facts network - left-center leaning

resources

Web scraping

LangChain and vector databases

Our documentation

Writing Markdown on GitHub

Flask

https://newsdata.io/blog/news-api-python-client/

Steps to run

Note

If you're running a python environment, ensure you have it activated. Also be sure to have your NewsData.io API key stored in the environment variable NEWS_API_KEY. To automate starting these, you can run source act.sh if you have an existing virtual environment at the root named .venv and your API key is stored in a file named .env based on .env.example. For more information on NewsData.io, see API Setup.

Run make dev_env.
Run make tests.
Run ./local.sh.
Run make prod.
Run the menu: dev.sh.

API Setup

Set Environment Variable:

Get API Key
On macOS or Linux:
1. Open your terminal.
2. Run the following command to set an environment variable (replace YOUR_API_KEY with the actual API key):
```
export NEWS_API_KEY="YOUR_API_KEY"
```
On Windows:
1. Press Win + X and select "System".
2. Click on "Advanced system settings" on the left.
3. Click on "Environment Variables".
4. Under "System variables", click "New" and enter NEWS_API_KEY as the variable name and your actual API key as the variable value.
It's important to ensure that the environment variable is set every time you run your Python script. You might want to add the export command to your shell's profile script (e.g., ~/.bash_profile or ~/.zshrc on macOS and Linux) to ensure the environment variable is set automatically whenever you open a new terminal window.
Future notes/plans
- For a more permanent and portable solution, you might want to consider using a configuration file or a more advanced secret management solution, especially in a production environment.

Setting up MongoDB on MacOS

This section guides you through the process of installing MongoDB on MacOS using Homebrew and connecting to your MongoDB instance.

Prerequisites

MacOS with Homebrew installed.
Terminal access.

Installation Steps

Add MongoDB Repository to Homebrew: MongoDB provides a custom Homebrew tap. Adding this tap allows you to install MongoDB directly through Homebrew. Run the following command to add the MongoDB tap:
```
brew tap mongodb/brew
```
Install MongoDB Community Edition: Once you have tapped the MongoDB repository, you can install MongoDB Community Edition using the following command:
```
brew install mongodb-community
```
Start the MongoDB Service: After installation, you can start the MongoDB service. This will initiate the MongoDB server and make it ready for connections:
```
brew services start mongodb-community
```
Connecting to MongoDB:
- Using mongosh (Recommended): Newer MongoDB installations come with mongosh, the MongoDB Shell, as the default CLI tool for interaction. You can connect to your local MongoDB instance by simply typing:
```
mongosh
```
- Using mongo (Legacy): In older versions or if you have mongo installed separately, you can connect using:
```
mongo
```

Verifying the Installation

To verify that MongoDB is running correctly, use the mongosh or mongo command to connect to your MongoDB instance. If you encounter any issues, ensure that MongoDB is correctly started and that there are no network or firewall configurations blocking the connection.

For detailed documentation and advanced configuration, refer to the official MongoDB documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 384 Commits
.github/workflows		.github/workflows
ai		ai
examples		examples
news_getters		news_getters
proj_ideas		proj_ideas
server		server
userdata		userdata
.env.example		.env.example
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
ProgressAndGoals.md		ProgressAndGoals.md
README.md		README.md
StandUp.md		StandUp.md
act.sh		act.sh
common.mk		common.mk
deploy.sh		deploy.sh
flask-api-README.md		flask-api-README.md
local.sh		local.sh
makefile		makefile
rebuild.sh		rebuild.sh
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

News Bias Tool

target customers

implementation

tl;dr:

long form:

future work

potential problems

competitors

resources

Steps to run

API Setup

Set Environment Variable:

Setting up MongoDB on MacOS

Prerequisites

Installation Steps

Verifying the Installation

About

Releases

Packages

Contributors 5

Languages

License

SWEES-news/backend

Folders and files

Latest commit

History

Repository files navigation

News Bias Tool

target customers

implementation

tl;dr:

long form:

future work

potential problems

competitors

resources

Steps to run

API Setup

Set Environment Variable:

Setting up MongoDB on MacOS

Prerequisites

Installation Steps

Verifying the Installation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages