UWaterloo - Fake News Challenge

FAKE NEWS CHALLENGE STAGE 1 (FNC-I): STANCE DETECTION

Introduction

Input

A headline and a body text - either from the same news article or from two different articles.

Output

Classify the stance of the body text relative to the claim made in the headline into one of four categories:

Agrees: The body text agrees with the headline.

Disagrees: The body text disagrees with the headline.

Discusses: The body text discuss the same topic as the headline, but does not take a position

Unrelated: The body text discusses a different topic than the headline

Stance Detection involves estimating the relative perspective (or stance) of two pieces of text relative to a topic, claim or issue. For FNC-1 we have chosen the task of estimating the stance of a body text from a news article relative to a headline. Specifically, the body text may agree, disagree, discuss or be unrelated to the headline.

For details of the task, see FakeNewsChallenge.org

Approach

The general approach here is to use a combination of conventional classifier and feed forward neural network model to tegether classify future instances. The choice of a conventional classifier chosen is SVM, which is used to do a binary classification to predict if an instance is related or unrelated to a headline. The neural network model is classfiying if an instance is one of the following agree, disagree, discuss.

Three kinds of features are extracted between a headline and body article: cosine similarity, kl-divergence, n-gram overlap

The two classifiers are trained using the same data set, but under different target labels. For SVM, the entire the target labels are changed to either related or unrelated. For neural network training input, labels have values of agree, disagree, discuss are keeped.

When predicting(tesing), the same data set is used. For instances that are marked as unrelated by SVM, they are keeped untouch. For those which are labeld as related, their target labels are swaped by the predictions(agree, disagree, discuss) of neural network.

`

The result of using 2-step classification is as follows:

ACCURACY: 0.865

MAX  - the best possible score (100% accuracy)
NULL - score as if all predicted stances were unrelated
TEST - score based on the provided predictions

||    MAX    ||    NULL   ||    TEST   ||
|| 11651.25  ||  4587.25  ||  9055.5   ||

The average accuracy reached is 86.5%, and leader board score is 9055.5.

System Requirement

Python 3.7.4

Brief

The source code contains two main .py files: 1. main.py, 2. FeatureExtract.py.

The first file is the driver script which contains the entire life cycle of the classification process; it calls according APIs to get cleaned data -> extract features out of headlines and bodies -> feed the models for classification.

The latter one contians the life cycle of taking in headline and body as input and output features.

The other .py files are helper scripts.

To Run

main.py is the driver script. Simply run main.py to re-produce submission(prediction) files stored in ../data/submission/ folder.

All data files used are provided by fnc-1 on GitHub, those files are put in ../data/ folder.

Before running, download word embedding file glove.6B.50d.txt, put it into ../data/ folder. You are good to go.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.idea		.idea
data		data
exec_output		exec_output
img		img
model		model
.gitignore		.gitignore
DataSet.py		DataSet.py
Feature_Extract.py		Feature_Extract.py
Preprocess.py		Preprocess.py
Project_Report.pdf		Project_Report.pdf
README.md		README.md
main.py		main.py
scorer.py		scorer.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UWaterloo - Fake News Challenge

Introduction

Approach

System Requirement

Brief

To Run

About

Releases

Packages

Languages

jimjimliu/Fake-News-Stance-Detection

Folders and files

Latest commit

History

Repository files navigation

UWaterloo - Fake News Challenge

Introduction

Approach

System Requirement

Brief

To Run

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages