Skip to content
forked from nargesam/factCC

Project for the AI Fellowship at Insight - Fact checking of text summarization models using Transformers.

License

Notifications You must be signed in to change notification settings

XuelinLuu/factCC

 
 

Repository files navigation

AutoFact: Automated tool for evaluating the factual consistency of summarized text

Project for the AI Fellowship at Insight

In this project I introduce AutoFact which is an automated tool for evaluating the factual consistency of summarizad text.

This project is built based on this paper from Salesforce Research: https://arxiv.org/abs/1910.12840

Project description

AutoFact: a tool that evaluates the factual consistency between the summary of a large text document and its source using Transformer models.

Development: used HuggingFace to built the AutoFact model using various Transformer model and compare various Transormer models such as Google BERT, Facebook RoBERTa and DistilBert to demonsterate runtime-accuracy trade off.

Demo: Served via an interactive command line interface created with the Python package Click.

Data description

The source text is CNN and Daily Mail news. The claim is an abstractive summarization of those news. After the summarization has been created and were labeled "SUPPORTS", they have all been augmented to create false claims and were labeled "REFUTES". Here's a description of data:

The data is provided from Salesforce research, and you can re-create the dataset here: https://github.com/salesforce/factCC/tree/master/data_generation

Test AutoFact

To try AutoFact, after you cloned the repo:

git clone https://github.com/nargesam/factCC.git

You can recreate the conda environment:

conda env create -f factcc_environment.yml 

Or, install the requirement.txt file:

pip3 install -r requirements.txt

Please download the BERT Base Cased model and its config file, BERT Base Uncased model and its config file, or RoBERTa model and its config file and save them to their directories: models/saved_models/< model-type >/batch_size-50__epoch-6__datasize-10perc

< model-type >: bert-base-cased, bert-base-uncased, roberta-base < Config Path >: src/factcc/try_factcc.cfg

Run the test python file:

python src/factcc/try_factcc.py  --model-type < model-type > --config-path < Config Path >

About

Project for the AI Fellowship at Insight - Fact checking of text summarization models using Transformers.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 81.4%
  • Python 16.3%
  • Makefile 2.3%