This repository implements functions for extracting factors from fact-checks, based on the appropriately-named paper Factoring Fact-Checks:
Structured Information Extraction from Fact-Checking Articles. If you're interested in evaluation results, see the docs
directory for a mini-writeup.
First, if you haven't, run pip install -r requirements.txt
.
This repository has two main functions:
- Download and reformat the ClaimReview dataset for NLP processing
- Fine tune and apply LLM models on ClaimReview data
To download the ClaimReview data, first run:
python get_dataset.py
This will download the Datacommons ClaimReview dataset ClaimReview dataset, filter out English articles, and extract the article text for each line. The output of this will be saved to data/dataset_with_articles.jsonl
.
Then, to perform the fuzzy matching algorithm, run:
python fuzzy_match_factors.py
This will produce data/matched_articles.jsonl
, which is ready for processing!
WARNING: This may take a long time.
First, change to the llm
directory. Here, there are a few methods that you can use for processing the data:
-
python factor.py --model <model>
: Run a cloud LLM (e.g. --model gpt-3.5-turbo) on a subset of the dataset, and produce a new file with predictions. This producesdata/predicted_factors.jsonl
. NOTE: you must have an OpenAI API key saved as the environment variableOPENAI_API_KEY
for this to work if using GPT! Otherwise if using Claude, save your API key underANTHROPIC_API_KEY
. -
python fine_tune.py --model <model>
: Fine-tune an Anyscale model (e.g. --model mistralai/Mixtral-8x7b) on the dataset. NOTE: you must have an AnyScale API key saved as the environment variableANYSCALE_API_KEY
for this to work! -
python tune_dspy.py
: Use DSPy to "fine-tune" a CoT few-shot prompt on the datase, and run evalutaion. Likefactor.py
, you must have an OpenAI API key saved. -
python eval.py --dataset <your_file.jsonl>
: This will run the evalutation script on your prediction file and return ROUGE-1 scores.