Skip to content

IMRO832000/CSE_574

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Enhancing Question Generation with Novel Reward Functions: Evaluation and Comparison

Overview

Improving the creation of questions by fine tuning a base LLM through Reinfocement Learning using new novel reward functions is the focus of "Enhancing Question Generation with Novel Reward Functions: Evaluation and Comparison." This study examines and compares these new approaches to see how well they work and what improvements they bring.

Architecture:

  • Pretrained SQUADv2 model
  • PPO (Proximal Policy Optimization) instead of SCST (Self-Critical Sequence Training)

Hyperparameters

  • Batch Size: 512
  • Total Batches: 50 out of 170
  • Learning Rate: 5e-5
  • Generation Kwargs: "min_new_tokens": 1, "max_new_tokens": 32

Dataset :

Reference Paper :

image image

Releases

No releases published

Packages

No packages published