SIG

This repository contains code for SIG.

SIG: Speaker Identification in Literature via Prompt-Based Generation

Zhenlin Su, Liyan Xu, Jin Xu, Jiangnan Li, Mingdu Huangfu

Introduction

Speaker identification in literary text aims at identifying the speaker of quotation in narrative genres Our method is identifying the speaker via Prompt-Based Generation: Prompting the generation model like BART to get the generation score to identify the speaker.

The main idea of SIG is consists of two parts:

Prompt designed: The training input (X) and the training label (Y) are accompanied by the appropriate prompt. Natural language prompt is added after quotation to close to the MLM training method of the pre-trained model. The training tag is preceded by a prefix prompt, such as "Speaker: Y", which allows the model to predict the speaker based on the input and prefix.
Classification by Generation: SIG calculate the generation probability of each candidate speaker to make better use of prior knowledge and limit the range of options for the final answer.

By changing the method of prompt and selection of candidates, SIG can be used for many tasks.

Main code Introduction:

config.py : Modify parameters here to control training and evaluation, including prompt construction, training parameters, checkpoint and data paths, etc
train.py: Code used to train the model
evaluate.py: Code used to evaluate the trained model in two ways: direct generation and classification by generation
SIG_test.py: Code used to test the model and to select the appropriate prompt and compare the various methods (Unnecessary)

Requirements:

pytorch==1.8.1
transformers==4.4.1
jieba==0.42.1
spacy==3.6.1

Citation:

@article{su2023sig,
title={SIG: Speaker Identification in Literature via Prompt-Based Generation},
journal={Proceedings of the AAAI Conference on Artificial Intelligence}, 
author={Zhenlin Su and Liyan Xu and Jin Xu and Jiangnan Li and Mingdu Huangfu}, 
year={2024}
}

Contact

If you have any problems, raise an issue or contact [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.idea		.idea
__pycache__		__pycache__
chatGPT_prompt		chatGPT_prompt
data		data
images		images
utils		utils
README.md		README.md
SIG_test.py		SIG_test.py
__init__.py		__init__.py
_top5_SIG.xlsx		_top5_SIG.xlsx
config.py		config.py
dataset_built.py		dataset_built.py
evaluate.py		evaluate.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SIG

Introduction

Main code Introduction:

Requirements:

Citation:

Contact

About

Releases

Packages

Languages

sumafuture/SIG

Folders and files

Latest commit

History

Repository files navigation

SIG

Introduction

Main code Introduction:

Requirements:

Citation:

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages