FYP-Project

This repository contains the Final Year Project aimed at restoring punctuation using Large Language Models (LLMs) and pre-LLMs (small-scale LLMs) for speech transcripts. The project is divided into two main parts: LLM and pre-LLM.

Introduction

The FYP project focuses on restoring punctuation in text using different language models. The project is divided into two main sections:

LLM: This section involves fine-tuning the Llama-2 model with LoRA using the LibriHeavy small dataset and comparing with the Gemini-pro performance using API.
pre-LLM: This section involves using the modified XLM-RoBERTa model for punctuation restoration.

Project Structure

The repository is organized into the following directories:

LLM: Contains scripts and resources for fine-tuning and testing the Llama-2 model.
pre-LLM: Contains scripts and resources for using the XLM-RoBERTa model for punctuation restoration.

Please refer to the readme files in LLM and pre-LLM for more detailed instruction.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
LLM		LLM
pre_LLM		pre_LLM
.gitattributes		.gitattributes
FYP_amended_Final_Report_LiuChangsong.pdf		FYP_amended_Final_Report_LiuChangsong.pdf
Presentation Slides_refined.pptx		Presentation Slides_refined.pptx
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FYP-Project

Table of Contents

Introduction

Project Structure

About

Releases

Packages

Languages

ntuspeechlab/LiuChangsong_FYP2024_SUD

Folders and files

Latest commit

History

Repository files navigation

FYP-Project

Table of Contents

Introduction

Project Structure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages