STOKE: A Toolkit for Streaming Token Classification

Installation

You can use pip to install the required dependencies:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Training Procedure

Example of a data generation and training procedure

# first, create an annotated dataset using an auxiliary model
python generate_instruct.py --language_model meta-llama/Llama-3.2-3B-Instruct --cuda

# train probes for layers 20-27, while pointing to the the dataset created above
python example_train.py --path 'data/meta-llama/Llama-3.2-3B-Instruct/STOKE_500_wikiqa' --layers 20 21 22 23 24 25 26 27 --batch_size 4 --cuda

Chat Demo

You can download trained probes for Llama-3.2-1B-Instruct here. More pre-trained models will be added soon!

In order to launch the chat demo (shown in image above):

export HF_TOKEN="your token here..."
python chat.py

Playground

In order to launch the playground (shown below):

streamlit run stoke/src/playground/app.py

Transformers fork

In order to easily use the streaming classifiers, this repo makes use of a custom fork of transformers.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
stoke		stoke
.gitignore		.gitignore
README.md		README.md
chat.py		chat.py
example_generate.py		example_generate.py
example_train.py		example_train.py
generate_instruct.py		generate_instruct.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STOKE: A Toolkit for Streaming Token Classification

Installation

Training Procedure

Chat Demo

Playground

Transformers fork

About

Releases

Packages

Languages

nicpopovic/STOKE

Folders and files

Latest commit

History

Repository files navigation

STOKE: A Toolkit for Streaming Token Classification

Installation

Training Procedure

Chat Demo

Playground

Transformers fork

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages