Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multicontext training #359

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

gkielian
Copy link
Collaborator

This PR is a prototype for muitple simultaneous contexts.

Including a labelled shakespeare_char_punct which has code for
converting labels per character (in this case vowel, punctuation,
consonant or other, but can also be POS).

Currently we default to averaging the losses for each lm_head.

Also this only has two streams and no method for selecting the head to
inference from.

This allows for concurrent lables of the
This allows us to view the separate losses for a and b context streams.

Note, this is currently essentially equivalent to position embeddings.

We can add simple NLP Based means now like vowel vs consonant, and maybe
position embeddings within the word itself.
@gkielian gkielian linked an issue Jan 19, 2025 that may be closed by this pull request
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Semantic Factorization
1 participant