Sparse autoencoders for mechanistic interpretability of the DNA sequence-based model Borzoi @ f(DNA) Calico

Large ML frameworks, such as the DNA sequence-based model Borzoi, ingest large amounts of data for training. Training data contain multitudes of features, and as the model is successful at prediction tasks and downstream benchmarks, it has extracted these features from the input sequence. We aim to use sparse autoencoders to decompose activations from the first few layers of the pre-trained model into monosemantic concepts that map to known and unknown transcriptional regulatory motifs.

We use the top K approach for sparsity described by L. Gao et al. to reconstruct the activations of the first few convoluational layers.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
README.md		README.md
config.json		config.json
conv1d_1.txt		conv1d_1.txt
dataset.py		dataset.py
find_global_max.py		find_global_max.py
grid.py		grid.py
grid_infer.py		grid_infer.py
infer.py		infer.py
infer_one_instance.py		infer_one_instance.py
params_grid.json		params_grid.json
plot_acts_pretrained.py		plot_acts_pretrained.py
sae.py		sae.py
train_one_instance.py		train_one_instance.py
train_sae.py		train_sae.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sparse autoencoders for mechanistic interpretability of the DNA sequence-based model Borzoi @ f(DNA) Calico

About

Releases

Packages

Languages

anyakors/sae_borzoi

Folders and files

Latest commit

History

Repository files navigation

Sparse autoencoders for mechanistic interpretability of the DNA sequence-based model Borzoi @ f(DNA) Calico

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages