“Somewhere in the semantics of natural language and the ambiguity of our understanding in reality leaves truth as one of the great mysteries”
Hi, I’m @lhallee!
My name is Logan Hallee, a scientist working on computational protein modeling through the lens of machine learning. Most notably, I am the the Chief Scientific Officer and Founder of Synthyra, a Public Benefit LLC which functions as a research org for protein science. I am also a PhD Candidate in Bioinformatics at the University of Delaware in the (Gleghorn Lab), where my research is focused on (you guessed it) protein modeling with transformer neural networks. On the side I run a fun blog called Minds and Molecules which touches on philosophical ideas I find facinating.
You can find my CV here
- SYNTERACT is a large language model for protein–protein interaction prediction.
- First LLM approach to PPI.
- Ranks in the top 3% of research outputs by Altmetric.
- Collaborated with Stephen Wolfram & other mentors at the Wolfram Winter School.
- Developed “Tetris For Proteins” – a shape-based metric emulating lock-and-key enzyme-substrate interactions.
- Generates hypotheses on protein aggregation likelihood.
- Invented the Annotation Vocabulary, a unique set of integers mapped to popular protein and gene ontologies.
- Enables state-of-the-art protein annotation and generation models when paired with its own token embedding.
- Codon usage bias is highlighted as a key biological phenomenon and valuable feature for machine learning in Nature Scientific Reports.
- Our models show codon usage with a powerful phylogenetic association
- Introduced cdsBERT, showcasing cost-effective ways to enhance biological relevance in protein language models via a codon vocabulary.
- Invented a Mixture of Experts extension for scalable transformer networks adept at sentence similarity tasks.
- Future networks with N experts could perform like N independently trained networks, offering significant time and computational savings in semantic retrieval systems.
- In review.
- Collaborates on lab projects involving deep learning for reconstructing 3D organs from 2D Z-stacks.
- Informs morphometric and pharmacokinetic studies to further understanding of organ structure and function.
- featureranker: A Python package for feature ranking.
- Textbook Chapter on Protein Language Models.
- Machine Learning for identifying cardioprotective molecules in minority groups.
- Investigations of Hsp90 and Gamma secretase in cardiac disease.
Research related queries - [email protected]
Business related queries - [email protected]
Last Updated: April 2025