thesofakillers

Follow

🤔

💡

Giulio Starace thesofakillers

🤔

💡

Follow

79 followers · 173 following

Liminalis, Limbo
09:34 (UTC +01:00)
giuliostarace.com

Achievements

Achievements

Pinned Loading

openai/mle-bench openai/mle-bench Public

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

Python 555 62
openai/evals openai/evals Public

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python 15.2k 2.6k
GPTrue-or-False GPTrue-or-False Public

📝🔍 A browser extension that displays the GPT-2 Log Probability of selected text

JavaScript 113 11
nlgoals nlgoals Public

Official repository for my MSc thesis: "Addressing Goal Misgeneralization with Natural Language Interfaces."

TeX 3
infoshare infoshare Public

Official repository for the paper: "Probing LLMs for Joint Encoding of Linguistic Categories." Findings of EMNLP 2023.

Python 6
dlml-tutorial dlml-tutorial Public

🤓 A tutorial on the Discretized Logistic Mixture Likelihood (DLML)

Python 8