sam-paech

Follow

Sam Paech sam-paech

Follow

Independent AI tinkerer

25 followers · 0 following

Achievements

Achievements

Popular repositories Loading

antislop-sampler antislop-sampler Public

Python 270 27
diplobench diplobench Public

Benchmark for LLMs playing full press diplomacy

HTML 26 6
lm-evaluation-harness lm-evaluation-harness Public

Forked from EleutherAI/lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 4 1
Ollama-MMLU-Pro-IRT Ollama-MMLU-Pro-IRT Public

Forked from chigkim/Ollama-MMLU-Pro

Ollama-MMLU-Pro fork, using a smaller IRT-tuned subset of MMLU-Pro

Jupyter Notebook 2
entropix-gsm8k-eval entropix-gsm8k-eval Public

Jupyter Notebook 1
FastEval FastEval Public

Forked from FastEval/FastEval

Fast & more realistic evaluation of chat language models. Includes leaderboard.

Python