Skip to content
View sam-paech's full-sized avatar

Block or report sam-paech

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. antislop-sampler antislop-sampler Public

    Python 241 23

  2. lm-evaluation-harness lm-evaluation-harness Public

    Forked from EleutherAI/lm-evaluation-harness

    A framework for few-shot evaluation of language models.

    Python 4 1

  3. Ollama-MMLU-Pro-IRT Ollama-MMLU-Pro-IRT Public

    Forked from chigkim/Ollama-MMLU-Pro

    Ollama-MMLU-Pro fork, using a smaller IRT-tuned subset of MMLU-Pro

    Jupyter Notebook 2

  4. entropix-gsm8k-eval entropix-gsm8k-eval Public

    Jupyter Notebook 1

  5. FastEval FastEval Public

    Forked from FastEval/FastEval

    Fast & more realistic evaluation of chat language models. Includes leaderboard.

    Python

  6. MMLU-Pro-IRT MMLU-Pro-IRT Public

    Forked from TIGER-AI-Lab/MMLU-Pro

    The scripts for MMLU-Pro, using a smaller IRT-tuned dataset

    Python