Skip to content

Pinned Loading

  1. OLMo OLMo Public

    Modeling, training, eval, and inference code for OLMo

    Python 5.7k 621

  2. dolma dolma Public

    Data and tools for generating and inspecting OLMo pre-training data.

    Python 1.3k 144

  3. ai2thor ai2thor Public

    An open-source platform for Visual AI.

    C# 1.4k 245

  4. olmocr olmocr Public

    Toolkit for linearizing PDFs for LLM datasets/training

    Python 13k 937

  5. OLMoE OLMoE Public

    OLMoE: Open Mixture-of-Experts Language Models

    Jupyter Notebook 788 72

Repositories

Showing 10 of 505 repositories
  • ai2thor Public

    An open-source platform for Visual AI.

    allenai/ai2thor’s past year of commit activity
    C# 1,422 Apache-2.0 245 262 5 Updated Jun 25, 2025
  • open-instruct Public

    AllenAI's post-training codebase

    allenai/open-instruct’s past year of commit activity
    Python 3,027 Apache-2.0 408 13 9 Updated Jun 25, 2025
  • dolma Public

    Data and tools for generating and inspecting OLMo pre-training data.

    allenai/dolma’s past year of commit activity
    Python 1,250 Apache-2.0 144 29 16 Updated Jun 25, 2025
  • olmo-cookbook Public

    OLMost every training recipe you need to perform data interventions with the OLMo family of models.

    allenai/olmo-cookbook’s past year of commit activity
    Python 32 Apache-2.0 7 0 20 Updated Jun 25, 2025
  • OLMo-core Public

    PyTorch building blocks for the OLMo ecosystem

    allenai/OLMo-core’s past year of commit activity
    Python 240 Apache-2.0 40 0 27 Updated Jun 25, 2025
  • beaker-gantry Public

    Gantry streamlines running Python experiments in Beaker by managing containers and boilerplate for you

    allenai/beaker-gantry’s past year of commit activity
    Python 23 Apache-2.0 6 2 2 Updated Jun 25, 2025
  • olmocr Public

    Toolkit for linearizing PDFs for LLM datasets/training

    allenai/olmocr’s past year of commit activity
    Python 13,036 Apache-2.0 937 95 3 Updated Jun 24, 2025
  • genesys Public

    2024 internship project on LLM-driven model discovery

    allenai/genesys’s past year of commit activity
    Python 1 0 0 0 Updated Jun 24, 2025
  • S2AND Public

    Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite

    allenai/S2AND’s past year of commit activity
    Python 93 19 6 0 Updated Jun 24, 2025
  • codescientist Public

    CodeScientist: An automated scientific discovery system for code-based experiments

    allenai/codescientist’s past year of commit activity
    Python 271 Apache-2.0 33 1 0 Updated Jun 24, 2025