Skip to content
Change the repository type filter

All

    Repositories list

    • Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.
      Shell
      150148414Updated Feb 25, 2025Feb 25, 2025
    • sb-cli

      Public
      Run SWE-bench evaluations remotely
      Python
      MIT License
      0530Updated Feb 25, 2025Feb 25, 2025
    • Landing page + leaderboard for SWE-Bench benchmark
      HTML
      4111Updated Feb 25, 2025Feb 25, 2025
    • SWE-bench

      Public
      SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?
      Python
      MIT License
      4262.5k325Updated Feb 24, 2025Feb 24, 2025
    • .github

      Public
      0000Updated Oct 24, 2024Oct 24, 2024
    • Evaluation data + results for SWE-agent inference on HumanEvalFix task
      Jupyter Notebook
      0000Updated Jul 11, 2024Jul 11, 2024