Skip to content

Pinned Loading

  1. terminal-bench terminal-bench Public

    A benchmark for LLMs on complicated tasks in the terminal

    Python 2.4k 544

  2. harbor harbor Public

    Harbor is a framework for running agent evaluations and creating and using RL environments.

    Python 2.5k 1.2k

  3. terminal-bench-2 terminal-bench-2 Public

    Shell 290 88

  4. terminal-bench-3 terminal-bench-3 Public

    Measuring agents' ability to get work done on a computer

    Python 243 292

  5. terminal-bench-science terminal-bench-science Public

    Terminal-Bench-Science: Evaluating AI Agents on Complex Real-World Scientific Workflows in the Terminal

    Python 146 76

  6. awesome-harbor awesome-harbor Public

    A curated list of awesome Harbor ecosystem projects

    42 2

Repositories

Showing 10 of 16 repositories
  • terminal-bench-science Public

    Terminal-Bench-Science: Evaluating AI Agents on Complex Real-World Scientific Workflows in the Terminal

    harbor-framework/terminal-bench-science’s past year of commit activity
    Python 146 Apache-2.0 76 2 53 Updated Jun 18, 2026
  • terminal-bench-3 Public

    Measuring agents' ability to get work done on a computer

    harbor-framework/terminal-bench-3’s past year of commit activity
    Python 243 292 3 104 Updated Jun 18, 2026
  • harbor Public

    Harbor is a framework for running agent evaluations and creating and using RL environments.

    harbor-framework/harbor’s past year of commit activity
    Python 2,546 Apache-2.0 1,176 136 290 Updated Jun 18, 2026
  • t-bench-docs Public
    harbor-framework/t-bench-docs’s past year of commit activity
    TypeScript 8 14 2 0 Updated Jun 18, 2026
  • harbor-framework/terminal-bench-challenges’s past year of commit activity
    Shell 11 3 0 2 Updated Jun 18, 2026
  • harbor-framework/harbor-adapters-experiments’s past year of commit activity
    Python 7 Apache-2.0 12 0 0 Updated Jun 16, 2026
  • docs Public
    harbor-framework/docs’s past year of commit activity
    MDX 0 MIT 0 0 0 Updated Jun 3, 2026
  • benchmark-template Public template

    Harbor Benchmark Template

    harbor-framework/benchmark-template’s past year of commit activity
    Python 13 10 7 7 Updated May 30, 2026
  • awesome-harbor Public

    A curated list of awesome Harbor ecosystem projects

    harbor-framework/awesome-harbor’s past year of commit activity
    42 2 0 1 Updated May 29, 2026
  • harbor-framework/harbor-datasets’s past year of commit activity
    35 115 6 21 Updated May 16, 2026

Top languages

Loading…

Most used topics

Loading…