> ## Documentation Index
> Fetch the complete documentation index at: https://docs.runloop.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# What is Runloop?

> Runloop: Sandbox Tools for AI Agent Workflows

Runloop is the *batteries included* platform designed for building and optimizing
AI-driven software engineering agents.  [Get started now.](/docs/tutorials/quickstart)

With the Runloop platform, you get:

* [Devboxes](/docs/devboxes/overview): Our lightning fast, secure sandboxed development environment for executing agents & agent tools
* [Axons](/docs/axons/overview): Distributed event streams for sequencing, recording, and observing agent interactions in real-time
* [Blueprints](/docs/devboxes/blueprints): Create & share templates for Devboxes with custom configuration
* [Snapshots](/docs/devboxes/snapshots): Save, suspend and resume activity for your Devboxes
* [Benchmarks](/docs/benchmarks/overview): Use state of the art benchmarks and evals or create your own to measure and improve agent performance

Whether you are trying to build an AI agent that can respond to pull requests or an AI agent that can generate new UI components, Runloop makes it possible to get from zero to production in just a few lines of code.

## Why Runloop?

At Runloop, our mission is to keep you focused on the things that improve your agents. Spend time on what actually matters, and leave the rest to us.

As your agents evolve, so will your needs. Runloop is designed for builders at all stages:

| Stage       | Why Runloop                                                                                                                          |
| ----------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| Prototyping | <ul><li>Zero infrastructure worries using managed, instant-on devboxes.</li><li>Build, deploy, learn, and iterate quickly.</li></ul> |
| Production  | <ul><li>Team-shared blueprints and projects.</li><li>24/7/365 managed platform and oncall team.</li><li>SOC2 compliant.</li></ul>    |
| Growth      | <ul><li>Benchmarking and evaluation stack to monitor and fine-tune your agent's performance.</li></ul>                               |

### Use cases

Our customers are already leveraging Runloop to build AI agents that can:

* Respond to Pull Requests and enhance the code review process
* Enable users to chat with and navigate their codebase
* Generate new test cases for existing codebases
* Act as pair programmers
* Generate new UI components for their frontend
* Create custom benchmarks to train your agent using Reinforcement Fine Tuning (RFT)
* ...many more

Not sure how to incorporate AI agents into your workflow?
Send us an email at [support@runloop.ai](mailto:support@runloop.ai) to learn more about how Runloop can help you build AI agents.

## Ready to get started?

<CardGroup cols={2}>
  <Card title="Quickstart" icon="rocket" href="/docs/tutorials/quickstart">
    Create your first devbox and run a command in under a minute.
  </Card>

  <Card title="Explore Features" icon="sparkles" href="/docs/overview/runloop-features">
    See everything the Runloop platform has to offer.
  </Card>
</CardGroup>

{/*
Removing APIs until they're ready for public access

### Code Scenarios
<Note>
Code Scenarios APIs are currently in beta. Please contact us at [support@runloop.ai](mailto:support@runloop.ai) to get access.
</Note>

Tuning the behavior of your AI agent is a critical part of making it work reliably. However, going from POC to production is where most AI agents fail.
Code Scenarios is a set of Benchmarking and Eval Tools that help you understand and improve your AI agent's behavior in a methodical way.
For example, with Code Scenarios:
- You can run your agent against well known benchmarks such as SWE-bench or create your own custom benchmarks
- You can record live production agent traces and monitor the Agent performance or use the traces to create new benchmarks
- You can create custom Reward Models based on production traces and benchmark data to fine tune your agent's behavior

<CodeGroup>
  ```typescript TypeScript [expandable]
  import RunloopSDK from '@runloop/api-client';

  const runloop = new RunloopSDK(); // API Key is automatically loaded from "RUNLOOP_API_KEY" environment variable

  // 1. Create new Benchmarks or use existing ones like SWE-bench
  const myBenchmark = await runloop.api.benchmarks.create({
      benchmark: 'UI Component Generation',
      testCases: [
          {
              name: 'Create a Login Page',
              problemStatement: 'Create a login page that allows users to login to the application. Call the component `LoginPage` and export it from the file `src/components/LoginPage.tsx`.',
              // Configure specific starting Devbox environemnts for your Agent to use as part of a benchmark test
              environment: 'DEFAULT_DEVBOX',
              outputContractRules: [
                  {
                      type: 'typescript',
                      files: ["expected_snapshot.png"]
                      validate: (output) => {
                          // Validate the storybook snapshot is updated and roughly matches the expected_snapshot.png
                      }
                  }
              ]                
          }
      ]
  });

  // 2. Run your agent against the benchmark
  const benchmarkRun = await runloop.api.benchmarks.beginTestRun(myBenchmark.id);
  // For each test case run our agent and report the completion
  for (const testCase of myBenchmark.testCases) {
      const agentOutput = await myAgent.run({
          prompt: testCase.problemStatement,
          devbox: testCase.devbox
      });
      await runloop.api.benchmarks.reportTestCaseRun(testCaseRun.id, {
          output: agentOutput
      });
  }

  ```
</CodeGroup>
*/}