# Complete a BenchmarkRun.
Source: https://docs.runloop.ai/api-reference/benchmark/complete-a-benchmarkrun

post /v1/benchmarks/runs/{id}/complete
Complete a currently running BenchmarkRun.


# Create a Benchmark.
Source: https://docs.runloop.ai/api-reference/benchmark/create-a-benchmark

post /v1/benchmarks
Create a Benchmark with a set of Scenarios.


# Get a Benchmark.
Source: https://docs.runloop.ai/api-reference/benchmark/get-a-benchmark

get /v1/benchmarks/{id}
Get a previously created Benchmark.


# Get a previously created BenchmarkRun.
Source: https://docs.runloop.ai/api-reference/benchmark/get-a-previously-created-benchmarkrun

get /v1/benchmarks/runs/{id}
Get a BenchmarkRun given ID.


# List BenchmarkRuns.
Source: https://docs.runloop.ai/api-reference/benchmark/list-benchmarkruns

get /v1/benchmarks/runs
List all BenchmarkRuns matching filter.


# List Benchmarks.
Source: https://docs.runloop.ai/api-reference/benchmark/list-benchmarks

get /v1/benchmarks
List all Benchmarks matching filter.


# List Public Benchmarks.
Source: https://docs.runloop.ai/api-reference/benchmark/list-public-benchmarks

get /v1/benchmarks/list_public
List all public benchmarks matching filter.


# Start a new BenchmarkRun.
Source: https://docs.runloop.ai/api-reference/benchmark/start-a-new-benchmarkrun

post /v1/benchmarks/start_run
Start a new BenchmarkRun based on the provided Benchmark.


# Update a Benchmark.
Source: https://docs.runloop.ai/api-reference/benchmark/update-a-benchmark

post /v1/benchmarks/{id}
Update a Benchmark with a set of Scenarios.


# Create and build a Blueprint.
Source: https://docs.runloop.ai/api-reference/blueprint/create-and-build-a-blueprint

post /v1/blueprints
Starts build of custom defined container Blueprint. The Blueprint will begin in the 'provisioning' step and transition to the 'building' step once it is selected off the build queue., Upon build complete it will transition to 'building_complete' if the build is successful.


# Delete a Blueprint.
Source: https://docs.runloop.ai/api-reference/blueprint/delete-a-blueprint

post /v1/blueprints/{id}/delete
Delete a previously created Blueprint.


# Get a Blueprint.
Source: https://docs.runloop.ai/api-reference/blueprint/get-a-blueprint

get /v1/blueprints/{id}
Get the details of a previously created Blueprint including the build status.


# Get Blueprint build logs.
Source: https://docs.runloop.ai/api-reference/blueprint/get-blueprint-build-logs

get /v1/blueprints/{id}/logs
Get all logs from the building of a Blueprint.


# List Blueprints.
Source: https://docs.runloop.ai/api-reference/blueprint/list-blueprints

get /v1/blueprints
List all Blueprints or filter by name.


# Preview Dockerfile definition for a Blueprint.
Source: https://docs.runloop.ai/api-reference/blueprint/preview-dockerfile-definition-for-a-blueprint

post /v1/blueprints/preview
Preview building a Blueprint with the specified configuration. You can take the resulting Dockerfile and test out your build using any local docker tooling.


# The Blueprint Object
Source: https://docs.runloop.ai/api-reference/blueprint/the-blueprint-object


Blueprints are ways to create customized starting points for Devboxes. They allow you to define custom starting points for Devboxes such that environment set up can be cached to improve Devbox boot times.


# Create a Browser.
Source: https://docs.runloop.ai/api-reference/browser/create-a-browser

post /v1/devboxes/browsers
Create a Devbox that has a managed Browser and begin the boot process. As part of booting the Devbox, the browser will automatically be started with connection utilities activated.


# Get Browser Details.
Source: https://docs.runloop.ai/api-reference/browser/get-browser-details

get /v1/devboxes/browsers/{id}


# Create a Computer.
Source: https://docs.runloop.ai/api-reference/computer/create-a-computer

post /v1/devboxes/computers
Create a Computer and begin the boot process. The Computer will initially launch in the 'provisioning' state while Runloop allocates the necessary infrastructure. It will transition to the 'initializing' state while the booted Computer runs any Runloop or user defined set up scripts. Finally, the Computer will transition to the 'running' state when it is ready for use.


# Get Computer Details.
Source: https://docs.runloop.ai/api-reference/computer/get-computer-details

get /v1/devboxes/computers/{id}


# Asynchronously execute a command via the Devbox shell
Source: https://docs.runloop.ai/api-reference/devbox/asynchronously-execute-a-command-via-the-devbox-shell

post /v1/devboxes/{id}/execute_async
Execute the given command in the Devbox shell asynchronously and returns the execution that can be used to track the command's progress.


# Create a Devbox.
Source: https://docs.runloop.ai/api-reference/devbox/create-a-devbox

post /v1/devboxes
Create a Devbox and begin the boot process. The Devbox will initially launch in the 'provisioning' state while Runloop allocates the necessary infrastructure. It will transition to the 'initializing' state while the booted Devbox runs any Runloop or user defined set up scripts. Finally, the Devbox will transition to the 'running' state when it is ready for use.


# Create a disk snapshot of a running Devbox.
Source: https://docs.runloop.ai/api-reference/devbox/create-a-disk-snapshot-of-a-running-devbox

post /v1/devboxes/{id}/snapshot_disk
Create a disk snapshot of a devbox with the specified name and metadata to enable launching future Devboxes with the same disk state.


# Create a tunnel to an available port on the Devbox.
Source: https://docs.runloop.ai/api-reference/devbox/create-a-tunnel-to-an-available-port-on-the-devbox

post /v1/devboxes/{id}/create_tunnel
Create a live tunnel to an available port on the Devbox. Note the port must be made available using Devbox.create.availablePorts. Otherwise, the tunnel will not connect to any running processes on the Devbox.


# Create an SSH key for a Devbox
Source: https://docs.runloop.ai/api-reference/devbox/create-an-ssh-key-for-a-devbox

post /v1/devboxes/{id}/create_ssh_key
Create an SSH key for a Devbox to enable remote access.


# Delete a disk snapshot of a Devbox.
Source: https://docs.runloop.ai/api-reference/devbox/delete-a-disk-snapshot-of-a-devbox

post /v1/devboxes/disk_snapshots/{id}/delete
Delete a previously taken disk snapshot of a Devbox.


# Download binary file contents from Devbox filesystem.
Source: https://docs.runloop.ai/api-reference/devbox/download-binary-file-contents-from-devbox-filesystem

post /v1/devboxes/{id}/download_file
Download file contents of any type (binary, text, etc) from a specified path on the Devbox.


# Get Devbox details.
Source: https://docs.runloop.ai/api-reference/devbox/get-devbox-details

get /v1/devboxes/{id}
Get the latest details and status of a Devbox.


# Get Devbox logs.
Source: https://docs.runloop.ai/api-reference/devbox/get-devbox-logs

get /v1/devboxes/{id}/logs
Get all logs from a running or completed Devbox.


# Get status of an asynchronous execution on a Devbox.
Source: https://docs.runloop.ai/api-reference/devbox/get-status-of-an-asynchronous-execution-on-a-devbox

get /v1/devboxes/{devbox_id}/executions/{execution_id}
Get the latest status of a previously launched asynchronous execuction including stdout/error and the exit code if complete.


# null
Source: https://docs.runloop.ai/api-reference/devbox/kill-an-asynchronous-execution-currently-running-on-a-devbox

post /v1/devboxes/{id}/executions/{execution_id}/kill


# List Devboxes.
Source: https://docs.runloop.ai/api-reference/devbox/list-devboxes

get /v1/devboxes
List all Devboxes while optionally filtering by status.


# List disk snapshots of a Devbox.
Source: https://docs.runloop.ai/api-reference/devbox/list-disk-snapshots-of-a-devbox

get /v1/devboxes/disk_snapshots
List all snapshots of a Devbox while optionally filtering by Devbox ID.


# Live Tail Devbox Logs.
Source: https://docs.runloop.ai/api-reference/devbox/live-tail-devbox-logs

get /v1/devboxes/{id}/logs/tail
Tail the logs for the given devbox. This will return past log entries and continue streaming from there. The stream will then continue to stream logs until the connection is closed.


# Read text file contents from Devbox filesystem.
Source: https://docs.runloop.ai/api-reference/devbox/read-text-file-contents-from-devbox-filesystem

post /v1/devboxes/{id}/read_file_contents
Read file contents from a file on a Devbox as a UTF-8. Note 'downloadFile' should be used for large files (greater than 100MB). Returns the file contents as a UTF-8 string.


# Remove an open tunnel on the Devbox.
Source: https://docs.runloop.ai/api-reference/devbox/remove-an-open-tunnel-on-the-devbox

post /v1/devboxes/{id}/remove_tunnel
Remove a previously opened tunnel on the Devbox.


# Shutdown a running Devbox.
Source: https://docs.runloop.ai/api-reference/devbox/shutdown-a-running-devbox

post /v1/devboxes/{id}/shutdown
Shutdown a running Devbox. This will permanently stop the Devbox. If you want to save the state of the Devbox, you should take a snapshot before shutting down or should suspend the Devbox instead of shutting down.


# Synchronously execute a shell command on a Devbox
Source: https://docs.runloop.ai/api-reference/devbox/synchronously-execute-a-shell-command-on-a-devbox

post /v1/devboxes/{id}/execute_sync
Execute a bash command in the Devbox shell, await the command completion and return the output.


# The Devbox Object
Source: https://docs.runloop.ai/api-reference/devbox/the-devbox-object


A Devbox represents a virtual development environment. It is an isolated sandbox that can be given to agents and used to run arbitrary code such as AI generated code.


# Upload binary file contents to Devbox filesystem.
Source: https://docs.runloop.ai/api-reference/devbox/upload-binary-file-contents-to-devbox-filesystem

post /v1/devboxes/{id}/upload_file
Upload file contents of any type (binary, text, etc) to a Devbox. Note this API is suitable for large files (larger than 100MB) and efficiently uploads files via multipart form data.


# Write text file contents to Devbox filesystem.
Source: https://docs.runloop.ai/api-reference/devbox/write-text-file-contents-to-devbox-filesystem

post /v1/devboxes/{id}/write_file_contents
Write UTF-8 string contents to a file at path on the Devbox. Note for large files (larger than 100MB), the upload_file endpoint must be used.


# Introduction
Source: https://docs.runloop.ai/api-reference/introduction

Welcome to the Runloop API Reference Documentation

## Authentication

All API endpoints are authenticated using Bearer tokens generated via the [API Key section of the Runloop Dashboard](https://platform.runloop.ai/manage/keys).

```bash
curl -X 'POST' \
  'https://api.runloop.ai/v1/devboxes' \
  -H "Authorization: Bearer <YOUR_API_KEY>" \
  -H 'Content-Type: application/json'  
```

## SDKs

To make it easier to interact with the Runloop API, we've created SDKs in a variety of languages.

* [Runloop Python SDK](https://github.com/runloopai/api-client-python)
* [Runloop Typescript SDK](https://github.com/runloopai/api-client-ts)

Please reach out if you need SDKs in other languages.


# Create a Repository Connection.
Source: https://docs.runloop.ai/api-reference/repository/create-a-repository-connection

post /v1/repositories
Create a connection to a Github Repository and trigger an initial inspection of the repo's technical stack and developer environment requirements.


# Delete a Repository Connection and associated objects.
Source: https://docs.runloop.ai/api-reference/repository/delete-a-repository-connection-and-associated-objects

post /v1/repositories/{id}/delete
Permanently Delete a Repository Connection including any automatically generated inspection insights.


# Get Repository Connection details.
Source: https://docs.runloop.ai/api-reference/repository/get-repository-connection-details

get /v1/repositories/{id}
Get Repository Connection details including latest inspection status and generated repository insights.


# List analyzed repository versions.
Source: https://docs.runloop.ai/api-reference/repository/list-analyzed-repository-versions

get /v1/repositories/{id}/versions
List all analyzed versions of a repository connection including automatically generated insights for each version.


# List available repository connections.
Source: https://docs.runloop.ai/api-reference/repository/list-available-repository-connections

get /v1/repositories
List all available repository connections.


# The Repository Object
Source: https://docs.runloop.ai/api-reference/repository/the-repository-object


The Repository object manages the link to a remote repository and the automated analysis of the repository.


# Trigger inspection of the latest version of a repository connection.
Source: https://docs.runloop.ai/api-reference/repository/trigger-inspection-of-the-latest-version-of-a-repository-connection

post /v1/repositories/{id}/inspect_latest
Trigger inspection of the latest version of a repository including repo's technical stack and developer environment requirements.


# Complete a ScenarioRun.
Source: https://docs.runloop.ai/api-reference/scenario/complete-a-scenariorun

post /v1/scenarios/runs/{id}/complete
Complete a currently running ScenarioRun. Calling complete will shutdown underlying Devbox resource.


# Create a custom scenario scorer.
Source: https://docs.runloop.ai/api-reference/scenario/create-a-custom-scenario-scorer

post /v1/scenarios/scorers
Create a custom scenario scorer.


# Create a Scenario.
Source: https://docs.runloop.ai/api-reference/scenario/create-a-scenario

post /v1/scenarios
Create a Scenario, a repeatable AI coding evaluation test that defines the starting environment as well as evaluation success criteria.


# Get a previously created ScenarioRun.
Source: https://docs.runloop.ai/api-reference/scenario/get-a-previously-created-scenariorun

get /v1/scenarios/runs/{id}
Get a ScenarioRun given ID.


# Get a Scenario.
Source: https://docs.runloop.ai/api-reference/scenario/get-a-scenario

get /v1/scenarios/{id}
Get a previously created scenario.


# List Public Scenarios.
Source: https://docs.runloop.ai/api-reference/scenario/list-public-scenarios

get /v1/scenarios/list_public
List all public scenarios matching filter.


# List Scenario Scorers.
Source: https://docs.runloop.ai/api-reference/scenario/list-scenario-scorers

get /v1/scenarios/scorers
List all Scenario Scorers matching filter.


# List ScenarioRuns.
Source: https://docs.runloop.ai/api-reference/scenario/list-scenarioruns

get /v1/scenarios/runs
List all ScenarioRuns matching filter.


# List Scenarios.
Source: https://docs.runloop.ai/api-reference/scenario/list-scenarios

get /v1/scenarios
List all Scenarios matching filter.


# Retrieve Scenario Scorer.
Source: https://docs.runloop.ai/api-reference/scenario/retrieve-scenario-scorer

get /v1/scenarios/scorers/{id}
Retrieve Scenario Scorer.


# Score a ScenarioRun.
Source: https://docs.runloop.ai/api-reference/scenario/score-a-scenariorun

post /v1/scenarios/runs/{id}/score
Score a currently running ScenarioRun.


# Start a new ScenarioRun.
Source: https://docs.runloop.ai/api-reference/scenario/start-a-new-scenariorun

post /v1/scenarios/start_run
Start a new ScenarioRun based on the provided Scenario.


# Update a custom scenario scorer.
Source: https://docs.runloop.ai/api-reference/scenario/update-a-custom-scenario-scorer

post /v1/scenarios/scorers/{id}
Update a scenario scorer.


# Update a Scenario.
Source: https://docs.runloop.ai/api-reference/scenario/update-a-scenario

post /v1/scenarios/{id}
Update a Scenario, a repeatable AI coding evaluation test that defines the starting environment as well as evaluation success criteria.


# Validate a custom scenario scorer.
Source: https://docs.runloop.ai/api-reference/scenario/validate-a-custom-scenario-scorer

post /v1/scenarios/scorers/{id}/validate
Validate a scenario scorer.


# Build Custom Agent Benchmarks with Runloop
Source: https://docs.runloop.ai/benchmarks/custom-benchmarks

Learn how to create and run custom benchmarks. We're personally excited about this part of our platform - let us know at support@runloop.ai if you need any help!

## Creating Custom Scenarios

Creating custom scenarios allows users to tailor problem statements and environments specific to their needs. This is useful for testing agents in controlled conditions or building unique challenges.

To define your own scenario:

1. Create a development environment (devbox).
2. Take a snapshot of the environment at a key point in time.
3. Define a problem statement for the scenario.
4. Attach scoring functions to measure performance.

Example:

<CodeGroup>
  ```typescript TypeScript
  const devbox = await runloop.devboxes.create({ blueprint_name: "bpt_123" });
  const mySnapshot = await runloop.devboxes.snapshotDisk(devbox.id, {
      name: 'div incorrectly centered in flexbox',
  });

  const myNewScenario = await runloop.scenarios.create({
      name: 'My New Scenario',
      input_context: { problem_statement: 'Create a UI component' },
      environment_parameters: { snapshot_id: '123' },
      scoring_contract: {
          scoring_function_parameters: [{
              name: 'bash_scorer',
              scorer: {
                  type: 'bash_script_scorer',
                  bash_script: 'some script that writes files and validates output',
              },
              weight: 1.0,
          }],
      },
  });
  ```
</CodeGroup>

## Understanding Scoring Functions

Scoring functions validate whether a scenario was successfully completed. These functions help ensure solutions are correct, provide feedback, and assign a score for evaluation.

### Basic Scoring Function Example

A simple scoring function is a bash script that echoes a score between 0 and 1:

<CodeGroup>
  ```typescript TypeScript
  scoring_function_parameters: [{
      name: 'my-custom-pytest-script',
      scorer: {
          name: 'bash_scorer',
          type: 'bash_script_scorer',
          bash_script: 'echo 1.0',
      },
      weight: 1.0,
  }]
  ```
</CodeGroup>

### Custom Scoring Functions

To make scoring more reusable and flexible, you can define **custom scoring functions**. These are used to evaluate performance in specific ways, such as running tests or analyzing output logs.

Example:

<CodeGroup>
  ```typescript TypeScript
  const myCustomScenario = await runloop.scenarios.create({
      name: 'scenario with custom scorer',
      input_context: { problem_statement: 'Create a UI component' },
      environment_parameters: { snapshot_id: mySnapshot.id },
      scoring_contract: {
          scoring_function_parameters: [{
              name: 'my-custom-pytest-script',
              scorer: {
                  type: 'custom_scorer',
                  custom_scorer_type: 'my-custom-pytest-script',
                  scorer_params: { relevant_tests: ['foo.test.py', 'bar.test.py'] },
              },
              weight: 1.0,
          }],
      },
  });

  ```
</CodeGroup>

### Custom benchmarks

Once you have your scenarios and scoring functions defined, you can run all of your custom scenarios as a **custom benchmark**.

You'll need to create the benchmark instance first, then run it. Here's how:

<CodeGroup>
  ```typescript TypeScript
  const myBenchmark = await runloop.benchmarks.create({
      name: 'py bench',
      scenarios: [myNewScenario.id, myCustomScenario.id]
  })
  ```
</CodeGroup>

You can update both code scenarios and benchmarks at any time so that you can build it up over time.


# Overview of Benchmarking on Runloop
Source: https://docs.runloop.ai/benchmarks/overview

Make your agent better and more reliable with Runloop's tools for benchmarking.

Your AI coding agent is capable of numerous tasks such as reading code, writing and preparing patches, and submitting commits to code repositories.

A common problem with such agents is ensuring that they *perform*: Without monitoring, tuning and optimization, your agent may be prone to making mistakes, experience regressions over time, and generally not deliver the best user experience.

Runloop Benchmarking is a a suite of tools to help you address these issues and stay focused on building the best possible agent.

## Main Features

Runloop Benchmarking includes several tools to save you time while optimizing your agent:

* **Run Public Benchmarks:** Easily run your agent against a matrix of well-known and open source benchmarks, such as SWE-bench.
* **Run Custom Benchmarks:** Write custom scoring functions for each of your agent's tasks, then evaluate the agent's performance against them.
* **Reports & Insights:** As you run benchmarks over time, you will see how your agent's performance changes in the Runloop dashboard.

## Key Concepts

Whether you're using public or custom benchmarks, you'll keep the following key concepts in mind:

* **Code Scenario**: A single test case where an agent is given a problem and is expected to modify a target environment to solve it. Scenarios help test AI agents in realistic coding environments.
* **Scoring Function**: A script or function that runs after the agent completes its task to validate whether the solution works. These functions generate a final score between 0 and 1 to indicate performance.
* **Benchmark**: A collection of Code Scenarios designed to evaluate AI agents on a broader set of tasks. Benchmarks help measure agent capabilities systematically.

Next, learn how to [run public benchmarks](/benchmarks/public-benchmarks).


# Public Benchmarks
Source: https://docs.runloop.ai/benchmarks/public-benchmarks

Learn how to easily run your agent against popular public benchmarks.

export const ExampleRepoLink = props => {
  return <Info><h3><a href={props.link}>Full example</a></h3></Info>;
};

<ExampleRepoLink link="https://github.com/runloopai/public_benchmarks_example" />

## Public Benchmarks

Runloop Public Benchmarks make it simple to validate your coding agent against the most popular, open source
coding evaluation datasets.

Each Benchmark contains a set of Scenarios based on each test in the dataset. The Scenario contains the **problem statement** that your agent
must work through, a pre-built **environment** containing all of context needed to complete the job, and a built-in **scoring contract**
to properly evaluate the result for correctness.

## Viewing Public Benchmarks

We're constantly adding new supported datasets. To list the up-to-date list of supported public Benchmarks, use the following API call:

<CodeGroup>
  ```typescript TypeScript
  // Query to see the latest list of supported public benchmarks
  // princeton-nlp/SWE-bench_Lite, etc
  const { benchmarks } = await rl.benchmarks.list_public();
  ```
</CodeGroup>

<Note>Are we missing your favorite open source benchmark? Let us know at [support@runloop.ai](mailto:support@runloop.ai)</Note>

Each Benchmark contains a set of Scenarios that correspond to a test-case in the evaluation dataset.

<CodeGroup>
  ```typescript TypeScript
  // The Benchmark definition contains a list of all scenarios contained in the benchmark
  console.log(benchmarks[0].scenarioIds)
  ```
</CodeGroup>

## Running Scenarios & Benchmarks

Each Scenario can be **run** to evaluate an AI agent's performance. Running a scenario involves:

1. Initiating a scenario run.
2. Launching a development environment (devbox).
3. Running the agent against the problem statement.
4. Scoring the results.
5. Uploading traces for analysis.

### Run a single scenario from a public benchmark

Here's an example of how to run a single scenario from a public benchmark against your own agent.

First, create a **scenario run** to track the status and results of this run:

<CodeGroup>
  ```typescript TypeScript
  const scenarioId = benchmarks[0].scenarioIds[0]
  const scenarioRun = await runloop.scenarios.startRun({
      scenario_id: scenarioId,
      run_name: 'marshmallow-code__marshmallow-1359 test run'
  });
  ```
</CodeGroup>

When starting a run, Runloop will create a Devbox with the *environment*
specified by the test requirements.

Wait for the devbox used by the scenario to become ready:

<CodeGroup>
  ```typescript TypeScript
  const devboxId = scenarioRun.devbox_id;
  await runloop.devboxes.awaitRunning(devboxId);
  ```
</CodeGroup>

Now, run your agent. How and where your agent runs is up to you. Here's an example of an agent that leverages the Runloop Devbox that was just created:

<CodeGroup>
  ```typescript TypeScript
  const myAgent = new MyAgent({
      prompt: scenarioRun.scenario.context.problemStatement,
      tools: [runloop.devboxes.shellTools(devboxId)],
  });
  ```
</CodeGroup>

Finally, run the scoring function to validate the agent's performance:

<CodeGroup>
  ```typescript TypeScript
  // Run the scoring function. Automatically marks the secenario run as done.
  const validateResults = await runloop.scenarioRuns.scoreAndAwait(
      scenarioRun.id
  );
  console.log(validateResults);
  ```
</CodeGroup>

### Perform a full benchmark run of a public benchmark

Once your agent is excelling at an individual scenario, you will want to test
against all Scenarios for a given Benchmark.

Here's an example of how to perform a full benchmark run of a public benchmark.

<CodeGroup>
  ```typescript TypeScript
  // Start a full run of the first public benchmark returned
  let benchmarkRun = await runloop.benchmarks.startRun({
      benchmark_id: benchmarks[0].id,
      run_name: 'optional run name'
  });

  // This shows a serialized scenario by scenario runner but can also run in any
  // level of parallelism
  benchmarkRun.pending_scenarios.forEach(async scenarioId => {
      // create a scenario run tied to the benchmark run
      const scenarioRun = await runloop.scenarios.startRunAndAwaitEnvReady({
          scenario_id: scenarioId,
          benchmark_run_id: benchmarkRun.id
      });

      const devboxId = scenarioRun.devbox_id;

      // Run your agent on the problem at hand to see how it does
      const myAgent = new MyAgent({
          prompt: scenarioRun.scenario.context.problemStatement,
          tools: [runloop.devboxes.shellTools(devboxId)],
      });

      // Score and complete the run. This will also properly shutdown the Devbox environment.
      const validateResults = await runloop.scenarios.runs.scoreAndComplete(
          scenarioRun.id
      );
  });

  // Benchmark runs will end automatically when no more pending scenarios but also
  // can optionally just end a benchmark run early
  await runloop.benchmarks.runs.complete(benchmarkRun.id)
  ```
</CodeGroup>

Public Benchmarks make it fast and easy to start evaluating your agent against industry standard coding evaluations.
When you're ready to expand or customize Benchmarks that meet your specific needs, move on to creating [Custom Benchmarks](/benchmarks/custom-benchmarks).


# Quickstart - Controlling a Browser in a Runloop Devbox
Source: https://docs.runloop.ai/devboxes/addons/browser

Learn how to control a browser programmatically inside a Runloop Devbox using the Runloop SDK

export const ExampleRepoLink = props => {
  return <Info><h3><a href={props.link}>Full example</a></h3></Info>;
};

<ExampleRepoLink link="https://github.com/runloopai/runloop-examples/tree/main/runloop-demo-browser" />

## Introduction

This guide will walk you through using the **Runloop SDK** to control a browser inside a **Runloop Devbox**. The Runloop API provides a **browser-ready Devbox**, enabling AI agents to interact with web pages programmatically.

<Steps>
  <Step title="Set Up Your Environment">
    Set up your authentication key:

    ```bash
    export RUNLOOP_API_KEY="your-api-key"
    ```
  </Step>

  <Step title="Install and Initialize the Runloop SDK">
    First, install the Runloop SDK if you haven't already:

    <CodeGroup>
      ```bash Python
      pip install runloop_api_client
      ```

      ```bash TypeScript
      npm install @runloop/api-client
      ```
    </CodeGroup>

    Then, import and initialize the SDK:

    <CodeGroup>
      ```python Python
      from runloop_api_client import Runloop

      client = Runloop(bearer_token="your-api-key")
      ```

      ```typescript TypeScript
      import Runloop from '@runloop/api-client';

      const client = new Runloop({
        bearerToken: 'your-api-key'
      });
      ```
    </CodeGroup>

    This `client` object allows interaction with the Runloop API.
  </Step>

  <Step title="Create a Devbox and Start the Browser">
    Set up your **browser-ready Devbox** and obtain the connection details:

    <CodeGroup>
      ```python Python
      # Create a Devbox with a browser instance
      browser = client.devboxes.browsers.create()

      # Wait for the Devbox to be fully running
      client.devboxes.await_running(browser.devbox.id)

      # View your remote browser here:
      browser.live_view_url

      # Connect to your browser here:
      browser.connection_url 
      ```

      ```typescript TypeScript
      // Create a Devbox with a browser instance
      const browser = await client.devboxes.browsers.create();

      // Wait for the Devbox to be fully running
      await client.devboxes.awaitRunning(browser.devbox.id);

      // View your remote browser here:
      console.log(browser.live_view_url);

      // Connect to your browser here:
      console.log(browser.connection_url);
      ```
    </CodeGroup>
  </Step>

  <Step title="Connect to the Browser using Playwright">
    To interact with the browser, you can use automation tools like **Selenium, Puppeteer, or Playwright**. Here's an example using **Playwright's Chrome DevTools Protocol (CDP)**:

    <CodeGroup>
      ```python Python
      from playwright.async_api import async_playwright


      # Initialize playwright context manager 
      playwright = await async_playwright().start()

      # Connect to your remote browser 
      browser = await playwright.chromium.connect_over_cdp(url)

      # Create your browser context 
      context = await browser.new_context()

      # Accesses pages in the browser context's list of pages
      page = context.pages[0]
      ```

      ```typescript TypeScript
      import { chromium } from 'playwright';

      // Initialize playwright and connect to browser
      const browser = await chromium.connectOverCDP(url);

      // Create your browser context
      const context = await browser.newContext();

      // Accesses pages in the browser context's list of pages
      const page = context.pages()[0];
      ```
    </CodeGroup>
  </Step>

  <Step title="Defining Tools for AI Agents">
    You can create **custom tools** for AI agents to interact with the browser programmatically. Here's an example of a **navigation tool** using Playwright:

    <CodeGroup>
      ```python Python
      from playwright.async_api import async_playwright

      class NavigateTool:
          """A tool for navigating to a URL using Playwright."""

          async def __call__(self, *, url: str):
              async with async_playwright() as p:
                  browser = await p.chromium.launch()
                  page = await browser.new_page()
                  await page.goto(url)
                  content = await page.content()
                  await browser.close()
                  return {"output": f"Navigated to {url}", "content": content[:500]}

          def to_params(self):
              return {
                  "name": "navigate_tool",
                  "description": "Navigates to a URL and retrieves content.",
                  "input_schema": {
                      "type": "object",
                      "properties": {"url": {"type": "string"}},
                      "required": ["url"],
                  },
              }
      ```

      ```typescript TypeScript
      import { chromium, Browser, Page } from 'playwright';

      class NavigateTool {
          /**
           * A tool for navigating to a URL using Playwright.
           */
          
          async call({ url }: { url: string }) {
              const browser = await chromium.launch();
              const page = await browser.newPage();
              await page.goto(url);
              const content = await page.content();
              await browser.close();
              return { 
                  output: `Navigated to ${url}`, 
                  content: content.slice(0, 500) 
              };
          }

          toParams() {
              return {
                  name: "navigate_tool",
                  description: "Navigates to a URL and retrieves content.",
                  input_schema: {
                      type: "object",
                      properties: { url: { type: "string" } },
                      required: ["url"],
                  },
              };
          }
      }
      ```
    </CodeGroup>
  </Step>

  <Step title="Passing Tools to an AI Agent">
    Now, you can pass this tool to an AI agent, enabling it to use the browser autonomously:

    <CodeGroup>
      ```python Python
      tool_instance = NavigateTool()

      response = client.messages.create(
          model=model,
          max_tokens=max_tokens,
          messages=messages,
          tools=[tool_instance.to_params()]
      )
      ```

      ```typescript TypeScript
      const toolInstance = new NavigateTool();

      const response = await client.messages.create({
          model,
          maxTokens,
          messages,
          tools: [toolInstance.toParams()]
      });
      ```
    </CodeGroup>

    <Note>
      Different LLM providers have their own specific formats and requirements for defining and passing tools. Make sure to reference your LLM provider's documentation for the correct implementation details of tool schemas and function calling.
    </Note>
  </Step>

  <Step title="Properly Freeing Resources">
    To ensure efficient resource management, **always shut down the Devbox** when you're done:

    <CodeGroup>
      ```python Python
      client.devboxes.shutdown(browser.devbox.id)
      ```

      ```typescript TypeScript
      await client.devboxes.shutdown(browser.devbox.id);
      ```
    </CodeGroup>
  </Step>
</Steps>

## Additional Resources

* [Runloop GitHub Repository](https://github.com/runloopai/examples) - Explore more examples.
* [Runloop API Documentation](https://docs.runloop.ai) - Official API reference.


# Quickstart - Controlling a remote computer in a Runloop Devbox
Source: https://docs.runloop.ai/devboxes/addons/computer

Learn how to control a computer programmatically inside a Runloop Devbox using the Runloop SDK

export const ExampleRepoLink = props => {
  return <Info><h3><a href={props.link}>Full example</a></h3></Info>;
};

<ExampleRepoLink link="https://github.com/runloopai/runloop-examples/tree/main/runloop-demo-computer" />

## Introduction

This guide will walk you through using the **Runloop SDK** to control a remote computer inside a **Runloop Devbox**. The Runloop API provides a **computer-ready Devbox**, enabling AI agents to interact with the system programmatically.

<Steps>
  <Step title="Set Up Your Environment">
    Set up your authentication key:

    ```bash
    export RUNLOOP_API_KEY="your-api-key"
    ```
  </Step>

  <Step title="Install and Initialize the Runloop SDK">
    First, install the Runloop SDK if you haven't already:

    <CodeGroup>
      ```bash Python
      pip install runloop_api_client
      ```

      ```bash TypeScript
      npm install @runloop/api-client
      ```
    </CodeGroup>

    Then, import and initialize the SDK:

    <CodeGroup>
      ```python Python
      from runloop_api_client import Runloop

      client = Runloop(bearer_token="your-api-key")
      ```

      ```typescript TypeScript
      import Runloop from '@runloop/api-client';

      const client = new Runloop({
        bearerToken: 'your-api-key'
      });
      ```
    </CodeGroup>

    This `client` object allows interaction with the Runloop API.
  </Step>

  <Step title="Create a Devbox and Start the Computer Tool">
    Create your Devbox, wait for it to be ready, and retrieve the connection details:

    <CodeGroup>
      ```python Python
      # Create a Devbox with a computer instance
      computer = client.devboxes.computers.create()

      # Wait for the Devbox to be fully running
      client.devboxes.await_running(computer.devbox.id)

      # Retrieve the computer connection details
      devbox_id = computer.devbox.id
      display_url = computer.live_screen_url
      ```

      ```typescript TypeScript
      // Create a Devbox with a computer instance
      const computer = await client.devboxes.computers.create();

      // Wait for the Devbox to be fully running
      await client.devboxes.awaitRunning(computer.devbox.id);

      // Retrieve the computer connection details
      const devboxId = computer.devbox.id;
      const displayUrl = computer.live_screen_url;
      ```
    </CodeGroup>
  </Step>

  <Step title="Interacting with the Computer">
    The **computer-ready Devbox** offers a suite of **Computer Tools** for agent interactions. The available actions include:

    * **Keyboard interaction**: `key`, `type`
    * **Mouse interaction**: `mouse_move`, `left_click`, `left_click_drag`, `right_click`, `middle_click`, `double_click`
    * **Screen interaction**: `screenshot`, `cursor_position`

    You can access these tools using the Runloop client as shown below:

    <CodeGroup>
      ```python Python
      # Keyboard usage
      client.devboxes.computers.keyboard_interaction(devbox_id, action=action, text=text)

      # Mouse usage
      client.devboxes.computers.mouse_interaction(devbox_id, action=action, coordinate={"x": int, "y": int})    

      # Take a screenshot or retrieve current mouse coordinates
      client.devboxes.computers.screen_interaction(devbox_id, action=action)
      ```

      ```typescript TypeScript
      // Keyboard usage
      await client.devboxes.computers.keyboardInteraction(devboxId, { 
        action: action, 
        text: text 
      });

      // Mouse usage
      await client.devboxes.computers.mouseInteraction(devboxId, {
        action: action,
        coordinate: { x: number, y: number }
      });

      // Take a screenshot or retrieve current mouse coordinates
      await client.devboxes.computers.screenInteraction(devboxId, {
        action: action
      });
      ```
    </CodeGroup>
  </Step>

  <Step title="Using API Tools with the Computer Tool">
    Once you create tools for your agent, you can integrate them with your preferred LLM. Here's an example of integrating it with **Anthropic's Claude**:

    <CodeGroup>
      ```python Python
      import Anthropic

      anthropic_client = Anthropic(api_key="your-anthropic-api-key")

      response = anthropic_client.messages.create(
          model="claude-3",
          max_tokens=300,
          messages=messages
          tools=[tool],
      )
      ```

      ```typescript TypeScript
      import Anthropic from '@anthropic-ai/sdk';

      const anthropicClient = new Anthropic({
        apiKey: 'your-anthropic-api-key'
      });

      const response = await anthropicClient.messages.create({
        model: 'claude-3',
        maxTokens: 300,
        messages: messages,
        tools: [tool]
      });
      ```
    </CodeGroup>

    <Note>
      Different LLM providers have their own specific formats and requirements for defining and passing tools. Make sure to reference your LLM provider's documentation for the correct implementation details of tool schemas and function calling.
    </Note>
  </Step>

  <Step title="Properly Freeing Resources">
    To ensure efficient resource management, **always shut down the Devbox** when you're done:

    <CodeGroup>
      ```python Python
      client.devboxes.shutdown(computer.devbox.id)
      ```

      ```typescript TypeScript
      await client.devboxes.shutdown(computer.devbox.id);
      ```
    </CodeGroup>
  </Step>
</Steps>

## Additional Resources

* [Runloop GitHub Repository](https://github.com/runloopai/examples) - Explore more examples.
* [Runloop API Documentation](https://docs.runloop.ai) - Official API reference.


# Overview of Devbox Add-ons
Source: https://docs.runloop.ai/devboxes/addons/overview


Devboxes are more than just flexible, general purpose virtual machines. They also come with a set of optional add-ons that extend their capabilities in ways that many coding agents need.

## Available add-ons

Currently available add-ons:

* **Browser** (beta): A remotely-controllable Playwright browser.
* **Computer** (beta): A remotely-controllable Ubuntu Desktop environment.

See the next chapters in this section for instructions on using these add-ons.

## Pricing & availability

During open beta, add-ons are available at no additional cost.


# Devbox Blueprints
Source: https://docs.runloop.ai/devboxes/blueprints

Reproducible templates for devboxes

export const ExampleRepoLink = props => {
  return <Info><h3><a href={props.link}>Full example</a></h3></Info>;
};

<ExampleRepoLink link="https://github.com/runloopai/runloop-examples/tree/main/blueprints" />

Often you will want to start your devboxes with your own customizations. For example, you may want to always boot with a specific version of a language or framework or set up a specific repository.

Rather than running these commands every time you launch a devbox, you can use a blueprint to optimize boot time by saving the state of your devbox after these commands have been run. By building a blueprint, you get:

1. **Standardization**: Define tools, binaries, and configurations your AI agent needs at runtime.
2. **Consistency**: Ensure reproducible AI behavior across environments.
3. **Efficiency**: Reduce Devbox startup time by pre-installing necessary tools.
4. **Customization**: Tailor environments to specific AI-assisted development needs.

<Note>
  When should I use a Blueprint vs. a Snapshot?

  Snapshots and Blueprints both allow you to run devboxes with customizations. **Blueprints** are fast to boot and cacheable using Docker layers, while **Snapshots** are a bit slower on boot (reproducing each step taken in the devbox) but can be created quickly from an existing devbox.

  Examples:

  * **[Blueprint](/devboxes/blueprints)**: You have a coding agent that is performing a task that requires installing a specific tool. Create a blueprint with set-up steps for the tool and future devboxes will cache the installation to speed up boot and execution time.
  * **[Snapshot](/devboxes/snapshots)**: You have a coding agent in a devbox considering 3 different ways to complete a task. Create a snapshot of the initial state of the devbox, create 3 parallel devboxes from that snapshot, collate the results, and then choose the best option to continue.
</Note>

## Creating a Blueprint

One use case for a blueprint is preinstalling tools your AI agent may want to use. For example, let's create a simple Blueprint that installs `jq`, a lightweight command-line JSON processor:

<CodeGroup>
  ```bash curl
  curl -X POST 'https://api.runloop.ai/v1/blueprints' \
    -H 'accept: application/json' \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -d '{
    "name": "docs-template",
    "system_setup_commands": ["sudo apt install -y jq"]
  }'
  ```

  ```python Python
  from runloop_api_client import Runloop
  import os

  client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  blueprint = client.blueprints.create(
      name="docs-template",
      system_setup_commands=["sudo apt install -y jq"]
  )

  print(f"Blueprint created with ID: {blueprint.id}")
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';

  const client = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  async function createBlueprint() {
    const blueprint = await client.blueprints.create({
      name: "docs-template",
      system_setup_commands: ["sudo apt install -y jq"]
    });

    console.log(`Blueprint created with ID: ${blueprint.id}`);
  }

  createBlueprint();
  ```
</CodeGroup>

<Note>
  Use the Debian package manager (apt) for installing system packages on the Runloop base image.
</Note>

## Using Your Blueprint

Once your Blueprint's status is `build_complete`, create a Devbox using it:

<CodeGroup>
  ```bash curl
  curl -X POST 'https://api.runloop.ai/v1/devboxes' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
      "blueprint_name": "fe-bot",
      "setup_commands": [
        "cd /home/user/runloop-fe && git pull",
        "npm install"
      ]
    }'
  ```

  ```python Python
  from runloop_api_client import Runloop
  import os

  client = Runloop(api_key=os.environ.get("RUNLOOP_API_KEY"))

  devbox = client.devboxes.create(
      blueprint_name="fe-bot",
      setup_commands=[
          "cd /home/user/runloop-fe && git pull",
          "npm install"
      ]
  )
  print(f"Devbox created with ID: {devbox.id}")
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';

  const client = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  async function createDevbox() {
    const devbox = await client.devboxes.create({
      blueprint_name: "fe-bot",
      setup_commands: [
        "cd /home/user/runloop-fe && git pull",
        "npm install"
      ]
    });

    console.log(`Devbox created with ID: ${devbox.id}`);
  }

  createDevbox();
  ```
</CodeGroup>

## Creating Blueprints with CodeMounts

### Basic Configuration

To add a CodeMount to your Blueprint:

<CodeGroup>
  ```bash curl
  curl -X POST 'https://api.runloop.ai/v1/blueprints' \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -d '{
      "name": "fe-bot",
      "code_mounts": [{
        "repo_name": "runloop-fe",
        "repo_owner": "runloop"
      }]
    }'
  ```

  ```python Python
  from runloop_api_client import Runloop
  import os

  client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  blueprint = client.blueprints.create(
      name="fe-bot",
      code_mounts=[{
          "repo_name": "runloop-fe",
          "repo_owner": "runloop",
          "token": os.environ.get("GH_TOKEN")
      }]
  )
  print(f"Blueprint created with ID: {blueprint.id}")
  ```

  ```typescript TypeScript
  import { Runloop } from '@runloop/sdk';

  const client = new Runloop('your_api_key_here');

  const blueprint = await client.blueprints.create({
    name: "fe-bot",
    code_mounts: [{
      repo_name: "runloop-fe",
      repo_owner: "runloop",
      token: process.env.GH_TOKEN
    }]
  });
  console.log(`Blueprint created with ID: ${blueprint.id}`);
  ```
</CodeGroup>

This creates a Blueprint named "fe-bot" that includes the "runloop-fe" repository.

### Private Repository Authentication

For private repositories, include a GitHub Personal Access Token (PAT):

<CodeGroup>
  ```bash curl
  curl -X POST 'https://api.runloop.ai/v1/blueprints' \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -d '{
      "name": "fe-bot",
      "code_mounts": [{
        "repo_name": "runloop-fe",
        "repo_owner": "runloop",
        "token": "'"${GH_TOKEN}"'"
      }]
    }'
  ```

  ```python Python
  blueprint = client.blueprints.create(
      name="fe-bot",
      code_mounts=[{
          "repo_name": "runloop-fe",
          "repo_owner": "runloop",
          "token": os.environ.get("GH_TOKEN")
      }]
  )
  ```

  ```typescript TypeScript
  const blueprint = await client.blueprints.create({
    name: "fe-bot",
    code_mounts: [{
      repo_name: "runloop-fe",
      repo_owner: "runloop",
      token: process.env.GH_TOKEN
    }]
  });
  ```
</CodeGroup>

This sets up the necessary environment for immediate use of Git and GitHub tools.

## The Blueprint Build Process

When you create a Blueprint, Runloop builds a custom image containing all specified tools and configurations.

### Checking Build Status

After creating a Blueprint, check its status:

<CodeGroup>
  ```bash curl
  curl -X GET 'https://api.runloop.ai/v1/blueprints/{blueprint_id}' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY"
  ```

  ```python Python
  from runloop_api_client import Runloop
  import os

  client = Runloop(api_key=os.environ.get("RUNLOOP_API_KEY"))

  blueprint = client.blueprints.retrieve("bpt_123")
  print(f"Blueprint status: {blueprint.status}")
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';

  const client = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  async function checkBlueprintStatus(blueprintId: string) {
    const blueprint = await client.blueprints.retrieve(blueprintId);
    console.log(`Blueprint status: ${blueprint.status}`);
  }

  checkBlueprintStatus("{blueprint_id}");
  ```
</CodeGroup>

Replace `{blueprint_id}` with the ID returned when you created the Blueprint.

Example response:

```json
{
  "id": "bpt_123",
  "name": "docs-template",
  "status": "build_complete",
  "create_time_ms": 1722264065963,
  "parameters": {
    ...
  }
}
```

The `status` field indicates the current state of your Blueprint:

* `build_complete`: Blueprint is ready to use
* `build_failed`: Refer to the [Blueprint troubleshooting](/devboxes/troubleshooting-blueprints) guide

## Advanced Usage: Custom Dockerfiles

For more complex environments, you can use a full Dockerfile as the basis for your Blueprint. This is useful when you need to install multiple tools or perform complex setup operations.

1. Base your Dockerfile on the Runloop base image:

   ```
   FROM public.ecr.aws/f7m5a7m8/devbox:prod
   ```

2. Runloop will:
   * Use your Dockerfile as the base
   * Apply any `system_setup_commands` specified
   * Set up any defined `CodeMount`s

<Note>
  The Runloop base image is public and can be downloaded for local testing.
</Note>

## Keeping Blueprints Updated

Periodically update Blueprints by building a new blueprint with the same name. This ensures that your AI agents always work with the latest code and dependencies.

<CodeGroup>
  ```bash curl
  curl -X POST 'https://api.runloop.ai/v1/blueprints' \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -d '{
      "name": "fe-bot",
      "code_mounts": [{
        "repo_name": "runloop-fe",
        "repo_owner": "runloop",
        "token": "'"${GH_TOKEN}"'"
      }]
    }'
  ```

  ```python Python
  blueprint = client.blueprints.create(
      name="fe-bot",
      code_mounts=[{
          "repo_name": "runloop-fe",
          "repo_owner": "runloop",
          "token": os.environ.get("GH_TOKEN")
      }]
  )
  ```

  ```typescript TypeScript
  const blueprint = await client.blueprints.create({
    name: "fe-bot",
    code_mounts: [{
      repo_name: "runloop-fe",
      repo_owner: "runloop",
      token: process.env.GH_TOKEN
    }]
  });
  ```
</CodeGroup>

This creates a new Blueprint version with the same `name`, allowing for faster updates and efficient resource use.

## Best Practices

1. **Start Simple**: Begin with basic Blueprints and gradually add complexity.
2. **Test Manually Using SSH**: You can create a devbox and SSH into it and manually install tools to make sure the commands are correct before layering them into Blueprints.
3. Always use `blueprint_name` instead of `blueprint_id` to ensure you're using the latest version. Use specific Blueprint IDs only when you need version control for particular setups.
4. Implement `setup_commands` in your Devbox creation to keep code and dependencies up-to-date.
5. Regularly update your Blueprints with the latest repository changes.

By leveraging Blueprints effectively, you can create optimized, consistent environments for your AI-assisted software engineering tasks, enhancing productivity and reliability in your development process.

## Upcoming Features

Future releases plan to include:

* Multiple repository support in a single Blueprint
* Specific branch specifications
* Git submodules support
* Advanced multi-step build processes

<Note>
  If any of these features are critical for your use case, please let us know.
</Note>


# Mount a Code Repository on a Devbox
Source: https://docs.runloop.ai/devboxes/code-mounts

Enable AI agents to work with full projects: access public and private repositories

export const ExampleRepoLink = props => {
  return <Info><h3><a href={props.link}>Full example</a></h3></Info>;
};

<ExampleRepoLink link="https://github.com/runloopai/runloop-examples/tree/main/code-mounts" />

## Overview

Enabling your AI agent to work on full existing code projects unlocks a new set of capabilities. This guide explains how to give your AI agent access to entire codebases, allowing it to make changes and run projects end-to-end like a human engineer.

## Using Code Mounts

While you can use normal shell exec commands to clone a public GitHub repository, Runloop provides a more powerful feature called `CodeMounts`. This allows you to mount a repository into your Devbox at a specific path.

### Creating a Devbox with a Code Mount

<CodeGroup>
  ```bash curl
  curl -X POST \
    'https://api.runloop.ai/v1/devboxes' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
      "code_mounts": [
        {
          "repo_name": "rl-cli",
          "repo_owner": "runloopai",
          "token": "<YOUR_GITHUB_TOKEN>"
        }
      ]
    }'
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  devbox = client.devboxes.create(
      code_mounts=[
          {
            "repo_name": "rl-cli",
            "repo_owner": "runloopai",
            "token": os.environ.get("GITHUB_TOKEN"),
          }
      ]
  )
  print(f"Devbox created with ID: {devbox.id}")
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';

  const client = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  async function createDevbox() {
    const devbox = await client.devboxes.create({
      code_mounts: [
        {
          "repo_name": "rl-cli",
          "repo_owner": "runloopai",
          "token": process.env.GITHUB_TOKEN,
        }
      ]
    });
    console.log(`Devbox created with ID: ${devbox.id}`);
  }

  createDevbox();
  ```
</CodeGroup>

This will clone the repo onto the Devbox and allow you to pull changes and branches. Note if you want to create pull requests or mutative actions you must configure your Git Auth as described below.

## Connecting to Private GitHub Repositories

To enable your Devbox to interact with private GitHub repositories, you need to provide proper authentication credentials. Runloop offers several methods to achieve this.

### Using Code Mounts with GitHub Token

When you create a Devbox with a Code Mount, Runloop automatically sets up the `GH_TOKEN` environment variable and credential cache for you. This authenticates all command-line tools in your Devbox with your GitHub token. This allows your AI agent to use Github and open authenticated pull requests using the `gh` cli tool.

<CodeGroup>
  ```bash curl
  curl -X POST \
    'https://api.runloop.ai/v1/devboxes' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
      "code_mounts": [
        {
          "repo_name": "rl-cli",
          "repo_owner": "runloopai",
          "token": "<YOUR_GITHUB_TOKEN>"
        }
      ]
    }'
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  devbox = client.devboxes.create(
      code_mounts=[
          {
            "repo_name": "rl-cli",
            "repo_owner": "runloopai",
            "token": os.environ.get("GITHUB_TOKEN"),
          }
      ]
  )
  print(f"Devbox created with ID: {devbox.id}")
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';

  const client = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  async function createDevbox() {
    const devbox = await client.devboxes.create({
      code_mounts: [
        {
          "repo_name": "rl-cli",
          "repo_owner": "runloopai",
          "token": process.env.GITHUB_TOKEN,
        }
      ]
    });
    console.log(`Devbox created with ID: ${devbox.id}`);
  }

  createDevbox();
  ```
</CodeGroup>

### Manually Configuring Your Devbox for GitHub

Alternatively, you can configure your Devbox manually using the `setup_commands` argument:

<CodeGroup>
  ```bash curl
  curl -X POST \
    'https://api.runloop.ai/v1/devboxes' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -H 'Content-Type: application/json'  \
    -d '{
      "environment_variables": {"GH_TOKEN": "<YOUR_GITHUB_TOKEN>"},
      "setup_commands": [
        "git config --global credential.helper '\''cache --timeout=3600'\''",
        "echo \"protocol=https\nhost=github.com\nusername=$GH_TOKEN\npassword=$GH_TOKEN\" | git credential-cache store"      
      ]
    }'
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  devbox = client.devboxes.create(
      environment_variables={"GH_TOKEN": "<YOUR_GITHUB_TOKEN>"},
      setup_commands=[
          "git config --global credential.helper 'cache --timeout=3600'",
          "echo \"protocol=https\nhost=github.com\nusername=$GH_TOKEN\npassword=$GH_TOKEN\" | git credential-cache store"
      ]
  )
  print(f"Devbox created with ID: {devbox.id}")
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';

  const client = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  async function createDevbox() {
    const devbox = await client.devboxes.create({
      environment_variables: { GH_TOKEN: "<YOUR_GITHUB_TOKEN>" },
      setup_commands: [
        "git config --global credential.helper 'cache --timeout=3600'",
        "echo \"protocol=https\nhost=github.com\nusername=$GH_TOKEN\npassword=$GH_TOKEN\" | git credential-cache store"
      ]
    });
    console.log(`Devbox created with ID: ${devbox.id}`);
  }

  createDevbox();
  ```
</CodeGroup>

This command:

1. Creates a new Devbox
2. Sets the `GH_TOKEN` environment variable with your GitHub token
3. Configures Git to use the credential cache
4. Stores your GitHub token in the Git credential cache for one hour

<Tip>
  Adjust the `--timeout` value in the git config command to change how long the credentials are cached.
</Tip>

### Best Practices for Token Security

1. Use tokens with the minimum required permissions for your tasks.
2. Regularly rotate your GitHub tokens.
3. Never commit or push files containing your tokens to version control.
4. Use environment variables when possible to avoid exposing tokens in command-line arguments.

By following these guidelines, you can securely enable your AI agent to work with full projects and private repositories, expanding its capabilities within the Runloop Devbox environment.


# Debugging Agent Output with SSH
Source: https://docs.runloop.ai/devboxes/debugging-agent-output-with-ssh

Securely connect to a remote Runloop Devbox using SSH for debugging

## Overview

When working with AI-generated code, you may need to debug the state of the project after the AI has run various commands. SSH allows you to connect your computer directly to a Devbox, enabling you to debug, run remote commands, and view or modify the remote filesystem.

Runloop uses a transparent proxy to facilitate routing for all SSH access. Your SSH connection is end-to-end encrypted using standard SSH public key cryptography. The Runloop API provides a mechanism for retrieving SSH keys using a Runloop API key.

## Setup

We recommend using the `rl` CLI to interact with Devboxes. You can find installation instructions at [https://github.com/runloopai/rl-cli](https://github.com/runloopai/rl-cli).

## Create and SSH into a Devbox

<Steps>
  <Step title="Export your API key">
    ```bash
    export RUNLOOP_API_KEY="ak_<your_key_here>"
    ```
  </Step>

  <Step title="Create an empty Devbox">
    ```bash
    rl devbox create
    ```

    You'll receive a response like this:

    ```json
    {
       "id": "dbx_2xMEVq0JpPtxUxZikhOLm",
       "blueprint_id": null,
       "create_time_ms": 1723232059063,
       "end_time_ms": null,
       "initiator_id": null,
       "initiator_type": "invocation",
       "name": null,
       "status": "provisioning"
    }
    ```
  </Step>

  <Step title="SSH into the Devbox">
    SSH into an `active` Devbox using the returned `id`:

    ```bash
    rl devbox ssh --id dbx_2xMEVq0JpPtxUxZikhOLm
    ```

    You should now have a shell into the Devbox:

    ```
    user@devbox-019138a2-7e80-7233-8100-1add224f41ee-zst79:~$
    ```
  </Step>

  <Step title="Exit the SSH session">
    Type `exit` to leave the SSH session.
  </Step>

  <Step title="Shut down the Devbox">
    When you're done, shut down the Devbox:

    ```bash
    rl devbox shutdown --id dbx_2xMEVq0JpPtxUxZikhOLm
    ```
  </Step>
</Steps>

## Using VSCode with SSH

You can use SSH access to connect VSCode to the remote Devbox.

<Steps>
  <Step title="Install VSCode SSH extension">
    Install the [Visual Studio Code Remote - SSH extension](https://code.visualstudio.com/docs/remote/ssh).
  </Step>

  <Step title="Create a Devbox">
    ```bash
    rl devbox create
    ```
  </Step>

  <Step title="Generate SSH config entry">
    ```bash
    rl devbox ssh --id dbx_2xMEa8BVcYOOGtXGqWNVj --config-only
    ```
  </Step>

  <Step title="Append to SSH config file">
    ```bash
    rl devbox ssh --id dbx_2xMEa8BVcYOOGtXGqWNVj --config-only >> ~/.ssh/config
    ```
  </Step>

  <Step title="Verify the configuration">
    ```bash
    ssh dbx_2xMEa8BVcYOOGtXGqWNVj "whoami"
    ```

    This should return `user`.
  </Step>

  <Step title="Connect VSCode to your Devbox">
    You now have a ready-to-use SSH connection to the Devbox. Follow the remaining instructions in the [VSCode SSH documentation](https://code.visualstudio.com/docs/remote/ssh#_connect-to-a-remote-host) to connect VSCode to your Devbox.
  </Step>
</Steps>

## Security Notes

* All SSH connections are routed through Runloop's transparent proxy.
* Connections are end-to-end encrypted using SSH public key cryptography.
* SSH keys are generated and managed securely through the Runloop API.

By following these steps, you can securely connect to your Runloop Devbox for debugging, code inspection, and project management tasks.


# Execute Commands on a Devbox
Source: https://docs.runloop.ai/devboxes/execute-commands

Run and execute code at scale

export const ExampleRepoLink = props => {
  return <Info><h3><a href={props.link}>Full example</a></h3></Info>;
};

<ExampleRepoLink link="https://github.com/runloopai/runloop-examples/tree/main/devboxes/execute-commands" />

## Running Commands Synchronously vs Asynchronously

The Runloop shell APIs support both synchronous for immediate results and asynchronous for long-running commands or daemons.

### Synchronous Commands

Synchronous commands allow you to run commands and block until you get the command results including stdout, stderr, and the exit code of the command process.

<CodeGroup>
  ```bash curl
  curl -X POST 'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/execute_sync' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
      "command": "echo Hello World",
    }'
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"])

  result = client.devboxes.execute_sync("<YOUR_DEVBOX_ID>", command="echo Hello World")
  print(result)
  ```

  ```typescript TypeScript
  const result = await client.devboxes.executeSync('<YOUR_DEVBOX_ID>', {
    command: 'echo Hello World'
  });
  console.log(result);
  ```
</CodeGroup>

### Asynchronous Commands

Asynchronous commands allow you to run commands and not block until you get the command results. This can be useful for long-running commands or daemons such as launching dev servers or background processes.

<Steps>
  <Step title="Launch an async command">
    <CodeGroup>
      ```bash curl
      curl -X POST 'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/execute_async' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" \
        -H 'Content-Type: application/json' \
        -d '{
          "command": "while true; do echo 'Hello World'; sleep 1; done",
        }'
      ```

      ```python Python
      import os
      from runloop_api_client import Runloop

      client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"])

      result = client.devboxes.execute_async("<YOUR_DEVBOX_ID>", command="while true; do echo 'Hello World'; sleep 1; done")
      print(result.stdout)
      ```

      ```typescript TypeScript
      const result = await client.devboxes.executeAsync('<YOUR_DEVBOX_ID>', {
        command: 'while true; do echo "Hello World"; sleep 1; done'
      });
      console.log(result.stdout);
      ```
    </CodeGroup>
  </Step>

  <Step title="Retrieve the Status of the Async Command including the latest output">
    <CodeGroup>
      ```bash curl
      curl -X GET 'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/executions/<EXECUTION_ID>' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" \
        -H 'Content-Type: application/json'
      ```

      ```python Python
      import os
      from runloop_api_client import Runloop

      client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"])

      # Now we can load the latest status of the execution
      client.devboxes.executions.retrieve(        
          "<EXECUTION_ID>",
          devbox_id="<YOUR_DEVBOX_ID>",
      )

      # Alternatively, we can wait the background command completing
      client.devboxes.executions.await_completed(
          "<EXECUTION_ID>",
          devbox_id="<YOUR_DEVBOX_ID>",
      )
      ```

      ```typescript TypeScript
      const result = await client.devboxes.executions.retrieve('<YOUR_DEVBOX_ID>', "<EXECUTION_ID>");
      console.log(result.stdout);

      // Alternatively, we can wait the background command completing
      await client.devboxes.executions.awaitCompleted('<YOUR_DEVBOX_ID>', "<EXECUTION_ID>");
      ```
    </CodeGroup>
  </Step>

  <Step title="(Optionally) Kill the async command if needed">
    <CodeGroup>
      ```bash curl
      curl -X POST 'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/executions/<EXECUTION_ID>/kill' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" \
        -H 'Content-Type: application/json' \
        -d '{}'
      ```

      ```python Python
      import os
      from runloop_api_client import Runloop

      client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"])

      client.devboxes.executions.kill("<EXECUTION_ID>", devbox_id="<YOUR_DEVBOX_ID>")
      ```

      ```typescript TypeScript
      await client.devboxes.executions.kill('<YOUR_DEVBOX_ID>', "<EXECUTION_ID>");
      ```
    </CodeGroup>
  </Step>
</Steps>

## Isolated vs StatefulShells

By default, every Devbox command is run in an isolated shell. This means that each command is executed in a new shell session, and the state of the shell is not preserved between commands.

<CodeGroup>
  ```bash curl
  curl -X POST 'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/execute_sync' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{"command": "echo Hello World"}'
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"])

  result = client.devboxes.execute_sync("<YOUR_DEVBOX_ID>", command="echo Hello World")
  print(result.stdout)
  ```

  ```typescript TypeScript
  const result = await client.devboxes.executeSync('<YOUR_DEVBOX_ID>', {
    command: 'echo Hello World'
  });
  console.log(result.stdout);
  ```
</CodeGroup>

## Using Stateful Shells

Alternatively, you can use the `shell_name` parameter to use a 'stateful' shell. This means that the shell will maintain its state across commands including environment variables and working directory.

As an example, let's create a series of interdependent commands that need to be run in the same shell:

<Steps>
  <Step title="Check initial directory">
    <CodeGroup>
      ```bash curl
      curl -X POST 'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/execute_sync' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" \
        -H 'Content-Type: application/json' \
        -d '{
          "command": "pwd",
          "shell_name": "my-shell"
        }'
      ```

      ```python Python
      import os
      from runloop_api_client import Runloop

      client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"])

      result = client.devboxes.execute_sync("<YOUR_DEVBOX_ID>", command="pwd", shell_name="my-shell")
      print(result.stdout)
      ```

      ```typescript TypeScript
      const result = await client.devboxes.executeSync('<YOUR_DEVBOX_ID>', {
        command: 'pwd',
        shell_name: 'my-shell'
      });
      console.log(result.stdout);
      ```
    </CodeGroup>
  </Step>

  <Step title="Create and enter new directory">
    <CodeGroup>
      ```bash curl
      curl -X POST 'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/execute_sync' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" \
        -H 'Content-Type: application/json' \
        -d '{
          "command": "mkdir mynewfolder && cd mynewfolder",
          "shell_name": "my-shell"
        }'
      ```

      ```python Python
      import os
      from runloop_api_client import Runloop

      client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"])

      client.devboxes.execute_sync(
          "<YOUR_DEVBOX_ID>",
          command="mkdir mynewfolder && cd mynewfolder",
          shell_name="my-shell"
      )
      ```

      ```typescript TypeScript
      await client.devboxes.executeSync('<YOUR_DEVBOX_ID>', {
        command: 'mkdir mynewfolder && cd mynewfolder',
        shell_name: 'my-shell'
      });
      ```
    </CodeGroup>
  </Step>

  <Step title="Verify new working directory is preserved!">
    <CodeGroup>
      ```bash curl
      curl -X POST 'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/execute_sync' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" \
        -H 'Content-Type: application/json' \
        -d '{
          "command": "pwd",
          "shell_name": "my-shell"
        }'
      ```

      ```python Python
      import os
      from runloop_api_client import Runloop

      client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"])

      result = client.devboxes.execute_sync("<YOUR_DEVBOX_ID>", command="pwd", shell_name="my-shell")
      print(result.stdout)
      ```

      ```typescript TypeScript
      const result = await client.devboxes.executeSync('<YOUR_DEVBOX_ID>', {
        command: 'pwd',
        shell_name: 'my-shell'
      });
      console.log(result.stdout);
      ```
    </CodeGroup>
  </Step>
</Steps>


# Read and Write Files on a Devbox
Source: https://docs.runloop.ai/devboxes/files

Give your AI agent access to modify and interact with files on your devbox.

export const ExampleRepoLink = props => {
  return <Info><h3><a href={props.link}>Full example</a></h3></Info>;
};

<ExampleRepoLink link="https://github.com/runloopai/runloop-examples/tree/main/devboxes/read-write-files" />

## Overview

In addition to running commands, your AI agent may need to modify or read files on your Devbox. The Runloop Devbox provides full programmatic access to the underlying filesystem, allowing your agent to interact with files as needed.

## Writing Files to the Devbox

When authoring code, your AI Agent will often need to write files to disk. There are two main methods for this:

### Writing Small Text Files

You can use `write_file_contents` to easily write a UTF-8 string to a file on disk. Note that all file paths are relative to the user's home directory by default.

<CodeGroup>
  ```bash curl
  curl -X POST \
    'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/write_file_contents' \
    -H "Authorization: Bearer <YOUR_API_KEY>" \
    -H 'Content-Type: application/json' \
    -d '{
      "file_path": "/home/user/main.py",
      "contents": "print(\"Hello, World!\")"
    }'
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  client.devboxes.write_file_contents(
      "<YOUR_DEVBOX_ID>",
      file_path="/home/user/main.py",
      contents='print("Hello, World!")'
  )
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';

  const client = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  async function writeFileContents() {
    await client.devboxes.writeFileContents('<YOUR_DEVBOX_ID>', {
      file_path: '/home/user/main.py',
      contents: 'print("Hello, World!")'
    });
  }

  writeFileContents();
  ```
</CodeGroup>

### Uploading Large or Non-Text Files

For larger files or binary data, you should use the `upload_file` API, which supports files of any sizes and allows passing non text data:

<CodeGroup>
  ```bash curl
  curl -X POST \
    'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/upload_file' \
    -H "Authorization: Bearer <YOUR_API_KEY>" \
    -H 'Content-Type: multipart/form-data' \
    -F "path=/home/user/large_data.txt" \
    -F "file=@large_data.txt"
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  with open('large_data.txt', 'rb') as file:
      client.devboxes.upload_file(
          "<YOUR_DEVBOX_ID>",
          path="/home/user/large_data.txt",
          file=file
      )
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';
  import fs from 'fs';

  const client = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  async function uploadFile() {
    const file = fs.createReadStream('large_data.txt');
    await client.devboxes.uploadFile('<YOUR_DEVBOX_ID>', {
      path: '/home/user/large_data.txt',
      file: file
    });
  }

  uploadFile();
  ```
</CodeGroup>

## Reading Files

Your AI Agent will often also need to read files from the Devbox. There are two main methods for this:

### Reading Small Text Files

You can use `read_file_contents` to read the contents of a file on the Devbox as a UTF-8 string.

<CodeGroup>
  ```bash curl
  curl -X POST \
    'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/read_file_contents' \
    -H "Authorization: Bearer <YOUR_API_KEY>" \
    -H 'Content-Type: application/json' \
    -d '{
      "file_path": "/home/user/test_results.txt"
    }'
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  contents = client.devboxes.read_file_contents(
      "<YOUR_DEVBOX_ID>",
      file_path="/home/user/test_results.txt"
  )
  print(contents)
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';

  const client = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  async function readFile() {
    const contents = await client.devboxes.readFileContents('<YOUR_DEVBOX_ID>', {
      file_path: '/home/user/test_results.txt'
    });
    console.log(contents);
  }

  readFile();
  ```
</CodeGroup>

### Downloading Large or Non-Text Files

You can also use `download_file` to download a file from the Devbox directly for large or non-text files.

<CodeGroup>
  ```bash curl
  curl -X POST \
    'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/download_file' \
    -H "Authorization: Bearer <YOUR_API_KEY>" \
    -H 'Content-Type: application/json' \
    -d '{
      "file_path": "/home/user/large_data.txt"
    }'
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  client.devboxes.download_file(
      "<YOUR_DEVBOX_ID>",
      file_path="/home/user/large_data.txt"
  )
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';

  const client = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  async function downloadFile() {
    await client.devboxes.downloadFile('<YOUR_DEVBOX_ID>', {
      file_path: '/home/user/large_data.txt'
    });
  }

  downloadFile();
  ```
</CodeGroup>

## Best Practices

1. Always specify the full path when working with files to avoid ambiguity.

2. Be mindful of file permissions when reading or writing files in different directories.

3. Use error handling in your AI agent's code to manage potential issues with file operations, such as "file not found" or "permission denied" errors.

By leveraging these file operations, your AI agent can effectively manage code, data, and results within the Runloop Devbox environment.


# The Devbox Lifecycle
Source: https://docs.runloop.ai/devboxes/lifecycle

Reference documentation for the various states a Devbox can be in.

## Understanding the Devbox State Machine

<Frame>
  <img src="https://mintlify.s3.us-west-1.amazonaws.com/runloopai/images/devbox_lifecycle_cropped.png" style={{ borderRadius: '0.5rem' }} />
</Frame>

Devboxes represent a persistent dev environment that can be launched and shut down as needed.
Over the course of a Devbox's lifecycle, it will transition through a series of states depending on your use case:

* **provisioning**: Runloop is allocating and booting the necessary infrastructure resources.
* **initializing**: Runloop defined boot scripts are running to enable the environment for interaction.
* **running**: The Devbox is ready for interaction.
* **failure**: The Devbox failed as part of booting or running user requested actions.
* **shutdown**: The Devbox was successfully shutdown and no more active compute is being used.

{/*
  - **suspending**: The Devbox disk is being snapshotted and as part of suspension.
  - **suspended**: The Devbox disk is saved and no more active compute is being used for the Devbox.
  - **resuming**: The Devbox disk is being loaded as part of booting a suspended Devbox.
  */}

{/*
  Re-enable this section once suspend and resume is stable
  ### Suspending and Resuming Devboxes to Save Disk State
  In addition to use idle management configuration, you can also manually suspend and resume a devbox.
  <Note>Only disk state, not in-memory state is preserved during suspend/resume operations</Note>

  <Steps>
  <Step title="Suspend the devbox">
    <CodeGroup>

    ```bash curl
    curl -X POST 'https://api.runloop.ai/v1/devboxes/{devbox_id}/suspend' \
      -H "Authorization: Bearer $RUNLOOP_API_KEY" \
      -H 'Content-Type: application/json'
    ```

    ```python Python
    from runloop_api_client import Runloop

    client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))
    client.devboxes.suspend(devbox_id)
    ```

    ```typescript TypeScript
    const client = new Runloop({
      bearerToken: process.env.RUNLOOP_API_KEY,
    });
    await client.devboxes.suspend(devbox_id);
    ```
    </CodeGroup>
  </Step>

  <Step title="Resume when needed">
    <CodeGroup>

    ```bash curl
    curl -X POST 'https://api.runloop.ai/v1/devboxes/{devbox_id}/resume' \
      -H "Authorization: Bearer $RUNLOOP_API_KEY"
    ```

    ```python Python
    client.devboxes.resume(devbox_id)
    ```

    ```typescript TypeScript
    await client.devboxes.resume(devbox_id);
    ```
    </CodeGroup>
  </Step>

  <Step title="Wait for the devbox to be running again">
    <CodeGroup>

    ```bash curl
    # wait for the devbox to be running again
    while true; do
      status=$(curl -s -X GET 'https://api.runloop.ai/v1/devboxes/{devbox_id}' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" | jq -r '.status')
      if [ "$status" == "running" ]; then
        break
      fi
      sleep 1
    done
    ```

    ```python Python
    devbox = client.devboxes.await_running(devbox_id)    
    ```

    ```typescript TypeScript
    const devbox = await client.devboxes.awaitRunning(devbox_id);    
    ```
    </CodeGroup>
  </Step>
  </Steps>


  ### Important Notes
  - Suspended Devboxes still incur storage charges until explicitly shut down
  - The suspend/resume process typically takes seconds, depending on the amount of modified data
  - Daemons or other processes running at suspend time must be manually restarted after resuming
  - The original Devbox ID and SSH keys are preserved through suspend/resume cycles
  */}


# Managing Devbox Metadata
Source: https://docs.runloop.ai/devboxes/metadata

Effectively manage and organize large numbers of Devboxes using metadata

When working with hundreds or thousands of Devboxes, effective organization becomes crucial. Runloop provides a powerful metadata system to help you tag, categorize, and filter your Devboxes efficiently.

## Using Metadata

Metadata allows you to attach custom key-value pairs to your Devboxes. This information can include:

* Project names
* Team assignments
* Environment types (e.g., development, staging, production)
* Any other relevant tags for your workflow

## Adding Metadata to Devboxes

When creating a Devbox, you can include metadata to help organize and filter them later:

<CodeGroup>
  ```bash curl
  curl -X POST 'https://api.runloop.ai/v1/devboxes' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
      "metadata": {
        "project": "runloop-fe",
        "team": "frontend",
        "environment": "development"
      }
    }'
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"])

  devbox = client.devboxes.create(
      metadata={
          "project": "runloop-fe",
          "team": "frontend",
          "environment": "development"
      }
  )
  print(f"Devbox created with ID: {devbox.id}")
  ```

  ```typescript TypeScript
  import { Runloop } from '@runloop/sdk';

  const client = new Runloop('your_api_key_here');

  const devbox = await client.devboxes.create({
    metadata: {
      project: "runloop-fe",
      team: "frontend",
      environment: "development"
    }
  });
  console.log(`Devbox created with ID: ${devbox.id}`);
  ```
</CodeGroup>

## Benefits of Using Metadata

1. **Easy Filtering**: Quickly find Devboxes related to specific projects or teams.
2. **Improved Organization**: Group Devboxes logically based on your workflow.
3. **Enhanced Visibility**: Easily identify the purpose and ownership of each Devbox.
4. **Streamlined Management**: Perform bulk operations on Devboxes with similar metadata.

## Viewing and Filtering Metadata

The Runloop dashboard displays metadata tags for each Devbox, allowing you to:

* View all metadata associated with a Devbox at a glance
* Use integrated filters to sort and find Devboxes based on their metadata
* Create custom views based on frequently used metadata filters

## Best Practices for Using Metadata

1. **Consistent Naming**: Use a consistent naming convention for your metadata keys and values.
2. **Relevant Information**: Include only metadata that is useful for organizing and filtering.
3. **Update Regularly**: Keep metadata up-to-date as projects evolve or team assignments change.
4. **Use Hierarchies**: Consider using hierarchical metadata (e.g., "env:production" instead of just "production").

By effectively using metadata, you can maintain organization and clarity even when managing thousands of Devboxes across multiple projects and teams.


# Overview of Devboxes
Source: https://docs.runloop.ai/devboxes/overview


Runloop devboxes are the foundation of building AI coding agents fast. We built devboxes because we were tired of hitting the same common problems and needs when building new agents.

## Devboxes & Your Stack

Your AI agent will need to do more than just chat. Very likely, you are building an agents that will:

* Query external APIs
* Pull, build, and execute code from git repositories
* Run a headless browser to scrape or interact with websites
* Read and write files on a filesystem
* Run proprietary code or binaries

In development, it's easy to do all of these things with a script on your local machine. But in production, you'll need a better approach. That's where devboxes come in.

Runloop devboxes are **the isolated virtual machine your AI agent does its work on.** By building your agent against devbox APIs, your agent can do all of these things without you investing significant time and effort in building infrastructure.

## Key Devbox Features

* **Isolated, ephemeral virtual machines:** Devboxes are created on demand, and deleted when they are no longer needed.
* **Super fast boot times:** Our base devbox images are optimized to boot in less than `200ms`.
* **Stateful or stateless:** By default, devboxes are stateless and are destroyed after each run. But devboxes also support **snapshot**, **suspend**, and **result**, each with one simple API call.
* **Customizable sizes and images:** You can choose machine size and resources from a range of options, and you can create and customize team-shared images with blueprints.

## Working with Devboxes

Your agent code will interact with devboxes through the Runloop API. We provide [client SDKs](/tools/sdks) for Python and Typescript.

You can also use the [Runloop CLI](/tools/cli) and the [Runloop Dashboard](/tools/dashboard) to view, manage, and monitor your devboxes.

Ready to get started? Read on for quick examples showcasing common devbox uses.


# Configuring Devbox Instance Sizes
Source: https://docs.runloop.ai/devboxes/sizes

Configure your Devboxes using predefined sizes

Runloop offers flexible options to tailor your Devbox resources and lifecycle to your specific AI workloads. This guide covers predefined resource sizes for standardized configurations.

## Predefined Resource Sizes

Runloop provides the following resource configurations for Devboxes:

| Size      | CPU | Memory | Storage |
| --------- | --- | ------ | ------- |
| X\_SMALL  | 0.5 | 1GB    | 4GB     |
| SMALL     | 1   | 2GB    | 4GB     |
| MEDIUM    | 2   | 4GB    | 8GB     |
| LARGE     | 2   | 8GB    | 16GB    |
| X\_LARGE  | 4   | 16GB   | 16GB    |
| XX\_LARGE | 8   | 32GB   | 16GB    |

## Launch Parameters

When creating a Devbox, use `LaunchParameters` to specify the desired configuration.

### Resource Size

Set the `resource_size` parameter to choose a predefined size:

<CodeGroup>
  ```bash curl
  curl -X POST 'https://api.runloop.ai/v1/devboxes' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
      "launch_parameters": {
        "resource_size": "MEDIUM"
      }
    }'
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  devbox = client.devboxes.create(
      launch_parameters={
          "resource_size": "MEDIUM"
      }
  )
  print(f"Devbox created with ID: {devbox.id}")
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';

  const client = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  async function createDevbox() {
    const devbox = await client.devboxes.create({
      launch_parameters: {
        resource_size: "MEDIUM"
      }
    });
    console.log(`Devbox created with ID: ${devbox.id}`);
  }

  createDevbox();
  ```
</CodeGroup>

This example creates a Devbox with 2 CPU cores and 2Gi of memory.


# Devbox Snapshots
Source: https://docs.runloop.ai/devboxes/snapshots

Saved diskstates from existing for devboxes for re-use & branching

export const ExampleRepoLink = props => {
  return <Info><h3><a href={props.link}>Full example</a></h3></Info>;
};

<ExampleRepoLink link="https://github.com/runloopai/runloop-examples/tree/main/snapshots" />

Snapshots can be used to save the current disk state of a devbox, and to create
new devboxes from a previous point in time. These can be used to:

* Improve build times by snapshotting a populated build cache.
* Roll back to a known good point in time.
* Perform fan-out and attempt multiple approaches to a code change.

Snapshots are referenced by a random identifier and can be queried via the API. Currently only disk snapshots are supported.

<Note>
  When should I use a Blueprint vs. a Snapshot?

  Snapshots and Blueprints both allow you to run devboxes with customizations. **Blueprints** are fast to boot and cacheable using Docker layers, while **Snapshots** are a bit slower on boot (reproducing each step taken in the devbox) but can be created quickly from an existing devbox.

  Examples:

  * **[Blueprint](/devboxes/blueprints)**: You have a coding agent that is performing a task that requires installing a specific tool. Create a blueprint with set-up steps for the tool and future devboxes will cache the installation to speed up boot and execution time.
  * **[Snapshot](/devboxes/snapshots)**: You have a coding agent in a devbox considering 3 different ways to complete a task. Create a snapshot of the initial state of the devbox, create 3 parallel devboxes from that snapshot, collate the results, and then choose the best option to continue.
</Note>

<Steps>
  <Step title="Identify the devbox to snapshot">
    First, identify a running devbox id using the dashboard or rl-cli.

    ```shell
    $ rl devbox list --status=running
    ```

    Optionally, you may want to remove any temporary files before proceeding to reduce the latency of snapshot operations.
  </Step>

  <Step title="Snapshot the disk of a currently running devbox">
    <CodeGroup>
      ```bash curl
      curl -X POST 'https://api.runloop.ai/v1/devboxes/{devbox_id}/snapshot_disk' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" \
        -H 'Content-Type: application/json' \
        -d '{}'
      ```

      ```python Python
      snapshot = client.devboxes.snapshot_disk(devbox.id)
      print(f"Snapshot created with ID: {snapshot.id}")
      ```

      ```typescript TypeScript
      const snapshot = await client.devboxes.snapshot_disk(devbox.id);
      console.log(`Snapshot created with ID: ${snapshot.id}`);
      ```
    </CodeGroup>
  </Step>

  <Step title="Create a new devbox from a snapshot">
    Using the `snapshot_id` from the previous step, launch a new devbox.

    <CodeGroup>
      ```bash curl
      curl -X POST 'https://api.runloop.ai/v1/devboxes' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" \
        -H 'Content-Type: application/json' \
        -d '{
          "snapshot_id": <snapshot_id>
        }'
      ```

      ```python Python
      from runloop_api_client import Runloop

      client = Runloop(api_key="your_api_key_here")

      devbox = client.devboxes.create(
          snapshot_id=<snapshot_id>
      )
      print(f"Devbox created with ID: {devbox.id}")
      ```

      ```typescript TypeScript
      import { Runloop } from '@runloop/sdk';

      const client = new Runloop('your_api_key_here');

      const devbox = await client.devboxes.create({
        snapshot_id: <snapshot_id>
      });
      console.log(`Devbox created with ID: ${devbox.id}`);
      ```
    </CodeGroup>
  </Step>
</Steps>


# Start and Stop a Devbox
Source: https://docs.runloop.ai/devboxes/start-stop

Getting started with the Runloop platform

export const ExampleRepoLink = props => {
  return <Info><h3><a href={props.link}>Full example</a></h3></Info>;
};

<ExampleRepoLink link="https://github.com/runloopai/runloop-examples/tree/main/devboxes/start-stop" />

### Launching and Shutting Down new Devboxes

When a Devbox is launched, Runloop will allocate the necessary infrastructure. You should see the Devbox transition to the `running` state at which point you will be able to interact with the Devbox:

<CodeGroup>
  ```bash curl
  curl -X POST 'https://api.runloop.ai/v1/devboxes' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{}'
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  runloop_client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))
  # create the devbox and wait for it to be ready
  devbox = runloop_client.devboxes.create_and_await_running()
  print(devbox.id)
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';

  const runloopClient = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  const devbox = await runloopClient.devboxes.createAndAwaitRunning();
  console.log(devbox.id);

  ```
</CodeGroup>

<Note>
  You can also create a devbox from a snapshot or blueprint to optimize boot times.
  Check out the [Snapshots Guide](/devboxes/snapshots) for more information.
</Note>

When you are done with a devbox, you can shut it down to free up resources.

<CodeGroup>
  ```bash curl
  curl -X POST 'https://api.runloop.ai/v1/devboxes/{devbox_id}/shutdown' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY"
  ```

  ```python Python
  client.devboxes.shutdown(devbox_id)
  ```

  ```typescript TypeScript
  await client.devboxes.shutdown(devbox_id);
  ```
</CodeGroup>

<Note>Once a devbox is shutdown, it's disk state is deleted so if you want to keep your devbox's disk state, you should suspend it instead or use a snapshot.</Note>

{/*
  ### Idle Management
  By default, Devboxes will automatically shutdown after 1 hour of inactivity. 
  However, you can configure your Devbox to either suspend or shutdown after a custom period of time to optimize costs.

  <CodeGroup>

  ```bash curl
  curl -X POST 'https://api.runloop.ai/v1/devboxes' \
  -H "Authorization: Bearer $RUNLOOP_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{
    "launch_parameters": {
      "after_idle": {
        "on_idle": "suspend",
        "idle_time_seconds": 1800
      }
    }
  }'
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  devbox = client.devboxes.create(
    launch_parameters={
        "after_idle": {
            "idle_time_seconds": 1800,
            "on_idle": "suspend"
        }
    }
  )
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';

  const client = new Runloop({
  bearerToken: process.env.RUNLOOP_API_KEY,
  });

  const devbox = await client.devboxes.create({
  launch_parameters: {
    after_idle: {
      idle_time_seconds: 1800,
      on_idle: "suspend"
    }
  }
  });
  ```
  </CodeGroup>
  */}


# Troubleshooting Blueprint Builds
Source: https://docs.runloop.ai/devboxes/troubleshooting-blueprints

Debug and fix your Blueprint builds in Runloop.

## Step 1: Check Blueprint Logs

Start by examining the build process logs. Runloop builds a Docker image behind the scenes, and you can access these logs using the Blueprint logs endpoint.

<CodeGroup>
  ```bash curl
  curl -X GET 'https://api.runloop.ai/v1/blueprints/{blueprint_id}/logs' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY"
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  logs = client.blueprints.logs("{blueprint_id}")
  for log in logs.logs:
      print(f"{log.level}: {log.message}")
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';

  const client = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  async function getBlueprintLogs(blueprintId: string) {
    const logs = await client.blueprints.logs(blueprintId);
    logs.logs.forEach(log => {
      console.log(`${log.level}: ${log.message}`);
    });
  }

  getBlueprintLogs("{blueprint_id}");
  ```
</CodeGroup>

Replace `{blueprint_id}` with your actual Blueprint ID.

### Interpreting Log Output

The logs can help you identify specific build issues. Here's an example of what you might see:

```json
[
  {
    "level": "info",
    "timestamp_ms": 1722357063912,
    "message": "fatal: could not read Password for 'https://$GH_TOKEN@github.com': No such device or address"
  },
  {
    "level": "info",
    "timestamp_ms": 1722357063915,
    "message": "error building image: error building stage: failed to execute command: waiting for process to exit: exit status 128"
  }
]
```

In this example, the error suggests an issue with GitHub authentication, possibly due to an invalid or missing token.

## Step 2: Local Build Testing

If the logs don't reveal an obvious problem, you may want to build the Docker image locally. This can help identify issues specific to your development environment or configuration.

### 2.1 Obtain the Dockerfile

Use the `preview` endpoint to get the full Docker configuration:

<CodeGroup>
  ```bash curl
  curl -X POST 'https://api.runloop.ai/v1/blueprints/preview' \
    -H 'accept: application/json' \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $RUNLOOP_API_KEY" \
    -d '{
    "name": "ai-dev-environment",
    "dockerfile": "FROM public.ecr.aws/f7m5a7m8/devbox:prod\nRUN apt-get update && apt-get install -y cowsay\n"
  }' | jq -r '.dockerfile' > Dockerfile.runloop
  ```

  ```python Python
  import os
  from runloop_api_client import Runloop

  client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  preview = client.blueprints.preview(
      name="ai-dev-environment",
      dockerfile="FROM public.ecr.aws/f7m5a7m8/devbox:prod\nRUN apt-get update && apt-get install -y cowsay\n"
  )

  with open("Dockerfile.runloop", "w") as f:
      f.write(preview.dockerfile)

  print("Dockerfile saved as Dockerfile.runloop")
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';
  import fs from 'fs/promises';

  const client = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  async function getDockerfilePreview() {
    const preview = await client.blueprints.preview({
      name: "ai-dev-environment",
      dockerfile: "FROM public.ecr.aws/f7m5a7m8/devbox:prod\nRUN apt-get update && apt-get install -y cowsay\n"
    });

    await fs.writeFile("Dockerfile.runloop", preview.dockerfile);
    console.log("Dockerfile saved as Dockerfile.runloop");
  }

  getDockerfilePreview();
  ```
</CodeGroup>

<Note>
  This command uses `jq` to extract the Dockerfile from the response and save it to a file named `Dockerfile.runloop`.
</Note>

### 2.2 Build Locally

With your `Dockerfile.runloop` ready, you can test and debug the build locally:

```bash
docker build --build-arg GH_TOKEN_0="$GH_TOKEN" -f Dockerfile.runloop -t local-blueprint-img .
```

## Step 3: Common Issues and Solutions

Here are some common issues you might encounter and how to resolve them:

1. **GitHub Authentication Errors**:
   * Ensure your `GH_TOKEN` is valid and has the necessary permissions.
   * Check that the token is correctly set in your environment variables.

2. **Package Installation Failures**:
   * Verify that your `system_setup_commands` are correct and compatible with the base image.
   * Ensure you're using the correct package manager (apt for Debian-based images).

3. **CodeMount Issues**:
   * Double-check the repository name, owner, and access permissions.
   * Verify that the `install_command` is appropriate for your project.

4. **Resource Constraints**:
   * If the build is timing out or failing due to resource limits, consider optimizing your Dockerfile or increasing resource allocations.

## Step 4: Seeking Additional Help

If you're still encountering issues after following these steps:

1. Review the [Runloop Documentation](https://docs.runloop.ai) for any updates or known issues.
2. Reach out to Runloop support with:
   * Your Blueprint ID
   * The full logs from both Runloop and your local build attempts
   * A description of the steps you've taken to troubleshoot

By following this troubleshooting guide, you should be able to identify and resolve most issues with your Blueprint builds.


# Open a Tunnel to a Service on a Devbox
Source: https://docs.runloop.ai/devboxes/tunnels

Create a tunnel to access ports on your Devbox

export const ExampleRepoLink = props => {
  return <Info><h3><a href={props.link}>Full example</a></h3></Info>;
};

<ExampleRepoLink link="https://github.com/runloopai/runloop-examples/tree/main/devboxes/web-tunnels" />

When developing software on your Devbox, you will often want to expose local services running on your Devbox to the outside world.
For example, you may want to have your agent start a local web server to serve a frontend application and then expose the live frontend to your users.
Other examples include using tunnels:

* to remotely collaborate on a frontend project,
* test a web service,
* access a Jupyter notebook running on your Devbox,
* or access a local database running on your Devbox.

Let's use Devbox tunnels to securely access ports on your Devbox over a simple url.

## Setting up a tunnel

To set up a Devbox tunnel, first make any ports you will want to access available
at Devbox creation time.

<Steps>
  <Step title="Create a devbox with the ports you want to expose">
    Create a devbox with the ports you want to expose.

    <CodeGroup>
      ```bash curl
      curl -X POST 'https://api.runloop.ai/v1/devboxes' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" \
        -H 'Content-Type: application/json' \
        -d '{
          "launch_parameters": {
            "available_ports": [4040],
          },
          "entrypoint": "python3 -m http.server 4040"
        }'
      ```

      ```python Python
      import os
      from runloop_api_client import Runloop
      from runloop_api_client.types.shared_params import AfterIdle, LaunchParameters

      client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

      devbox = client.devboxes.create(
          launch_parameters={
            available_ports=[4040]
          },
          entrypoint="python3 -m http.server 4040"
      )
      ```

      ```typescript TypeScript
      import Runloop from '@runloop/api-client';

      const client = new Runloop({
        bearerToken: process.env.RUNLOOP_API_KEY,
      });

      async function createDevbox() {
        const devbox = await client.devboxes.create({
          launch_parameters: {
            available_ports: [4040]
          },
          entrypoint: "python3 -m http.server 4040"
        });
      }

      createDevbox();
      ```
    </CodeGroup>
  </Step>

  <Step title="Create a tunnel to the port you want to expose">
    When your devbox is created and running, you can now open a tunnel.  Use the `create_tunnel`
    endpoint to create a unique URL to your devbox.

    <CodeGroup>
      ```bash curl
      curl -X POST 'https://api.runloop.ai/v1/devboxes/<devbox_id>/create_tunnel' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" \
        -H 'Content-Type: application/json' \
        -d '{
          "port": 4040
        }'
      ```

      ```python Python
      import os
      from runloop_api_client import Runloop
      from runloop_api_client.types.shared_params import AfterIdle, LaunchParameters

      client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

      devbox = client.devboxes.create_tunnel(
          id=devbox.id,
          port=4040
      )
      ```

      ```typescript TypeScript
      async function createDevbox() {
        const devbox = await client.devboxes.createTunnel({
          id: devbox.id,
          port: 4040
        });
      }

      createDevbox();
      ```
    </CodeGroup>

    From the result, extract the url and open it in your browser to access the service running on your Devbox.

    <Warning>While the Devbox is active and the tunnel is open, the URL now has remote access to this port of your Devbox. Treat with care</Warning>
  </Step>
</Steps>


# Usage with Common Model Providers and Frameworks
Source: https://docs.runloop.ai/examples/llm-integrations


### **Initialization**

To integrate with Runloop and your preferred LLM provider, initialize the respective SDK clients.

<CodeGroup>
  ```python Python
  from anthropic import Anthropic
  from runloop_api_client import Runloop
  import os

  # Initialize clients
  anthropic = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
  runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))
  ```

  ```typescript TypeScript
  import Runloop from '@runloop/api-client';
  import Anthropic from '@anthropic-ai/sdk';

  // Initialize clients
  const anthropic = new Anthropic({apiKey: process.env.ANTHROPIC_API_KEY,});
  const runloop = new Runloop({bearerToken: process.env.RUNLOOP_API_KEY});
  ```
</CodeGroup>

### **Defining Prompts**

Defining a clear, actionable prompt ensures accurate LLM responses.

<CodeGroup>
  ```python Python
  system_prompt = "You are a helpful coding assistant that can generate and execute python code. "
  "You only respond with the code to be executed and nothing else."
  "Strip backticks in code blocks."

  prompt = 
  "Write a Python script that generates a maze. The script should:"
  "1. Accept a size parameter from command line arguments"
  "2. Generate a random maze of the specified size. Remember to make the maze solvable "
  "and easy and to make it clear the outer borders of the maze."
  "3. Print the maze where '#' represents walls and ' ' represents paths."
  "Mark the Maze start with 'S' and end with 'E'"
  "4. Use argparse for command line argument parsing"
  "The code should be in the format of a Python script that can be run directly"
  "with 'python gen_maze.py --size 5'."
  "ONLY output the code and do NOT wrap the code in markdown! The code should begin "
  "with an import and end with a print statement."

  ```

  ```typescript TypeScript
  const prompt = "Write a Python script that generates a maze. The script should:"
  "1. Accept a size parameter from command line arguments"
  "2. Generate a random maze of the specified size. Remember to make the maze solvable"
  " and easy and to make it clear the outer borders of the maze."
  "3. Print the maze where '#' represents walls and ' ' represents paths." 
  "Mark the Maze start with 'S' and end with 'E'"
  "4. Use argparse for command line argument parsing"
  "The code should be in the format of a Python script that can be run directly"
  "with 'python gen_maze.py --size 5'."
  "ONLY output the code and do NOT wrap the code in markdown!`;"
  ```
</CodeGroup>

### **Generating Code**

Send the defined prompts to the LLM's message endpoint, configure parameters and extract the generated code from the response.

<CodeGroup>
  ```python Python
  # Generate code using Claude
  response = anthropic.messages.create(
      model="claude-3-5-sonnet-20240620",
      max_tokens=1024,
      messages=[
          {"role": "assistant", "content": system_prompt},
          {"role": "user", "content": prompt}
      ]
  )
  maze_generation_script = response.content[0].text
  ```

  ```typescript TypeScript
  const {content} = await anthropic.messages.create({
    model: "claude-3-5-sonnet-20241022",
    max_tokens: 1000,
    temperature: 0,
    system: "Respond only with code. Do not include any markdown or comments.",
    messages: [{
        "role": "user",
        "content": [
            {
            "type": "text",
            "text": prompt
            }]
        }]
    });
  const mazeGenerationScript = (content[0] as { text: string }).text;
  ```
</CodeGroup>

### **Running Code on a Devbox**

After retrieving the code from the LLM, execute it in a Runloop Devbox.

<CodeGroup>
  ```python Python
  devbox = runloop.devboxes.create_and_await_running()
  print("Devbox ID:", devbox.id)

  runloop.devboxes.write_file_contents(
    devbox.id,
    file_path="gen_maze.py",
    contents=code
  )

  result = runloop.devboxes.execute_sync(
    devbox.id,
    command=f"python gen_maze.py --size {size}"
  )

  if not result.exit_status:
    print("Maze generated successfully\n", result.stdout)
    return result.stdout
  else:
    print("Script execution failed:", result.stderr)
    return result.stderr
  ```

  ```typescript TypeScript
  const devbox = await runloop.devboxes.createAndAwaitRunning();
  console.log(`Devbox ID: ${devbox.id}`);

  await runloop.devboxes.writeFileContents(devbox.id, {
    file_path: "gen_maze.py",
    contents:  mazeGenerationScript,
  });

  const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, {
    command: "python gen_maze.py --size 11",
  });

  exit_status === 0
    ? console.log("Maze generated successfully\n", stdout)
    : console.error("Maze generation failed\n", stderr);
  ```
</CodeGroup>

### **Integrating with Popular Frameworks**

The examples below show Integration of Runloop with popular frameworks and LLM providers.

The examples follow this structure:

1. **Client Initialization**: Set up SDK clients with environment variables.
2. **Prompt Definition**: Use pre-defined system and user prompts.
3. **Code Generation**: Generate code based on the prompts.
4. **Execution**: Run the code in a secure Runloop Devbox.

<Note> Prompts are defined above and reused across examples.</Note>
<Note>Handle the non-null "!" operator in examples with default values or as needed.</Note>

## **TypeScript Integrations**

<CodeGroup>
  ```typescript Claude
  import Runloop from '@runloop/api-client';
  import Anthropic from '@anthropic-ai/sdk';

  // Initialize clients
  const anthropic = new Anthropic({apiKey: process.env.ANTHROPIC_API_KEY,});
  const runloop = new Runloop({bearerToken: process.env.RUNLOOP_API_KEY});


  async function generateMazeCreator() {
    try{
      const {content} = await anthropic.messages.create({
        model: "claude-3-5-sonnet-20241022",
        max_tokens: 1000,
        temperature: 0,
        system: "Respond only with code. Do not include any markdown or comments.",
        messages: [{
          "role": "user",
          "content": [
            {
            "type": "text",
            "text": prompt
            }]
          }
      ]
      });
      const mazeGenerationScript = (content[0] as { text: string }).text;

      // Execute the script in a Devbox
      const devbox = await runloop.devboxes.createAndAwaitRunning();
      console.log(`Devbox ID: ${devbox.id}`);

      await runloop.devboxes.writeFileContents(devbox.id, {
        file_path: "gen_maze.py",
        contents:  mazeGenerationScript,
      });

      const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, {
        command: "python gen_maze.py --size 11",
      });

      exit_status === 0
        ? console.log("Maze generated successfully\n", stdout)
        : console.error("Maze generation failed\n", stderr);

      } catch (error) {
        console.error("Error:", error);
      }
  }

  generateMazeCreator();
  ```

  ```typescript Gemini
  import Runloop from '@runloop/api-client'; 
  import { GoogleGenerativeAI } from "@google/generative-ai";

  // Initialize clients
  const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);
  const model = genAI.getGenerativeModel({ model: process.env.GEMINI_MODEL!}); // non-null assertion operator in use, add default value to handle undefined
  const runloop = new Runloop({bearerToken: process.env.RUNLOOP_API_KEY});

  async function generateMazeCreator() {
    try{
      // Generate code using Google Gemini
          const mazeGenerationScript = (await model.generateContent(prompt)).response.text();
          console.log(mazeGenerationScript);

      // Execute the script in a Devbox
      const devbox = await runloop.devboxes.createAndAwaitRunning();
      console.log(`Devbox ID: ${devbox.id}`);

      await runloop.devboxes.writeFileContents(devbox.id, {
          file_path: "gen_maze.py",
          contents:  mazeGenerationScript,
      });

      const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, {
        command: "python gen_maze.py --size 11",
      });

      exit_status === 0
        ? console.log("Maze generated successfully\n", stdout)
        : console.error("Maze generation failed\n", stderr);

    } catch (error) {
        console.error("Error:", error);
    }
  }

  generateMazeCreator();
  ```

  ```typescript LangChain
  import { ChatOpenAI } from "@langchain/openai"; 
  import { ChatPromptTemplate } from "@langchain/core/prompts";
  import { StringOutputParser } from "@langchain/core/output_parsers";
  import Runloop from '@runloop/api-client';

  // Initialize Runloop client
  const runloop = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY });

  async function generateMazeCreator() {
    try {
        // Generate code using OpenAI
        const llm = new ChatOpenAI({ model: "gpt-4o", temperature: 0 });
        const promptTemplate = ChatPromptTemplate.fromMessages([
            { role: "system", content: systemPrompt },
            { role: "user", content: "{input}" },
        ]);

        const outputParser = new StringOutputParser();

        // Create and run the chain
        const chain = promptTemplate.pipe(llm).pipe(outputParser);
        const mazeGenerationScript = await chain.invoke({ input: prompt });
        console.log(mazeGenerationScript);

        // Execute the script in a Devbox
        const devbox = await runloop.devboxes.createAndAwaitRunning();
        console.log(`Devbox ID: ${devbox.id}`);

        await runloop.devboxes.writeFileContents(devbox.id, {
            file_path: "gen_maze.py",
            contents:  mazeGenerationScript,
        });

        const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, {
            command: "python gen_maze.py --size 10",
        });

        exit_status === 0
            ? console.log("Maze generated successfully\n", stdout)
            : console.error("Maze generation failed\n", stderr);

    } catch (error) {
        console.error("Error:", error);
    }
  }
  generateMazeCreator();
  ```

  ```typescript LlamaIndex
  import { OpenAI, Settings} from "llamaindex"
  import Runloop from '@runloop/api-client';

  // Initialize clients
  const runloop = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY });
  Settings.llm = new OpenAI({
      apiKey: process.env.OPENAI_API_KEY,
      model: "gpt-4o",
  })

  // Create an OpenAI agent to generate Python code and run it in a Runloop Devbox
  async function generateMazeCreator() {
      try{
          const {message} = await Settings.llm.chat({
              messages: [{
                  role: "user",
                  content: prompt,}],
          });

          const mazeGenerationScript = message.content.toString();
          console.log(mazeGenerationScript)

          // Execute the script in a Devbox
          const devbox = await runloop.devboxes.createAndAwaitRunning();
          console.log(`Devbox ID: ${devbox.id}`);

          await runloop.devboxes.writeFileContents(devbox.id, {
              file_path: "gen_maze.py",
              contents:  mazeGenerationScript,
          });

          const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, {
              command: "python gen_maze.py --size 10",
          });

          exit_status === 0
              ? console.log("Maze generated successfully\n", stdout)
              : console.error("Maze generation failed\n", stderr);

      } catch (error) {
          console.error("Error:", error);
      }
  }

  generateMazeCreator();

  ```

  ```typescript Mistral
  import Runloop from '@runloop/api-client';
  import { Mistral } from '@mistralai/mistralai';

  // Create Mistral client
  const client = new Mistral({apiKey: process.env.MISTRAL_API_KEY});
  const runloop = new Runloop({bearerToken: process.env.RUNLOOP_API_KEY});


  async function generateMazeCreator() {
      try{       
           // Generate code using Mistral 
          const {choices}= await client.chat.complete({
              model: 'codestral-latest',
              messages: [
                  {role: 'system', content: systemPrompt},
                  {role: 'user', content: prompt}],
            });
          const mazeGenerationScript = choices![0].message.content!.toString();
          console.log(mazeGenerationScript);
          
          // Execute the script in a Devbox
          const devbox = await runloop.devboxes.createAndAwaitRunning();
          console.log(`Devbox ID: ${devbox.id}`);

          await runloop.devboxes.writeFileContents(devbox.id, {
              file_path: "gen_maze.py",
              contents:  mazeGenerationScript,
          });

          const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, {
              command: "python gen_maze.py --size 10",
          });

          exit_status === 0
              ? console.log("Maze generated successfully\n", stdout)
              : console.error("Maze generation failed\n", stderr);

      } catch (error) {
          console.error("Error:", error);
      }
  }

  generateMazeCreator();
  ```

  ```typescript OpenAI
  import { OpenAI } from "openai";
  import Runloop from "@runloop/api-client";

  // Initialize clients
  const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
  const runloop = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY });

  async function generateMazeCreator() {
      try {
          // Generate code using OpenAI
          const { choices } = await openai.chat.completions.create({
              model: "gpt-3.5-turbo",
              messages: [{ role: "user", content: prompt }],
              temperature: 0,
          });
          const mazeGenerationScript = choices[0].message.content ?? "print('Ai could not generate the code')";

          // Execute the script in a Devbox
          const devbox = await runloop.devboxes.createAndAwaitRunning();
          console.log(`Devbox ID: ${devbox.id}`);

          await runloop.devboxes.writeFileContents(devbox.id, {
              file_path: "gen_maze.py",
              contents:  mazeGenerationScript,
          });

          const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, {
              command: "python gen_maze.py --size 10",
          });

          exit_status === 0
              ? console.log("Maze generated successfully\n", stdout)
              : console.error("Maze generation failed\n", stderr);

      } catch (error) {
          console.error("Error:", error);
      }
  }

  generateMazeCreator();
  ```

  ```typescript VercelAI
  // npm install ai @ai-sdk/openai @e2b/code-interpreter
  import { openai } from '@ai-sdk/openai'
  import { generateText } from 'ai'
  import Runloop from '@runloop/api-client';

  // Initialize clients
  const runloop = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY });
  const model = openai('gpt-4o')


  async function generateMazeCreator() {
      try{
          // Generate code with OpenAI
          const { text: mazeGenerationScript } = await generateText({
          model,
          prompt
          })
          console.log(mazeGenerationScript)

          // Execute the script in a Devbox
          const devbox = await runloop.devboxes.createAndAwaitRunning();
          console.log(`Devbox ID: ${devbox.id}`);

          await runloop.devboxes.writeFileContents(devbox.id, {
              file_path: "gen_maze.py",
              contents:  mazeGenerationScript,
          });

          const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, {
              command: "python gen_maze.py --size 10",
          });

          exit_status === 0
              ? console.log("Maze generated successfully\n", stdout)
              : console.error("Maze generation failed\n", stderr);

      } catch (error) {
          console.error("Error:", error);
      }
  }

  generateMazeCreator();

  ```
</CodeGroup>

## **Python Integrations**

<CodeGroup>
  ```python Claude
  from anthropic import Anthropic
  from runloop_api_client import Runloop
  import os

  # Initialize clients
  anthropic = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
  runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  def generate_maze_creator():
      try:
          # Generate code using Claude
          response = anthropic.messages.create(
              model="claude-3-5-sonnet-20240620",
              max_tokens=1024,
              messages=[
                  {"role": "assistant", "content": system_prompt},
                  {"role": "user", "content": prompt}
              ]
          )
          maze_generation_script = response.content[0].text

          # Execute the script in a Devbox
          devbox = runloop.devboxes.create_and_await_running()
          print("Devbox ID:", devbox.id)

          runloop.devboxes.write_file_contents(devbox.id,
           file_path= "gen_maze.py",
           contents= maze_generation_script
           )

          result = runloop.devboxes.execute_sync(devbox.id,
              command= "python gen_maze.py --size 10"
          )

          if not result.exit_status:
              print("Maze generated successfully\n", result.stdout)
          else:
              print("Script execution failed:", result.stderr)

      except Exception as e:
          print("Error:", e)

  if __name__ == "__main__":
      generate_maze_creator()
  ```

  ```python Gemini
  import google.generativeai as genai
  from runloop_api_client import Runloop
  import os

  # Initialize clients
  genai.configure(api_key=os.environ.get("GEMINI_API_KEY"))
  runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  def generate_maze_creator():
      try:
          # Generate code using Gemini
          model = genai.GenerativeModel("gemini-1.5-flash")
          response = model.generate_content(prompt)
          maze_generation_script = response.text.strip()

          # Execute the script in a Devbox
          devbox = runloop.devboxes.create_and_await_running()
          print("Devbox ID:", devbox.id)

          runloop.devboxes.write_file_contents(devbox.id,
              file_path= "gen_maze.py",
              contents= maze_generation_script
              )

          result = runloop.devboxes.execute_sync(devbox.id,
              command= "python gen_maze.py --size 10"
          )

          if not result.exit_status:
              print("Maze generated successfully\n", result.stdout)
          else:
              print("Script execution failed:", result.stderr)

      except Exception as e:
          print("Error:", e)

  if __name__ == "__main__":
      generate_maze_creator()
  ```

  ```python LangChain
  from langchain_openai import ChatOpenAI
  from langchain_core.prompts import ChatPromptTemplate
  from langchain_core.output_parsers import StrOutputParser
  from runloop_api_client import Runloop
  import os

  # Initialize client
  runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  def generate_maze_creator():
      try: 
          # Generate code using OpenAI
          llm = ChatOpenAI(model="gpt-4o")
          prompt_template = ChatPromptTemplate.from_messages([
              ("system", system_prompt),
              ("human", prompt)
          ])
          output_parser = StrOutputParser()

          # Create and run the chain
          chain = prompt_template | llm | output_parser
          maze_generation_script = chain.invoke({"input": prompt})

          # Execute the script in a Devbox
          devbox = runloop.devboxes.create_and_await_running()
          print("Devbox ID:", devbox.id)

          runloop.devboxes.write_file_contents(devbox.id,
           file_path= "gen_maze.py",
           contents= maze_generation_script
           )

          result = runloop.devboxes.execute_sync(devbox.id,
              command= "python gen_maze.py --size 10"
          )

          if not result.exit_status:
              print("Maze generated successfully\n", result.stdout)
          else:
              print("Script execution failed:", result.stderr)

      except Exception as e:
          print("Error:", e)

  if __name__ == "__main__":
      generate_maze_creator()
  ```

  ```python LlamaIndex
  from langchain_openai import ChatOpenAI
  from langchain_core.prompts import ChatPromptTemplate
  from langchain_core.output_parsers import StrOutputParser
  from runloop_api_client import Runloop
  import os

  # Initialize client
  runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  def generate_maze_creator():
      try:
          # Create LangChain components
          llm = ChatOpenAI(model="gpt-4o")
          prompt_template = ChatPromptTemplate.from_messages([
              ("system", system_prompt),
              ("human", prompt)
          ])
          output_parser = StrOutputParser()

          # Create and run the chain
          chain = prompt_template | llm | output_parser
          maze_generation_script = chain.invoke({"input": prompt})
          print(maze_generation_script)

          # Execute the script in a Devbox
          devbox = runloop.devboxes.create_and_await_running()
          print("Devbox ID:", devbox.id)

          runloop.devboxes.write_file_contents(devbox.id,
           file_path= "gen_maze.py",
           contents= maze_generation_script
           )

          result = runloop.devboxes.execute_sync(devbox.id,
              command= "python gen_maze.py --size 10"
          )

          if not result.exit_status:
              print("Maze generated successfully\n", result.stdout)
          else:
              print("Script execution failed:", result.stderr)

      except Exception as e:
          print("Error:", e)

  if __name__ == "__main__":
      generate_maze_creator()

  ```

  ```python Mistral

  import os
  from mistralai import Mistral
  from runloop_api_client import Runloop

  # Initialize clients
  client = Mistral(api_key=os.environ.get("MISTRAL_API_KEY"))
  runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  def generate_maze_creator():
      try:
          # Generate code using OpenAI
          response = client.chat.complete(
              model="codestral-latest",
              messages=[
                  {"role": "system", "content": system_prompt},
                  {"role": "user", "content": prompt}])

          maze_generation_script = response.choices[0].message.content

          # Execute the script in a Devbox
          devbox = runloop.devboxes.create_and_await_running()
          print("Devbox ID:", devbox.id)

          runloop.devboxes.write_file_contents(devbox.id,
           file_path= "gen_maze.py",
           contents= maze_generation_script
           )

          result = runloop.devboxes.execute_sync(devbox.id,
              command= "python gen_maze.py --size 10"
          )

          if not result.exit_status:
              print("Maze generated successfully\n", result.stdout)
          else:
              print("Script execution failed:", result.stderr)

      except Exception as e:
          print("Error:", e)

  if __name__ == "__main__":
      generate_maze_creator()
  ```

  ```python OpenAI
  import os
  from openai import OpenAI
  from runloop_api_client import Runloop

  # Initialize clients
  openai = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
  runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  def generate_maze_creator():
      try:
          # Generate code using OpenAI
          response = openai.chat.completions.create(
              model="gpt-4o",
              messages=[{"role": "user", "content": prompt}],
              temperature=0
          )
          maze_generation_script = response.choices[0].message.content

          # Execute the script in a Devbox
          devbox = runloop.devboxes.create_and_await_running()
          print("Devbox ID:", devbox.id)

          runloop.devboxes.write_file_contents(devbox.id,
           file_path= "gen_maze.py",
           contents= maze_generation_script
           )

          result = runloop.devboxes.execute_sync(devbox.id,
              command= "python gen_maze.py --size 10"
          )

          if not result.exit_status:
              print("Maze generated successfully\n", result.stdout)
          else:
              print("Script execution failed:", result.stderr)

      except Exception as e:
          print("Error:", e)

  if __name__ == "__main__":
      generate_maze_creator()
  ```

  ```python CrewAI
  from crewai.tools import tool
  from crewai import Agent, Task, Crew, LLM
  from runloop_api_client import Runloop
  import os

  @tool("Tool to generate Python code using LLM")
  def generate_code(prompt: str) -> str:
      """
      Generate Python code based on a given prompt using the LLM.
      """
      try:
          # Generate code using the LLM
          llm = LLM(model="gpt-4o")
          response = llm.chat(messages=[{"role": "user", "content": prompt}])
          generated_code = response['choices'][0]['message']['content']
          
          return generated_code

      except Exception as e:
          print("LLM Exception occurred:", e)
          return str(e)

  @tool("Tool to execute Python code on Runloop")
  def execute_code_on_runloop(code: str, size: int):
      """
      Execute Python code on a Runloop Devbox.
      """
      try:
          # Initialize client
          runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

          # Execute the script in a Devbox
          devbox = runloop.devboxes.create_and_await_running()
          print("Devbox ID:", devbox.id)

          runloop.devboxes.write_file_contents(
              devbox.id,
              file_path="gen_maze.py",
              contents=code
          )

          result = runloop.devboxes.execute_sync(
              devbox.id,
              command=f"python gen_maze.py --size {size}"
          )

          if not result.exit_status:
              print("Maze generated successfully\n", result.stdout)
              return result.stdout
          else:
              print("Script execution failed:", result.stderr)
              return result.stderr

      except Exception as e:
          print("Runloop Exception occurred:", e)
          return str(e)

  # Define the agent
  code_writer_executor = Agent(
      role='Python Code Writer and Executor',
      goal='Write Python scripts based on prompts, execute them, and return the results.',
      backstory='You are an expert Python programmer capable of writing, executing code, and returning results.',
      tools=[generate_code, execute_code_on_runloop],
      llm=LLM(model="gpt-4o")
  )

  # Define the task
  generate_maze_task = Task(
      description="Generate and execute a Python script that creates a maze.",
      agent=code_writer_executor,
      expected_output="A successfully generated and executed maze of size 11.",
      inputs={
          "prompt": prompt,
          "size": 11
      }
  )

  # Create the crew
  maze_generation_crew = Crew(
      agents=[code_writer_executor],
      tasks=[generate_maze_task],
      verbose=True,
  )

  # Run the crew
  result = maze_generation_crew.kickoff()
  print(result)
  ```
</CodeGroup>


# Quickstart: Giving Agents a Development Environment
Source: https://docs.runloop.ai/overview/quickstart


## Running AI generated code securely with Runloop

Runloop Devboxes are a secure and isolated environment for running AI-generated code.

Let's see how we can use Devboxes to safely run AI generated code to generate mazes.

<Steps>
  <Step title="Set Up Your Environment">
    Set up your API keys as environment variables:

    ```bash
    export RUNLOOP_API_KEY=<your_runloop_api_key_here>
    export OPENAI_API_KEY=<your_openai_api_key_here>
    ```

    <Note>Replace the placeholders with your actual API keys. Note you can get your Runloop API key from the [Runloop Dashboard](https://platform.runloop.ai/manage/keys).</Note>
  </Step>

  <Step title="Use AI to generate a maze generator program">
    First, we'll use the OpenAI API to generate Python code that generates a maze.

    <CodeGroup>
      ````bash curl
      response=$(curl "https://api.openai.com/v1/chat/completions" \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer $OPENAI_API_KEY" \
      -d '{
          "model": "gpt-4o-mini",
          "messages": [
              {
                  "role": "system",
                  "content": "You are a helpful Python code generating assistant. You output code only, no other text. Do NOT wrap your code in ```python or ```bash or anything like that."
              },
              {
                  "role": "user",
                  "content": "Write a Python script that generates an ASCII art maze and prints it to stdout. It should be 10x10 in size and callable from the command line via `python maze.py`."
              }
          ]
      }')

      python_script=$(echo "$response" | jq -r '.choices[0].message.content')
      ````

      ````python Python
      import openai

      openai.api_key = os.environ.get("OPENAI_API_KEY")

      response = openai.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
          {"role": "system", "content": "You are a helpful Python code generating assistant. You output code only, no other text. Do NOT wrap your code in ```python or ```bash or anything like that."},
          {"role": "user", "content": "Write a Python script that generates an ASCII art maze and prints it to stdout. It should be 10x10 in size and callable from the command line via `python maze.py`."}
        ]
      )

      python_script = response.choices[0].message.content
      ````

      ````typescript TypeScript
      import OpenAI from "openai";

      const openai = new OpenAI();

      const completion = await openai.chat.completions.create({
        model: "gpt-4o-mini",
        messages: [
            { role: "system", content: "You are a helpful Python code generating assistant. You output code only, no other text. Do NOT wrap your code in ```python or ```bash or anything like that." },
            {
                role: "user",
                content: "Write a Python script that generates an ASCII art maze and prints it to stdout. It should be 10x10 in size and callable from the command line via `python maze.py`.",
            },
        ],
      });

      console.log(completion.choices[0].message);

      const pythonScript = completion.choices[0].message?.content;
      ````
    </CodeGroup>
  </Step>

  <Step title="Create a Devbox to securely run the AI generated code">
    Now, let's create a Devbox to use as our sandbox environment.
    Once a Devbox is created, Runloop will automatically provision a secure microVM that can be used to load and run any coding projects.

    A Devbox starts in the 'provisioning' state. Once the Devbox is ready, it will transition to the 'running' state at which point we can begin using it.

    <CodeGroup>
      ```bash curl
      curl -X POST \
        'https://api.runloop.ai/v1/devboxes' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" \
        -H 'Content-Type: application/json' \
        -d '{}'
      ```

      ```python Python
      import os
      from runloop_api_client import Runloop

      # ... previous code ...

      runloop_client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))
      # create the devbox and wait for it to be ready
      devbox = runloop_client.devboxes.create_and_await_running()
      print(devbox.id)
      ```

      ```typescript TypeScript
      import Runloop from '@runloop/api-client';

      // ... previous code ...

      const runloopClient = new Runloop({
        bearerToken: process.env.RUNLOOP_API_KEY,
      });

      // create the devbox and wait for it to be ready
      async function runGenerateMazeProgram(program: string) {
        const devbox = await runloopClient.devboxes.createAndAwaitRunning();
        console.log(devbox.id);
      }

      runGenerateMazeProgram(python_script);
      ```
    </CodeGroup>

    <Note>This command will return a Devbox ID (e.g.`dbx_1234567890`). This ID will be used to perform operations on the Devbox.</Note>
  </Step>

  <Step title="Upload the Maze Generator Program to the Devbox">
    Now, let's upload the Python script to the Devbox so we can run it securely.

    <CodeGroup>
      ```bash curl
      curl -X POST \
        'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/write_file_contents' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" \
        -H 'Content-Type: application/json' \
        -d "{\"file_path\": \"maze.py\", \"contents\": \"$python_script\"}"
      ```

      ```python Python
      # ... previous code ...
      runloop_client.devboxes.write_file_contents(devbox.id, file_path="maze.py", contents=python_script)
      ```

      ```typescript TypeScript
      async function runGenerateMazeProgram(program: string) {
        // ... previous code ...
        await runloopClient.devboxes.writeFileContents(devbox.id, {
          file_path: 'maze.py',
          contents: program,
        });
      }
      ```
    </CodeGroup>
  </Step>

  <Step title="Run the Maze Generator Program">
    <CodeGroup>
      ```bash curl
      curl -X POST \
        'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/execute_sync' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY" \
        -H 'Content-Type: application/json' \
        -d '{"command": "python maze.py"}'
      ```

      ```python Python
      exec_result = runloop_client.devboxes.execute_sync("<YOUR_DEVBOX_ID>", command="python maze.py")
      # Print stdout
      print(exec_result.stdout)
      ```

      ```typescript TypeScript
      const execResult = await runloopClient.devboxes.executeSync('<YOUR_DEVBOX_ID>', {
        command: 'python maze.py',
      });
      console.log(execResult.stdout);
      ```
    </CodeGroup>
  </Step>

  <Step title="Shutdown the Devbox">
    Once we are done generating mazes, we can shut down the Devbox to free up resources:

    <CodeGroup>
      ```bash curl
      curl -X POST \
        'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/shutdown' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY"
      ```

      ```python Python
      runloop_client.devboxes.shutdown(devbox.id)
      ```

      ```typescript TypeScript
      await runloopClient.devboxes.shutdown(devbox.id);
      ```
    </CodeGroup>

    By default, Devboxes are configured to automatically shut down after 60 minutes of inactivity but they can be configured to run for any amount of time or even to automatically shut down after some idle period.
  </Step>
</Steps>

## Giving agents a secure development environment via Tools

In addition to just using the Runloop API to manually upload and run code, you can also use Runloop Tools to give full access to the Devbox to an agent.

For example, let's make a simple coding agent that can generate Python code and ask it to write a command-line script that prints command-line arguments as ascii words!

<Steps>
  <Step title="Create a Devbox for the agent to use">
    Let's create a Devbox to use as a development environment for the agent.

    <CodeGroup>
      ```python Python
      import os
      from runloop_api_client import Runloop

      # Initialize the Runloop client
      runloop_client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

      # Initialize a devbox and retrieve the devbox id
      devbox = runloop_client.devboxes.create_and_await_running()
      print(devbox.id)
      ```

      ```typescript TypeScript
      import { Runloop } from "@runloop/api-client";

      // Initialize the Runloop client
      const runloop = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY });

      // Create a devbox and await running
      const devbox = await runloop.devboxes.create();
      await runloop.devboxes.awaitRunning(devbox.id);
      console.log(devbox.id);
      ```
    </CodeGroup>

    <Note>This command will return a Devbox ID (e.g.`dbx_1234567890`). This ID will be used to perform operations on the Devbox.</Note>
  </Step>

  <Step title="Create tools so the agent can use the Devbox">
    Next, let's create tools bound to the Devbox so the agent can use it. In the examples below, we will use:

    * **Python**: Utilize the Ell framework to create tools and run an agent.
    * **TypeScript**: Define each tool and pass them as a `ChatCompletionTool[]`.

    However, the tools can easily be created in any language and any framework! Check out our [examples repository](https://github.com/runloopai/examples.git) for more examples in your favorite language or framework.

    <CodeGroup>
      ```python Python
      import ell
      from runloop_api_client import Runloop

      @ell.tool()
      def execute_shell_command(command: str):
          """Run a shell command in the devbox."""
          return runloop_client.devboxes.execute_sync(devbox.id, command=command).stdout

      @ell.tool()
      def read_file(filename: str):
          """Reads a file on the devbox."""
          return runloop_client.devboxes.read_file_contents(devbox.id, file_path=filename)

      @ell.tool()
      def write_file(filename: str, contents: str):
          """Writes a file on the devbox."""
          runloop_client.devboxes.write_file_contents(
              devbox.id, file_path=filename, contents=contents
          )
      ```

      ```typescript TypeScript
      const tools = [
        {
          type: "function",
          function: {
            name: "executeShellCommand",
            description: "Run a shell command in the devbox",
            parameters: {
              type: "object",
              properties: {
                command: {
                  type: "string",
                  description: "The shell command to execute.",
                },
              },
              required: ["command"],
              additionalProperties: false,
            },
            strict: true,
          },
        } as const,
      ];
      ```
    </CodeGroup>

    By default, Devboxes are configured to automatically shut down after 60 minutes of inactivity. They can also be configured to run for any amount of time or to automatically shut down after a specified idle period.
  </Step>

  <Step title="Run the agent with the tools">
    Now we can give the tools to the agent and ask it to generate a script and run the program in the devbox.

    <CodeGroup>
      ```python Python
        @ell.complex(
            model="gpt-4-turbo", tools=[execute_shell_command, read_file, write_file]
        )
        def invoke_agent(message_history: List[Message]):
            """Calls the LLM to generate the program."""
            messages = [
                ell.system(SYSTEM_PROMPT),
                ell.user(USER_PROMPT),
            ] + message_history
            return messages
      ```

      ```typescript TypeScript
        const messageHistory: ChatCompletionMessage[] = [
          { role: "assistant", content: SYSTEM_PROMPT, refusal: null },
          {role: "user", content: USER_PROMPT, refusal: null
          } as unknown as ChatCompletionMessage,
        ];

        let response = await openai.chat.completions.create({
              messages: messageHistory,
              model: "gpt-4-turbo",
              tools: tools,
              tool_choice: "auto",
            });
      ```
    </CodeGroup>
  </Step>

  <Step title="Shutdown the Devbox">
    Once we are done having our agent generate mazes, we can shut down the Devbox:

    <CodeGroup>
      ```bash curl
      curl -X POST \
        'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/shutdown' \
        -H "Authorization: Bearer $RUNLOOP_API_KEY"
      ```

      ```python Python
      runloop_client.devboxes.shutdown(devbox.id)
      ```

      ```typescript TypeScript
      await runloop.devboxes.shutdown(devbox.id);
      ```
    </CodeGroup>
  </Step>
</Steps>

<div className="section" title="Explore More with Runloop">
  Explore the [Runloop Examples Repository](https://github.com/runloopai/examples.git) to discover how to implement AI agents to run in Runloop.
</div>


# What is Runloop?
Source: https://docs.runloop.ai/overview/what-is-runloop


Runloop is the *batteries included* platform designed for building and optimizing
AI-driven software engineering agents.

With the Runloop platform, you get:

* Fast, isolated, snapshottable virtual machines for executing agents & agent tools ([Devboxes](/devboxes/overview)).
* Team-shareable templates for launching new devboxes with custom configuration ([Blueprints](/devboxes/blueprints)), and persistent disk state ([Snapshots](/devboxes/snapshots)).
* Zero-configuration code repository integration ([Code Mounts](/devboxes/code-mounts)) and a fully-featured, ready-to-use language server{/* ([Code Understanding APIs](/overview/what-is-runloop#code-understanding-apis))*/}.
* Turnkey benchmarking ([Benchmarks](/benchmarks/overview)) and evaluation services for fine-tuning your agent's behavior{/* ([Code Scenarios](/overview/what-is-runloop#code-scenarios))*/}.

Whether you are trying to build an AI agent that can respond to pull requests, or an AI agent that can generate new UI components, Runloop makes it possible to get from zero to POC in just a few lines of code.

## Why Runloop?

Our mission at Runloop is to keep you focused on the things differentiate your AI agent. Leave the building blocks to us and spend time on what actually matters.

As your agent evolves, your needs will evolve too. Runloop is designed for builders at all stages:

| Stage       | Why Runloop                                                                                                                          |
| ----------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| Prototyping | <ul><li>Zero infrastructure worries using managed, instant-on devboxes.</li><li>Build, deploy, learn, and iterate quickly.</li></ul> |
| Production  | <ul><li>Team-shared blueprints and projects.</li><li>24/7/365 managed platform and oncall team.</li><li>SOC2 compliant.</li></ul>    |
| Growth      | <ul><li>Benchmarking and evaluation stack to monitor and fine-tune your agent's performance.</li></ul>                               |

### Use cases

Our customers are already leveraging Runloop to build AI agents that can:

* Respond to Pull Requests and enhance the code review process
* Enable users to chat with and navigate their codebase
* Generate new test cases for existing codebases
* Act as pair programmers
* Generate new UI components for their frontend

Have a use case that we didn't cover?
Send us an email at [support@runloop.ai](mailto:support@runloop.ai) to learn more about how Runloop can help you build AI agents.

## Core Components of Runloop

### Devboxes

Devboxes are isolated, cloud-based development environments that can be controlled by AI agents via the Runloop API.
You can give agents access to a devbox to let agents run and test code in a safe, isolated environment.

<CodeGroup>
  ```typescript TypeScript
  import Runloop from '@runloop/api-client';
  import { generateText, tool } from 'ai';

  // Create an isolated Devbox for the agent to use
  const devbox = await runloopClient.devboxes.createAndAwaitRunning();
  // Get the Runloop Tool Representation for the Devbox and convert them to Vercel AI Tools
  const runloopDevboxShellTools = runloopClient.devboxes.tools.shellTools(devbox.id)      
  const runloopDevboxFileTools = runloopClient.devboxes.tools.fileTools(devbox.id)      

  // Use VercelAI SDK to create a simple agent that uses the Devbox to code a game
  const { text: answer } = await generateText({
      model: openai('gpt-4o-2024-08-06'),
      tools: {
          ...runloopDevboxShellTools,
          ...runloopDevboxFileTools
      },
      maxSteps: 10,
      system:
          'You are an expert python coder that specializes in making CLI games.'
      prompt:
          'Create a CLI game that is a guessing game where the user has to guess a number between 1 and 100. Write the python script in the file `game.py`. The program should be callable from the command line via `python game.py`. Once you have generated the program, run it and print the output to stdout.'
    });

  console.log(`ANSWER: ${answer}`);
  ```
</CodeGroup>

{/*
  Removing APIs until they're ready for public access

  ### Code Understanding APIs
  <Note>
  Code Understanding APIs are currently in beta. Please contact us at [support@runloop.ai](mailto:support@runloop.ai) to get access.
  </Note>
  A critical part of making AI SWE agents work reliably is giving them the right context to solve the problem. In many cases, this means extracting context from the existing codebase such as function signatures, finding tests that cover a specific segment of code, or understanding which files are often edited together.
  For example, one common heuristic that helps AI agents navigate codebases is [the Repository Map used by Aider](https://aider.chat/docs/repomap.html). Writing your own Repository Map style heuristic can be difficult as it requires static analysis of the codebase. Other heuristics can be even harder to create and rely on gathering information from the runtime dataflow of the codebase.
  The Code Understanding APIs aim to make it possible to create these types of heuristics in just a few lines of code. 


  <CodeGroup>
  ```typescript TypeScript [expandable]
  import Runloop from '@runloop/api-client';

  const runloopClient = new Runloop({
    bearerToken: process.env.RUNLOOP_API_KEY,
  });

  // First create a repository connection for Runloop to preindex the codebase we will run on:
  const repositoryConnectionView = await client.repositories.createAndAwaitIndexing({ name: 'repository-name', owner: 'repository-owner' });

  // Now let's use the repository connection to recreate the repository map heuristic:
  // 1. We list all the code files in the repository
  const codeFiles = await client.repositories.codeFiles(repositoryConnectionView.id);
  // 2. We can use the special file viewer and query syntax to only view the files with method signatures
  const files_with_method_signatures_only = await client.repositories.fileViewer(repositoryConnectionView.id, {
    files: codeFiles,
    // We make our query against the AST such that we only get class and method nodes and only include the signature and comments 
    query: 'class(signatures, comments) || method(signatures, comments)'
  });

  // Or we can simply use the built in repository map heuristic:
  const repositoryMap = await client.repositories.repositoryMap(repositoryConnectionView.id);
  ```

  ```python Python [expandable]
  import os
  from runloop_api_client import Runloop

  # Create a Runloop client
  runloop_client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

  # First create a repository connection for Runloop to preindex the codebase we will run on:
  repository_connection_view = runloop_client.repositories.create(
    name='repository-name', 
    owner='repository-owner'
  )

  # Now let's use the repository connection to recreate the repository map heuristic:
  # 1. We list all the code files in the repository
  code_files = runloop_client.repositories.code_files(repository_connection_view.id)

  # 2. We can use the special file viewer and query syntax to only view the files with method signatures
  files_with_method_signatures_only = runloop_client.repositories.file_viewer(
    repository_connection_view.id,
    files=code_files,
    # We make our query against the AST such that we only get class and method nodes and only include the signature and comments 
    query='class(signatures, comments) || method(signatures, comments)'
  )

  # Or we can simply use the built in repository map heuristic:
  repository_map = runloop_client.repositories.repository_map(repository_connection_view.id)
  ```

  </CodeGroup>

  ### Code Scenarios
  <Note>
  Code Scenarios APIs are currently in beta. Please contact us at [support@runloop.ai](mailto:support@runloop.ai) to get access.
  </Note>

  Tuning the behavior of your AI agent is a critical part of making it work reliably. However, going from POC to production is where most AI agents fail.
  Code Scenarios is a set of Benchmarking and Eval Tools that help you understand and improve your AI agent's behavior in a methodical way.
  For example, with Code Scenarios:
  - You can run your agent against well known benchmarks such as SWE-bench or create your own custom benchmarks
  - You can record live production agent traces and monitor the Agent performance or use the traces to create new benchmarks
  - You can create custom Reward Models based on production traces and benchmark data to fine tune your agent's behavior

  <CodeGroup>
    ```typescript TypeScript [expandable]
    import Runloop from '@runloop/api-client';

    const runloopClient = new Runloop({
      bearerToken: process.env.RUNLOOP_API_KEY,
    });

    // 1. Create new Benchmarks or use existing ones like SWE-bench
    const myBenchmark = await runloop.benchmarks.create({
        benchmark: 'UI Component Generation',
        testCases: [
            {
                name: 'Create a Login Page',
                problemStatement: 'Create a login page that allows users to login to the application. Call the component `LoginPage` and export it from the file `src/components/LoginPage.tsx`.',
                // Configure specific starting Devbox environemnts for your Agent to use as part of a benchmark test
                environment: 'DEFAULT_DEVBOX',
                outputContractRules: [
                    {
                        type: 'typescript',
                        files: ["expected_snapshot.png"]
                        validate: (output) => {
                            // Validate the storybook snapshot is updated and roughly matches the expected_snapshot.png
                        }
                    }
                ]                
            }
        ]
    });

    // 2. Run your agent against the benchmark
    const benchmarkRun = await runloop.benchmarks.beginTestRun(myBenchmark.id);
    // For each test case run our agent and report the completion
    for (const testCase of myBenchmark.testCases) {
        const agentOutput = await myAgent.run({
            prompt: testCase.problemStatement,
            devbox: testCase.devbox
        });
        await runloop.benchmarks.reportTestCaseRun(testCaseRun.id, {
            output: agentOutput
        });
    }

    ```
  </CodeGroup>
  */}


# Support for AI tools
Source: https://docs.runloop.ai/tools/ai-tools

Add context about the Runloop API to your LLMs

#### Runloop provides first-class context for AI tools to integrate with our APIs

<Info>[/llms.txt specification list here.](/llms.txt)</Info>

<Info>[Full specification is available here.](/llms-full.txt)</Info>


# Runloop CLI
Source: https://docs.runloop.ai/tools/cli

Explore, experiment with, and test the Runloop API using the Runloop CLI.

## Key Features

* Easy interaction with Runloop API
* Helpful for blueprint testing and devbox management
* Debugging tool for running devboxes
* Potential tool for AI agents to interact with Runloop

## Setting Up the CLI

The Runloop CLI is written in Python and wraps the Runloop Python SDK.

To set up the Runloop CLI:

1. Visit the [Runloop CLI GitHub page](https://github.com/runloopai/rl-cli)
2. Follow the installation instructions provided in the repository

## Essential CLI Commands

Once set up and validated, the CLI offers a range of helpful commands:

### Create a Blueprint

```bash
rl blueprint create --system_setup_commands "sudo apt install cowsay -y" --name cowsay-devbox
```

This command creates a new blueprint named `cowsay-devbox` with the `cowsay` package installed.
The response from the cli will include an `id` for the blueprint.

### Create a Devbox

```bash
rl devbox create --entrypoint "/usr/games/cowsay runloop" --blueprint_id "<bpt_your_blueprint>"
```

This creates a devbox using a specified blueprint and sets an entrypoint command.

### Monitor Devbox Logs

```bash
watch rl devbox logs --id <dbx_your_id>
```

This command allows you to watch the logs of a running devbox in real-time.

### SSH into a Devbox

For more advanced use cases, you can use the CLI to SSH into your Devbox. For detailed instructions, refer to our [SSH Access guide](/devboxes/debugging-agent-output-with-ssh).

## Contributing

The Runloop CLI is open-source, and we enthusiastically welcome contributions and feedback from our community. To contribute:

1. Visit our [GitHub repository](https://github.com/runloopai/rl-cli)
2. Fork the repository
3. Make your changes
4. Submit a pull request

We appreciate all forms of contribution, from bug reports and feature requests to code contributions and documentation improvements.

## Getting Help

If you encounter any issues or have questions about using the Runloop CLI:

1. Check the [CLI documentation](https://github.com/runloopai/rl-cli/blob/main/README.md) for detailed usage instructions
2. Open an issue on the GitHub repository for bug reports or feature requests
3. Reach out to our team for additional assistance


# Cursor Rules
Source: https://docs.runloop.ai/tools/cursor-files


## Download .mdc Files

We recommend using the [Cursor rules](https://docs.cursor.com/context/rules) below for working with the Runloop SDK. You can download the `.mdc` files below:

* <a href="/static/files/runloop-python-client.mdc" download="runloop-python-client.mdc">Python File</a>
* <a href="/static/files/runloop-typescript-client.mdc" download="runloop-typescript-client.mdc">TypeScript File</a>

These files should be added to `.cursor/rules` in your project directory.

## Recommended rules for working with Runloop on Cursor

<Tabs>
  <Tab title="Python">
    ```
    You are working with runloop_api_client, a Python SDK for deploying and managing remote devboxes for AI agents. Use this guide to properly interact with the SDK.

    Devbox Overview

    A Devbox is a virtual development environment designed for running AI-generated or arbitrary code in an isolated sandbox. It provides configurable compute resources, supports execution of shell commands, and can be managed via API.

    Each Devbox has a unique ID, a status (e.g., running, suspended, or shutdown), and metadata for user-defined settings. It includes launch parameters for customizing resource size, execution behavior, and available ports.

    Key attributes:

    ID (string, required) – Unique identifier of the Devbox.
    Status (enum, required) – Current state of the Devbox (e.g., provisioning, running, shutdown).
    Create time (integer, required) – Timestamp (ms) when the Devbox was created.
    Launch parameters (object, required) – Includes startup commands, resource configuration, idle timeout, and network settings.
    Capabilities (list, required) – Defines supported tools such as computer usage APIs, browser usage, and language servers.
    Blueprint/Snapshot ID (string, optional) – Identifier if created from a predefined Blueprint or Snapshot.
    Failure/Shutdown reason (enum, optional) – Reason for termination if applicable (e.g., out of memory, idle shutdown).
    Devboxes can be started, suspended, resumed, or shut down, and they support file operations, shell command execution, and network tunneling.

    CORE SDK USAGE:

    Initialize client:
    from runloop_api_client import Runloop
    client = Runloop(api_key="YOUR_API_KEY")
    DEVBOX LIFECYCLE:

    Create a new Devbox:
    devbox_view = client.devboxes.create(
        name="devbox_name",
        blueprint_id="blueprint_id",
        snapshot_id="snapshot_id",
        launch_parameters={}
    )
    client.devboxes.await_running(browser.devbox.id)

    Retrieve an existing Devbox:
        devbox_view = client.devboxes.retrieve(id="devbox_id")
    List Devboxes:
        page = client.devboxes.list(status="running", limit=10)
    Suspend a Devbox:
        devbox_view = client.devboxes.suspend(id="devbox_id")
    Resume a Devbox:
        devbox_view = client.devboxes.resume(id="devbox_id")
    Shutdown a Devbox:
        devbox_view = client.devboxes.shutdown(id="devbox_id")
    Keep a Devbox alive:
        response = client.devboxes.keep_alive(id="devbox_id")

    DEVBOX FILE OPERATIONS:

    Write file contents:
        response = client.devboxes.write_file_contents(
            id="devbox_id",
            file_path="path/to/file.txt",
            contents="Hello, world!"
        )
    Read file contents:
        response = client.devboxes.read_file_contents(
            id="devbox_id",
            file_path="path/to/file.txt"
        )
    Upload a file:
        response = client.devboxes.upload_file(id="devbox_id", path="path/to/upload")
    Download a file:
        response = client.devboxes.download_file(id="devbox_id", path="path/to/file.txt")
        content = response.read()

    DEVBOX EXECUTION:

    Execute command synchronously:
        execution_detail = client.devboxes.execute_sync(id="devbox_id", command="ls -la")
    Execute command asynchronously:
        async_execution = client.devboxes.execute_async(id="devbox_id", command="ls -la")
    Retrieve execution status:
        execution_status = client.devboxes.executions.retrieve(
            devbox_id="devbox_id",
            execution_id="execution_id"
        )

    DEVBOX NETWORKING:

    Create a tunnel:
        tunnel = client.devboxes.create_tunnel(id="devbox_id", port=8080)
    Remove a tunnel:
        tunnel = client.devboxes.remove_tunnel(id="devbox_id", port=8080)
    Create an SSH key:
        ssh_key = client.devboxes.create_ssh_key(id="devbox_id")

    DEVBOX SNAPSHOT MANAGEMENT:

    List disk snapshots:
        page = client.devboxes.list_disk_snapshots(devbox_id="devbox_id", limit=10)
    Create a disk snapshot:
        snapshot = client.devboxes.snapshot_disk(id="devbox_id", name="snapshot_name")
    Delete a disk snapshot:
        response = client.devboxes.delete_disk_snapshot(id="snapshot_id")

    DEVBOX LOGGING:

    Retrieve logs:
        logs = client.devboxes.logs.list(id="devbox_id", execution_id="execution_id")

    REPOSITORY OBJECT

      A Repository in Runloop represents a connection to a remote repository and facilitates its automated analysis. This enables users to manage and inspect repositories efficiently.

      ATTRIBUTES
      - id (string, required) - Unique identifier of the Repository.
      - name (string, required) - The name of the Repository.
      - owner (string, required) - The account owner associated with the Repository.
      - status (enum<string>, required) - The current state of the Repository.  
        Available options: `pending`, `failure`, `active`.
      - failure_reason (string | null, optional) - The reason for failure, if the repository has a `failure` status.

      Repositories can be created, retrieved, listed, and deleted using the Runloop API.


    REPOSITORY OPERATIONS


      List repositories:
          repositories = client.repositories.list(limit=10)
      Create a repository connection:
          repository = client.repositories.create(name="repo_name", owner="repo_owner")
      Retrieve repository details:
          repository = client.repositories.retrieve(id="repo_id")
      Delete a repository:
          response = client.repositories.delete(id="repo_id")
      List repository versions:
          repo_versions = client.repositories.versions(id="repo_id")

      Using Blueprints in Runloop SDK

      Blueprints provide a way to create customized starting points for Devboxes, caching environment setups to improve boot times. They allow pre-configured development environments to be reused efficiently.

      Blueprint Object Attributes

      ID (string, required) – Unique identifier of the Blueprint.
      Name (string, required) – The name of the Blueprint.
      Status (enum, required) – Current state of the Blueprint build (e.g., provisioning, building, build_complete, failed).
      Create time (integer, required) – Timestamp (ms) when the Blueprint was created.
      Parameters (object, required) – Configuration used to create the Blueprint.
      Failure reason (enum, optional) – Reason for failure if the build failed (e.g., out_of_memory, out_of_disk, build_failed).

      BLUEPRINT OPERATIONS

      List all Blueprints:
          page = client.blueprints.list()
      Create a new Blueprint:
          blueprint_view = client.blueprints.create(name="custom_blueprint")
      Retrieve a Blueprint by ID:
          blueprint_view = client.blueprints.retrieve(id="blueprint_id")
      Retrieve build logs for a Blueprint:
          blueprint_logs = client.blueprints.logs(id="blueprint_id")
      Preview a Blueprint before creation:
          blueprint_preview = client.blueprints.preview(name="custom_blueprint")

      Blueprints accelerate Devbox setup by caching pre-configured environments, making them ideal for reproducible and scalable development workflows.


      EXECUTION GUIDELINES:

      Always shutdown Devboxes after use to free up resources.
      Use execute_async for non-blocking commands.
      Handle API errors using try/except blocks.
      Use tunnels for exposing services running inside Devboxes.
      Use snapshots to persist Devbox states.

    ```
  </Tab>

  <Tab title="Typescript">
    ```.cursorrules
    You are working with @runloop/api-client, a typescript SDK for deploying and managing remote devboxes for AI agents. Use this guide to properly interact with the SDK.

    Devbox Overview

    A Devbox is a virtual development environment designed for running AI-generated or arbitrary code in an isolated sandbox. It provides configurable compute resources, supports execution of shell commands, and can be managed via API.

    Each Devbox has a unique ID, a status (e.g., running, suspended, or shutdown), and metadata for user-defined settings. It includes launch parameters for customizing resource size, execution behavior, and available ports.

    Key attributes:

    ID (string, required) – Unique identifier of the Devbox.
    Status (enum, required) – Current state of the Devbox (e.g., provisioning, running, shutdown).
    Create time (integer, required) – Timestamp (ms) when the Devbox was created.
    Launch parameters (object, required) – Includes startup commands, resource configuration, idle timeout, and network settings.
    Capabilities (list, required) – Defines supported tools such as computer usage APIs, browser usage, and language servers.
    Blueprint/Snapshot ID (string, optional) – Identifier if created from a predefined Blueprint or Snapshot.
    Failure/Shutdown reason (enum, optional) – Reason for termination if applicable (e.g., out of memory, idle shutdown).
    Devboxes can be started, suspended, resumed, or shut down, and they support file operations, shell command execution, and network tunneling.


    CORE SDK USAGE

    Initialize clients:
        import Runloop from '@runloop/api-client';
        const client = new Runloop({
        bearerToken: process.env['RUNLOOP_API_KEY'],
        });

    BLUEPRINT OPERATIONS

    List Blueprints:
        for await (const blueprintView of client.blueprints.list()) {
        console.log(blueprintView.id);
        }

    Create a Blueprint:
        const blueprintView = await client.blueprints.create({ name: 'name' });
        console.log(blueprintView.id);

    Retrieve a Blueprint:
        const blueprintView = await client.blueprints.retrieve('id');
        console.log(blueprintView.id);

    Get Blueprint Logs:
        const blueprintBuildLogsListView = await client.blueprints.logs('id');
        console.log(blueprintBuildLogsListView.blueprint_id);

    Preview a Blueprint:
        const blueprintPreviewView = await client.blueprints.preview({ name: 'name' });
        console.log(blueprintPreviewView.dockerfile);

    DEVBOX OPERATIONS

    List Devboxes
        for await (const devboxView of client.devboxes.list()) {
        console.log(devboxView.id);
        }

    Create a Devbox:
        const devboxView = await client.devboxes.create();
        await runloop.devboxes.awaitRunning(devbox.id);
        console.log(devboxView.id);

    Retrieve a Devbox:
        const devboxView = await client.devboxes.retrieve('id');
        console.log(devboxView.id);

    Suspend a Devbox:
        const devboxView = await client.devboxes.suspend('id');
        console.log(devboxView.id);

    Resume a Devbox:
        const devboxView = await client.devboxes.resume('id');
        console.log(devboxView.id);

    Shutdown a Devbox:
        const devboxView = await client.devboxes.shutdown('id');
        console.log(devboxView.id);

    Keep Devbox Alive:
        const response = await client.devboxes.keepAlive('id');
        console.log(response);

    FILE OPERATIONS

    Write File Contents:
        const devboxExecutionDetailView = await client.devboxes.writeFileContents('id', {
        contents: 'contents',
        file_path: 'file_path',
        });
        console.log(devboxExecutionDetailView.devbox_id);

    Read File Contents:
        const response = await client.devboxes.readFileContents('id', { file_path: 'file_path' });
        console.log(response);

    Upload a File:
        const response = await client.devboxes.uploadFile('id', { path: 'path' });
        console.log(response);

    Download a File:
        const response = await client.devboxes.downloadFile('id', { path: 'path' });
        console.log(response);
        const content = await response.blob();
        console.log(content);

    SHELL COMMAND EXECUTION

    Execute Command Synchronously:
        const devboxExecutionDetailView = await client.devboxes.executeSync('id', { command: 'command' });
        console.log(devboxExecutionDetailView.devbox_id);

    Execute Command Asynchronously:
        const devboxAsyncExecutionDetailView = await client.devboxes.executeAsync('id', { command: 'command' });
        console.log(devboxAsyncExecutionDetailView.devbox_id);

    Retrieve Execution Status:
        const devboxAsyncExecutionDetailView = await client.devboxes.executions.retrieve(
        'devbox_id',
        'execution_id',
        );
        console.log(devboxAsyncExecutionDetailView.devbox_id);

    NETWORK OPERATIONS

    Create a Devbox Tunnel:
        const devboxTunnelView = await client.devboxes.createTunnel('id', { port: 0 });
        console.log(devboxTunnelView.devbox_id);

    Remove a Devbox Tunnel:
        const devboxTunnelView = await client.devboxes.removeTunnel('id', { port: 0 });
        console.log(devboxTunnelView.devbox_id);

    Create SSH Key:
        const response = await client.devboxes.createSSHKey('id');
        console.log(response.id);

    DEVBOX PERSISTENCE TOOLS

    List Disk Snapshots
        for await (const devboxSnapshotView of client.devboxes.listDiskSnapshots()) {
        console.log(devboxSnapshotView.id);
        }

    Create a Snapshot:
        const devboxSnapshotView = await client.devboxes.snapshotDisk('id');
        console.log(devboxSnapshotView.id);

    Delete a Snapshot:
        const response = await client.devboxes.deleteDiskSnapshot('id');
        console.log(response);

    DEVBOX OBSERVABILITY TOOLS

    Retrieve Devbox Logs:
        const devboxLogsListView = await client.devboxes.logs.list('id');
        console.log(devboxLogsListView.logs);

    Fetch Logs via API:
        const options = { method: 'GET', headers: { Authorization: 'Bearer <token>' } };
        fetch('https://api.runloop.ai/v1/devboxes/{id}/logs/tail', options)
        .then(response => response.json())
        .then(response => console.log(response))
        .catch(err => console.error(err));


    REPOSITORY OBJECT

    A Repository in Runloop represents a connection to a remote repository and facilitates its automated analysis. This enables users to manage and inspect repositories efficiently.

    ATTRIBUTES
    - id (string, required) - Unique identifier of the Repository.
    - name (string, required) - The name of the Repository.
    - owner (string, required) - The account owner associated with the Repository.
    - status (enum<string>, required) - The current state of the Repository.  
      Available options: `pending`, `failure`, `active`.
    - failure_reason (string | null, optional) - The reason for failure, if the repository has a `failure` status.

    Repositories can be created, retrieved, listed, and deleted using the Runloop API.


    REPOSITORY OPERATIONS

    List Repositories:
        for await (const repositoryConnectionView of client.repositories.list()) {
        console.log(repositoryConnectionView.id);
        }

    Create a Repository:
        const repositoryConnectionView = await client.repositories.create({ name: 'name', owner: 'owner' });
        console.log(repositoryConnectionView.id);

    Retrieve a Repository:
        const repositoryConnectionView = await client.repositories.retrieve('id');
        console.log(repositoryConnectionView.id);

    Delete a Repository:
        const repository = await client.repositories.delete('id');
        console.log(repository);

    Inspect Latest Repository State via API:
        const options = { method: 'POST', headers: { Authorization: 'Bearer <token>' } };
        fetch('https://api.runloop.ai/v1/repositories/{id}/inspect_latest', options)
        .then(response => response.json())
        .then(response => console.log(response))
        .catch(err => console.error(err));

    List Repository Versions:
        const repositoryVersionListView = await client.repositories.versions('id');
        console.log(repositoryVersionListView.analyzed_versions);


    EXECUTION GUIDELINES:

    Always shutdown Devboxes after use to free up resources.
    Use execute async for non-blocking commands.
    Handle API errors using try/except blocks.
    Use tunnels for exposing services running inside Devboxes.
    Use snapshots to persist Devbox states.
    ```
  </Tab>
</Tabs>

## Download .cursorrules Files (Legacy)

Legacy `.cursorrules` files for working with the Runloop SDK are also available below:

* <a href="/static/files/python.cursorrules" download="python.cursorrules">Python File</a>
* <a href="/static/files/typescript.cursorrules" download="typescript.cursorrules">TypeScript File</a>


# Runloop Dashboard
Source: https://docs.runloop.ai/tools/dashboard

Manage, monitor, and optimize your AI-powered coding environments with the Runloop Dashboard.

The Runloop Dashboard is a powerful web-based interface designed to help developers manage, monitor, and optimize their AI-powered coding environments. It serves as a central command center for your Devboxes, offering intuitive tools for deployment, monitoring, and troubleshooting.

## Getting Started

1. Log in to your Runloop account at [https://platform.runloop.ai](https://platform.runloop.ai)
2. Navigate through the sidebar to access different tools and features

## Key Features

1. **Runloop Shell**: An in-browser command-line interface for `running` Devbox interaction.
2. **Comprehensive Search**: Quickly find specific Devboxes using metadata and status filters.
3. **Log Viewer**: Deep dive into Devbox logs with real-time streaming and querying.
4. **Resource Monitoring**: Track and optimize CPU, memory, and storage usage across your Devboxes.

## Essential Dashboard Tools

### Runloop Shell

The Runloop Shell allows you to manage active Devboxes, execute commands, and troubleshoot issues without leaving your browser.

### Advanced Search

Use the filter functionality to find the right Devboxes Devboxes:

* By status: `status:running`
* By metadata: `metadata.project:ai-refactor`
* By time range: `created_after:2023-01-01`

### Log Analysis

Access and analyze logs for any Devbox:

1. Select a Devbox from the dashboard
2. Navigate to the "Logs" tab
3. Use built-in filters to isolate specific log entries
4. Enable real-time streaming for active monitoring

### Resource Optimization (Coming Soon)

Monitor resource utilization:

1. View historical usage graphs
2. Receive optimization recommendations


# SDKs
Source: https://docs.runloop.ai/tools/sdks

Use the Runloop SDKs to interact with the Runloop API.

# Runloop SDKs

Runloop provides SDKs in common languages to interact with the Runloop API. These SDKs allow you to create, manage, and interact with Devboxes, Blueprints, and other Runloop resources programmatically.

* If you are using Runloop from Python, use the [Runloop Python SDK](https://github.com/runloopai/api-client-python)
* If you are using Runloop from Node, use the [Runloop Typescript SDK](https://github.com/runloopai/api-client-ts)

Please reach out if you need SDKs in other languages.