# Complete a BenchmarkRun. Source: https://docs.runloop.ai/api-reference/benchmark/complete-a-benchmarkrun post /v1/benchmarks/runs/{id}/complete Complete a currently running BenchmarkRun. # Create a Benchmark. Source: https://docs.runloop.ai/api-reference/benchmark/create-a-benchmark post /v1/benchmarks Create a Benchmark with a set of Scenarios. # Get a Benchmark. Source: https://docs.runloop.ai/api-reference/benchmark/get-a-benchmark get /v1/benchmarks/{id} Get a previously created Benchmark. # Get a previously created BenchmarkRun. Source: https://docs.runloop.ai/api-reference/benchmark/get-a-previously-created-benchmarkrun get /v1/benchmarks/runs/{id} Get a BenchmarkRun given ID. # List BenchmarkRuns. Source: https://docs.runloop.ai/api-reference/benchmark/list-benchmarkruns get /v1/benchmarks/runs List all BenchmarkRuns matching filter. # List Benchmarks. Source: https://docs.runloop.ai/api-reference/benchmark/list-benchmarks get /v1/benchmarks List all Benchmarks matching filter. # List Public Benchmarks. Source: https://docs.runloop.ai/api-reference/benchmark/list-public-benchmarks get /v1/benchmarks/list_public List all public benchmarks matching filter. # Start a new BenchmarkRun. Source: https://docs.runloop.ai/api-reference/benchmark/start-a-new-benchmarkrun post /v1/benchmarks/start_run Start a new BenchmarkRun based on the provided Benchmark. # Update a Benchmark. Source: https://docs.runloop.ai/api-reference/benchmark/update-a-benchmark post /v1/benchmarks/{id} Update a Benchmark with a set of Scenarios. # Create and build a Blueprint. Source: https://docs.runloop.ai/api-reference/blueprint/create-and-build-a-blueprint post /v1/blueprints Starts build of custom defined container Blueprint. The Blueprint will begin in the 'provisioning' step and transition to the 'building' step once it is selected off the build queue., Upon build complete it will transition to 'building_complete' if the build is successful. # Delete a Blueprint. Source: https://docs.runloop.ai/api-reference/blueprint/delete-a-blueprint post /v1/blueprints/{id}/delete Delete a previously created Blueprint. # Get a Blueprint. Source: https://docs.runloop.ai/api-reference/blueprint/get-a-blueprint get /v1/blueprints/{id} Get the details of a previously created Blueprint including the build status. # Get Blueprint build logs. Source: https://docs.runloop.ai/api-reference/blueprint/get-blueprint-build-logs get /v1/blueprints/{id}/logs Get all logs from the building of a Blueprint. # List Blueprints. Source: https://docs.runloop.ai/api-reference/blueprint/list-blueprints get /v1/blueprints List all Blueprints or filter by name. # Preview Dockerfile definition for a Blueprint. Source: https://docs.runloop.ai/api-reference/blueprint/preview-dockerfile-definition-for-a-blueprint post /v1/blueprints/preview Preview building a Blueprint with the specified configuration. You can take the resulting Dockerfile and test out your build using any local docker tooling. # The Blueprint Object Source: https://docs.runloop.ai/api-reference/blueprint/the-blueprint-object Blueprints are ways to create customized starting points for Devboxes. They allow you to define custom starting points for Devboxes such that environment set up can be cached to improve Devbox boot times. # Create a Browser. Source: https://docs.runloop.ai/api-reference/browser/create-a-browser post /v1/devboxes/browsers Create a Devbox that has a managed Browser and begin the boot process. As part of booting the Devbox, the browser will automatically be started with connection utilities activated. # Get Browser Details. Source: https://docs.runloop.ai/api-reference/browser/get-browser-details get /v1/devboxes/browsers/{id} # Create a Computer. Source: https://docs.runloop.ai/api-reference/computer/create-a-computer post /v1/devboxes/computers Create a Computer and begin the boot process. The Computer will initially launch in the 'provisioning' state while Runloop allocates the necessary infrastructure. It will transition to the 'initializing' state while the booted Computer runs any Runloop or user defined set up scripts. Finally, the Computer will transition to the 'running' state when it is ready for use. # Get Computer Details. Source: https://docs.runloop.ai/api-reference/computer/get-computer-details get /v1/devboxes/computers/{id} # Asynchronously execute a command via the Devbox shell Source: https://docs.runloop.ai/api-reference/devbox/asynchronously-execute-a-command-via-the-devbox-shell post /v1/devboxes/{id}/execute_async Execute the given command in the Devbox shell asynchronously and returns the execution that can be used to track the command's progress. # Create a Devbox. Source: https://docs.runloop.ai/api-reference/devbox/create-a-devbox post /v1/devboxes Create a Devbox and begin the boot process. The Devbox will initially launch in the 'provisioning' state while Runloop allocates the necessary infrastructure. It will transition to the 'initializing' state while the booted Devbox runs any Runloop or user defined set up scripts. Finally, the Devbox will transition to the 'running' state when it is ready for use. # Create a disk snapshot of a running Devbox. Source: https://docs.runloop.ai/api-reference/devbox/create-a-disk-snapshot-of-a-running-devbox post /v1/devboxes/{id}/snapshot_disk Create a disk snapshot of a devbox with the specified name and metadata to enable launching future Devboxes with the same disk state. # Create a tunnel to an available port on the Devbox. Source: https://docs.runloop.ai/api-reference/devbox/create-a-tunnel-to-an-available-port-on-the-devbox post /v1/devboxes/{id}/create_tunnel Create a live tunnel to an available port on the Devbox. Note the port must be made available using Devbox.create.availablePorts. Otherwise, the tunnel will not connect to any running processes on the Devbox. # Create an SSH key for a Devbox Source: https://docs.runloop.ai/api-reference/devbox/create-an-ssh-key-for-a-devbox post /v1/devboxes/{id}/create_ssh_key Create an SSH key for a Devbox to enable remote access. # Delete a disk snapshot of a Devbox. Source: https://docs.runloop.ai/api-reference/devbox/delete-a-disk-snapshot-of-a-devbox post /v1/devboxes/disk_snapshots/{id}/delete Delete a previously taken disk snapshot of a Devbox. # Download binary file contents from Devbox filesystem. Source: https://docs.runloop.ai/api-reference/devbox/download-binary-file-contents-from-devbox-filesystem post /v1/devboxes/{id}/download_file Download file contents of any type (binary, text, etc) from a specified path on the Devbox. # Get Devbox details. Source: https://docs.runloop.ai/api-reference/devbox/get-devbox-details get /v1/devboxes/{id} Get the latest details and status of a Devbox. # Get Devbox logs. Source: https://docs.runloop.ai/api-reference/devbox/get-devbox-logs get /v1/devboxes/{id}/logs Get all logs from a running or completed Devbox. # Get status of an asynchronous execution on a Devbox. Source: https://docs.runloop.ai/api-reference/devbox/get-status-of-an-asynchronous-execution-on-a-devbox get /v1/devboxes/{devbox_id}/executions/{execution_id} Get the latest status of a previously launched asynchronous execuction including stdout/error and the exit code if complete. # null Source: https://docs.runloop.ai/api-reference/devbox/kill-an-asynchronous-execution-currently-running-on-a-devbox post /v1/devboxes/{id}/executions/{execution_id}/kill # List Devboxes. Source: https://docs.runloop.ai/api-reference/devbox/list-devboxes get /v1/devboxes List all Devboxes while optionally filtering by status. # List disk snapshots of a Devbox. Source: https://docs.runloop.ai/api-reference/devbox/list-disk-snapshots-of-a-devbox get /v1/devboxes/disk_snapshots List all snapshots of a Devbox while optionally filtering by Devbox ID. # Live Tail Devbox Logs. Source: https://docs.runloop.ai/api-reference/devbox/live-tail-devbox-logs get /v1/devboxes/{id}/logs/tail Tail the logs for the given devbox. This will return past log entries and continue streaming from there. The stream will then continue to stream logs until the connection is closed. # Read text file contents from Devbox filesystem. Source: https://docs.runloop.ai/api-reference/devbox/read-text-file-contents-from-devbox-filesystem post /v1/devboxes/{id}/read_file_contents Read file contents from a file on a Devbox as a UTF-8. Note 'downloadFile' should be used for large files (greater than 100MB). Returns the file contents as a UTF-8 string. # Remove an open tunnel on the Devbox. Source: https://docs.runloop.ai/api-reference/devbox/remove-an-open-tunnel-on-the-devbox post /v1/devboxes/{id}/remove_tunnel Remove a previously opened tunnel on the Devbox. # Shutdown a running Devbox. Source: https://docs.runloop.ai/api-reference/devbox/shutdown-a-running-devbox post /v1/devboxes/{id}/shutdown Shutdown a running Devbox. This will permanently stop the Devbox. If you want to save the state of the Devbox, you should take a snapshot before shutting down or should suspend the Devbox instead of shutting down. # Synchronously execute a shell command on a Devbox Source: https://docs.runloop.ai/api-reference/devbox/synchronously-execute-a-shell-command-on-a-devbox post /v1/devboxes/{id}/execute_sync Execute a bash command in the Devbox shell, await the command completion and return the output. # The Devbox Object Source: https://docs.runloop.ai/api-reference/devbox/the-devbox-object A Devbox represents a virtual development environment. It is an isolated sandbox that can be given to agents and used to run arbitrary code such as AI generated code. # Upload binary file contents to Devbox filesystem. Source: https://docs.runloop.ai/api-reference/devbox/upload-binary-file-contents-to-devbox-filesystem post /v1/devboxes/{id}/upload_file Upload file contents of any type (binary, text, etc) to a Devbox. Note this API is suitable for large files (larger than 100MB) and efficiently uploads files via multipart form data. # Write text file contents to Devbox filesystem. Source: https://docs.runloop.ai/api-reference/devbox/write-text-file-contents-to-devbox-filesystem post /v1/devboxes/{id}/write_file_contents Write UTF-8 string contents to a file at path on the Devbox. Note for large files (larger than 100MB), the upload_file endpoint must be used. # Introduction Source: https://docs.runloop.ai/api-reference/introduction Welcome to the Runloop API Reference Documentation ## Authentication All API endpoints are authenticated using Bearer tokens generated via the [API Key section of the Runloop Dashboard](https://platform.runloop.ai/manage/keys). ```bash curl -X 'POST' \ 'https://api.runloop.ai/v1/devboxes' \ -H "Authorization: Bearer " \ -H 'Content-Type: application/json' ``` ## SDKs To make it easier to interact with the Runloop API, we've created SDKs in a variety of languages. * [Runloop Python SDK](https://github.com/runloopai/api-client-python) * [Runloop Typescript SDK](https://github.com/runloopai/api-client-ts) Please reach out if you need SDKs in other languages. # Create a Repository Connection. Source: https://docs.runloop.ai/api-reference/repository/create-a-repository-connection post /v1/repositories Create a connection to a Github Repository and trigger an initial inspection of the repo's technical stack and developer environment requirements. # Delete a Repository Connection and associated objects. Source: https://docs.runloop.ai/api-reference/repository/delete-a-repository-connection-and-associated-objects post /v1/repositories/{id}/delete Permanently Delete a Repository Connection including any automatically generated inspection insights. # Get Repository Connection details. Source: https://docs.runloop.ai/api-reference/repository/get-repository-connection-details get /v1/repositories/{id} Get Repository Connection details including latest inspection status and generated repository insights. # List analyzed repository versions. Source: https://docs.runloop.ai/api-reference/repository/list-analyzed-repository-versions get /v1/repositories/{id}/versions List all analyzed versions of a repository connection including automatically generated insights for each version. # List available repository connections. Source: https://docs.runloop.ai/api-reference/repository/list-available-repository-connections get /v1/repositories List all available repository connections. # The Repository Object Source: https://docs.runloop.ai/api-reference/repository/the-repository-object The Repository object manages the link to a remote repository and the automated analysis of the repository. # Trigger inspection of the latest version of a repository connection. Source: https://docs.runloop.ai/api-reference/repository/trigger-inspection-of-the-latest-version-of-a-repository-connection post /v1/repositories/{id}/inspect_latest Trigger inspection of the latest version of a repository including repo's technical stack and developer environment requirements. # Complete a ScenarioRun. Source: https://docs.runloop.ai/api-reference/scenario/complete-a-scenariorun post /v1/scenarios/runs/{id}/complete Complete a currently running ScenarioRun. Calling complete will shutdown underlying Devbox resource. # Create a custom scenario scorer. Source: https://docs.runloop.ai/api-reference/scenario/create-a-custom-scenario-scorer post /v1/scenarios/scorers Create a custom scenario scorer. # Create a Scenario. Source: https://docs.runloop.ai/api-reference/scenario/create-a-scenario post /v1/scenarios Create a Scenario, a repeatable AI coding evaluation test that defines the starting environment as well as evaluation success criteria. # Get a previously created ScenarioRun. Source: https://docs.runloop.ai/api-reference/scenario/get-a-previously-created-scenariorun get /v1/scenarios/runs/{id} Get a ScenarioRun given ID. # Get a Scenario. Source: https://docs.runloop.ai/api-reference/scenario/get-a-scenario get /v1/scenarios/{id} Get a previously created scenario. # List Public Scenarios. Source: https://docs.runloop.ai/api-reference/scenario/list-public-scenarios get /v1/scenarios/list_public List all public scenarios matching filter. # List Scenario Scorers. Source: https://docs.runloop.ai/api-reference/scenario/list-scenario-scorers get /v1/scenarios/scorers List all Scenario Scorers matching filter. # List ScenarioRuns. Source: https://docs.runloop.ai/api-reference/scenario/list-scenarioruns get /v1/scenarios/runs List all ScenarioRuns matching filter. # List Scenarios. Source: https://docs.runloop.ai/api-reference/scenario/list-scenarios get /v1/scenarios List all Scenarios matching filter. # Retrieve Scenario Scorer. Source: https://docs.runloop.ai/api-reference/scenario/retrieve-scenario-scorer get /v1/scenarios/scorers/{id} Retrieve Scenario Scorer. # Score a ScenarioRun. Source: https://docs.runloop.ai/api-reference/scenario/score-a-scenariorun post /v1/scenarios/runs/{id}/score Score a currently running ScenarioRun. # Start a new ScenarioRun. Source: https://docs.runloop.ai/api-reference/scenario/start-a-new-scenariorun post /v1/scenarios/start_run Start a new ScenarioRun based on the provided Scenario. # Update a custom scenario scorer. Source: https://docs.runloop.ai/api-reference/scenario/update-a-custom-scenario-scorer post /v1/scenarios/scorers/{id} Update a scenario scorer. # Update a Scenario. Source: https://docs.runloop.ai/api-reference/scenario/update-a-scenario post /v1/scenarios/{id} Update a Scenario, a repeatable AI coding evaluation test that defines the starting environment as well as evaluation success criteria. # Validate a custom scenario scorer. Source: https://docs.runloop.ai/api-reference/scenario/validate-a-custom-scenario-scorer post /v1/scenarios/scorers/{id}/validate Validate a scenario scorer. # Build Custom Agent Benchmarks with Runloop Source: https://docs.runloop.ai/benchmarks/custom-benchmarks Learn how to create and run custom benchmarks. We're personally excited about this part of our platform - let us know at support@runloop.ai if you need any help! ## Creating Custom Scenarios Creating custom scenarios allows users to tailor problem statements and environments specific to their needs. This is useful for testing agents in controlled conditions or building unique challenges. To define your own scenario: 1. Create a development environment (devbox). 2. Take a snapshot of the environment at a key point in time. 3. Define a problem statement for the scenario. 4. Attach scoring functions to measure performance. Example: ```typescript TypeScript const devbox = await runloop.devboxes.create({ blueprint_name: "bpt_123" }); const mySnapshot = await runloop.devboxes.snapshotDisk(devbox.id, { name: 'div incorrectly centered in flexbox', }); const myNewScenario = await runloop.scenarios.create({ name: 'My New Scenario', input_context: { problem_statement: 'Create a UI component' }, environment_parameters: { snapshot_id: '123' }, scoring_contract: { scoring_function_parameters: [{ name: 'bash_scorer', scorer: { type: 'bash_script_scorer', bash_script: 'some script that writes files and validates output', }, weight: 1.0, }], }, }); ``` ## Understanding Scoring Functions Scoring functions validate whether a scenario was successfully completed. These functions help ensure solutions are correct, provide feedback, and assign a score for evaluation. ### Basic Scoring Function Example A simple scoring function is a bash script that echoes a score between 0 and 1: ```typescript TypeScript scoring_function_parameters: [{ name: 'my-custom-pytest-script', scorer: { name: 'bash_scorer', type: 'bash_script_scorer', bash_script: 'echo 1.0', }, weight: 1.0, }] ``` ### Custom Scoring Functions To make scoring more reusable and flexible, you can define **custom scoring functions**. These are used to evaluate performance in specific ways, such as running tests or analyzing output logs. Example: ```typescript TypeScript const myCustomScenario = await runloop.scenarios.create({ name: 'scenario with custom scorer', input_context: { problem_statement: 'Create a UI component' }, environment_parameters: { snapshot_id: mySnapshot.id }, scoring_contract: { scoring_function_parameters: [{ name: 'my-custom-pytest-script', scorer: { type: 'custom_scorer', custom_scorer_type: 'my-custom-pytest-script', scorer_params: { relevant_tests: ['foo.test.py', 'bar.test.py'] }, }, weight: 1.0, }], }, }); ``` ### Custom benchmarks Once you have your scenarios and scoring functions defined, you can run all of your custom scenarios as a **custom benchmark**. You'll need to create the benchmark instance first, then run it. Here's how: ```typescript TypeScript const myBenchmark = await runloop.benchmarks.create({ name: 'py bench', scenarios: [myNewScenario.id, myCustomScenario.id] }) ``` You can update both code scenarios and benchmarks at any time so that you can build it up over time. # Overview of Benchmarking on Runloop Source: https://docs.runloop.ai/benchmarks/overview Make your agent better and more reliable with Runloop's tools for benchmarking. Your AI coding agent is capable of numerous tasks such as reading code, writing and preparing patches, and submitting commits to code repositories. A common problem with such agents is ensuring that they *perform*: Without monitoring, tuning and optimization, your agent may be prone to making mistakes, experience regressions over time, and generally not deliver the best user experience. Runloop Benchmarking is a a suite of tools to help you address these issues and stay focused on building the best possible agent. ## Main Features Runloop Benchmarking includes several tools to save you time while optimizing your agent: * **Run Public Benchmarks:** Easily run your agent against a matrix of well-known and open source benchmarks, such as SWE-bench. * **Run Custom Benchmarks:** Write custom scoring functions for each of your agent's tasks, then evaluate the agent's performance against them. * **Reports & Insights:** As you run benchmarks over time, you will see how your agent's performance changes in the Runloop dashboard. ## Key Concepts Whether you're using public or custom benchmarks, you'll keep the following key concepts in mind: * **Code Scenario**: A single test case where an agent is given a problem and is expected to modify a target environment to solve it. Scenarios help test AI agents in realistic coding environments. * **Scoring Function**: A script or function that runs after the agent completes its task to validate whether the solution works. These functions generate a final score between 0 and 1 to indicate performance. * **Benchmark**: A collection of Code Scenarios designed to evaluate AI agents on a broader set of tasks. Benchmarks help measure agent capabilities systematically. Next, learn how to [run public benchmarks](/benchmarks/public-benchmarks). # Public Benchmarks Source: https://docs.runloop.ai/benchmarks/public-benchmarks Learn how to easily run your agent against popular public benchmarks. export const ExampleRepoLink = props => { return

Full example

; }; ## Public Benchmarks Runloop Public Benchmarks make it simple to validate your coding agent against the most popular, open source coding evaluation datasets. Each Benchmark contains a set of Scenarios based on each test in the dataset. The Scenario contains the **problem statement** that your agent must work through, a pre-built **environment** containing all of context needed to complete the job, and a built-in **scoring contract** to properly evaluate the result for correctness. ## Viewing Public Benchmarks We're constantly adding new supported datasets. To list the up-to-date list of supported public Benchmarks, use the following API call: ```typescript TypeScript // Query to see the latest list of supported public benchmarks // princeton-nlp/SWE-bench_Lite, etc const { benchmarks } = await rl.benchmarks.list_public(); ``` Are we missing your favorite open source benchmark? Let us know at [support@runloop.ai](mailto:support@runloop.ai) Each Benchmark contains a set of Scenarios that correspond to a test-case in the evaluation dataset. ```typescript TypeScript // The Benchmark definition contains a list of all scenarios contained in the benchmark console.log(benchmarks[0].scenarioIds) ``` ## Running Scenarios & Benchmarks Each Scenario can be **run** to evaluate an AI agent's performance. Running a scenario involves: 1. Initiating a scenario run. 2. Launching a development environment (devbox). 3. Running the agent against the problem statement. 4. Scoring the results. 5. Uploading traces for analysis. ### Run a single scenario from a public benchmark Here's an example of how to run a single scenario from a public benchmark against your own agent. First, create a **scenario run** to track the status and results of this run: ```typescript TypeScript const scenarioId = benchmarks[0].scenarioIds[0] const scenarioRun = await runloop.scenarios.startRun({ scenario_id: scenarioId, run_name: 'marshmallow-code__marshmallow-1359 test run' }); ``` When starting a run, Runloop will create a Devbox with the *environment* specified by the test requirements. Wait for the devbox used by the scenario to become ready: ```typescript TypeScript const devboxId = scenarioRun.devbox_id; await runloop.devboxes.awaitRunning(devboxId); ``` Now, run your agent. How and where your agent runs is up to you. Here's an example of an agent that leverages the Runloop Devbox that was just created: ```typescript TypeScript const myAgent = new MyAgent({ prompt: scenarioRun.scenario.context.problemStatement, tools: [runloop.devboxes.shellTools(devboxId)], }); ``` Finally, run the scoring function to validate the agent's performance: ```typescript TypeScript // Run the scoring function. Automatically marks the secenario run as done. const validateResults = await runloop.scenarioRuns.scoreAndAwait( scenarioRun.id ); console.log(validateResults); ``` ### Perform a full benchmark run of a public benchmark Once your agent is excelling at an individual scenario, you will want to test against all Scenarios for a given Benchmark. Here's an example of how to perform a full benchmark run of a public benchmark. ```typescript TypeScript // Start a full run of the first public benchmark returned let benchmarkRun = await runloop.benchmarks.startRun({ benchmark_id: benchmarks[0].id, run_name: 'optional run name' }); // This shows a serialized scenario by scenario runner but can also run in any // level of parallelism benchmarkRun.pending_scenarios.forEach(async scenarioId => { // create a scenario run tied to the benchmark run const scenarioRun = await runloop.scenarios.startRunAndAwaitEnvReady({ scenario_id: scenarioId, benchmark_run_id: benchmarkRun.id }); const devboxId = scenarioRun.devbox_id; // Run your agent on the problem at hand to see how it does const myAgent = new MyAgent({ prompt: scenarioRun.scenario.context.problemStatement, tools: [runloop.devboxes.shellTools(devboxId)], }); // Score and complete the run. This will also properly shutdown the Devbox environment. const validateResults = await runloop.scenarios.runs.scoreAndComplete( scenarioRun.id ); }); // Benchmark runs will end automatically when no more pending scenarios but also // can optionally just end a benchmark run early await runloop.benchmarks.runs.complete(benchmarkRun.id) ``` Public Benchmarks make it fast and easy to start evaluating your agent against industry standard coding evaluations. When you're ready to expand or customize Benchmarks that meet your specific needs, move on to creating [Custom Benchmarks](/benchmarks/custom-benchmarks). # Quickstart - Controlling a Browser in a Runloop Devbox Source: https://docs.runloop.ai/devboxes/addons/browser Learn how to control a browser programmatically inside a Runloop Devbox using the Runloop SDK export const ExampleRepoLink = props => { return

Full example

; }; ## Introduction This guide will walk you through using the **Runloop SDK** to control a browser inside a **Runloop Devbox**. The Runloop API provides a **browser-ready Devbox**, enabling AI agents to interact with web pages programmatically. Set up your authentication key: ```bash export RUNLOOP_API_KEY="your-api-key" ``` First, install the Runloop SDK if you haven't already: ```bash Python pip install runloop_api_client ``` ```bash TypeScript npm install @runloop/api-client ``` Then, import and initialize the SDK: ```python Python from runloop_api_client import Runloop client = Runloop(bearer_token="your-api-key") ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: 'your-api-key' }); ``` This `client` object allows interaction with the Runloop API. Set up your **browser-ready Devbox** and obtain the connection details: ```python Python # Create a Devbox with a browser instance browser = client.devboxes.browsers.create() # Wait for the Devbox to be fully running client.devboxes.await_running(browser.devbox.id) # View your remote browser here: browser.live_view_url # Connect to your browser here: browser.connection_url ``` ```typescript TypeScript // Create a Devbox with a browser instance const browser = await client.devboxes.browsers.create(); // Wait for the Devbox to be fully running await client.devboxes.awaitRunning(browser.devbox.id); // View your remote browser here: console.log(browser.live_view_url); // Connect to your browser here: console.log(browser.connection_url); ``` To interact with the browser, you can use automation tools like **Selenium, Puppeteer, or Playwright**. Here's an example using **Playwright's Chrome DevTools Protocol (CDP)**: ```python Python from playwright.async_api import async_playwright # Initialize playwright context manager playwright = await async_playwright().start() # Connect to your remote browser browser = await playwright.chromium.connect_over_cdp(url) # Create your browser context context = await browser.new_context() # Accesses pages in the browser context's list of pages page = context.pages[0] ``` ```typescript TypeScript import { chromium } from 'playwright'; // Initialize playwright and connect to browser const browser = await chromium.connectOverCDP(url); // Create your browser context const context = await browser.newContext(); // Accesses pages in the browser context's list of pages const page = context.pages()[0]; ``` You can create **custom tools** for AI agents to interact with the browser programmatically. Here's an example of a **navigation tool** using Playwright: ```python Python from playwright.async_api import async_playwright class NavigateTool: """A tool for navigating to a URL using Playwright.""" async def __call__(self, *, url: str): async with async_playwright() as p: browser = await p.chromium.launch() page = await browser.new_page() await page.goto(url) content = await page.content() await browser.close() return {"output": f"Navigated to {url}", "content": content[:500]} def to_params(self): return { "name": "navigate_tool", "description": "Navigates to a URL and retrieves content.", "input_schema": { "type": "object", "properties": {"url": {"type": "string"}}, "required": ["url"], }, } ``` ```typescript TypeScript import { chromium, Browser, Page } from 'playwright'; class NavigateTool { /** * A tool for navigating to a URL using Playwright. */ async call({ url }: { url: string }) { const browser = await chromium.launch(); const page = await browser.newPage(); await page.goto(url); const content = await page.content(); await browser.close(); return { output: `Navigated to ${url}`, content: content.slice(0, 500) }; } toParams() { return { name: "navigate_tool", description: "Navigates to a URL and retrieves content.", input_schema: { type: "object", properties: { url: { type: "string" } }, required: ["url"], }, }; } } ``` Now, you can pass this tool to an AI agent, enabling it to use the browser autonomously: ```python Python tool_instance = NavigateTool() response = client.messages.create( model=model, max_tokens=max_tokens, messages=messages, tools=[tool_instance.to_params()] ) ``` ```typescript TypeScript const toolInstance = new NavigateTool(); const response = await client.messages.create({ model, maxTokens, messages, tools: [toolInstance.toParams()] }); ``` Different LLM providers have their own specific formats and requirements for defining and passing tools. Make sure to reference your LLM provider's documentation for the correct implementation details of tool schemas and function calling. To ensure efficient resource management, **always shut down the Devbox** when you're done: ```python Python client.devboxes.shutdown(browser.devbox.id) ``` ```typescript TypeScript await client.devboxes.shutdown(browser.devbox.id); ``` ## Additional Resources * [Runloop GitHub Repository](https://github.com/runloopai/examples) - Explore more examples. * [Runloop API Documentation](https://docs.runloop.ai) - Official API reference. # Quickstart - Controlling a remote computer in a Runloop Devbox Source: https://docs.runloop.ai/devboxes/addons/computer Learn how to control a computer programmatically inside a Runloop Devbox using the Runloop SDK export const ExampleRepoLink = props => { return

Full example

; }; ## Introduction This guide will walk you through using the **Runloop SDK** to control a remote computer inside a **Runloop Devbox**. The Runloop API provides a **computer-ready Devbox**, enabling AI agents to interact with the system programmatically. Set up your authentication key: ```bash export RUNLOOP_API_KEY="your-api-key" ``` First, install the Runloop SDK if you haven't already: ```bash Python pip install runloop_api_client ``` ```bash TypeScript npm install @runloop/api-client ``` Then, import and initialize the SDK: ```python Python from runloop_api_client import Runloop client = Runloop(bearer_token="your-api-key") ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: 'your-api-key' }); ``` This `client` object allows interaction with the Runloop API. Create your Devbox, wait for it to be ready, and retrieve the connection details: ```python Python # Create a Devbox with a computer instance computer = client.devboxes.computers.create() # Wait for the Devbox to be fully running client.devboxes.await_running(computer.devbox.id) # Retrieve the computer connection details devbox_id = computer.devbox.id display_url = computer.live_screen_url ``` ```typescript TypeScript // Create a Devbox with a computer instance const computer = await client.devboxes.computers.create(); // Wait for the Devbox to be fully running await client.devboxes.awaitRunning(computer.devbox.id); // Retrieve the computer connection details const devboxId = computer.devbox.id; const displayUrl = computer.live_screen_url; ``` The **computer-ready Devbox** offers a suite of **Computer Tools** for agent interactions. The available actions include: * **Keyboard interaction**: `key`, `type` * **Mouse interaction**: `mouse_move`, `left_click`, `left_click_drag`, `right_click`, `middle_click`, `double_click` * **Screen interaction**: `screenshot`, `cursor_position` You can access these tools using the Runloop client as shown below: ```python Python # Keyboard usage client.devboxes.computers.keyboard_interaction(devbox_id, action=action, text=text) # Mouse usage client.devboxes.computers.mouse_interaction(devbox_id, action=action, coordinate={"x": int, "y": int}) # Take a screenshot or retrieve current mouse coordinates client.devboxes.computers.screen_interaction(devbox_id, action=action) ``` ```typescript TypeScript // Keyboard usage await client.devboxes.computers.keyboardInteraction(devboxId, { action: action, text: text }); // Mouse usage await client.devboxes.computers.mouseInteraction(devboxId, { action: action, coordinate: { x: number, y: number } }); // Take a screenshot or retrieve current mouse coordinates await client.devboxes.computers.screenInteraction(devboxId, { action: action }); ``` Once you create tools for your agent, you can integrate them with your preferred LLM. Here's an example of integrating it with **Anthropic's Claude**: ```python Python import Anthropic anthropic_client = Anthropic(api_key="your-anthropic-api-key") response = anthropic_client.messages.create( model="claude-3", max_tokens=300, messages=messages tools=[tool], ) ``` ```typescript TypeScript import Anthropic from '@anthropic-ai/sdk'; const anthropicClient = new Anthropic({ apiKey: 'your-anthropic-api-key' }); const response = await anthropicClient.messages.create({ model: 'claude-3', maxTokens: 300, messages: messages, tools: [tool] }); ``` Different LLM providers have their own specific formats and requirements for defining and passing tools. Make sure to reference your LLM provider's documentation for the correct implementation details of tool schemas and function calling. To ensure efficient resource management, **always shut down the Devbox** when you're done: ```python Python client.devboxes.shutdown(computer.devbox.id) ``` ```typescript TypeScript await client.devboxes.shutdown(computer.devbox.id); ``` ## Additional Resources * [Runloop GitHub Repository](https://github.com/runloopai/examples) - Explore more examples. * [Runloop API Documentation](https://docs.runloop.ai) - Official API reference. # Overview of Devbox Add-ons Source: https://docs.runloop.ai/devboxes/addons/overview Devboxes are more than just flexible, general purpose virtual machines. They also come with a set of optional add-ons that extend their capabilities in ways that many coding agents need. ## Available add-ons Currently available add-ons: * **Browser** (beta): A remotely-controllable Playwright browser. * **Computer** (beta): A remotely-controllable Ubuntu Desktop environment. See the next chapters in this section for instructions on using these add-ons. ## Pricing & availability During open beta, add-ons are available at no additional cost. # Devbox Blueprints Source: https://docs.runloop.ai/devboxes/blueprints Reproducible templates for devboxes export const ExampleRepoLink = props => { return

Full example

; }; Often you will want to start your devboxes with your own customizations. For example, you may want to always boot with a specific version of a language or framework or set up a specific repository. Rather than running these commands every time you launch a devbox, you can use a blueprint to optimize boot time by saving the state of your devbox after these commands have been run. By building a blueprint, you get: 1. **Standardization**: Define tools, binaries, and configurations your AI agent needs at runtime. 2. **Consistency**: Ensure reproducible AI behavior across environments. 3. **Efficiency**: Reduce Devbox startup time by pre-installing necessary tools. 4. **Customization**: Tailor environments to specific AI-assisted development needs. When should I use a Blueprint vs. a Snapshot? Snapshots and Blueprints both allow you to run devboxes with customizations. **Blueprints** are fast to boot and cacheable using Docker layers, while **Snapshots** are a bit slower on boot (reproducing each step taken in the devbox) but can be created quickly from an existing devbox. Examples: * **[Blueprint](/devboxes/blueprints)**: You have a coding agent that is performing a task that requires installing a specific tool. Create a blueprint with set-up steps for the tool and future devboxes will cache the installation to speed up boot and execution time. * **[Snapshot](/devboxes/snapshots)**: You have a coding agent in a devbox considering 3 different ways to complete a task. Create a snapshot of the initial state of the devbox, create 3 parallel devboxes from that snapshot, collate the results, and then choose the best option to continue. ## Creating a Blueprint One use case for a blueprint is preinstalling tools your AI agent may want to use. For example, let's create a simple Blueprint that installs `jq`, a lightweight command-line JSON processor: ```bash curl curl -X POST 'https://api.runloop.ai/v1/blueprints' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -d '{ "name": "docs-template", "system_setup_commands": ["sudo apt install -y jq"] }' ``` ```python Python from runloop_api_client import Runloop import os client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) blueprint = client.blueprints.create( name="docs-template", system_setup_commands=["sudo apt install -y jq"] ) print(f"Blueprint created with ID: {blueprint.id}") ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function createBlueprint() { const blueprint = await client.blueprints.create({ name: "docs-template", system_setup_commands: ["sudo apt install -y jq"] }); console.log(`Blueprint created with ID: ${blueprint.id}`); } createBlueprint(); ``` Use the Debian package manager (apt) for installing system packages on the Runloop base image. ## Using Your Blueprint Once your Blueprint's status is `build_complete`, create a Devbox using it: ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "blueprint_name": "fe-bot", "setup_commands": [ "cd /home/user/runloop-fe && git pull", "npm install" ] }' ``` ```python Python from runloop_api_client import Runloop import os client = Runloop(api_key=os.environ.get("RUNLOOP_API_KEY")) devbox = client.devboxes.create( blueprint_name="fe-bot", setup_commands=[ "cd /home/user/runloop-fe && git pull", "npm install" ] ) print(f"Devbox created with ID: {devbox.id}") ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function createDevbox() { const devbox = await client.devboxes.create({ blueprint_name: "fe-bot", setup_commands: [ "cd /home/user/runloop-fe && git pull", "npm install" ] }); console.log(`Devbox created with ID: ${devbox.id}`); } createDevbox(); ``` ## Creating Blueprints with CodeMounts ### Basic Configuration To add a CodeMount to your Blueprint: ```bash curl curl -X POST 'https://api.runloop.ai/v1/blueprints' \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -d '{ "name": "fe-bot", "code_mounts": [{ "repo_name": "runloop-fe", "repo_owner": "runloop" }] }' ``` ```python Python from runloop_api_client import Runloop import os client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) blueprint = client.blueprints.create( name="fe-bot", code_mounts=[{ "repo_name": "runloop-fe", "repo_owner": "runloop", "token": os.environ.get("GH_TOKEN") }] ) print(f"Blueprint created with ID: {blueprint.id}") ``` ```typescript TypeScript import { Runloop } from '@runloop/sdk'; const client = new Runloop('your_api_key_here'); const blueprint = await client.blueprints.create({ name: "fe-bot", code_mounts: [{ repo_name: "runloop-fe", repo_owner: "runloop", token: process.env.GH_TOKEN }] }); console.log(`Blueprint created with ID: ${blueprint.id}`); ``` This creates a Blueprint named "fe-bot" that includes the "runloop-fe" repository. ### Private Repository Authentication For private repositories, include a GitHub Personal Access Token (PAT): ```bash curl curl -X POST 'https://api.runloop.ai/v1/blueprints' \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -d '{ "name": "fe-bot", "code_mounts": [{ "repo_name": "runloop-fe", "repo_owner": "runloop", "token": "'"${GH_TOKEN}"'" }] }' ``` ```python Python blueprint = client.blueprints.create( name="fe-bot", code_mounts=[{ "repo_name": "runloop-fe", "repo_owner": "runloop", "token": os.environ.get("GH_TOKEN") }] ) ``` ```typescript TypeScript const blueprint = await client.blueprints.create({ name: "fe-bot", code_mounts: [{ repo_name: "runloop-fe", repo_owner: "runloop", token: process.env.GH_TOKEN }] }); ``` This sets up the necessary environment for immediate use of Git and GitHub tools. ## The Blueprint Build Process When you create a Blueprint, Runloop builds a custom image containing all specified tools and configurations. ### Checking Build Status After creating a Blueprint, check its status: ```bash curl curl -X GET 'https://api.runloop.ai/v1/blueprints/{blueprint_id}' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" ``` ```python Python from runloop_api_client import Runloop import os client = Runloop(api_key=os.environ.get("RUNLOOP_API_KEY")) blueprint = client.blueprints.retrieve("bpt_123") print(f"Blueprint status: {blueprint.status}") ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function checkBlueprintStatus(blueprintId: string) { const blueprint = await client.blueprints.retrieve(blueprintId); console.log(`Blueprint status: ${blueprint.status}`); } checkBlueprintStatus("{blueprint_id}"); ``` Replace `{blueprint_id}` with the ID returned when you created the Blueprint. Example response: ```json { "id": "bpt_123", "name": "docs-template", "status": "build_complete", "create_time_ms": 1722264065963, "parameters": { ... } } ``` The `status` field indicates the current state of your Blueprint: * `build_complete`: Blueprint is ready to use * `build_failed`: Refer to the [Blueprint troubleshooting](/devboxes/troubleshooting-blueprints) guide ## Advanced Usage: Custom Dockerfiles For more complex environments, you can use a full Dockerfile as the basis for your Blueprint. This is useful when you need to install multiple tools or perform complex setup operations. 1. Base your Dockerfile on the Runloop base image: ``` FROM public.ecr.aws/f7m5a7m8/devbox:prod ``` 2. Runloop will: * Use your Dockerfile as the base * Apply any `system_setup_commands` specified * Set up any defined `CodeMount`s The Runloop base image is public and can be downloaded for local testing. ## Keeping Blueprints Updated Periodically update Blueprints by building a new blueprint with the same name. This ensures that your AI agents always work with the latest code and dependencies. ```bash curl curl -X POST 'https://api.runloop.ai/v1/blueprints' \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -d '{ "name": "fe-bot", "code_mounts": [{ "repo_name": "runloop-fe", "repo_owner": "runloop", "token": "'"${GH_TOKEN}"'" }] }' ``` ```python Python blueprint = client.blueprints.create( name="fe-bot", code_mounts=[{ "repo_name": "runloop-fe", "repo_owner": "runloop", "token": os.environ.get("GH_TOKEN") }] ) ``` ```typescript TypeScript const blueprint = await client.blueprints.create({ name: "fe-bot", code_mounts: [{ repo_name: "runloop-fe", repo_owner: "runloop", token: process.env.GH_TOKEN }] }); ``` This creates a new Blueprint version with the same `name`, allowing for faster updates and efficient resource use. ## Best Practices 1. **Start Simple**: Begin with basic Blueprints and gradually add complexity. 2. **Test Manually Using SSH**: You can create a devbox and SSH into it and manually install tools to make sure the commands are correct before layering them into Blueprints. 3. Always use `blueprint_name` instead of `blueprint_id` to ensure you're using the latest version. Use specific Blueprint IDs only when you need version control for particular setups. 4. Implement `setup_commands` in your Devbox creation to keep code and dependencies up-to-date. 5. Regularly update your Blueprints with the latest repository changes. By leveraging Blueprints effectively, you can create optimized, consistent environments for your AI-assisted software engineering tasks, enhancing productivity and reliability in your development process. ## Upcoming Features Future releases plan to include: * Multiple repository support in a single Blueprint * Specific branch specifications * Git submodules support * Advanced multi-step build processes If any of these features are critical for your use case, please let us know. # Mount a Code Repository on a Devbox Source: https://docs.runloop.ai/devboxes/code-mounts Enable AI agents to work with full projects: access public and private repositories export const ExampleRepoLink = props => { return

Full example

; }; ## Overview Enabling your AI agent to work on full existing code projects unlocks a new set of capabilities. This guide explains how to give your AI agent access to entire codebases, allowing it to make changes and run projects end-to-end like a human engineer. ## Using Code Mounts While you can use normal shell exec commands to clone a public GitHub repository, Runloop provides a more powerful feature called `CodeMounts`. This allows you to mount a repository into your Devbox at a specific path. ### Creating a Devbox with a Code Mount ```bash curl curl -X POST \ 'https://api.runloop.ai/v1/devboxes' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "code_mounts": [ { "repo_name": "rl-cli", "repo_owner": "runloopai", "token": "" } ] }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) devbox = client.devboxes.create( code_mounts=[ { "repo_name": "rl-cli", "repo_owner": "runloopai", "token": os.environ.get("GITHUB_TOKEN"), } ] ) print(f"Devbox created with ID: {devbox.id}") ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function createDevbox() { const devbox = await client.devboxes.create({ code_mounts: [ { "repo_name": "rl-cli", "repo_owner": "runloopai", "token": process.env.GITHUB_TOKEN, } ] }); console.log(`Devbox created with ID: ${devbox.id}`); } createDevbox(); ``` This will clone the repo onto the Devbox and allow you to pull changes and branches. Note if you want to create pull requests or mutative actions you must configure your Git Auth as described below. ## Connecting to Private GitHub Repositories To enable your Devbox to interact with private GitHub repositories, you need to provide proper authentication credentials. Runloop offers several methods to achieve this. ### Using Code Mounts with GitHub Token When you create a Devbox with a Code Mount, Runloop automatically sets up the `GH_TOKEN` environment variable and credential cache for you. This authenticates all command-line tools in your Devbox with your GitHub token. This allows your AI agent to use Github and open authenticated pull requests using the `gh` cli tool. ```bash curl curl -X POST \ 'https://api.runloop.ai/v1/devboxes' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "code_mounts": [ { "repo_name": "rl-cli", "repo_owner": "runloopai", "token": "" } ] }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) devbox = client.devboxes.create( code_mounts=[ { "repo_name": "rl-cli", "repo_owner": "runloopai", "token": os.environ.get("GITHUB_TOKEN"), } ] ) print(f"Devbox created with ID: {devbox.id}") ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function createDevbox() { const devbox = await client.devboxes.create({ code_mounts: [ { "repo_name": "rl-cli", "repo_owner": "runloopai", "token": process.env.GITHUB_TOKEN, } ] }); console.log(`Devbox created with ID: ${devbox.id}`); } createDevbox(); ``` ### Manually Configuring Your Devbox for GitHub Alternatively, you can configure your Devbox manually using the `setup_commands` argument: ```bash curl curl -X POST \ 'https://api.runloop.ai/v1/devboxes' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "environment_variables": {"GH_TOKEN": ""}, "setup_commands": [ "git config --global credential.helper '\''cache --timeout=3600'\''", "echo \"protocol=https\nhost=github.com\nusername=$GH_TOKEN\npassword=$GH_TOKEN\" | git credential-cache store" ] }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) devbox = client.devboxes.create( environment_variables={"GH_TOKEN": ""}, setup_commands=[ "git config --global credential.helper 'cache --timeout=3600'", "echo \"protocol=https\nhost=github.com\nusername=$GH_TOKEN\npassword=$GH_TOKEN\" | git credential-cache store" ] ) print(f"Devbox created with ID: {devbox.id}") ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function createDevbox() { const devbox = await client.devboxes.create({ environment_variables: { GH_TOKEN: "" }, setup_commands: [ "git config --global credential.helper 'cache --timeout=3600'", "echo \"protocol=https\nhost=github.com\nusername=$GH_TOKEN\npassword=$GH_TOKEN\" | git credential-cache store" ] }); console.log(`Devbox created with ID: ${devbox.id}`); } createDevbox(); ``` This command: 1. Creates a new Devbox 2. Sets the `GH_TOKEN` environment variable with your GitHub token 3. Configures Git to use the credential cache 4. Stores your GitHub token in the Git credential cache for one hour Adjust the `--timeout` value in the git config command to change how long the credentials are cached. ### Best Practices for Token Security 1. Use tokens with the minimum required permissions for your tasks. 2. Regularly rotate your GitHub tokens. 3. Never commit or push files containing your tokens to version control. 4. Use environment variables when possible to avoid exposing tokens in command-line arguments. By following these guidelines, you can securely enable your AI agent to work with full projects and private repositories, expanding its capabilities within the Runloop Devbox environment. # Debugging Agent Output with SSH Source: https://docs.runloop.ai/devboxes/debugging-agent-output-with-ssh Securely connect to a remote Runloop Devbox using SSH for debugging ## Overview When working with AI-generated code, you may need to debug the state of the project after the AI has run various commands. SSH allows you to connect your computer directly to a Devbox, enabling you to debug, run remote commands, and view or modify the remote filesystem. Runloop uses a transparent proxy to facilitate routing for all SSH access. Your SSH connection is end-to-end encrypted using standard SSH public key cryptography. The Runloop API provides a mechanism for retrieving SSH keys using a Runloop API key. ## Setup We recommend using the `rl` CLI to interact with Devboxes. You can find installation instructions at [https://github.com/runloopai/rl-cli](https://github.com/runloopai/rl-cli). ## Create and SSH into a Devbox ```bash export RUNLOOP_API_KEY="ak_" ``` ```bash rl devbox create ``` You'll receive a response like this: ```json { "id": "dbx_2xMEVq0JpPtxUxZikhOLm", "blueprint_id": null, "create_time_ms": 1723232059063, "end_time_ms": null, "initiator_id": null, "initiator_type": "invocation", "name": null, "status": "provisioning" } ``` SSH into an `active` Devbox using the returned `id`: ```bash rl devbox ssh --id dbx_2xMEVq0JpPtxUxZikhOLm ``` You should now have a shell into the Devbox: ``` user@devbox-019138a2-7e80-7233-8100-1add224f41ee-zst79:~$ ``` Type `exit` to leave the SSH session. When you're done, shut down the Devbox: ```bash rl devbox shutdown --id dbx_2xMEVq0JpPtxUxZikhOLm ``` ## Using VSCode with SSH You can use SSH access to connect VSCode to the remote Devbox. Install the [Visual Studio Code Remote - SSH extension](https://code.visualstudio.com/docs/remote/ssh). ```bash rl devbox create ``` ```bash rl devbox ssh --id dbx_2xMEa8BVcYOOGtXGqWNVj --config-only ``` ```bash rl devbox ssh --id dbx_2xMEa8BVcYOOGtXGqWNVj --config-only >> ~/.ssh/config ``` ```bash ssh dbx_2xMEa8BVcYOOGtXGqWNVj "whoami" ``` This should return `user`. You now have a ready-to-use SSH connection to the Devbox. Follow the remaining instructions in the [VSCode SSH documentation](https://code.visualstudio.com/docs/remote/ssh#_connect-to-a-remote-host) to connect VSCode to your Devbox. ## Security Notes * All SSH connections are routed through Runloop's transparent proxy. * Connections are end-to-end encrypted using SSH public key cryptography. * SSH keys are generated and managed securely through the Runloop API. By following these steps, you can securely connect to your Runloop Devbox for debugging, code inspection, and project management tasks. # Execute Commands on a Devbox Source: https://docs.runloop.ai/devboxes/execute-commands Run and execute code at scale export const ExampleRepoLink = props => { return

Full example

; }; ## Running Commands Synchronously vs Asynchronously The Runloop shell APIs support both synchronous for immediate results and asynchronous for long-running commands or daemons. ### Synchronous Commands Synchronous commands allow you to run commands and block until you get the command results including stdout, stderr, and the exit code of the command process. ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes//execute_sync' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "command": "echo Hello World", }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"]) result = client.devboxes.execute_sync("", command="echo Hello World") print(result) ``` ```typescript TypeScript const result = await client.devboxes.executeSync('', { command: 'echo Hello World' }); console.log(result); ``` ### Asynchronous Commands Asynchronous commands allow you to run commands and not block until you get the command results. This can be useful for long-running commands or daemons such as launching dev servers or background processes. ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes//execute_async' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "command": "while true; do echo 'Hello World'; sleep 1; done", }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"]) result = client.devboxes.execute_async("", command="while true; do echo 'Hello World'; sleep 1; done") print(result.stdout) ``` ```typescript TypeScript const result = await client.devboxes.executeAsync('', { command: 'while true; do echo "Hello World"; sleep 1; done' }); console.log(result.stdout); ``` ```bash curl curl -X GET 'https://api.runloop.ai/v1/devboxes//executions/' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"]) # Now we can load the latest status of the execution client.devboxes.executions.retrieve( "", devbox_id="", ) # Alternatively, we can wait the background command completing client.devboxes.executions.await_completed( "", devbox_id="", ) ``` ```typescript TypeScript const result = await client.devboxes.executions.retrieve('', ""); console.log(result.stdout); // Alternatively, we can wait the background command completing await client.devboxes.executions.awaitCompleted('', ""); ``` ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes//executions//kill' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{}' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"]) client.devboxes.executions.kill("", devbox_id="") ``` ```typescript TypeScript await client.devboxes.executions.kill('', ""); ``` ## Isolated vs StatefulShells By default, every Devbox command is run in an isolated shell. This means that each command is executed in a new shell session, and the state of the shell is not preserved between commands. ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes//execute_sync' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{"command": "echo Hello World"}' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"]) result = client.devboxes.execute_sync("", command="echo Hello World") print(result.stdout) ``` ```typescript TypeScript const result = await client.devboxes.executeSync('', { command: 'echo Hello World' }); console.log(result.stdout); ``` ## Using Stateful Shells Alternatively, you can use the `shell_name` parameter to use a 'stateful' shell. This means that the shell will maintain its state across commands including environment variables and working directory. As an example, let's create a series of interdependent commands that need to be run in the same shell: ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes//execute_sync' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "command": "pwd", "shell_name": "my-shell" }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"]) result = client.devboxes.execute_sync("", command="pwd", shell_name="my-shell") print(result.stdout) ``` ```typescript TypeScript const result = await client.devboxes.executeSync('', { command: 'pwd', shell_name: 'my-shell' }); console.log(result.stdout); ``` ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes//execute_sync' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "command": "mkdir mynewfolder && cd mynewfolder", "shell_name": "my-shell" }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"]) client.devboxes.execute_sync( "", command="mkdir mynewfolder && cd mynewfolder", shell_name="my-shell" ) ``` ```typescript TypeScript await client.devboxes.executeSync('', { command: 'mkdir mynewfolder && cd mynewfolder', shell_name: 'my-shell' }); ``` ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes//execute_sync' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "command": "pwd", "shell_name": "my-shell" }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"]) result = client.devboxes.execute_sync("", command="pwd", shell_name="my-shell") print(result.stdout) ``` ```typescript TypeScript const result = await client.devboxes.executeSync('', { command: 'pwd', shell_name: 'my-shell' }); console.log(result.stdout); ``` # Read and Write Files on a Devbox Source: https://docs.runloop.ai/devboxes/files Give your AI agent access to modify and interact with files on your devbox. export const ExampleRepoLink = props => { return

Full example

; }; ## Overview In addition to running commands, your AI agent may need to modify or read files on your Devbox. The Runloop Devbox provides full programmatic access to the underlying filesystem, allowing your agent to interact with files as needed. ## Writing Files to the Devbox When authoring code, your AI Agent will often need to write files to disk. There are two main methods for this: ### Writing Small Text Files You can use `write_file_contents` to easily write a UTF-8 string to a file on disk. Note that all file paths are relative to the user's home directory by default. ```bash curl curl -X POST \ 'https://api.runloop.ai/v1/devboxes//write_file_contents' \ -H "Authorization: Bearer " \ -H 'Content-Type: application/json' \ -d '{ "file_path": "/home/user/main.py", "contents": "print(\"Hello, World!\")" }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) client.devboxes.write_file_contents( "", file_path="/home/user/main.py", contents='print("Hello, World!")' ) ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function writeFileContents() { await client.devboxes.writeFileContents('', { file_path: '/home/user/main.py', contents: 'print("Hello, World!")' }); } writeFileContents(); ``` ### Uploading Large or Non-Text Files For larger files or binary data, you should use the `upload_file` API, which supports files of any sizes and allows passing non text data: ```bash curl curl -X POST \ 'https://api.runloop.ai/v1/devboxes//upload_file' \ -H "Authorization: Bearer " \ -H 'Content-Type: multipart/form-data' \ -F "path=/home/user/large_data.txt" \ -F "file=@large_data.txt" ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) with open('large_data.txt', 'rb') as file: client.devboxes.upload_file( "", path="/home/user/large_data.txt", file=file ) ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; import fs from 'fs'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function uploadFile() { const file = fs.createReadStream('large_data.txt'); await client.devboxes.uploadFile('', { path: '/home/user/large_data.txt', file: file }); } uploadFile(); ``` ## Reading Files Your AI Agent will often also need to read files from the Devbox. There are two main methods for this: ### Reading Small Text Files You can use `read_file_contents` to read the contents of a file on the Devbox as a UTF-8 string. ```bash curl curl -X POST \ 'https://api.runloop.ai/v1/devboxes//read_file_contents' \ -H "Authorization: Bearer " \ -H 'Content-Type: application/json' \ -d '{ "file_path": "/home/user/test_results.txt" }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) contents = client.devboxes.read_file_contents( "", file_path="/home/user/test_results.txt" ) print(contents) ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function readFile() { const contents = await client.devboxes.readFileContents('', { file_path: '/home/user/test_results.txt' }); console.log(contents); } readFile(); ``` ### Downloading Large or Non-Text Files You can also use `download_file` to download a file from the Devbox directly for large or non-text files. ```bash curl curl -X POST \ 'https://api.runloop.ai/v1/devboxes//download_file' \ -H "Authorization: Bearer " \ -H 'Content-Type: application/json' \ -d '{ "file_path": "/home/user/large_data.txt" }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) client.devboxes.download_file( "", file_path="/home/user/large_data.txt" ) ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function downloadFile() { await client.devboxes.downloadFile('', { file_path: '/home/user/large_data.txt' }); } downloadFile(); ``` ## Best Practices 1. Always specify the full path when working with files to avoid ambiguity. 2. Be mindful of file permissions when reading or writing files in different directories. 3. Use error handling in your AI agent's code to manage potential issues with file operations, such as "file not found" or "permission denied" errors. By leveraging these file operations, your AI agent can effectively manage code, data, and results within the Runloop Devbox environment. # The Devbox Lifecycle Source: https://docs.runloop.ai/devboxes/lifecycle Reference documentation for the various states a Devbox can be in. ## Understanding the Devbox State Machine Devboxes represent a persistent dev environment that can be launched and shut down as needed. Over the course of a Devbox's lifecycle, it will transition through a series of states depending on your use case: * **provisioning**: Runloop is allocating and booting the necessary infrastructure resources. * **initializing**: Runloop defined boot scripts are running to enable the environment for interaction. * **running**: The Devbox is ready for interaction. * **failure**: The Devbox failed as part of booting or running user requested actions. * **shutdown**: The Devbox was successfully shutdown and no more active compute is being used. {/* - **suspending**: The Devbox disk is being snapshotted and as part of suspension. - **suspended**: The Devbox disk is saved and no more active compute is being used for the Devbox. - **resuming**: The Devbox disk is being loaded as part of booting a suspended Devbox. */} {/* Re-enable this section once suspend and resume is stable ### Suspending and Resuming Devboxes to Save Disk State In addition to use idle management configuration, you can also manually suspend and resume a devbox. Only disk state, not in-memory state is preserved during suspend/resume operations ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes/{devbox_id}/suspend' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' ``` ```python Python from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) client.devboxes.suspend(devbox_id) ``` ```typescript TypeScript const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); await client.devboxes.suspend(devbox_id); ``` ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes/{devbox_id}/resume' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" ``` ```python Python client.devboxes.resume(devbox_id) ``` ```typescript TypeScript await client.devboxes.resume(devbox_id); ``` ```bash curl # wait for the devbox to be running again while true; do status=$(curl -s -X GET 'https://api.runloop.ai/v1/devboxes/{devbox_id}' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" | jq -r '.status') if [ "$status" == "running" ]; then break fi sleep 1 done ``` ```python Python devbox = client.devboxes.await_running(devbox_id) ``` ```typescript TypeScript const devbox = await client.devboxes.awaitRunning(devbox_id); ``` ### Important Notes - Suspended Devboxes still incur storage charges until explicitly shut down - The suspend/resume process typically takes seconds, depending on the amount of modified data - Daemons or other processes running at suspend time must be manually restarted after resuming - The original Devbox ID and SSH keys are preserved through suspend/resume cycles */} # Managing Devbox Metadata Source: https://docs.runloop.ai/devboxes/metadata Effectively manage and organize large numbers of Devboxes using metadata When working with hundreds or thousands of Devboxes, effective organization becomes crucial. Runloop provides a powerful metadata system to help you tag, categorize, and filter your Devboxes efficiently. ## Using Metadata Metadata allows you to attach custom key-value pairs to your Devboxes. This information can include: * Project names * Team assignments * Environment types (e.g., development, staging, production) * Any other relevant tags for your workflow ## Adding Metadata to Devboxes When creating a Devbox, you can include metadata to help organize and filter them later: ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "metadata": { "project": "runloop-fe", "team": "frontend", "environment": "development" } }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ["RUNLOOP_API_KEY"]) devbox = client.devboxes.create( metadata={ "project": "runloop-fe", "team": "frontend", "environment": "development" } ) print(f"Devbox created with ID: {devbox.id}") ``` ```typescript TypeScript import { Runloop } from '@runloop/sdk'; const client = new Runloop('your_api_key_here'); const devbox = await client.devboxes.create({ metadata: { project: "runloop-fe", team: "frontend", environment: "development" } }); console.log(`Devbox created with ID: ${devbox.id}`); ``` ## Benefits of Using Metadata 1. **Easy Filtering**: Quickly find Devboxes related to specific projects or teams. 2. **Improved Organization**: Group Devboxes logically based on your workflow. 3. **Enhanced Visibility**: Easily identify the purpose and ownership of each Devbox. 4. **Streamlined Management**: Perform bulk operations on Devboxes with similar metadata. ## Viewing and Filtering Metadata The Runloop dashboard displays metadata tags for each Devbox, allowing you to: * View all metadata associated with a Devbox at a glance * Use integrated filters to sort and find Devboxes based on their metadata * Create custom views based on frequently used metadata filters ## Best Practices for Using Metadata 1. **Consistent Naming**: Use a consistent naming convention for your metadata keys and values. 2. **Relevant Information**: Include only metadata that is useful for organizing and filtering. 3. **Update Regularly**: Keep metadata up-to-date as projects evolve or team assignments change. 4. **Use Hierarchies**: Consider using hierarchical metadata (e.g., "env:production" instead of just "production"). By effectively using metadata, you can maintain organization and clarity even when managing thousands of Devboxes across multiple projects and teams. # Overview of Devboxes Source: https://docs.runloop.ai/devboxes/overview Runloop devboxes are the foundation of building AI coding agents fast. We built devboxes because we were tired of hitting the same common problems and needs when building new agents. ## Devboxes & Your Stack Your AI agent will need to do more than just chat. Very likely, you are building an agents that will: * Query external APIs * Pull, build, and execute code from git repositories * Run a headless browser to scrape or interact with websites * Read and write files on a filesystem * Run proprietary code or binaries In development, it's easy to do all of these things with a script on your local machine. But in production, you'll need a better approach. That's where devboxes come in. Runloop devboxes are **the isolated virtual machine your AI agent does its work on.** By building your agent against devbox APIs, your agent can do all of these things without you investing significant time and effort in building infrastructure. ## Key Devbox Features * **Isolated, ephemeral virtual machines:** Devboxes are created on demand, and deleted when they are no longer needed. * **Super fast boot times:** Our base devbox images are optimized to boot in less than `200ms`. * **Stateful or stateless:** By default, devboxes are stateless and are destroyed after each run. But devboxes also support **snapshot**, **suspend**, and **result**, each with one simple API call. * **Customizable sizes and images:** You can choose machine size and resources from a range of options, and you can create and customize team-shared images with blueprints. ## Working with Devboxes Your agent code will interact with devboxes through the Runloop API. We provide [client SDKs](/tools/sdks) for Python and Typescript. You can also use the [Runloop CLI](/tools/cli) and the [Runloop Dashboard](/tools/dashboard) to view, manage, and monitor your devboxes. Ready to get started? Read on for quick examples showcasing common devbox uses. # Configuring Devbox Instance Sizes Source: https://docs.runloop.ai/devboxes/sizes Configure your Devboxes using predefined sizes Runloop offers flexible options to tailor your Devbox resources and lifecycle to your specific AI workloads. This guide covers predefined resource sizes for standardized configurations. ## Predefined Resource Sizes Runloop provides the following resource configurations for Devboxes: | Size | CPU | Memory | Storage | | --------- | --- | ------ | ------- | | X\_SMALL | 0.5 | 1GB | 4GB | | SMALL | 1 | 2GB | 4GB | | MEDIUM | 2 | 4GB | 8GB | | LARGE | 2 | 8GB | 16GB | | X\_LARGE | 4 | 16GB | 16GB | | XX\_LARGE | 8 | 32GB | 16GB | ## Launch Parameters When creating a Devbox, use `LaunchParameters` to specify the desired configuration. ### Resource Size Set the `resource_size` parameter to choose a predefined size: ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "launch_parameters": { "resource_size": "MEDIUM" } }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) devbox = client.devboxes.create( launch_parameters={ "resource_size": "MEDIUM" } ) print(f"Devbox created with ID: {devbox.id}") ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function createDevbox() { const devbox = await client.devboxes.create({ launch_parameters: { resource_size: "MEDIUM" } }); console.log(`Devbox created with ID: ${devbox.id}`); } createDevbox(); ``` This example creates a Devbox with 2 CPU cores and 2Gi of memory. # Devbox Snapshots Source: https://docs.runloop.ai/devboxes/snapshots Saved diskstates from existing for devboxes for re-use & branching export const ExampleRepoLink = props => { return

Full example

; }; Snapshots can be used to save the current disk state of a devbox, and to create new devboxes from a previous point in time. These can be used to: * Improve build times by snapshotting a populated build cache. * Roll back to a known good point in time. * Perform fan-out and attempt multiple approaches to a code change. Snapshots are referenced by a random identifier and can be queried via the API. Currently only disk snapshots are supported. When should I use a Blueprint vs. a Snapshot? Snapshots and Blueprints both allow you to run devboxes with customizations. **Blueprints** are fast to boot and cacheable using Docker layers, while **Snapshots** are a bit slower on boot (reproducing each step taken in the devbox) but can be created quickly from an existing devbox. Examples: * **[Blueprint](/devboxes/blueprints)**: You have a coding agent that is performing a task that requires installing a specific tool. Create a blueprint with set-up steps for the tool and future devboxes will cache the installation to speed up boot and execution time. * **[Snapshot](/devboxes/snapshots)**: You have a coding agent in a devbox considering 3 different ways to complete a task. Create a snapshot of the initial state of the devbox, create 3 parallel devboxes from that snapshot, collate the results, and then choose the best option to continue. First, identify a running devbox id using the dashboard or rl-cli. ```shell $ rl devbox list --status=running ``` Optionally, you may want to remove any temporary files before proceeding to reduce the latency of snapshot operations. ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes/{devbox_id}/snapshot_disk' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{}' ``` ```python Python snapshot = client.devboxes.snapshot_disk(devbox.id) print(f"Snapshot created with ID: {snapshot.id}") ``` ```typescript TypeScript const snapshot = await client.devboxes.snapshot_disk(devbox.id); console.log(`Snapshot created with ID: ${snapshot.id}`); ``` Using the `snapshot_id` from the previous step, launch a new devbox. ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "snapshot_id": }' ``` ```python Python from runloop_api_client import Runloop client = Runloop(api_key="your_api_key_here") devbox = client.devboxes.create( snapshot_id= ) print(f"Devbox created with ID: {devbox.id}") ``` ```typescript TypeScript import { Runloop } from '@runloop/sdk'; const client = new Runloop('your_api_key_here'); const devbox = await client.devboxes.create({ snapshot_id: }); console.log(`Devbox created with ID: ${devbox.id}`); ``` # Start and Stop a Devbox Source: https://docs.runloop.ai/devboxes/start-stop Getting started with the Runloop platform export const ExampleRepoLink = props => { return

Full example

; }; ### Launching and Shutting Down new Devboxes When a Devbox is launched, Runloop will allocate the necessary infrastructure. You should see the Devbox transition to the `running` state at which point you will be able to interact with the Devbox: ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{}' ``` ```python Python import os from runloop_api_client import Runloop runloop_client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) # create the devbox and wait for it to be ready devbox = runloop_client.devboxes.create_and_await_running() print(devbox.id) ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const runloopClient = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); const devbox = await runloopClient.devboxes.createAndAwaitRunning(); console.log(devbox.id); ``` You can also create a devbox from a snapshot or blueprint to optimize boot times. Check out the [Snapshots Guide](/devboxes/snapshots) for more information. When you are done with a devbox, you can shut it down to free up resources. ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes/{devbox_id}/shutdown' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" ``` ```python Python client.devboxes.shutdown(devbox_id) ``` ```typescript TypeScript await client.devboxes.shutdown(devbox_id); ``` Once a devbox is shutdown, it's disk state is deleted so if you want to keep your devbox's disk state, you should suspend it instead or use a snapshot. {/* ### Idle Management By default, Devboxes will automatically shutdown after 1 hour of inactivity. However, you can configure your Devbox to either suspend or shutdown after a custom period of time to optimize costs. ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "launch_parameters": { "after_idle": { "on_idle": "suspend", "idle_time_seconds": 1800 } } }' ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) devbox = client.devboxes.create( launch_parameters={ "after_idle": { "idle_time_seconds": 1800, "on_idle": "suspend" } } ) ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); const devbox = await client.devboxes.create({ launch_parameters: { after_idle: { idle_time_seconds: 1800, on_idle: "suspend" } } }); ``` */} # Troubleshooting Blueprint Builds Source: https://docs.runloop.ai/devboxes/troubleshooting-blueprints Debug and fix your Blueprint builds in Runloop. ## Step 1: Check Blueprint Logs Start by examining the build process logs. Runloop builds a Docker image behind the scenes, and you can access these logs using the Blueprint logs endpoint. ```bash curl curl -X GET 'https://api.runloop.ai/v1/blueprints/{blueprint_id}/logs' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) logs = client.blueprints.logs("{blueprint_id}") for log in logs.logs: print(f"{log.level}: {log.message}") ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function getBlueprintLogs(blueprintId: string) { const logs = await client.blueprints.logs(blueprintId); logs.logs.forEach(log => { console.log(`${log.level}: ${log.message}`); }); } getBlueprintLogs("{blueprint_id}"); ``` Replace `{blueprint_id}` with your actual Blueprint ID. ### Interpreting Log Output The logs can help you identify specific build issues. Here's an example of what you might see: ```json [ { "level": "info", "timestamp_ms": 1722357063912, "message": "fatal: could not read Password for 'https://$GH_TOKEN@github.com': No such device or address" }, { "level": "info", "timestamp_ms": 1722357063915, "message": "error building image: error building stage: failed to execute command: waiting for process to exit: exit status 128" } ] ``` In this example, the error suggests an issue with GitHub authentication, possibly due to an invalid or missing token. ## Step 2: Local Build Testing If the logs don't reveal an obvious problem, you may want to build the Docker image locally. This can help identify issues specific to your development environment or configuration. ### 2.1 Obtain the Dockerfile Use the `preview` endpoint to get the full Docker configuration: ```bash curl curl -X POST 'https://api.runloop.ai/v1/blueprints/preview' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -d '{ "name": "ai-dev-environment", "dockerfile": "FROM public.ecr.aws/f7m5a7m8/devbox:prod\nRUN apt-get update && apt-get install -y cowsay\n" }' | jq -r '.dockerfile' > Dockerfile.runloop ``` ```python Python import os from runloop_api_client import Runloop client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) preview = client.blueprints.preview( name="ai-dev-environment", dockerfile="FROM public.ecr.aws/f7m5a7m8/devbox:prod\nRUN apt-get update && apt-get install -y cowsay\n" ) with open("Dockerfile.runloop", "w") as f: f.write(preview.dockerfile) print("Dockerfile saved as Dockerfile.runloop") ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; import fs from 'fs/promises'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function getDockerfilePreview() { const preview = await client.blueprints.preview({ name: "ai-dev-environment", dockerfile: "FROM public.ecr.aws/f7m5a7m8/devbox:prod\nRUN apt-get update && apt-get install -y cowsay\n" }); await fs.writeFile("Dockerfile.runloop", preview.dockerfile); console.log("Dockerfile saved as Dockerfile.runloop"); } getDockerfilePreview(); ``` This command uses `jq` to extract the Dockerfile from the response and save it to a file named `Dockerfile.runloop`. ### 2.2 Build Locally With your `Dockerfile.runloop` ready, you can test and debug the build locally: ```bash docker build --build-arg GH_TOKEN_0="$GH_TOKEN" -f Dockerfile.runloop -t local-blueprint-img . ``` ## Step 3: Common Issues and Solutions Here are some common issues you might encounter and how to resolve them: 1. **GitHub Authentication Errors**: * Ensure your `GH_TOKEN` is valid and has the necessary permissions. * Check that the token is correctly set in your environment variables. 2. **Package Installation Failures**: * Verify that your `system_setup_commands` are correct and compatible with the base image. * Ensure you're using the correct package manager (apt for Debian-based images). 3. **CodeMount Issues**: * Double-check the repository name, owner, and access permissions. * Verify that the `install_command` is appropriate for your project. 4. **Resource Constraints**: * If the build is timing out or failing due to resource limits, consider optimizing your Dockerfile or increasing resource allocations. ## Step 4: Seeking Additional Help If you're still encountering issues after following these steps: 1. Review the [Runloop Documentation](https://docs.runloop.ai) for any updates or known issues. 2. Reach out to Runloop support with: * Your Blueprint ID * The full logs from both Runloop and your local build attempts * A description of the steps you've taken to troubleshoot By following this troubleshooting guide, you should be able to identify and resolve most issues with your Blueprint builds. # Open a Tunnel to a Service on a Devbox Source: https://docs.runloop.ai/devboxes/tunnels Create a tunnel to access ports on your Devbox export const ExampleRepoLink = props => { return

Full example

; }; When developing software on your Devbox, you will often want to expose local services running on your Devbox to the outside world. For example, you may want to have your agent start a local web server to serve a frontend application and then expose the live frontend to your users. Other examples include using tunnels: * to remotely collaborate on a frontend project, * test a web service, * access a Jupyter notebook running on your Devbox, * or access a local database running on your Devbox. Let's use Devbox tunnels to securely access ports on your Devbox over a simple url. ## Setting up a tunnel To set up a Devbox tunnel, first make any ports you will want to access available at Devbox creation time. Create a devbox with the ports you want to expose. ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "launch_parameters": { "available_ports": [4040], }, "entrypoint": "python3 -m http.server 4040" }' ``` ```python Python import os from runloop_api_client import Runloop from runloop_api_client.types.shared_params import AfterIdle, LaunchParameters client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) devbox = client.devboxes.create( launch_parameters={ available_ports=[4040] }, entrypoint="python3 -m http.server 4040" ) ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); async function createDevbox() { const devbox = await client.devboxes.create({ launch_parameters: { available_ports: [4040] }, entrypoint: "python3 -m http.server 4040" }); } createDevbox(); ``` When your devbox is created and running, you can now open a tunnel. Use the `create_tunnel` endpoint to create a unique URL to your devbox. ```bash curl curl -X POST 'https://api.runloop.ai/v1/devboxes//create_tunnel' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "port": 4040 }' ``` ```python Python import os from runloop_api_client import Runloop from runloop_api_client.types.shared_params import AfterIdle, LaunchParameters client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) devbox = client.devboxes.create_tunnel( id=devbox.id, port=4040 ) ``` ```typescript TypeScript async function createDevbox() { const devbox = await client.devboxes.createTunnel({ id: devbox.id, port: 4040 }); } createDevbox(); ``` From the result, extract the url and open it in your browser to access the service running on your Devbox. While the Devbox is active and the tunnel is open, the URL now has remote access to this port of your Devbox. Treat with care # Usage with Common Model Providers and Frameworks Source: https://docs.runloop.ai/examples/llm-integrations ### **Initialization** To integrate with Runloop and your preferred LLM provider, initialize the respective SDK clients. ```python Python from anthropic import Anthropic from runloop_api_client import Runloop import os # Initialize clients anthropic = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY")) runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; import Anthropic from '@anthropic-ai/sdk'; // Initialize clients const anthropic = new Anthropic({apiKey: process.env.ANTHROPIC_API_KEY,}); const runloop = new Runloop({bearerToken: process.env.RUNLOOP_API_KEY}); ``` ### **Defining Prompts** Defining a clear, actionable prompt ensures accurate LLM responses. ```python Python system_prompt = "You are a helpful coding assistant that can generate and execute python code. " "You only respond with the code to be executed and nothing else." "Strip backticks in code blocks." prompt = "Write a Python script that generates a maze. The script should:" "1. Accept a size parameter from command line arguments" "2. Generate a random maze of the specified size. Remember to make the maze solvable " "and easy and to make it clear the outer borders of the maze." "3. Print the maze where '#' represents walls and ' ' represents paths." "Mark the Maze start with 'S' and end with 'E'" "4. Use argparse for command line argument parsing" "The code should be in the format of a Python script that can be run directly" "with 'python gen_maze.py --size 5'." "ONLY output the code and do NOT wrap the code in markdown! The code should begin " "with an import and end with a print statement." ``` ```typescript TypeScript const prompt = "Write a Python script that generates a maze. The script should:" "1. Accept a size parameter from command line arguments" "2. Generate a random maze of the specified size. Remember to make the maze solvable" " and easy and to make it clear the outer borders of the maze." "3. Print the maze where '#' represents walls and ' ' represents paths." "Mark the Maze start with 'S' and end with 'E'" "4. Use argparse for command line argument parsing" "The code should be in the format of a Python script that can be run directly" "with 'python gen_maze.py --size 5'." "ONLY output the code and do NOT wrap the code in markdown!`;" ``` ### **Generating Code** Send the defined prompts to the LLM's message endpoint, configure parameters and extract the generated code from the response. ```python Python # Generate code using Claude response = anthropic.messages.create( model="claude-3-5-sonnet-20240620", max_tokens=1024, messages=[ {"role": "assistant", "content": system_prompt}, {"role": "user", "content": prompt} ] ) maze_generation_script = response.content[0].text ``` ```typescript TypeScript const {content} = await anthropic.messages.create({ model: "claude-3-5-sonnet-20241022", max_tokens: 1000, temperature: 0, system: "Respond only with code. Do not include any markdown or comments.", messages: [{ "role": "user", "content": [ { "type": "text", "text": prompt }] }] }); const mazeGenerationScript = (content[0] as { text: string }).text; ``` ### **Running Code on a Devbox** After retrieving the code from the LLM, execute it in a Runloop Devbox. ```python Python devbox = runloop.devboxes.create_and_await_running() print("Devbox ID:", devbox.id) runloop.devboxes.write_file_contents( devbox.id, file_path="gen_maze.py", contents=code ) result = runloop.devboxes.execute_sync( devbox.id, command=f"python gen_maze.py --size {size}" ) if not result.exit_status: print("Maze generated successfully\n", result.stdout) return result.stdout else: print("Script execution failed:", result.stderr) return result.stderr ``` ```typescript TypeScript const devbox = await runloop.devboxes.createAndAwaitRunning(); console.log(`Devbox ID: ${devbox.id}`); await runloop.devboxes.writeFileContents(devbox.id, { file_path: "gen_maze.py", contents: mazeGenerationScript, }); const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, { command: "python gen_maze.py --size 11", }); exit_status === 0 ? console.log("Maze generated successfully\n", stdout) : console.error("Maze generation failed\n", stderr); ``` ### **Integrating with Popular Frameworks** The examples below show Integration of Runloop with popular frameworks and LLM providers. The examples follow this structure: 1. **Client Initialization**: Set up SDK clients with environment variables. 2. **Prompt Definition**: Use pre-defined system and user prompts. 3. **Code Generation**: Generate code based on the prompts. 4. **Execution**: Run the code in a secure Runloop Devbox. Prompts are defined above and reused across examples. Handle the non-null "!" operator in examples with default values or as needed. ## **TypeScript Integrations** ```typescript Claude import Runloop from '@runloop/api-client'; import Anthropic from '@anthropic-ai/sdk'; // Initialize clients const anthropic = new Anthropic({apiKey: process.env.ANTHROPIC_API_KEY,}); const runloop = new Runloop({bearerToken: process.env.RUNLOOP_API_KEY}); async function generateMazeCreator() { try{ const {content} = await anthropic.messages.create({ model: "claude-3-5-sonnet-20241022", max_tokens: 1000, temperature: 0, system: "Respond only with code. Do not include any markdown or comments.", messages: [{ "role": "user", "content": [ { "type": "text", "text": prompt }] } ] }); const mazeGenerationScript = (content[0] as { text: string }).text; // Execute the script in a Devbox const devbox = await runloop.devboxes.createAndAwaitRunning(); console.log(`Devbox ID: ${devbox.id}`); await runloop.devboxes.writeFileContents(devbox.id, { file_path: "gen_maze.py", contents: mazeGenerationScript, }); const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, { command: "python gen_maze.py --size 11", }); exit_status === 0 ? console.log("Maze generated successfully\n", stdout) : console.error("Maze generation failed\n", stderr); } catch (error) { console.error("Error:", error); } } generateMazeCreator(); ``` ```typescript Gemini import Runloop from '@runloop/api-client'; import { GoogleGenerativeAI } from "@google/generative-ai"; // Initialize clients const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!); const model = genAI.getGenerativeModel({ model: process.env.GEMINI_MODEL!}); // non-null assertion operator in use, add default value to handle undefined const runloop = new Runloop({bearerToken: process.env.RUNLOOP_API_KEY}); async function generateMazeCreator() { try{ // Generate code using Google Gemini const mazeGenerationScript = (await model.generateContent(prompt)).response.text(); console.log(mazeGenerationScript); // Execute the script in a Devbox const devbox = await runloop.devboxes.createAndAwaitRunning(); console.log(`Devbox ID: ${devbox.id}`); await runloop.devboxes.writeFileContents(devbox.id, { file_path: "gen_maze.py", contents: mazeGenerationScript, }); const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, { command: "python gen_maze.py --size 11", }); exit_status === 0 ? console.log("Maze generated successfully\n", stdout) : console.error("Maze generation failed\n", stderr); } catch (error) { console.error("Error:", error); } } generateMazeCreator(); ``` ```typescript LangChain import { ChatOpenAI } from "@langchain/openai"; import { ChatPromptTemplate } from "@langchain/core/prompts"; import { StringOutputParser } from "@langchain/core/output_parsers"; import Runloop from '@runloop/api-client'; // Initialize Runloop client const runloop = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY }); async function generateMazeCreator() { try { // Generate code using OpenAI const llm = new ChatOpenAI({ model: "gpt-4o", temperature: 0 }); const promptTemplate = ChatPromptTemplate.fromMessages([ { role: "system", content: systemPrompt }, { role: "user", content: "{input}" }, ]); const outputParser = new StringOutputParser(); // Create and run the chain const chain = promptTemplate.pipe(llm).pipe(outputParser); const mazeGenerationScript = await chain.invoke({ input: prompt }); console.log(mazeGenerationScript); // Execute the script in a Devbox const devbox = await runloop.devboxes.createAndAwaitRunning(); console.log(`Devbox ID: ${devbox.id}`); await runloop.devboxes.writeFileContents(devbox.id, { file_path: "gen_maze.py", contents: mazeGenerationScript, }); const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, { command: "python gen_maze.py --size 10", }); exit_status === 0 ? console.log("Maze generated successfully\n", stdout) : console.error("Maze generation failed\n", stderr); } catch (error) { console.error("Error:", error); } } generateMazeCreator(); ``` ```typescript LlamaIndex import { OpenAI, Settings} from "llamaindex" import Runloop from '@runloop/api-client'; // Initialize clients const runloop = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY }); Settings.llm = new OpenAI({ apiKey: process.env.OPENAI_API_KEY, model: "gpt-4o", }) // Create an OpenAI agent to generate Python code and run it in a Runloop Devbox async function generateMazeCreator() { try{ const {message} = await Settings.llm.chat({ messages: [{ role: "user", content: prompt,}], }); const mazeGenerationScript = message.content.toString(); console.log(mazeGenerationScript) // Execute the script in a Devbox const devbox = await runloop.devboxes.createAndAwaitRunning(); console.log(`Devbox ID: ${devbox.id}`); await runloop.devboxes.writeFileContents(devbox.id, { file_path: "gen_maze.py", contents: mazeGenerationScript, }); const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, { command: "python gen_maze.py --size 10", }); exit_status === 0 ? console.log("Maze generated successfully\n", stdout) : console.error("Maze generation failed\n", stderr); } catch (error) { console.error("Error:", error); } } generateMazeCreator(); ``` ```typescript Mistral import Runloop from '@runloop/api-client'; import { Mistral } from '@mistralai/mistralai'; // Create Mistral client const client = new Mistral({apiKey: process.env.MISTRAL_API_KEY}); const runloop = new Runloop({bearerToken: process.env.RUNLOOP_API_KEY}); async function generateMazeCreator() { try{ // Generate code using Mistral const {choices}= await client.chat.complete({ model: 'codestral-latest', messages: [ {role: 'system', content: systemPrompt}, {role: 'user', content: prompt}], }); const mazeGenerationScript = choices![0].message.content!.toString(); console.log(mazeGenerationScript); // Execute the script in a Devbox const devbox = await runloop.devboxes.createAndAwaitRunning(); console.log(`Devbox ID: ${devbox.id}`); await runloop.devboxes.writeFileContents(devbox.id, { file_path: "gen_maze.py", contents: mazeGenerationScript, }); const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, { command: "python gen_maze.py --size 10", }); exit_status === 0 ? console.log("Maze generated successfully\n", stdout) : console.error("Maze generation failed\n", stderr); } catch (error) { console.error("Error:", error); } } generateMazeCreator(); ``` ```typescript OpenAI import { OpenAI } from "openai"; import Runloop from "@runloop/api-client"; // Initialize clients const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY }); const runloop = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY }); async function generateMazeCreator() { try { // Generate code using OpenAI const { choices } = await openai.chat.completions.create({ model: "gpt-3.5-turbo", messages: [{ role: "user", content: prompt }], temperature: 0, }); const mazeGenerationScript = choices[0].message.content ?? "print('Ai could not generate the code')"; // Execute the script in a Devbox const devbox = await runloop.devboxes.createAndAwaitRunning(); console.log(`Devbox ID: ${devbox.id}`); await runloop.devboxes.writeFileContents(devbox.id, { file_path: "gen_maze.py", contents: mazeGenerationScript, }); const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, { command: "python gen_maze.py --size 10", }); exit_status === 0 ? console.log("Maze generated successfully\n", stdout) : console.error("Maze generation failed\n", stderr); } catch (error) { console.error("Error:", error); } } generateMazeCreator(); ``` ```typescript VercelAI // npm install ai @ai-sdk/openai @e2b/code-interpreter import { openai } from '@ai-sdk/openai' import { generateText } from 'ai' import Runloop from '@runloop/api-client'; // Initialize clients const runloop = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY }); const model = openai('gpt-4o') async function generateMazeCreator() { try{ // Generate code with OpenAI const { text: mazeGenerationScript } = await generateText({ model, prompt }) console.log(mazeGenerationScript) // Execute the script in a Devbox const devbox = await runloop.devboxes.createAndAwaitRunning(); console.log(`Devbox ID: ${devbox.id}`); await runloop.devboxes.writeFileContents(devbox.id, { file_path: "gen_maze.py", contents: mazeGenerationScript, }); const { exit_status, stdout, stderr } = await runloop.devboxes.executeSync(devbox.id, { command: "python gen_maze.py --size 10", }); exit_status === 0 ? console.log("Maze generated successfully\n", stdout) : console.error("Maze generation failed\n", stderr); } catch (error) { console.error("Error:", error); } } generateMazeCreator(); ``` ## **Python Integrations** ```python Claude from anthropic import Anthropic from runloop_api_client import Runloop import os # Initialize clients anthropic = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY")) runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) def generate_maze_creator(): try: # Generate code using Claude response = anthropic.messages.create( model="claude-3-5-sonnet-20240620", max_tokens=1024, messages=[ {"role": "assistant", "content": system_prompt}, {"role": "user", "content": prompt} ] ) maze_generation_script = response.content[0].text # Execute the script in a Devbox devbox = runloop.devboxes.create_and_await_running() print("Devbox ID:", devbox.id) runloop.devboxes.write_file_contents(devbox.id, file_path= "gen_maze.py", contents= maze_generation_script ) result = runloop.devboxes.execute_sync(devbox.id, command= "python gen_maze.py --size 10" ) if not result.exit_status: print("Maze generated successfully\n", result.stdout) else: print("Script execution failed:", result.stderr) except Exception as e: print("Error:", e) if __name__ == "__main__": generate_maze_creator() ``` ```python Gemini import google.generativeai as genai from runloop_api_client import Runloop import os # Initialize clients genai.configure(api_key=os.environ.get("GEMINI_API_KEY")) runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) def generate_maze_creator(): try: # Generate code using Gemini model = genai.GenerativeModel("gemini-1.5-flash") response = model.generate_content(prompt) maze_generation_script = response.text.strip() # Execute the script in a Devbox devbox = runloop.devboxes.create_and_await_running() print("Devbox ID:", devbox.id) runloop.devboxes.write_file_contents(devbox.id, file_path= "gen_maze.py", contents= maze_generation_script ) result = runloop.devboxes.execute_sync(devbox.id, command= "python gen_maze.py --size 10" ) if not result.exit_status: print("Maze generated successfully\n", result.stdout) else: print("Script execution failed:", result.stderr) except Exception as e: print("Error:", e) if __name__ == "__main__": generate_maze_creator() ``` ```python LangChain from langchain_openai import ChatOpenAI from langchain_core.prompts import ChatPromptTemplate from langchain_core.output_parsers import StrOutputParser from runloop_api_client import Runloop import os # Initialize client runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) def generate_maze_creator(): try: # Generate code using OpenAI llm = ChatOpenAI(model="gpt-4o") prompt_template = ChatPromptTemplate.from_messages([ ("system", system_prompt), ("human", prompt) ]) output_parser = StrOutputParser() # Create and run the chain chain = prompt_template | llm | output_parser maze_generation_script = chain.invoke({"input": prompt}) # Execute the script in a Devbox devbox = runloop.devboxes.create_and_await_running() print("Devbox ID:", devbox.id) runloop.devboxes.write_file_contents(devbox.id, file_path= "gen_maze.py", contents= maze_generation_script ) result = runloop.devboxes.execute_sync(devbox.id, command= "python gen_maze.py --size 10" ) if not result.exit_status: print("Maze generated successfully\n", result.stdout) else: print("Script execution failed:", result.stderr) except Exception as e: print("Error:", e) if __name__ == "__main__": generate_maze_creator() ``` ```python LlamaIndex from langchain_openai import ChatOpenAI from langchain_core.prompts import ChatPromptTemplate from langchain_core.output_parsers import StrOutputParser from runloop_api_client import Runloop import os # Initialize client runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) def generate_maze_creator(): try: # Create LangChain components llm = ChatOpenAI(model="gpt-4o") prompt_template = ChatPromptTemplate.from_messages([ ("system", system_prompt), ("human", prompt) ]) output_parser = StrOutputParser() # Create and run the chain chain = prompt_template | llm | output_parser maze_generation_script = chain.invoke({"input": prompt}) print(maze_generation_script) # Execute the script in a Devbox devbox = runloop.devboxes.create_and_await_running() print("Devbox ID:", devbox.id) runloop.devboxes.write_file_contents(devbox.id, file_path= "gen_maze.py", contents= maze_generation_script ) result = runloop.devboxes.execute_sync(devbox.id, command= "python gen_maze.py --size 10" ) if not result.exit_status: print("Maze generated successfully\n", result.stdout) else: print("Script execution failed:", result.stderr) except Exception as e: print("Error:", e) if __name__ == "__main__": generate_maze_creator() ``` ```python Mistral import os from mistralai import Mistral from runloop_api_client import Runloop # Initialize clients client = Mistral(api_key=os.environ.get("MISTRAL_API_KEY")) runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) def generate_maze_creator(): try: # Generate code using OpenAI response = client.chat.complete( model="codestral-latest", messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": prompt}]) maze_generation_script = response.choices[0].message.content # Execute the script in a Devbox devbox = runloop.devboxes.create_and_await_running() print("Devbox ID:", devbox.id) runloop.devboxes.write_file_contents(devbox.id, file_path= "gen_maze.py", contents= maze_generation_script ) result = runloop.devboxes.execute_sync(devbox.id, command= "python gen_maze.py --size 10" ) if not result.exit_status: print("Maze generated successfully\n", result.stdout) else: print("Script execution failed:", result.stderr) except Exception as e: print("Error:", e) if __name__ == "__main__": generate_maze_creator() ``` ```python OpenAI import os from openai import OpenAI from runloop_api_client import Runloop # Initialize clients openai = OpenAI(api_key=os.environ.get("OPENAI_API_KEY")) runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) def generate_maze_creator(): try: # Generate code using OpenAI response = openai.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": prompt}], temperature=0 ) maze_generation_script = response.choices[0].message.content # Execute the script in a Devbox devbox = runloop.devboxes.create_and_await_running() print("Devbox ID:", devbox.id) runloop.devboxes.write_file_contents(devbox.id, file_path= "gen_maze.py", contents= maze_generation_script ) result = runloop.devboxes.execute_sync(devbox.id, command= "python gen_maze.py --size 10" ) if not result.exit_status: print("Maze generated successfully\n", result.stdout) else: print("Script execution failed:", result.stderr) except Exception as e: print("Error:", e) if __name__ == "__main__": generate_maze_creator() ``` ```python CrewAI from crewai.tools import tool from crewai import Agent, Task, Crew, LLM from runloop_api_client import Runloop import os @tool("Tool to generate Python code using LLM") def generate_code(prompt: str) -> str: """ Generate Python code based on a given prompt using the LLM. """ try: # Generate code using the LLM llm = LLM(model="gpt-4o") response = llm.chat(messages=[{"role": "user", "content": prompt}]) generated_code = response['choices'][0]['message']['content'] return generated_code except Exception as e: print("LLM Exception occurred:", e) return str(e) @tool("Tool to execute Python code on Runloop") def execute_code_on_runloop(code: str, size: int): """ Execute Python code on a Runloop Devbox. """ try: # Initialize client runloop = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) # Execute the script in a Devbox devbox = runloop.devboxes.create_and_await_running() print("Devbox ID:", devbox.id) runloop.devboxes.write_file_contents( devbox.id, file_path="gen_maze.py", contents=code ) result = runloop.devboxes.execute_sync( devbox.id, command=f"python gen_maze.py --size {size}" ) if not result.exit_status: print("Maze generated successfully\n", result.stdout) return result.stdout else: print("Script execution failed:", result.stderr) return result.stderr except Exception as e: print("Runloop Exception occurred:", e) return str(e) # Define the agent code_writer_executor = Agent( role='Python Code Writer and Executor', goal='Write Python scripts based on prompts, execute them, and return the results.', backstory='You are an expert Python programmer capable of writing, executing code, and returning results.', tools=[generate_code, execute_code_on_runloop], llm=LLM(model="gpt-4o") ) # Define the task generate_maze_task = Task( description="Generate and execute a Python script that creates a maze.", agent=code_writer_executor, expected_output="A successfully generated and executed maze of size 11.", inputs={ "prompt": prompt, "size": 11 } ) # Create the crew maze_generation_crew = Crew( agents=[code_writer_executor], tasks=[generate_maze_task], verbose=True, ) # Run the crew result = maze_generation_crew.kickoff() print(result) ``` # Quickstart: Giving Agents a Development Environment Source: https://docs.runloop.ai/overview/quickstart ## Running AI generated code securely with Runloop Runloop Devboxes are a secure and isolated environment for running AI-generated code. Let's see how we can use Devboxes to safely run AI generated code to generate mazes. Set up your API keys as environment variables: ```bash export RUNLOOP_API_KEY= export OPENAI_API_KEY= ``` Replace the placeholders with your actual API keys. Note you can get your Runloop API key from the [Runloop Dashboard](https://platform.runloop.ai/manage/keys). First, we'll use the OpenAI API to generate Python code that generates a maze. ````bash curl response=$(curl "https://api.openai.com/v1/chat/completions" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -d '{ "model": "gpt-4o-mini", "messages": [ { "role": "system", "content": "You are a helpful Python code generating assistant. You output code only, no other text. Do NOT wrap your code in ```python or ```bash or anything like that." }, { "role": "user", "content": "Write a Python script that generates an ASCII art maze and prints it to stdout. It should be 10x10 in size and callable from the command line via `python maze.py`." } ] }') python_script=$(echo "$response" | jq -r '.choices[0].message.content') ```` ````python Python import openai openai.api_key = os.environ.get("OPENAI_API_KEY") response = openai.chat.completions.create( model="gpt-3.5-turbo", messages=[ {"role": "system", "content": "You are a helpful Python code generating assistant. You output code only, no other text. Do NOT wrap your code in ```python or ```bash or anything like that."}, {"role": "user", "content": "Write a Python script that generates an ASCII art maze and prints it to stdout. It should be 10x10 in size and callable from the command line via `python maze.py`."} ] ) python_script = response.choices[0].message.content ```` ````typescript TypeScript import OpenAI from "openai"; const openai = new OpenAI(); const completion = await openai.chat.completions.create({ model: "gpt-4o-mini", messages: [ { role: "system", content: "You are a helpful Python code generating assistant. You output code only, no other text. Do NOT wrap your code in ```python or ```bash or anything like that." }, { role: "user", content: "Write a Python script that generates an ASCII art maze and prints it to stdout. It should be 10x10 in size and callable from the command line via `python maze.py`.", }, ], }); console.log(completion.choices[0].message); const pythonScript = completion.choices[0].message?.content; ```` Now, let's create a Devbox to use as our sandbox environment. Once a Devbox is created, Runloop will automatically provision a secure microVM that can be used to load and run any coding projects. A Devbox starts in the 'provisioning' state. Once the Devbox is ready, it will transition to the 'running' state at which point we can begin using it. ```bash curl curl -X POST \ 'https://api.runloop.ai/v1/devboxes' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{}' ``` ```python Python import os from runloop_api_client import Runloop # ... previous code ... runloop_client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) # create the devbox and wait for it to be ready devbox = runloop_client.devboxes.create_and_await_running() print(devbox.id) ``` ```typescript TypeScript import Runloop from '@runloop/api-client'; // ... previous code ... const runloopClient = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); // create the devbox and wait for it to be ready async function runGenerateMazeProgram(program: string) { const devbox = await runloopClient.devboxes.createAndAwaitRunning(); console.log(devbox.id); } runGenerateMazeProgram(python_script); ``` This command will return a Devbox ID (e.g.`dbx_1234567890`). This ID will be used to perform operations on the Devbox. Now, let's upload the Python script to the Devbox so we can run it securely. ```bash curl curl -X POST \ 'https://api.runloop.ai/v1/devboxes//write_file_contents' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d "{\"file_path\": \"maze.py\", \"contents\": \"$python_script\"}" ``` ```python Python # ... previous code ... runloop_client.devboxes.write_file_contents(devbox.id, file_path="maze.py", contents=python_script) ``` ```typescript TypeScript async function runGenerateMazeProgram(program: string) { // ... previous code ... await runloopClient.devboxes.writeFileContents(devbox.id, { file_path: 'maze.py', contents: program, }); } ``` ```bash curl curl -X POST \ 'https://api.runloop.ai/v1/devboxes//execute_sync' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" \ -H 'Content-Type: application/json' \ -d '{"command": "python maze.py"}' ``` ```python Python exec_result = runloop_client.devboxes.execute_sync("", command="python maze.py") # Print stdout print(exec_result.stdout) ``` ```typescript TypeScript const execResult = await runloopClient.devboxes.executeSync('', { command: 'python maze.py', }); console.log(execResult.stdout); ``` Once we are done generating mazes, we can shut down the Devbox to free up resources: ```bash curl curl -X POST \ 'https://api.runloop.ai/v1/devboxes//shutdown' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" ``` ```python Python runloop_client.devboxes.shutdown(devbox.id) ``` ```typescript TypeScript await runloopClient.devboxes.shutdown(devbox.id); ``` By default, Devboxes are configured to automatically shut down after 60 minutes of inactivity but they can be configured to run for any amount of time or even to automatically shut down after some idle period. ## Giving agents a secure development environment via Tools In addition to just using the Runloop API to manually upload and run code, you can also use Runloop Tools to give full access to the Devbox to an agent. For example, let's make a simple coding agent that can generate Python code and ask it to write a command-line script that prints command-line arguments as ascii words! Let's create a Devbox to use as a development environment for the agent. ```python Python import os from runloop_api_client import Runloop # Initialize the Runloop client runloop_client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) # Initialize a devbox and retrieve the devbox id devbox = runloop_client.devboxes.create_and_await_running() print(devbox.id) ``` ```typescript TypeScript import { Runloop } from "@runloop/api-client"; // Initialize the Runloop client const runloop = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY }); // Create a devbox and await running const devbox = await runloop.devboxes.create(); await runloop.devboxes.awaitRunning(devbox.id); console.log(devbox.id); ``` This command will return a Devbox ID (e.g.`dbx_1234567890`). This ID will be used to perform operations on the Devbox. Next, let's create tools bound to the Devbox so the agent can use it. In the examples below, we will use: * **Python**: Utilize the Ell framework to create tools and run an agent. * **TypeScript**: Define each tool and pass them as a `ChatCompletionTool[]`. However, the tools can easily be created in any language and any framework! Check out our [examples repository](https://github.com/runloopai/examples.git) for more examples in your favorite language or framework. ```python Python import ell from runloop_api_client import Runloop @ell.tool() def execute_shell_command(command: str): """Run a shell command in the devbox.""" return runloop_client.devboxes.execute_sync(devbox.id, command=command).stdout @ell.tool() def read_file(filename: str): """Reads a file on the devbox.""" return runloop_client.devboxes.read_file_contents(devbox.id, file_path=filename) @ell.tool() def write_file(filename: str, contents: str): """Writes a file on the devbox.""" runloop_client.devboxes.write_file_contents( devbox.id, file_path=filename, contents=contents ) ``` ```typescript TypeScript const tools = [ { type: "function", function: { name: "executeShellCommand", description: "Run a shell command in the devbox", parameters: { type: "object", properties: { command: { type: "string", description: "The shell command to execute.", }, }, required: ["command"], additionalProperties: false, }, strict: true, }, } as const, ]; ``` By default, Devboxes are configured to automatically shut down after 60 minutes of inactivity. They can also be configured to run for any amount of time or to automatically shut down after a specified idle period. Now we can give the tools to the agent and ask it to generate a script and run the program in the devbox. ```python Python @ell.complex( model="gpt-4-turbo", tools=[execute_shell_command, read_file, write_file] ) def invoke_agent(message_history: List[Message]): """Calls the LLM to generate the program.""" messages = [ ell.system(SYSTEM_PROMPT), ell.user(USER_PROMPT), ] + message_history return messages ``` ```typescript TypeScript const messageHistory: ChatCompletionMessage[] = [ { role: "assistant", content: SYSTEM_PROMPT, refusal: null }, {role: "user", content: USER_PROMPT, refusal: null } as unknown as ChatCompletionMessage, ]; let response = await openai.chat.completions.create({ messages: messageHistory, model: "gpt-4-turbo", tools: tools, tool_choice: "auto", }); ``` Once we are done having our agent generate mazes, we can shut down the Devbox: ```bash curl curl -X POST \ 'https://api.runloop.ai/v1/devboxes//shutdown' \ -H "Authorization: Bearer $RUNLOOP_API_KEY" ``` ```python Python runloop_client.devboxes.shutdown(devbox.id) ``` ```typescript TypeScript await runloop.devboxes.shutdown(devbox.id); ```
Explore the [Runloop Examples Repository](https://github.com/runloopai/examples.git) to discover how to implement AI agents to run in Runloop.
# What is Runloop? Source: https://docs.runloop.ai/overview/what-is-runloop Runloop is the *batteries included* platform designed for building and optimizing AI-driven software engineering agents. With the Runloop platform, you get: * Fast, isolated, snapshottable virtual machines for executing agents & agent tools ([Devboxes](/devboxes/overview)). * Team-shareable templates for launching new devboxes with custom configuration ([Blueprints](/devboxes/blueprints)), and persistent disk state ([Snapshots](/devboxes/snapshots)). * Zero-configuration code repository integration ([Code Mounts](/devboxes/code-mounts)) and a fully-featured, ready-to-use language server{/* ([Code Understanding APIs](/overview/what-is-runloop#code-understanding-apis))*/}. * Turnkey benchmarking ([Benchmarks](/benchmarks/overview)) and evaluation services for fine-tuning your agent's behavior{/* ([Code Scenarios](/overview/what-is-runloop#code-scenarios))*/}. Whether you are trying to build an AI agent that can respond to pull requests, or an AI agent that can generate new UI components, Runloop makes it possible to get from zero to POC in just a few lines of code. ## Why Runloop? Our mission at Runloop is to keep you focused on the things differentiate your AI agent. Leave the building blocks to us and spend time on what actually matters. As your agent evolves, your needs will evolve too. Runloop is designed for builders at all stages: | Stage | Why Runloop | | ----------- | ------------------------------------------------------------------------------------------------------------------------------------ | | Prototyping |
  • Zero infrastructure worries using managed, instant-on devboxes.
  • Build, deploy, learn, and iterate quickly.
| | Production |
  • Team-shared blueprints and projects.
  • 24/7/365 managed platform and oncall team.
  • SOC2 compliant.
| | Growth |
  • Benchmarking and evaluation stack to monitor and fine-tune your agent's performance.
| ### Use cases Our customers are already leveraging Runloop to build AI agents that can: * Respond to Pull Requests and enhance the code review process * Enable users to chat with and navigate their codebase * Generate new test cases for existing codebases * Act as pair programmers * Generate new UI components for their frontend Have a use case that we didn't cover? Send us an email at [support@runloop.ai](mailto:support@runloop.ai) to learn more about how Runloop can help you build AI agents. ## Core Components of Runloop ### Devboxes Devboxes are isolated, cloud-based development environments that can be controlled by AI agents via the Runloop API. You can give agents access to a devbox to let agents run and test code in a safe, isolated environment. ```typescript TypeScript import Runloop from '@runloop/api-client'; import { generateText, tool } from 'ai'; // Create an isolated Devbox for the agent to use const devbox = await runloopClient.devboxes.createAndAwaitRunning(); // Get the Runloop Tool Representation for the Devbox and convert them to Vercel AI Tools const runloopDevboxShellTools = runloopClient.devboxes.tools.shellTools(devbox.id) const runloopDevboxFileTools = runloopClient.devboxes.tools.fileTools(devbox.id) // Use VercelAI SDK to create a simple agent that uses the Devbox to code a game const { text: answer } = await generateText({ model: openai('gpt-4o-2024-08-06'), tools: { ...runloopDevboxShellTools, ...runloopDevboxFileTools }, maxSteps: 10, system: 'You are an expert python coder that specializes in making CLI games.' prompt: 'Create a CLI game that is a guessing game where the user has to guess a number between 1 and 100. Write the python script in the file `game.py`. The program should be callable from the command line via `python game.py`. Once you have generated the program, run it and print the output to stdout.' }); console.log(`ANSWER: ${answer}`); ``` {/* Removing APIs until they're ready for public access ### Code Understanding APIs Code Understanding APIs are currently in beta. Please contact us at [support@runloop.ai](mailto:support@runloop.ai) to get access. A critical part of making AI SWE agents work reliably is giving them the right context to solve the problem. In many cases, this means extracting context from the existing codebase such as function signatures, finding tests that cover a specific segment of code, or understanding which files are often edited together. For example, one common heuristic that helps AI agents navigate codebases is [the Repository Map used by Aider](https://aider.chat/docs/repomap.html). Writing your own Repository Map style heuristic can be difficult as it requires static analysis of the codebase. Other heuristics can be even harder to create and rely on gathering information from the runtime dataflow of the codebase. The Code Understanding APIs aim to make it possible to create these types of heuristics in just a few lines of code. ```typescript TypeScript [expandable] import Runloop from '@runloop/api-client'; const runloopClient = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); // First create a repository connection for Runloop to preindex the codebase we will run on: const repositoryConnectionView = await client.repositories.createAndAwaitIndexing({ name: 'repository-name', owner: 'repository-owner' }); // Now let's use the repository connection to recreate the repository map heuristic: // 1. We list all the code files in the repository const codeFiles = await client.repositories.codeFiles(repositoryConnectionView.id); // 2. We can use the special file viewer and query syntax to only view the files with method signatures const files_with_method_signatures_only = await client.repositories.fileViewer(repositoryConnectionView.id, { files: codeFiles, // We make our query against the AST such that we only get class and method nodes and only include the signature and comments query: 'class(signatures, comments) || method(signatures, comments)' }); // Or we can simply use the built in repository map heuristic: const repositoryMap = await client.repositories.repositoryMap(repositoryConnectionView.id); ``` ```python Python [expandable] import os from runloop_api_client import Runloop # Create a Runloop client runloop_client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY")) # First create a repository connection for Runloop to preindex the codebase we will run on: repository_connection_view = runloop_client.repositories.create( name='repository-name', owner='repository-owner' ) # Now let's use the repository connection to recreate the repository map heuristic: # 1. We list all the code files in the repository code_files = runloop_client.repositories.code_files(repository_connection_view.id) # 2. We can use the special file viewer and query syntax to only view the files with method signatures files_with_method_signatures_only = runloop_client.repositories.file_viewer( repository_connection_view.id, files=code_files, # We make our query against the AST such that we only get class and method nodes and only include the signature and comments query='class(signatures, comments) || method(signatures, comments)' ) # Or we can simply use the built in repository map heuristic: repository_map = runloop_client.repositories.repository_map(repository_connection_view.id) ``` ### Code Scenarios Code Scenarios APIs are currently in beta. Please contact us at [support@runloop.ai](mailto:support@runloop.ai) to get access. Tuning the behavior of your AI agent is a critical part of making it work reliably. However, going from POC to production is where most AI agents fail. Code Scenarios is a set of Benchmarking and Eval Tools that help you understand and improve your AI agent's behavior in a methodical way. For example, with Code Scenarios: - You can run your agent against well known benchmarks such as SWE-bench or create your own custom benchmarks - You can record live production agent traces and monitor the Agent performance or use the traces to create new benchmarks - You can create custom Reward Models based on production traces and benchmark data to fine tune your agent's behavior ```typescript TypeScript [expandable] import Runloop from '@runloop/api-client'; const runloopClient = new Runloop({ bearerToken: process.env.RUNLOOP_API_KEY, }); // 1. Create new Benchmarks or use existing ones like SWE-bench const myBenchmark = await runloop.benchmarks.create({ benchmark: 'UI Component Generation', testCases: [ { name: 'Create a Login Page', problemStatement: 'Create a login page that allows users to login to the application. Call the component `LoginPage` and export it from the file `src/components/LoginPage.tsx`.', // Configure specific starting Devbox environemnts for your Agent to use as part of a benchmark test environment: 'DEFAULT_DEVBOX', outputContractRules: [ { type: 'typescript', files: ["expected_snapshot.png"] validate: (output) => { // Validate the storybook snapshot is updated and roughly matches the expected_snapshot.png } } ] } ] }); // 2. Run your agent against the benchmark const benchmarkRun = await runloop.benchmarks.beginTestRun(myBenchmark.id); // For each test case run our agent and report the completion for (const testCase of myBenchmark.testCases) { const agentOutput = await myAgent.run({ prompt: testCase.problemStatement, devbox: testCase.devbox }); await runloop.benchmarks.reportTestCaseRun(testCaseRun.id, { output: agentOutput }); } ``` */} # Support for AI tools Source: https://docs.runloop.ai/tools/ai-tools Add context about the Runloop API to your LLMs #### Runloop provides first-class context for AI tools to integrate with our APIs [/llms.txt specification list here.](/llms.txt) [Full specification is available here.](/llms-full.txt) # Runloop CLI Source: https://docs.runloop.ai/tools/cli Explore, experiment with, and test the Runloop API using the Runloop CLI. ## Key Features * Easy interaction with Runloop API * Helpful for blueprint testing and devbox management * Debugging tool for running devboxes * Potential tool for AI agents to interact with Runloop ## Setting Up the CLI The Runloop CLI is written in Python and wraps the Runloop Python SDK. To set up the Runloop CLI: 1. Visit the [Runloop CLI GitHub page](https://github.com/runloopai/rl-cli) 2. Follow the installation instructions provided in the repository ## Essential CLI Commands Once set up and validated, the CLI offers a range of helpful commands: ### Create a Blueprint ```bash rl blueprint create --system_setup_commands "sudo apt install cowsay -y" --name cowsay-devbox ``` This command creates a new blueprint named `cowsay-devbox` with the `cowsay` package installed. The response from the cli will include an `id` for the blueprint. ### Create a Devbox ```bash rl devbox create --entrypoint "/usr/games/cowsay runloop" --blueprint_id "" ``` This creates a devbox using a specified blueprint and sets an entrypoint command. ### Monitor Devbox Logs ```bash watch rl devbox logs --id ``` This command allows you to watch the logs of a running devbox in real-time. ### SSH into a Devbox For more advanced use cases, you can use the CLI to SSH into your Devbox. For detailed instructions, refer to our [SSH Access guide](/devboxes/debugging-agent-output-with-ssh). ## Contributing The Runloop CLI is open-source, and we enthusiastically welcome contributions and feedback from our community. To contribute: 1. Visit our [GitHub repository](https://github.com/runloopai/rl-cli) 2. Fork the repository 3. Make your changes 4. Submit a pull request We appreciate all forms of contribution, from bug reports and feature requests to code contributions and documentation improvements. ## Getting Help If you encounter any issues or have questions about using the Runloop CLI: 1. Check the [CLI documentation](https://github.com/runloopai/rl-cli/blob/main/README.md) for detailed usage instructions 2. Open an issue on the GitHub repository for bug reports or feature requests 3. Reach out to our team for additional assistance # Cursor Rules Source: https://docs.runloop.ai/tools/cursor-files ## Download .mdc Files We recommend using the [Cursor rules](https://docs.cursor.com/context/rules) below for working with the Runloop SDK. You can download the `.mdc` files below: * Python File * TypeScript File These files should be added to `.cursor/rules` in your project directory. ## Recommended rules for working with Runloop on Cursor ``` You are working with runloop_api_client, a Python SDK for deploying and managing remote devboxes for AI agents. Use this guide to properly interact with the SDK. Devbox Overview A Devbox is a virtual development environment designed for running AI-generated or arbitrary code in an isolated sandbox. It provides configurable compute resources, supports execution of shell commands, and can be managed via API. Each Devbox has a unique ID, a status (e.g., running, suspended, or shutdown), and metadata for user-defined settings. It includes launch parameters for customizing resource size, execution behavior, and available ports. Key attributes: ID (string, required) – Unique identifier of the Devbox. Status (enum, required) – Current state of the Devbox (e.g., provisioning, running, shutdown). Create time (integer, required) – Timestamp (ms) when the Devbox was created. Launch parameters (object, required) – Includes startup commands, resource configuration, idle timeout, and network settings. Capabilities (list, required) – Defines supported tools such as computer usage APIs, browser usage, and language servers. Blueprint/Snapshot ID (string, optional) – Identifier if created from a predefined Blueprint or Snapshot. Failure/Shutdown reason (enum, optional) – Reason for termination if applicable (e.g., out of memory, idle shutdown). Devboxes can be started, suspended, resumed, or shut down, and they support file operations, shell command execution, and network tunneling. CORE SDK USAGE: Initialize client: from runloop_api_client import Runloop client = Runloop(api_key="YOUR_API_KEY") DEVBOX LIFECYCLE: Create a new Devbox: devbox_view = client.devboxes.create( name="devbox_name", blueprint_id="blueprint_id", snapshot_id="snapshot_id", launch_parameters={} ) client.devboxes.await_running(browser.devbox.id) Retrieve an existing Devbox: devbox_view = client.devboxes.retrieve(id="devbox_id") List Devboxes: page = client.devboxes.list(status="running", limit=10) Suspend a Devbox: devbox_view = client.devboxes.suspend(id="devbox_id") Resume a Devbox: devbox_view = client.devboxes.resume(id="devbox_id") Shutdown a Devbox: devbox_view = client.devboxes.shutdown(id="devbox_id") Keep a Devbox alive: response = client.devboxes.keep_alive(id="devbox_id") DEVBOX FILE OPERATIONS: Write file contents: response = client.devboxes.write_file_contents( id="devbox_id", file_path="path/to/file.txt", contents="Hello, world!" ) Read file contents: response = client.devboxes.read_file_contents( id="devbox_id", file_path="path/to/file.txt" ) Upload a file: response = client.devboxes.upload_file(id="devbox_id", path="path/to/upload") Download a file: response = client.devboxes.download_file(id="devbox_id", path="path/to/file.txt") content = response.read() DEVBOX EXECUTION: Execute command synchronously: execution_detail = client.devboxes.execute_sync(id="devbox_id", command="ls -la") Execute command asynchronously: async_execution = client.devboxes.execute_async(id="devbox_id", command="ls -la") Retrieve execution status: execution_status = client.devboxes.executions.retrieve( devbox_id="devbox_id", execution_id="execution_id" ) DEVBOX NETWORKING: Create a tunnel: tunnel = client.devboxes.create_tunnel(id="devbox_id", port=8080) Remove a tunnel: tunnel = client.devboxes.remove_tunnel(id="devbox_id", port=8080) Create an SSH key: ssh_key = client.devboxes.create_ssh_key(id="devbox_id") DEVBOX SNAPSHOT MANAGEMENT: List disk snapshots: page = client.devboxes.list_disk_snapshots(devbox_id="devbox_id", limit=10) Create a disk snapshot: snapshot = client.devboxes.snapshot_disk(id="devbox_id", name="snapshot_name") Delete a disk snapshot: response = client.devboxes.delete_disk_snapshot(id="snapshot_id") DEVBOX LOGGING: Retrieve logs: logs = client.devboxes.logs.list(id="devbox_id", execution_id="execution_id") REPOSITORY OBJECT A Repository in Runloop represents a connection to a remote repository and facilitates its automated analysis. This enables users to manage and inspect repositories efficiently. ATTRIBUTES - id (string, required) - Unique identifier of the Repository. - name (string, required) - The name of the Repository. - owner (string, required) - The account owner associated with the Repository. - status (enum, required) - The current state of the Repository. Available options: `pending`, `failure`, `active`. - failure_reason (string | null, optional) - The reason for failure, if the repository has a `failure` status. Repositories can be created, retrieved, listed, and deleted using the Runloop API. REPOSITORY OPERATIONS List repositories: repositories = client.repositories.list(limit=10) Create a repository connection: repository = client.repositories.create(name="repo_name", owner="repo_owner") Retrieve repository details: repository = client.repositories.retrieve(id="repo_id") Delete a repository: response = client.repositories.delete(id="repo_id") List repository versions: repo_versions = client.repositories.versions(id="repo_id") Using Blueprints in Runloop SDK Blueprints provide a way to create customized starting points for Devboxes, caching environment setups to improve boot times. They allow pre-configured development environments to be reused efficiently. Blueprint Object Attributes ID (string, required) – Unique identifier of the Blueprint. Name (string, required) – The name of the Blueprint. Status (enum, required) – Current state of the Blueprint build (e.g., provisioning, building, build_complete, failed). Create time (integer, required) – Timestamp (ms) when the Blueprint was created. Parameters (object, required) – Configuration used to create the Blueprint. Failure reason (enum, optional) – Reason for failure if the build failed (e.g., out_of_memory, out_of_disk, build_failed). BLUEPRINT OPERATIONS List all Blueprints: page = client.blueprints.list() Create a new Blueprint: blueprint_view = client.blueprints.create(name="custom_blueprint") Retrieve a Blueprint by ID: blueprint_view = client.blueprints.retrieve(id="blueprint_id") Retrieve build logs for a Blueprint: blueprint_logs = client.blueprints.logs(id="blueprint_id") Preview a Blueprint before creation: blueprint_preview = client.blueprints.preview(name="custom_blueprint") Blueprints accelerate Devbox setup by caching pre-configured environments, making them ideal for reproducible and scalable development workflows. EXECUTION GUIDELINES: Always shutdown Devboxes after use to free up resources. Use execute_async for non-blocking commands. Handle API errors using try/except blocks. Use tunnels for exposing services running inside Devboxes. Use snapshots to persist Devbox states. ``` ```.cursorrules You are working with @runloop/api-client, a typescript SDK for deploying and managing remote devboxes for AI agents. Use this guide to properly interact with the SDK. Devbox Overview A Devbox is a virtual development environment designed for running AI-generated or arbitrary code in an isolated sandbox. It provides configurable compute resources, supports execution of shell commands, and can be managed via API. Each Devbox has a unique ID, a status (e.g., running, suspended, or shutdown), and metadata for user-defined settings. It includes launch parameters for customizing resource size, execution behavior, and available ports. Key attributes: ID (string, required) – Unique identifier of the Devbox. Status (enum, required) – Current state of the Devbox (e.g., provisioning, running, shutdown). Create time (integer, required) – Timestamp (ms) when the Devbox was created. Launch parameters (object, required) – Includes startup commands, resource configuration, idle timeout, and network settings. Capabilities (list, required) – Defines supported tools such as computer usage APIs, browser usage, and language servers. Blueprint/Snapshot ID (string, optional) – Identifier if created from a predefined Blueprint or Snapshot. Failure/Shutdown reason (enum, optional) – Reason for termination if applicable (e.g., out of memory, idle shutdown). Devboxes can be started, suspended, resumed, or shut down, and they support file operations, shell command execution, and network tunneling. CORE SDK USAGE Initialize clients: import Runloop from '@runloop/api-client'; const client = new Runloop({ bearerToken: process.env['RUNLOOP_API_KEY'], }); BLUEPRINT OPERATIONS List Blueprints: for await (const blueprintView of client.blueprints.list()) { console.log(blueprintView.id); } Create a Blueprint: const blueprintView = await client.blueprints.create({ name: 'name' }); console.log(blueprintView.id); Retrieve a Blueprint: const blueprintView = await client.blueprints.retrieve('id'); console.log(blueprintView.id); Get Blueprint Logs: const blueprintBuildLogsListView = await client.blueprints.logs('id'); console.log(blueprintBuildLogsListView.blueprint_id); Preview a Blueprint: const blueprintPreviewView = await client.blueprints.preview({ name: 'name' }); console.log(blueprintPreviewView.dockerfile); DEVBOX OPERATIONS List Devboxes for await (const devboxView of client.devboxes.list()) { console.log(devboxView.id); } Create a Devbox: const devboxView = await client.devboxes.create(); await runloop.devboxes.awaitRunning(devbox.id); console.log(devboxView.id); Retrieve a Devbox: const devboxView = await client.devboxes.retrieve('id'); console.log(devboxView.id); Suspend a Devbox: const devboxView = await client.devboxes.suspend('id'); console.log(devboxView.id); Resume a Devbox: const devboxView = await client.devboxes.resume('id'); console.log(devboxView.id); Shutdown a Devbox: const devboxView = await client.devboxes.shutdown('id'); console.log(devboxView.id); Keep Devbox Alive: const response = await client.devboxes.keepAlive('id'); console.log(response); FILE OPERATIONS Write File Contents: const devboxExecutionDetailView = await client.devboxes.writeFileContents('id', { contents: 'contents', file_path: 'file_path', }); console.log(devboxExecutionDetailView.devbox_id); Read File Contents: const response = await client.devboxes.readFileContents('id', { file_path: 'file_path' }); console.log(response); Upload a File: const response = await client.devboxes.uploadFile('id', { path: 'path' }); console.log(response); Download a File: const response = await client.devboxes.downloadFile('id', { path: 'path' }); console.log(response); const content = await response.blob(); console.log(content); SHELL COMMAND EXECUTION Execute Command Synchronously: const devboxExecutionDetailView = await client.devboxes.executeSync('id', { command: 'command' }); console.log(devboxExecutionDetailView.devbox_id); Execute Command Asynchronously: const devboxAsyncExecutionDetailView = await client.devboxes.executeAsync('id', { command: 'command' }); console.log(devboxAsyncExecutionDetailView.devbox_id); Retrieve Execution Status: const devboxAsyncExecutionDetailView = await client.devboxes.executions.retrieve( 'devbox_id', 'execution_id', ); console.log(devboxAsyncExecutionDetailView.devbox_id); NETWORK OPERATIONS Create a Devbox Tunnel: const devboxTunnelView = await client.devboxes.createTunnel('id', { port: 0 }); console.log(devboxTunnelView.devbox_id); Remove a Devbox Tunnel: const devboxTunnelView = await client.devboxes.removeTunnel('id', { port: 0 }); console.log(devboxTunnelView.devbox_id); Create SSH Key: const response = await client.devboxes.createSSHKey('id'); console.log(response.id); DEVBOX PERSISTENCE TOOLS List Disk Snapshots for await (const devboxSnapshotView of client.devboxes.listDiskSnapshots()) { console.log(devboxSnapshotView.id); } Create a Snapshot: const devboxSnapshotView = await client.devboxes.snapshotDisk('id'); console.log(devboxSnapshotView.id); Delete a Snapshot: const response = await client.devboxes.deleteDiskSnapshot('id'); console.log(response); DEVBOX OBSERVABILITY TOOLS Retrieve Devbox Logs: const devboxLogsListView = await client.devboxes.logs.list('id'); console.log(devboxLogsListView.logs); Fetch Logs via API: const options = { method: 'GET', headers: { Authorization: 'Bearer ' } }; fetch('https://api.runloop.ai/v1/devboxes/{id}/logs/tail', options) .then(response => response.json()) .then(response => console.log(response)) .catch(err => console.error(err)); REPOSITORY OBJECT A Repository in Runloop represents a connection to a remote repository and facilitates its automated analysis. This enables users to manage and inspect repositories efficiently. ATTRIBUTES - id (string, required) - Unique identifier of the Repository. - name (string, required) - The name of the Repository. - owner (string, required) - The account owner associated with the Repository. - status (enum, required) - The current state of the Repository. Available options: `pending`, `failure`, `active`. - failure_reason (string | null, optional) - The reason for failure, if the repository has a `failure` status. Repositories can be created, retrieved, listed, and deleted using the Runloop API. REPOSITORY OPERATIONS List Repositories: for await (const repositoryConnectionView of client.repositories.list()) { console.log(repositoryConnectionView.id); } Create a Repository: const repositoryConnectionView = await client.repositories.create({ name: 'name', owner: 'owner' }); console.log(repositoryConnectionView.id); Retrieve a Repository: const repositoryConnectionView = await client.repositories.retrieve('id'); console.log(repositoryConnectionView.id); Delete a Repository: const repository = await client.repositories.delete('id'); console.log(repository); Inspect Latest Repository State via API: const options = { method: 'POST', headers: { Authorization: 'Bearer ' } }; fetch('https://api.runloop.ai/v1/repositories/{id}/inspect_latest', options) .then(response => response.json()) .then(response => console.log(response)) .catch(err => console.error(err)); List Repository Versions: const repositoryVersionListView = await client.repositories.versions('id'); console.log(repositoryVersionListView.analyzed_versions); EXECUTION GUIDELINES: Always shutdown Devboxes after use to free up resources. Use execute async for non-blocking commands. Handle API errors using try/except blocks. Use tunnels for exposing services running inside Devboxes. Use snapshots to persist Devbox states. ``` ## Download .cursorrules Files (Legacy) Legacy `.cursorrules` files for working with the Runloop SDK are also available below: * Python File * TypeScript File # Runloop Dashboard Source: https://docs.runloop.ai/tools/dashboard Manage, monitor, and optimize your AI-powered coding environments with the Runloop Dashboard. The Runloop Dashboard is a powerful web-based interface designed to help developers manage, monitor, and optimize their AI-powered coding environments. It serves as a central command center for your Devboxes, offering intuitive tools for deployment, monitoring, and troubleshooting. ## Getting Started 1. Log in to your Runloop account at [https://platform.runloop.ai](https://platform.runloop.ai) 2. Navigate through the sidebar to access different tools and features ## Key Features 1. **Runloop Shell**: An in-browser command-line interface for `running` Devbox interaction. 2. **Comprehensive Search**: Quickly find specific Devboxes using metadata and status filters. 3. **Log Viewer**: Deep dive into Devbox logs with real-time streaming and querying. 4. **Resource Monitoring**: Track and optimize CPU, memory, and storage usage across your Devboxes. ## Essential Dashboard Tools ### Runloop Shell The Runloop Shell allows you to manage active Devboxes, execute commands, and troubleshoot issues without leaving your browser. ### Advanced Search Use the filter functionality to find the right Devboxes Devboxes: * By status: `status:running` * By metadata: `metadata.project:ai-refactor` * By time range: `created_after:2023-01-01` ### Log Analysis Access and analyze logs for any Devbox: 1. Select a Devbox from the dashboard 2. Navigate to the "Logs" tab 3. Use built-in filters to isolate specific log entries 4. Enable real-time streaming for active monitoring ### Resource Optimization (Coming Soon) Monitor resource utilization: 1. View historical usage graphs 2. Receive optimization recommendations # SDKs Source: https://docs.runloop.ai/tools/sdks Use the Runloop SDKs to interact with the Runloop API. # Runloop SDKs Runloop provides SDKs in common languages to interact with the Runloop API. These SDKs allow you to create, manage, and interact with Devboxes, Blueprints, and other Runloop resources programmatically. * If you are using Runloop from Python, use the [Runloop Python SDK](https://github.com/runloopai/api-client-python) * If you are using Runloop from Node, use the [Runloop Typescript SDK](https://github.com/runloopai/api-client-ts) Please reach out if you need SDKs in other languages.