Quickstart: Controlling a Browser in a Runloop Devbox

Introduction

This guide will walk you through using the Runloop SDK to control a browser inside a Runloop Devbox. The Runloop API provides a browser-ready Devbox, enabling AI agents to interact with web pages programmatically.

1

Set Up Your Environment

Set up your authentication key:

export RUNLOOP_API_KEY="your-api-key"
2

Install and Initialize the Runloop SDK

First, install the Runloop SDK if you haven’t already:

pip install runloop_api_client

Then, import and initialize the SDK:

from runloop_api_client import Runloop

client = Runloop(bearer_token="your-api-key")

This client object allows interaction with the Runloop API.

3

Create a Devbox and Start the Browser

Set up your browser-ready Devbox and obtain the connection details:

# Create a Devbox with a browser instance
browser = client.devboxes.browsers.create()

# Wait for the Devbox to be fully running
client.devboxes.await_running(browser.devbox.id)

# View your remote browser here:
browser.live_view_url

# Connect to your browser here:
browser.connection_url 
4

Connect to the Browser using Playwright

To interact with the browser, you can use automation tools like Selenium, Puppeteer, or Playwright. Here’s an example using Playwright’s Chrome DevTools Protocol (CDP):

from playwright.async_api import async_playwright


# Initialize playwright context manager 
playwright = await async_playwright().start()

# Connect to your remote browser 
browser = await playwright.chromium.connect_over_cdp(url)

# Create your browser context 
context = await browser.new_context()

# Accesses pages in the browser context's list of pages
page = context.pages[0]
5

Defining Tools for AI Agents

You can create custom tools for AI agents to interact with the browser programmatically. Here’s an example of a navigation tool using Playwright:

from playwright.async_api import async_playwright

class NavigateTool:
    """A tool for navigating to a URL using Playwright."""

    async def __call__(self, *, url: str):
        async with async_playwright() as p:
            browser = await p.chromium.launch()
            page = await browser.new_page()
            await page.goto(url)
            content = await page.content()
            await browser.close()
            return {"output": f"Navigated to {url}", "content": content[:500]}

    def to_params(self):
        return {
            "name": "navigate_tool",
            "description": "Navigates to a URL and retrieves content.",
            "input_schema": {
                "type": "object",
                "properties": {"url": {"type": "string"}},
                "required": ["url"],
            },
        }
6

Passing Tools to an AI Agent

Now, you can pass this tool to an AI agent, enabling it to use the browser autonomously:

tool_instance = NavigateTool()

response = client.messages.create(
    model=model,
    max_tokens=max_tokens,
    messages=messages,
    tools=[tool_instance.to_params()]
)

Different LLM providers have their own specific formats and requirements for defining and passing tools. Make sure to reference your LLM provider’s documentation for the correct implementation details of tool schemas and function calling.

7

Properly Freeing Resources

To ensure efficient resource management, always shut down the Devbox when you’re done:

client.devboxes.shutdown(browser.devbox.id)

Additional Resources

Was this page helpful?