Introduction
This guide will walk you through using the Runloop SDK to control a browser inside a Runloop Devbox. The Runloop API provides a browser-ready Devbox, enabling AI agents to interact with web pages programmatically.1
Setup Your Environment
Follow the instructions in the Runloop Quickstart to set up your environment to use the Runloop SDK.
2
Create a Devbox and Start the Browser
Set up your browser-ready Devbox and obtain the connection details:
The URLs above are both to localhost by default, and will be
visible only inside the devbox. This is all you need if you are
running your AI agent to control the browser from within the
Devbox. If you need to access either URL remotely, you will need
to also configure a tunnel.
3
Programmatically Controlling the Browser
To interact with the browser, you can use automation tools like Selenium, Puppeteer, or Playwright. Here’s an example using Playwright’s Chrome DevTools Protocol (CDP):
4
Defining a Browser Tool for Your AI Agent
You can create custom tools for AI agents to interact with the browser programmatically. Here’s an example of a navigation tool using Playwright:
5
Passing Tools to an AI Agent
Now you can pass this tool to an AI agent, enabling it to use the browser autonomously:
Different LLM providers have their own specific formats and requirements for defining and passing tools. Make sure to reference your LLM provider’s documentation for the correct implementation details of tool schemas and function calling.
6
Properly Freeing Resources
To ensure efficient resource management, always shut down the Devbox when you’re done:
Additional Resources
- Runloop GitHub Repository - Explore more examples.
- Runloop API Documentation - Official API reference.
