Running AI generated code securely with Runloop

Runloop Devboxes are a secure and isolated environment for running AI-generated code.

Let’s see how we can use Devboxes to safely run AI generated code to generate mazes.

1

Set Up Your Environment

Set up your API keys as environment variables:

export RUNLOOP_API_KEY=<your_runloop_api_key_here>
export OPENAI_API_KEY=<your_openai_api_key_here>
Replace the placeholders with your actual API keys. Note you can get your Runloop API key from the Runloop Dashboard.
2

Use AI to generate a maze generator program

First, we’ll use the OpenAI API to generate Python code that generates a maze.

response=$(curl "https://api.openai.com/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
    "model": "gpt-4o-mini",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful Python code generating assistant. You output code only, no other text. Do NOT wrap your code in ```python or ```bash or anything like that."
        },
        {
            "role": "user",
            "content": "Write a Python script that generates an ASCII art maze and prints it to stdout. It should be 10x10 in size and callable from the command line via `python maze.py`."
        }
    ]
}')

python_script=$(echo "$response" | jq -r '.choices[0].message.content')
3

Create a Devbox to securely run the AI generated code

Now, let’s create a Devbox to use as our sandbox environment. Once a Devbox is created, Runloop will automatically provision a secure microVM that can be used to load and run any coding projects.

A Devbox starts in the ‘provisioning’ state. Once the Devbox is ready, it will transition to the ‘running’ state at which point we can begin using it.

curl -X POST \
  'https://api.runloop.ai/v1/devboxes' \
  -H "Authorization: Bearer $RUNLOOP_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{}'
This command will return a Devbox ID (e.g.dbx_1234567890). This ID will be used to perform operations on the Devbox.
4

Upload the Maze Generator Program to the Devbox

Now, let’s upload the Python script to the Devbox so we can run it securely.

curl -X POST \
  'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/write_file_contents' \
  -H "Authorization: Bearer $RUNLOOP_API_KEY" \
  -H 'Content-Type: application/json' \
  -d "{\"file_path\": \"maze.py\", \"contents\": \"$python_script\"}"
5

Run the Maze Generator Program

curl -X POST \
  'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/execute_sync' \
  -H "Authorization: Bearer $RUNLOOP_API_KEY" \
  -H 'Content-Type: application/json' \
  -d '{"command": "python maze.py"}'
6

Shutdown the Devbox

Once we are done generating mazes, we can shut down the Devbox to free up resources:

curl -X POST \
  'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/shutdown' \
  -H "Authorization: Bearer $RUNLOOP_API_KEY"

By default, Devboxes are configured to automatically shut down after 60 minutes of inactivity but they can be configured to run for any amount of time or even to automatically shut down after some idle period.

Giving agents a secure development environment via Tools

In addition to just using the Runloop API to manually upload and run code, you can also use Runloop Tools to give full access to the Devbox to an agent.

For example, let’s make a simple coding agent that can generate Python code and ask it to write a command-line script that prints command-line arguments as ascii words!

1

Create a Devbox for the agent to use

Let’s create a Devbox to use as a development environment for the agent.

import os
from runloop_api_client import Runloop

# Initialize the Runloop client
runloop_client = Runloop(bearer_token=os.environ.get("RUNLOOP_API_KEY"))

# Initialize a devbox and retrieve the devbox id
devbox = runloop_client.devboxes.create_and_await_running()
print(devbox.id)
This command will return a Devbox ID (e.g.dbx_1234567890). This ID will be used to perform operations on the Devbox.
2

Create tools so the agent can use the Devbox

Next, let’s create tools bound to the Devbox so the agent can use it. In the examples below, we will use:

  • Python: Utilize the Ell framework to create tools and run an agent.
  • TypeScript: Define each tool and pass them as a ChatCompletionTool[].

However, the tools can easily be created in any language and any framework! Check out our examples repository for more examples in your favorite language or framework.

import ell
from runloop_api_client import Runloop

@ell.tool()
def execute_shell_command(command: str):
    """Run a shell command in the devbox."""
    return runloop_client.devboxes.execute_sync(devbox.id, command=command).stdout

@ell.tool()
def read_file(filename: str):
    """Reads a file on the devbox."""
    return runloop_client.devboxes.read_file_contents(devbox.id, file_path=filename)

@ell.tool()
def write_file(filename: str, contents: str):
    """Writes a file on the devbox."""
    runloop_client.devboxes.write_file_contents(
        devbox.id, file_path=filename, contents=contents
    )

By default, Devboxes are configured to automatically shut down after 60 minutes of inactivity. They can also be configured to run for any amount of time or to automatically shut down after a specified idle period.

3

Run the agent with the tools

Now we can give the tools to the agent and ask it to generate a script and run the program in the devbox.

  @ell.complex(
      model="gpt-4-turbo", tools=[execute_shell_command, read_file, write_file]
  )
  def invoke_agent(message_history: List[Message]):
      """Calls the LLM to generate the program."""
      messages = [
          ell.system(SYSTEM_PROMPT),
          ell.user(USER_PROMPT),
      ] + message_history
      return messages
4

Shutdown the Devbox

Once we are done having our agent generate mazes, we can shut down the Devbox:

curl -X POST \
  'https://api.runloop.ai/v1/devboxes/<YOUR_DEVBOX_ID>/shutdown' \
  -H "Authorization: Bearer $RUNLOOP_API_KEY"

Explore the Runloop Examples Repository to discover how to implement AI agents to run in Runloop.

Was this page helpful?