Skip to main content
This is an optional extension of the Running Agents on Sandboxes tutorial. Complete the main tutorial first.

Overview

Create a turn-based system where an AI agent posts status updates to a GitHub pull request and executes tasks based on PR review comments. This enables collaborative workflows where reviewers guide the agent’s work just by leaving comments.
1

Set up the initial PR and post agent status

Create a pull request and post an initial status comment from the agent. This establishes the communication channel for turn-based interaction.
import os
from github import Github  # Requires PyGithub library

# Initialize GitHub client
g = Github(os.environ["GH_TOKEN"])
repo = g.get_repo("runloopai/sample-todo-nextjs")

# Push the branch and create a PR (or use existing PR)
# For this example, assume we have a PR number
pr_number = 123  # Replace with your actual PR number
pr = repo.get_pull(pr_number)

# Post initial agent status
initial_comment = """🤖 **Agent Status: Ready**

I'm ready to help with this PR. I'll monitor this thread for instructions and update you on my progress.

You can comment with tasks like:
- "Update the color scheme to use purple accents"
- "Add error handling to the API routes"
- "Refactor the component structure"

I'll respond with my progress and results.
"""
pr.create_issue_comment(initial_comment)
print(f"Posted initial status to PR #{pr_number}")
2

Monitor PR comments and process agent tasks

Set up a loop that monitors the PR for new comments, processes them as agent prompts, and posts progress updates. Use the -r flag to resume the most recent Claude Code conversation across multiple PR comments.
import os
import time
from github import Github

g = Github(os.environ["GH_TOKEN"])
repo = g.get_repo("runloopai/sample-todo-nextjs")
pr = repo.get_pull(pr_number)

# Track processed comments to avoid duplicates
processed_comment_ids = set()

# Create a named shell and navigate to the todo app directory
shell = devbox.shell("agent-shell")
await shell.exec("cd ~/sample-todo-nextjs")

print("Monitoring PR for comments...")
# Note: This loop runs indefinitely; in production, add your own exit criteria or lifecycle controls.
while True:
    # Get all comments on the PR
    comments = pr.get_issue_comments()
    
    for comment in comments:
        # Skip if we've already processed this comment
        if comment.id in processed_comment_ids:
            continue
        
        # Skip bot comments (including our own)
        if comment.user.type == "Bot":
            continue
        
        # Mark as processed
        processed_comment_ids.add(comment.id)
        
        # Extract the task from the comment
        task = comment.body.strip()
        commenter = comment.user.login
        
        print(f"New task from @{commenter}: {task}")
        
        # Post "working" status
        working_comment = f"""🤖 **Agent Status: Working**

Processing task from @{commenter}:
> {task}

Starting work now...
"""
        status_comment = pr.create_issue_comment(working_comment)
        
        # Execute the task using Claude Code
        # Use -r flag to resume the most recent session (or start new if none exists)
        result = await shell.exec(f'claude -r -p "{task}"')
        
        output = await result.stdout()
        
        # Post results
        result_comment = f"""✅ **Agent Status: Complete**

Task from @{commenter}:
> {task}

**Results:**
\`\`\`
{output[:1000]}
\`\`\`

**Next Steps:**
- Review the changes in the PR
- Comment with additional tasks or feedback
- The agent will continue monitoring for new instructions
"""
        pr.create_issue_comment(result_comment)
        
        # Stage and commit the changes (shell maintains working directory)
        await shell.exec("git add .")
        await shell.exec(f'git commit -m "Agent: {task[:50]}"')
        await shell.exec("git push")
        
        print(f"Completed task and updated PR")
    
    # Wait before checking again
    time.sleep(10)  # Check every 10 seconds
Named Shells: We use a named shell (devbox.shell("agent-shell")) to maintain the working directory state across commands. After the initial cd ~/sample-todo-nextjs, all subsequent commands run in that directory without needing to cd each time. Learn more about named shells.Session Resumption: The -r flag without a session ID will resume the most recent Claude Code conversation. On the first call, it starts a new session, and on subsequent calls, it automatically resumes the previous conversation, maintaining context across multiple PR comments.Production Webhooks: In production, you should use GitHub webhooks to receive real-time notifications when comments are added, rather than polling. This polling approach is shown for simplicity, but webhooks are more efficient and responsive.
3

Add error handling and status updates

Enhance the workflow with better error handling and more detailed status updates to keep reviewers informed.
import os
import time
from github import Github

g = Github(os.environ["GH_TOKEN"])
repo = g.get_repo("runloopai/sample-todo-nextjs")
pr = repo.get_pull(pr_number)

processed_comment_ids = set()

# Create a named shell and navigate to the todo app directory
shell = devbox.shell("agent-shell")
await shell.exec("cd ~/sample-todo-nextjs")

while True:
    comments = pr.get_issue_comments()
    
    for comment in comments:
        if comment.id in processed_comment_ids or comment.user.type == "Bot":
            continue
        
        processed_comment_ids.add(comment.id)
        task = comment.body.strip()
        commenter = comment.user.login
        
        # Post working status
        working_comment = f"""🤖 **Agent Status: Working**

Processing task from @{commenter}:
> {task}

Starting work now...
"""
        pr.create_issue_comment(working_comment)
        
        try:
            # Execute the task using Claude Code
            # Use -r flag to resume the most recent session
            result = await shell.exec(f'claude -r -p "{task}"')
            
            output = await result.stdout()
            exit_code = result.exit_code
            
            if exit_code == 0:
                # Success - post results
                result_comment = f"""✅ **Agent Status: Complete**

Task from @{commenter}:
> {task}

**Results:**
\`\`\`
{output[:1500]}
\`\`\`

Changes have been committed and pushed to this PR.
"""
                pr.create_issue_comment(result_comment)
                
                # Commit changes (shell maintains working directory)
                await shell.exec("git add .")
                await shell.exec(f'git commit -m "Agent: {task[:50]}"')
                await shell.exec("git push")
            else:
                # Error - post failure message
                error_comment = f"""❌ **Agent Status: Error**

Task from @{commenter}:
> {task}

**Error:**
The agent encountered an error while processing this task.

**Output:**
\`\`\`
{output[:1500]}
\`\`\`

Please review the error and provide additional guidance or clarification.
"""
                pr.create_issue_comment(error_comment)
                
        except Exception as e:
            # Exception handling
            error_comment = f"""❌ **Agent Status: Exception**

Task from @{commenter}:
> {task}

**Error:**
An exception occurred: {str(e)}

Please check the agent logs for more details.
"""
            pr.create_issue_comment(error_comment)
    
    time.sleep(10)
4

Add command filtering and special instructions

Add support for special commands and filtering to make the interaction more controlled and useful.
# Helper function to check if comment is a command
def is_command(comment_body):
    commands = ["/status", "/stop", "/help"]
    return any(comment_body.strip().startswith(cmd) for cmd in commands)

async def handle_command(command, pr, shell):
    if command == "/status":
        # Get git status (shell maintains working directory)
        status_result = await shell.exec("git status")
        status_output = await status_result.stdout()
        
        status_comment = f"""📊 **Agent Status Report**

**Git Status:**
\`\`\`
{status_output}
\`\`\`
"""
        pr.create_issue_comment(status_comment)
        return True
    
    elif command == "/stop":
        stop_comment = """🛑 **Agent Status: Stopped**

The agent has stopped monitoring this PR. To resume, post a new comment with a task.
"""
        pr.create_issue_comment(stop_comment)
        return False  # Signal to stop monitoring
    
    elif command == "/help":
        help_comment = """â„šī¸ **Agent Commands**

Available commands:
- `/status` - Show current git status
- `/stop` - Stop the agent from monitoring this PR
- `/help` - Show this help message

To give the agent a task, simply comment with your instruction (no command prefix needed).
"""
        pr.create_issue_comment(help_comment)
        return True
    
    return True

# In the main loop, check for commands first
for comment in comments:
    if comment.id in processed_comment_ids or comment.user.type == "Bot":
        continue
    
    processed_comment_ids.add(comment.id)
    comment_body = comment.body.strip()
    
    if is_command(comment_body):
        should_continue = await handle_command(comment_body, pr, shell)
        if not should_continue:
            break  # Stop monitoring
        continue
    
    # Process as regular task...
    # (rest of the task processing code)

Best Practices

  • Use webhooks in production: Replace polling with GitHub webhooks for real-time notifications
  • Rate limiting: Be mindful of GitHub API rate limits when polling frequently; authenticating with a GitHub token or GitHub App increases your available quota
  • Error recovery: Implement retry logic for transient failures
  • Security: Validate and sanitize user input and secrets before passing them to the agent
  • Logging: Keep detailed logs of agent actions for debugging and auditing
  • Session management: Use claude -r -p "query" to resume the most recent conversation. The -r flag without a session ID automatically resumes the most recent session, maintaining context across multiple PR comments
  • Claude Code flags:
    • Use -p flag for print mode (non-interactive, SDK usage)
    • Use -r or --resume without a session ID to resume the most recent session
    • Use -r "<session-id>" or --resume "<session-id>" to resume a specific session by ID
    • Use -c to continue the most recent conversation in the current directory
    • For more production examples of webhook-based workflows and secret management, see the Runloop tutorials and Account Secrets documentation.

Next Steps