Quickstart

This guide walks you through connecting a Steel cloud browser session with the browser-use framework, enabling an AI agent to interact with websites.

Prerequisites

Ensure you have the following:

Python 3.11 or higher
Steel API key (sign up at app.steel.dev)
OpenAI API key (sign up at platform.openai.com)

Step 1: Set up your environment

First, set up a virtual environment, and install the required packages:

Terminal

$ uv venv
$ source .venv/bin/activate
$ uv add steel-sdk browser-use python-dotenv

Create a .env file with your API keys:

ENV

.env

1STEEL_API_KEY=your_steel_api_key_here
2OPENAI_API_KEY=your_openai_api_key_here
3TASK=Go to Wikipedia and search for machine learning

Step 2: Create a Steel browser session

Use the Steel SDK to start a new browser session for your agent:

Python

main.py

1import os
2from steel import Steel
3from dotenv import load_dotenv
4
5# Load environment variables
6load_dotenv()
7STEEL_API_KEY = os.getenv("STEEL_API_KEY") or "your-steel-api-key-here"
8
9# Validate API key
10if STEEL_API_KEY == "your-steel-api-key-here":
11    print("⚠️  WARNING: Please replace with your actual Steel API key")
12    print("   Get your API key at: https://app.steel.dev/settings/api-keys")
13    return
14
15# Create a Steel browser session
16client = Steel(steel_api_key=STEEL_API_KEY)
17session = client.sessions.create()
18
19print("✅ Steel browser session started!")
20print(f"View live session at: {session.session_viewer_url}")

This creates a new browser session in Steel's cloud. The session_viewer_url allows you to watch your agent's actions in real-time.

Step 3: Define Your Browser Session

Connect the browser-use BrowserSession class to your Steel session using the CDP URL:

Python

main.py

1from browser_use import Agent, BrowserSession
2
3# Connect browser-use to the Steel session
4cdp_url = f"wss://connect.steel.dev?apiKey={STEEL_API_KEY}&sessionId={session.id}"
5browser_session = BrowserSession(cdp_url=cdp_url)

Step 4: Define your AI Agent

Here we bring it all together by defining our agent with what browser, browser context, task, and LLM to use.

Python

main.py

1# After setting up the browser session
2from browser_use import Agent
3from browser_use.llm import ChatOpenAI
4
5# Create a ChatOpenAI model for agent reasoning
6model = ChatOpenAI(
7    model="gpt-4o",
8    temperature=0.3,
9    api_key=os.getenv('OPENAI_API_KEY')
10)
11
12# Define the task for the agent
13task = os.getenv("TASK") or "Go to Wikipedia and search for machine learning"
14
15# Create the agent with the task, model, and browser session
16agent = Agent(
17    task=task,
18    llm=model,
19    browser_session=browser_session,
20)

This configures the AI agent with:

An OpenAI model for reasoning
The browser session instance from Step 3
A specific task to perform

Models: This example uses GPT-4o, but you can use any browser-use compatible models like Anthropic, DeepSeek, or Gemini. See the full list of supported models here.

Step 5: Run your Agent

Python

main.py

1import time
2
3# Define the main function with the agent execution
4async def main():
5    try:
6        start_time = time.time()
7
8        print(f"🎯 Executing task: {task}")
9        print("=" * 60)
10
11        # Run the agent
12        result = await agent.run()
13
14        duration = f"{(time.time() - start_time):.1f}"
15
16        print("\n" + "=" * 60)
17        print("🎉 TASK EXECUTION COMPLETED")
18        print("=" * 60)
19        print(f"⏱️  Duration: {duration} seconds")
20        print(f"🎯 Task: {task}")
21        if result:
22            print(f"📋 Result:\n{result}")
23        print("=" * 60)
24
25    except Exception as e:
26        print(f"❌ Task execution failed: {e}")
27    finally:
28        # Clean up resources
29        if session:
30            print("Releasing Steel session...")
31            client.sessions.release(session.id)
32            print(f"Session completed. View replay at {session.session_viewer_url}")
33        print("Done!")
34
35# Run the async main function
36if __name__ == '__main__':
37    asyncio.run(main())

The agent will spin up a steel browser session and interact with it to complete the task. After completion, it's important to properly close the browser and release the Steel session.

Complete example

Here's the complete script that puts all steps together:

Python

main.py

1"""
2AI-powered browser automation using browser-use library with Steel browsers.
3https://github.com/steel-dev/steel-cookbook/tree/main/examples/steel-browser-use-starter
4"""
5
6import os
7import time
8import asyncio
9from dotenv import load_dotenv
10from steel import Steel
11from browser_use import Agent, BrowserSession
12from browser_use.llm import ChatOpenAI
13
14load_dotenv()
15
16# Replace with your own API keys
17STEEL_API_KEY = os.getenv("STEEL_API_KEY") or "your-steel-api-key-here"
18OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") or "your-openai-api-key-here"
19
20# Replace with your own task
21TASK = os.getenv("TASK") or "Go to Wikipedia and search for machine learning"
22
23
24async def main():
25    print("🚀 Steel + Browser Use Assistant")
26    print("=" * 60)
27
28    if STEEL_API_KEY == "your-steel-api-key-here":
29        print("⚠️  WARNING: Please replace 'your-steel-api-key-here' with your actual Steel API key")
30        print("   Get your API key at: https://app.steel.dev/settings/api-keys")
31        return
32
33    if OPENAI_API_KEY == "your-openai-api-key-here":
34        print("⚠️  WARNING: Please replace 'your-openai-api-key-here' with your actual OpenAI API key")
35        print("   Get your API key at: https://platform.openai.com/api-keys")
36        return
37
38    print("\nStarting Steel browser session...")
39
40    client = Steel(steel_api_key=STEEL_API_KEY)
41
42    try:
43        session = client.sessions.create()
44        print("✅ Steel browser session started!")
45        print(f"View live session at: {session.session_viewer_url}")
46
47        print(
48            f"\033[1;93mSteel Session created!\033[0m\n"
49            f"View session at \033[1;37m{session.session_viewer_url}\033[0m\n"
50        )
51
52        cdp_url = f"wss://connect.steel.dev?apiKey={STEEL_API_KEY}&sessionId={session.id}"
53
54        model = ChatOpenAI(model="gpt-4o", temperature=0.3, api_key=OPENAI_API_KEY)
55        agent = Agent(task=TASK, llm=model, browser_session=BrowserSession(cdp_url=cdp_url))
56
57        start_time = time.time()
58
59        print(f"🎯 Executing task: {TASK}")
60        print("=" * 60)
61
62        try:
63            result = await agent.run()
64
65            duration = f"{(time.time() - start_time):.1f}"
66
67            print("\n" + "=" * 60)
68            print("🎉 TASK EXECUTION COMPLETED")
69            print("=" * 60)
70            print(f"⏱️  Duration: {duration} seconds")
71            print(f"🎯 Task: {TASK}")
72            if result:
73                print(f"📋 Result:\n{result}")
74            print("=" * 60)
75
76        except Exception as e:
77            print(f"❌ Task execution failed: {e}")
78        finally:
79            if session:
80                print("Releasing Steel session...")
81                client.sessions.release(session.id)
82                print(f"Session completed. View replay at {session.session_viewer_url}")
83            print("Done!")
84
85    except Exception as e:
86        print(f"❌ Failed to start Steel browser: {e}")
87        print("Please check your STEEL_API_KEY and internet connection.")
88
89
90if __name__ == "__main__":
91    asyncio.run(main())

Save this as main.py and run it with:

Customizing your agent's task

Try modifying the task to make your agent perform different actions:

ENV

.env

1# Search for weather information
2TASK = "Go to https://weather.com, search for 'San Francisco', and tell me today's forecast."
3
4# Research product information
5TASK = "Go to https://www.amazon.com, search for 'wireless headphones', and summarize the features of the first product."
6
7# Visit a documentation site
8TASK = "Go to https://docs.steel.dev, find information about the Steel API, and summarize the key features."

Congratulations! You've successfully connected a Steel browser session with browser-use to automate a task with AI.