Quickstart
A step-by-step guide to connecting Steel with Browser-use.
This guide walks you through connecting a Steel cloud browser session with the browser-use framework, enabling an AI agent to interact with websites.
PrerequisitesCopied!
Ensure you have the following:
-
Python 3.11 or higher
-
Steel API key (sign up at app.steel.dev)
-
OpenAI API key (sign up at platform.openai.com)
Step 1: Set up your environmentCopied!
First, create a project directory, set up a virtual environment, and install the required packages:
# Create a project directory
mkdir steel-browser-use-agent
cd steel-browser-use-agent
# Recommended: Create and activate a virtual environment
uv venv
source .venv/bin/activate # On Windows, use: .venv\Scripts\activate
# Install required packages
pip install steel-sdk browser-use langchain-openai python-dotenv
Create a .env
file with your API keys:
STEEL_API_KEY=your_steel_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
Step 2: Create a Steel browser sessionCopied!
Use the Steel SDK to start a new browser session for your agent:
from steel import Steel
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
STEEL_API_KEY = os.getenv("STEEL_API_KEY")
# Create a Steel browser session
client = Steel(steel_api_key=STEEL_API_KEY)
session = client.sessions.create()
print(f"View live session at: {session.session_viewer_url}")
This creates a new browser session in Steel's cloud. The session_viewer_url
allows you to watch your agent's actions in real-time.
Step 3: Define Your Browser and Browser ContextCopied!
Connect the browser-use browser & context to your Steel session using the CDP URL:
from browser_use.browser.browser import Browser, BrowserConfig
from browser_use.browser.context import BrowserContext
# Connect browser-use to the Steel session
cdp_url = f"wss://connect.steel.dev?apiKey={STEEL_API_KEY}&sessionId={session.id}"
browser = Browser(config=BrowserConfig(cdp_url=cdp_url))
browser_context = BrowserContext(browser=browser)
Step 4: Define your AI AgentCopied!
Here we bring it all together by defining our agent with what browser, browser context, task, and LLM to use.
# After setting up the browser
from browser_use import Agent
from langchain_openai import ChatOpenAI
# Create a ChatOpenAI model for agent reasoning
model = ChatOpenAI(
model="gpt-4o",
temperature=0.3,
api_key=os.getenv('OPENAI_API_KEY')
)
# Define the task for the agent
task = "Go to docs.steel.dev, open the changelog, and tell me what's new."
# Create the agent with the task, model, browser, and browser context
agent = Agent(
task=task,
llm=model,
browser=browser,
browser_context=browser_context,
)
This configures the AI agent with:
-
An OpenAI model for reasoning
-
The browser instance from Step 3
-
A specific task to perform
Supported Models:
This example uses GPT-4o, but you can use any browser-use compatible models like Anthropic, DeepSeek, or Gemini. See the full list of supported models here.
Step 5: Run your AgentCopied!
Execute the agent and handle cleanup:
# Define the main function with the agent execution
async def main():
try:
# Run the agent
print("Running the agent...")
await agent.run()
print("Task completed!")
finally:
# Clean up resources
await browser.close()
client.sessions.release(session.id)
print("Resources cleaned up")
# Run the async main function
if __name__ == '__main__':
asyncio.run(main())
Run using the following command: python main.py
The agent will spin up a steel browser session and interact with it to complete the task. After completion, it's important to properly close the browser and release the Steel session.
Complete exampleCopied!
Here's the complete script that puts all steps together:
import os
import asyncio
from dotenv import load_dotenv
from browser_use import Agent
from steel import Steel
from langchain_openai import ChatOpenAI
from browser_use.browser.browser import Browser, BrowserConfig
from browser_use.browser.context import BrowserContext
# Load environment variables
load_dotenv()
STEEL_API_KEY = os.getenv('STEEL_API_KEY')
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
# Initialize the Steel client
client = Steel(steel_api_key=STEEL_API_KEY)
# Create a Steel session
print("Creating Steel session...")
session = client.sessions.create()
print(f"Session created at {session.session_viewer_url}")
# Connect browser-use to Steel
cdp_url = f"wss://connect.steel.dev?apiKey={STEEL_API_KEY}&sessionId={session.id}"
browser = Browser(config=BrowserConfig(cdp_url=cdp_url))
browser_context = BrowserContext(browser=browser)
# Create and configure the AI agent
model = ChatOpenAI(
model="gpt-4o",
temperature=0.3,
api_key=OPENAI_API_KEY
)
task = "Go to docs.steel.dev, open the changelog, and tell me what's new."
agent = Agent(
task=task,
llm=model,
browser=browser,
browser_context=browser_context,
)
async def main():
try:
# Run the agent
print("Running the agent...")
await agent.run()
print("Task completed!")
except Exception as e:
print(f"An error occurred: {e}")
finally:
# Clean up resources
if browser:
await browser.close()
print("Browser closed")
if session:
client.sessions.release(session.id)
print("Session released")
print("Done!")
# Run the async main function
if __name__ == '__main__':
asyncio.run(main())
Save this as main.py
and run it with:
python main.py
Customizing your agent's taskCopied!
Try modifying the task to make your agent perform different actions:
# Search for weather information
task = "Go to https://weather.com, search for 'San Francisco', and tell me today's forecast."
# Research product information
task = "Go to https://www.amazon.com, search for 'wireless headphones', and summarize the features of the first product."
# Visit a documentation site
task = "Go to https://docs.steel.dev, find information about the Steel API, and summarize the key features."
Congratulations! You've successfully connected a Steel browser session with browser-use to create an AI web agent.