Quickstart
A step-by-step guide to connecting Steel with Browser-use.
This guide walks you through connecting a Steel cloud browser session with the browser-use framework, enabling an AI agent to interact with websites.
Prerequisites
Ensure you have the following:
-
Python 3.11 or higher
-
Steel API key (sign up at app.steel.dev)
-
OpenAI API key (sign up at platform.openai.com)
Step 1: Set up your environment
First, create a project directory, set up a virtual environment, and install the required packages:
# Create a project directorymkdir steel-browser-use-agentcd steel-browser-use-agent# Recommended: Create and activate a virtual environmentuv venvsource .venv/bin/activate # On Windows, use: .venv\Scripts\activate# Install required packagespip install steel-sdk browser-use python-dotenv
Create a .env
file with your API keys:
1STEEL_API_KEY=your_steel_api_key_here2OPENAI_API_KEY=your_openai_api_key_here3TASK=Go to Wikipedia and search for machine learning
Step 2: Create a Steel browser session
Use the Steel SDK to start a new browser session for your agent:
1import os2from steel import Steel3from dotenv import load_dotenv45# Load environment variables6load_dotenv()7STEEL_API_KEY = os.getenv("STEEL_API_KEY") or "your-steel-api-key-here"89# Validate API key10if STEEL_API_KEY == "your-steel-api-key-here":11print("⚠️ WARNING: Please replace with your actual Steel API key")12print(" Get your API key at: https://app.steel.dev/settings/api-keys")13return1415# Create a Steel browser session16client = Steel(steel_api_key=STEEL_API_KEY)17session = client.sessions.create()1819print("✅ Steel browser session started!")20print(f"View live session at: {session.session_viewer_url}")
This creates a new browser session in Steel's cloud. The session_viewer_url allows you to watch your agent's actions in real-time.
Step 3: Define Your Browser Session
Connect the browser-use BrowserSession class to your Steel session using the CDP URL:
1from browser_use import Agent, BrowserSession23# Connect browser-use to the Steel session4cdp_url = f"wss://connect.steel.dev?apiKey={STEEL_API_KEY}&sessionId={session.id}"5browser_session = BrowserSession(cdp_url=cdp_url)
Step 4: Define your AI Agent
Here we bring it all together by defining our agent with what browser, browser context, task, and LLM to use.
1# After setting up the browser session2from browser_use import Agent3from browser_use.llm import ChatOpenAI45# Create a ChatOpenAI model for agent reasoning6model = ChatOpenAI(7model="gpt-4o",8temperature=0.3,9api_key=os.getenv('OPENAI_API_KEY')10)1112# Define the task for the agent13task = os.getenv("TASK") or "Go to Wikipedia and search for machine learning"1415# Create the agent with the task, model, and browser session16agent = Agent(17task=task,18llm=model,19browser_session=browser_session,20)
This configures the AI agent with:
-
An OpenAI model for reasoning
-
The browser session instance from Step 3
-
A specific task to perform
Models: This example uses GPT-4o, but you can use any browser-use compatible models like Anthropic, DeepSeek, or Gemini. See the full list of supported models here.
Step 5: Run your Agent
1import time23# Define the main function with the agent execution4async def main():5try:6start_time = time.time()78print(f"🎯 Executing task: {task}")9print("=" * 60)1011# Run the agent12result = await agent.run()1314duration = f"{(time.time() - start_time):.1f}"1516print("\n" + "=" * 60)17print("🎉 TASK EXECUTION COMPLETED")18print("=" * 60)19print(f"⏱️ Duration: {duration} seconds")20print(f"🎯 Task: {task}")21if result:22print(f"📋 Result:\n{result}")23print("=" * 60)2425except Exception as e:26print(f"❌ Task execution failed: {e}")27finally:28# Clean up resources29if session:30print("Releasing Steel session...")31client.sessions.release(session.id)32print(f"Session completed. View replay at {session.session_viewer_url}")33print("Done!")3435# Run the async main function36if __name__ == '__main__':37asyncio.run(main())
The agent will spin up a steel browser session and interact with it to complete the task. After completion, it's important to properly close the browser and release the Steel session.
Complete example
Here's the complete script that puts all steps together:
1"""2AI-powered browser automation using browser-use library with Steel browsers.3https://github.com/steel-dev/steel-cookbook/tree/main/examples/steel-browser-use-starter4"""56import os7import time8import asyncio9from dotenv import load_dotenv10from steel import Steel11from browser_use import Agent, BrowserSession12from browser_use.llm import ChatOpenAI1314load_dotenv()1516# Replace with your own API keys17STEEL_API_KEY = os.getenv("STEEL_API_KEY") or "your-steel-api-key-here"18OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") or "your-openai-api-key-here"1920# Replace with your own task21TASK = os.getenv("TASK") or "Go to Wikipedia and search for machine learning"222324async def main():25print("🚀 Steel + Browser Use Assistant")26print("=" * 60)2728if STEEL_API_KEY == "your-steel-api-key-here":29print("⚠️ WARNING: Please replace 'your-steel-api-key-here' with your actual Steel API key")30print(" Get your API key at: https://app.steel.dev/settings/api-keys")31return3233if OPENAI_API_KEY == "your-openai-api-key-here":34print("⚠️ WARNING: Please replace 'your-openai-api-key-here' with your actual OpenAI API key")35print(" Get your API key at: https://platform.openai.com/api-keys")36return3738print("\nStarting Steel browser session...")3940client = Steel(steel_api_key=STEEL_API_KEY)4142try:43session = client.sessions.create()44print("✅ Steel browser session started!")45print(f"View live session at: {session.session_viewer_url}")4647print(48f"\033[1;93mSteel Session created!\033[0m\n"49f"View session at \033[1;37m{session.session_viewer_url}\033[0m\n"50)5152cdp_url = f"wss://connect.steel.dev?apiKey={STEEL_API_KEY}&sessionId={session.id}"5354model = ChatOpenAI(model="gpt-4o", temperature=0.3, api_key=OPENAI_API_KEY)55agent = Agent(task=TASK, llm=model, browser_session=BrowserSession(cdp_url=cdp_url))5657start_time = time.time()5859print(f"🎯 Executing task: {TASK}")60print("=" * 60)6162try:63result = await agent.run()6465duration = f"{(time.time() - start_time):.1f}"6667print("\n" + "=" * 60)68print("🎉 TASK EXECUTION COMPLETED")69print("=" * 60)70print(f"⏱️ Duration: {duration} seconds")71print(f"🎯 Task: {TASK}")72if result:73print(f"📋 Result:\n{result}")74print("=" * 60)7576except Exception as e:77print(f"❌ Task execution failed: {e}")78finally:79if session:80print("Releasing Steel session...")81client.sessions.release(session.id)82print(f"Session completed. View replay at {session.session_viewer_url}")83print("Done!")8485except Exception as e:86print(f"❌ Failed to start Steel browser: {e}")87print("Please check your STEEL_API_KEY and internet connection.")888990if __name__ == "__main__":91asyncio.run(main())
Save this as main.py and run it with:
Customizing your agent's task
Try modifying the task to make your agent perform different actions:
1# Search for weather information2TASK = "Go to https://weather.com, search for 'San Francisco', and tell me today's forecast."34# Research product information5TASK = "Go to https://www.amazon.com, search for 'wireless headphones', and summarize the features of the first product."67# Visit a documentation site8TASK = "Go to https://docs.steel.dev, find information about the Steel API, and summarize the key features."
Congratulations! You've successfully connected a Steel browser session with browser-use to automate a task with AI.