Build a browser agent with Browser Use
Integrate Steel with the browser-use framework for AI-driven web automation.
Browser Use is an agent framework: you give it an LLM, a browser, and a natural-language task, and it runs a perception-plan-act loop until the task is done. It doesn't need selectors or scripted steps. The model reads the page, decides the next action, and executes it against the browser you give it. The browser in this recipe is a Steel session, so the agent runs on managed cloud Chrome with stealth, proxies, and a live viewer instead of local Chromium.
session = client.sessions.create()cdp_url = f"{session.websocket_url}&apiKey={STEEL_API_KEY}"model = ChatOpenAI(model="gpt-5", api_key=OPENAI_API_KEY)agent = Agent(task=TASK,llm=model,browser_session=BrowserSession(cdp_url=cdp_url),)result = await agent.run()
BrowserSession(cdp_url=...) is the entire integration. Browser Use attaches to whatever Chrome is speaking the Chrome DevTools Protocol at that URL. No launcher, no playwright.chromium.launch(), no local browser binary.
Run it
cd examples/browser-usecp .env.example .env # set STEEL_API_KEY and OPENAI_API_KEYuv run main.py
Keys from app.steel.dev and platform.openai.com. A session viewer URL prints as the script starts. Open it in another tab to watch the agent click through the task in real time.
Your output varies. Structure looks like this:
Starting Steel browser session...Steel Session created!View session at https://app.steel.dev/sessions/ab12cd34…Executing task: Go to Wikipedia and search for machine learning============================================================INFO [Agent] Step 1: navigate to https://www.wikipedia.orgINFO [Agent] Step 2: input "machine learning" into search boxINFO [Agent] Step 3: click search buttonINFO [Agent] Step 4: done============================================================TASK EXECUTION COMPLETEDDuration: 38.2 secondsReleasing Steel session...Session completed. View replay at https://app.steel.dev/sessions/ab12cd34…
A run costs a few cents of Steel session time plus OpenAI tokens for each step the agent takes (screenshots go to the model on every iteration, so token usage scales with task length). The finally block that calls client.sessions.release() isn't optional. Steel bills per session-minute and skipping release keeps the browser running until the default 5-minute timeout.
Make it yours
- Change the task. Set
TASKin.envor edit the default inmain.py. Any sentence works: "log in to example.com and download the latest invoice PDF", "compare prices for the top 3 vacuum cleaners on Amazon", "fill out the contact form at acme.com with these fields". Long tasks are fine; the agent breaks them into steps on its own. - Turn on stealth. Add
use_proxy=True,solve_captcha=True, orsession_timeout=1800000to thesessions.create()call for sites with anti-bot. Tasks that navigate logged-in areas usually need longer timeouts than the 5-minute default. - Swap the model.
ChatOpenAI(model="gpt-5", ...)is the default; Browser Use also shipsChatAnthropic,ChatGoogle, and others. Change the import and themodelarg passed toAgent. - Persist login. Reuse cookies and local storage across runs via credentials so the agent doesn't have to sign in every time.
Related
Related recipes
Solve reCAPTCHA v2 manually with Browser Use
Manually solve reCAPTCHA v2 using Steel's CAPTCHA API with the browser-use framework.
Solve CAPTCHAs automatically in a Browser Use agent
Build an AI agent with browser-use and Steel that solves CAPTCHAs automatically.
Build a browser agent with the Claude Agent SDK
Use Steel with the Claude Agent SDK (TypeScript) to build a tool-using browser agent on Anthropic's first-party agent loop.