Build an AI browser agent with Magnitude
Use Steel with Magnitude for AI-powered browser automation.
Scaffolds a starter project locally. Requires the Steel CLI.
Magnitude grew out of end-to-end testing and kept the bias: an agent loop that narrates each turn, a CDP-level browser hookup, and LLM-backed primitives designed to intermix navigation, action, and typed readback. startBrowserAgent() hands you a BrowserAgent with a small surface this recipe exercises:
agent.extract(instruction, schema): describe what to pull off the page, pass a Zod schema, get a typed result.agent.act(instruction): describe an interaction in natural language. The agent plans, clicks, types, retries.agent.stop(): flush and tear down. Pair withclient.sessions.release()in afinally.
const agent = await startBrowserAgent({url: "https://github.com/steel-dev/leaderboard",narrate: true,telemetry: false,llm: {provider: "anthropic",options: {model: "claude-sonnet-4-6",apiKey: ANTHROPIC_API_KEY,},},browser: {cdp: `${session.websocketUrl}&apiKey=${STEEL_API_KEY}`,},});
browser.cdp is the whole wiring. narrate: true streams a log of what the agent is doing between screenshot turns. The url option does the first navigation, so there is no separate goto call in main().
What the demo does
main() walks a three-step flow against Steel's public leaderboard repo:
- 1Extract the user behind the most recent commit:
const mostRecentCommitter = await agent.extract("Find the user with the most recent commit",z.object({user: z.string(),commit: z.string(),}),);
- 1Act to open the pull request that produced that commit:
await agent.act("Find the pull request behind the most recent commit if there is one",);
- 1Extract a prose summary of what the PR changed.
The act call sits in try / catch because the leaderboard head commit is not always tied to a merged PR.
Run it
cd examples/magnitudecp .env.example .env # set STEEL_API_KEY and ANTHROPIC_API_KEYnpm installnpm start
Steel keys live at app.steel.dev/settings/api-keys; Anthropic keys at console.anthropic.com.
Your output varies. Structure looks like this:
Steel + Magnitude Node Starter============================================================Creating Steel session...Steel Session created!View session at https://app.steel.dev/sessions/ab12cd34...Connected to browser via MagnitudeLooking for commits[narrate] taking screenshot of github.com/steel-dev/leaderboard[narrate] extracting: Find the user with the most recent commitMost recent committer:alice-dev has the most recent commitLooking for pull request behind the most recent commit[narrate] clicking commit SHA link[narrate] navigating to pull/482Found pull request!Adds a tie-breaker rule when two contributors have identical scores.Automation completed successfully!Stopping Magnitude agent...Releasing Steel session...Steel session released successfully
A full run takes ~45 seconds. The finally block stops the agent first, then releases the session. Reverse that order and Magnitude can try to screenshot a browser Steel already tore down.
Make it yours
- Swap the schema and prompt.
extract()is schema-driven: forms, tables, invoices, search results. - Chain
actcalls for multi-step flows. Login, filter, paginate, export. Each step is one natural-language instruction. - Switch models.
llm.provideraccepts"anthropic"(used here) among others. PointmodelandapiKeyat a different provider instartBrowserAgent(). - Turn on stealth. Uncomment
useProxy,solveCaptcha, orsessionTimeoutinclient.sessions.create()for sites with anti-bot.
Related
Related recipes
Deep research with Claude Agent SDK subagents
Lead orchestrator dispatches parallel researcher subagents, each driving its own Steel browser, and synthesizes findings into a cited Markdown report.
Build a browser agent with the Claude Agent SDK
Use Steel with the Claude Agent SDK (TypeScript) to build a tool-using browser agent on Anthropic's first-party agent loop.
Build a typed browser agent with Pydantic AI
Use Steel with Pydantic AI to build typed, provider-agnostic browser agents with dependency injection.