Stream a browser agent into a Next.js chat app

A Next.js App Router chat app where a Vercel AI SDK agent drives a Steel cloud browser with embedded Live View.

examples/vercel-ai-sdk-nextjs
Contributors: Updated
Terminal

Scaffolds a starter project locally. Requires the Steel CLI.

A Next.js chat app where an AI SDK v6 agent drives a Steel cloud browser server-side and streams every tool call back into the UI. useChat on the client posts to /api/chat; that route calls streamText with four Steel-backed tools (openSession, navigate, snapshot, extract). Each tool call surfaces as a typed tool-* part on the message stream, and a Live View iframe on the right lights up the moment the agent opens a session.

app/
├── api/chat/route.ts # streamText + Steel tools, Node runtime
├── page.tsx # useChat, tool-call rendering, Live View iframe
├── layout.tsx # Geist fonts, dark theme
└── globals.css

app/api/chat/route.ts pins runtime = "nodejs" (Playwright will not run on Edge) and maxDuration = 120. next.config.mjs lists playwright, playwright-core, and steel-sdk under serverExternalPackages so Next skips bundling them into the server build.

The POST handler builds a per-request closure around three variables (session, browser, page) shared by every tool's execute, and streamText runs up to 15 steps (stopWhen: stepCountIs(15)). Both onFinish and onAbort call cleanup() to close the browser and steel.sessions.release(session.id).

A fifth tool, submitForm, carries needsApproval: true. Its body only runs if your approval UI confirms the call. It ships as a demo hook; wire it up when you add real destructive actions.

Phase-gating with prepareStep

Tool misuse by the model is a real failure mode. A second openSession mid-run would leak a browser; a navigate before any session exists throws. prepareStep constrains the active tool set per step:

prepareStep: async ({ stepNumber, steps }) => {
const sessionOpened = steps.some((s) =>
s.toolCalls?.some((tc) => tc.toolName === "openSession")
);
if (stepNumber === 0 || !sessionOpened) {
return { activeTools: ["openSession"] };
}
return { activeTools: ["navigate", "snapshot", "extract", "submitForm"] };
},

Run it

cd examples/vercel-ai-sdk-nextjs
cp .env.example .env # set STEEL_API_KEY and ANTHROPIC_API_KEY
npm install
npx playwright install chromium
npm run dev

Get keys at app.steel.dev/settings/api-keys and console.anthropic.com. Open http://localhost

and try one of the seeded prompts:

Go to https://github.com/trending/python and tell me the top 3 AI/ML repos.

A typical run takes ~20 seconds: openSession (~3s), navigate (~2s), snapshot (~1s), extract (~1s), then the model writes its reply. Server console logs each step:

step: openSession | 412 tokens
step: navigate | 1083 tokens
step: snapshot | 2847 tokens
step: extract | 3104 tokens
step: (text) | 3298 tokens

Deploying to Vercel

Push to GitHub, import into Vercel, add STEEL_API_KEY and ANTHROPIC_API_KEY as environment variables. Playwright's Chromium has to be downloaded during the build, so set the Build Command to:

npx playwright install chromium && next build

The /api/chat route already declares maxDuration = 120 and runtime = "nodejs".

Make it yours

  • Change the model. Swap anthropic("claude-haiku-4-5") for any model in @ai-sdk/*. The Zod tool schemas stay the same.
  • Add a screenshot tool. await page.screenshot({ type: "png" }) returns a Buffer; return it base64-encoded and render it as an <img> in the tool-call panel.
  • Stream a plan step. Add a plan tool with no side effects and a string input. The model can narrate its intent before executing.
  • Turn on stealth. Pass useProxy, solveCaptcha, or sessionTimeout options to steel.sessions.create() inside openSession.
  • Wire the approval UI. needsApproval: true on submitForm pauses execution and surfaces the call as a tool-submitForm part in state: "input-available". Render an Approve/Reject pair and call addToolResult from @ai-sdk/react to resume.

Plain TS version · AI SDK agents · Loop control · Next.js App Router