Quickstart (TypeScript)

Build a browser agent with the OpenAI Agents SDK for TypeScript and Steel. The agent opens a Steel session, navigates and snapshots the page, optionally extracts structured rows, and returns a Zod-validated final report.

Scroll to the bottom for the full example.

Requirements

Steel API key
OpenAI API key
Node.js 20+

Step 1: Project Setup

Terminal

mkdir steel-openai-agents && \
cd steel-openai-agents && \
npm init -y && \
npm install -D typescript @types/node ts-node && \
npx tsc --init && \
npm pkg set scripts.start="ts-node index.ts" && \
touch index.ts .env

Step 2: Install Dependencies

Terminal

$ npm install @openai/agents steel-sdk playwright zod dotenv

Step 3: Environment Variables

ENV

.env

1STEEL_API_KEY=your-steel-api-key-here
2OPENAI_API_KEY=your-openai-api-key-here

Step 4: Define Steel tools

Each tool is a typed tool() with a Zod parameters schema. Browser state (the Steel session + Playwright page) lives in a closure so every tool call sees the same page.

Two Zod gotchas with OpenAI strict mode

The Agents SDK sends tool schemas in strict JSON Schema mode. Two things get rejected that otherwise look fine:

Use .nullable(), not .optional(). Every property must be in required. z.string().optional() marks the field not-required and is rejected; z.string().nullable() keeps it required but lets the model pass null.
Skip .url() on tool params. Zod emits "format": "uri" for .url(), and strict mode rejects uri (supported formats are date-time, date, time, duration, email, hostname, ipv4, ipv6, uuid). Use plain z.string() and validate inside execute if needed.

Typescript

index.ts

1import * as dotenv from "dotenv";
2import { Agent, run, tool } from "@openai/agents";
3import { chromium, type Browser, type Page } from "playwright";
4import Steel from "steel-sdk";
5import { z } from "zod";
6
7dotenv.config();
8
9const STEEL_API_KEY = process.env.STEEL_API_KEY!;
10const steel = new Steel({ steelAPIKey: STEEL_API_KEY });
11
12let session: Awaited<ReturnType<typeof steel.sessions.create>> | null = null;
13let browser: Browser | null = null;
14let page: Page | null = null;
15
16const openSession = tool({
17  name: "open_session",
18  description:
19    "Open a Steel cloud browser session. Call exactly once, before anything else.",
20  parameters: z.object({}),
21  execute: async () => {
22    session = await steel.sessions.create({});
23    browser = await chromium.connectOverCDP(
24      `${session.websocketUrl}&apiKey=${STEEL_API_KEY}`
25    );
26    const ctx = browser.contexts()[0];
27    page = ctx.pages()[0] ?? (await ctx.newPage());
28    return { sessionId: session.id, liveViewUrl: session.sessionViewerUrl };
29  },
30});
31
32const navigate = tool({
33  name: "navigate",
34  description: "Navigate the open session to a URL and wait for it to load.",
35  // OpenAI strict JSON Schema rejects "uri" format, so use plain z.string() here.
36  parameters: z.object({ url: z.string() }),
37  execute: async ({ url }) => {
38    if (!page) throw new Error("open_session first.");
39    await page.goto(url, { waitUntil: "domcontentloaded", timeout: 45_000 });
40    return { url: page.url(), title: await page.title() };
41  },
42});
43
44const snapshot = tool({
45  name: "snapshot",
46  description:
47    "Return a readable snapshot of the current page: title, URL, visible text (capped), and a list of links. Call BEFORE extract so you never have to guess CSS selectors.",
48  parameters: z.object({
49    maxChars: z.number().int().positive().max(10_000).default(4_000),
50    maxLinks: z.number().int().positive().max(200).default(50),
51  }),
52  execute: async ({ maxChars, maxLinks }) => {
53    if (!page) throw new Error("open_session first.");
54    return (await page.evaluate(
55      ({ maxChars, maxLinks }: { maxChars: number; maxLinks: number }) => {
56        const text = (document.body.innerText || "").slice(0, maxChars);
57        const links = Array.from(document.querySelectorAll("a[href]"))
58          .slice(0, maxLinks)
59          .map((a) => {
60            const anchor = a as HTMLAnchorElement;
61            const t = (anchor.innerText || anchor.textContent || "").trim().slice(0, 120);
62            return { text: t, href: anchor.href };
63          })
64          .filter((l) => l.text && l.href);
65        return { url: location.href, title: document.title, text, links };
66      },
67      { maxChars, maxLinks }
68    )) as { url: string; title: string; text: string; links: { text: string; href: string }[] };
69  },
70});
71
72const extract = tool({
73  name: "extract",
74  description:
75    "Extract structured rows from the current page using CSS selectors. Prefer calling snapshot() first.",
76  parameters: z.object({
77    rowSelector: z.string(),
78    fields: z.array(z.object({
79      name: z.string(),
80      selector: z.string(),
81      attr: z.string().nullable(),
82    })).min(1).max(10),
83    limit: z.number().int().positive().max(20).default(10),
84  }),
85  execute: async ({ rowSelector, fields, limit }) => {
86    if (!page) throw new Error("open_session first.");
87    const items = (await page.evaluate(
88      ({ rowSelector, fields, limit }: {
89        rowSelector: string;
90        fields: { name: string; selector: string; attr: string | null }[];
91        limit: number;
92      }) => {
93        const rows = Array.from(document.querySelectorAll(rowSelector)).slice(0, limit);
94        return rows.map((row) => {
95          const item: Record<string, string> = {};
96          for (const f of fields) {
97            const el = f.selector ? (row.querySelector(f.selector) as Element | null) : row;
98            if (!el) { item[f.name] = ""; continue; }
99            item[f.name] = f.attr
100              ? (el.getAttribute(f.attr) ?? "").trim()
101              : (((el as HTMLElement).innerText ?? el.textContent ?? "")).trim();
102          }
103          return item;
104        });
105      },
106      { rowSelector, fields, limit }
107    )) as Record<string, string>[];
108    return { count: items.length, items };
109  },
110});

Step 5: Build the Agent

Give the agent instructions, tools, a model, and an outputType (Zod schema) for the final answer. Unlike some providers that force JSON-only mode when you ask for structured output, OpenAI supports outputType + tools together — the agent uses tools freely and still returns a validated final answer.

Typescript

index.ts

1const FinalReport = z.object({
2  summary: z.string().describe("One-paragraph summary of what these repos have in common."),
3  repos: z.array(z.object({
4    name: z.string(),
5    url: z.string(),
6    stars: z.string().nullable(),
7    description: z.string().nullable(),
8  })).min(1).max(5),
9});
10
11const agent = new Agent({
12  name: "SteelResearch",
13  instructions: [
14    "You operate a Steel cloud browser via tools.",
15    "Workflow: (1) open_session, (2) navigate to the target URL,",
16    "(3) snapshot to see the page's text and links,",
17    "(4) only call extract when you need structured rows beyond snapshot,",
18    "(5) return the final FinalReport.",
19    "Prefer snapshot's links list over guessing selectors. Do not invent data.",
20  ].join(" "),
21  model: "gpt-5-mini",
22  tools: [openSession, navigate, snapshot, extract],
23  outputType: FinalReport,
24});

Step 6: Run and clean up

Typescript

index.ts

1async function main() {
2  try {
3    const result = await run(
4      agent,
5      "Go to https://github.com/trending/python?since=daily and return the top 3 AI/ML-related repositories. For each, give name (owner/repo), GitHub URL, star count as shown, and the repo description.",
6      { maxTurns: 15 }
7    );
8    console.log(JSON.stringify(result.finalOutput, null, 2));
9  } finally {
10    if (browser) await browser.close().catch(() => {});
11    if (session) await steel.sessions.release(session.id).catch(() => {});
12  }
13}
14
15main().catch((e) => { console.error(e); process.exit(1); });

Run It

Terminal

npm start

Swap the model

gpt-5-mini is the default here because it's fast enough for interactive iteration. Swap up to gpt-5 when you need higher-quality reasoning on harder pages — expect 15-40s per turn because of its reasoning stage.

const agent = new Agent({ /* ... */, model: "gpt-5" }); // slower, better reasoning

Next Steps

OpenAI Agents SDK (TS): https://openai.github.io/openai-agents-js/
Python quickstart: /integrations/openai-agents-sdk/quickstart-python
Steel Sessions API: /overview/sessions-api/overview
This example on GitHub: https://github.com/steel-dev/steel-cookbook/tree/main/examples/steel-openai-agents-node-starter