Quickstart (Typescript)

This guide shows you how to create AI agents with Claude's computer use capabilities and Steel browsers for autonomous web task execution.

Prerequisites

Node.js 20+
A Steel API key (sign up here)
An Anthropic API key with access to Claude models

Step 1: Setup and Dependencies

First, create a project directory and install the required packages:

Terminal

# Create a project directory
mkdir steel-claude-computer-use
cd steel-claude-computer-use

# Initialize package.json
npm init -y

# Install required packages
npm install steel-sdk @anthropic-ai/sdk playwright dotenv
npm install -D @types/node typescript ts-node

Create a .env file with your API keys:

ENV

.env

1STEEL_API_KEY=your_steel_api_key_here
2ANTHROPIC_API_KEY=your_anthropic_api_key_here
3TASK=Go to Wikipedia and search for machine learning

Step 2: Create Helper Functions

Typescript

utils.ts

1import { chromium } from "playwright";
2import type { Browser, Page } from "playwright";
3import { Steel } from "steel-sdk";
4import * as dotenv from "dotenv";
5import Anthropic from "@anthropic-ai/sdk";
6import type {
7  MessageParam,
8  ToolResultBlockParam,
9  Message,
10} from "@anthropic-ai/sdk/resources/messages";
11
12dotenv.config();
13
14// Replace with your own API keys
15export const STEEL_API_KEY =
16  process.env.STEEL_API_KEY || "your-steel-api-key-here";
17export const ANTHROPIC_API_KEY =
18  process.env.ANTHROPIC_API_KEY || "your-anthropic-api-key-here";
19
20// Replace with your own task
21export const TASK =
22  process.env.TASK || "Go to Wikipedia and search for machine learning";
23
24export const SYSTEM_PROMPT = `You are an expert browser automation assistant operating in an iterative execution loop. Your goal is to efficiently complete tasks using a Chrome browser with full internet access.
25
26<CAPABILITIES>
27* You control a Chrome browser tab and can navigate to any website
28* You can click, type, scroll, take screenshots, and interact with web elements
29* You have full internet access and can visit any public website
30* You can read content, fill forms, search for information, and perform complex multi-step tasks
31* After each action, you receive a screenshot showing the current state
32
33<COORDINATE_SYSTEM>
34* The browser viewport has specific dimensions that you must respect
35* All coordinates (x, y) must be within the viewport bounds
36* X coordinates must be between 0 and the display width (inclusive)
37* Y coordinates must be between 0 and the display height (inclusive)
38* Always ensure your click, move, scroll, and drag coordinates are within these bounds
39* If you're unsure about element locations, take a screenshot first to see the current state
40
41<AUTONOMOUS_EXECUTION>
42* Work completely independently - make decisions and act immediately without asking questions
43* Never request clarification, present options, or ask for permission
44* Make intelligent assumptions based on task context
45* If something is ambiguous, choose the most logical interpretation and proceed
46* Take immediate action rather than explaining what you might do
47* When the task objective is achieved, immediately declare "TASK_COMPLETED:" - do not provide commentary or ask questions
48
49<REASONING_STRUCTURE>
50For each step, you must reason systematically:
51* Analyze your previous action's success/failure and current state
52* Identify what specific progress has been made toward the goal
53* Determine the next immediate objective and how to achieve it
54* Choose the most efficient action sequence to make progress
55
56<EFFICIENCY_PRINCIPLES>
57* Combine related actions when possible rather than single-step execution
58* Navigate directly to relevant websites without unnecessary exploration
59* Use screenshots strategically to understand page state before acting
60* Be persistent with alternative approaches if initial attempts fail
61* Focus on the specific information or outcome requested
62
63<COMPLETION_CRITERIA>
64* MANDATORY: When you complete the task, your final message MUST start with "TASK_COMPLETED: [brief summary]"
65* MANDATORY: If technical issues prevent completion, your final message MUST start with "TASK_FAILED: [reason]"
66* MANDATORY: If you abandon the task, your final message MUST start with "TASK_ABANDONED: [explanation]"
67* Do not write anything after completing the task except the required completion message
68* Do not ask questions, provide commentary, or offer additional help after task completion
69* The completion message is the end of the interaction - nothing else should follow
70
71<CRITICAL_REQUIREMENTS>
72* This is fully automated execution - work completely independently
73* Start by taking a screenshot to understand the current state
74* Never click on browser UI elements
75* Always respect coordinate boundaries - invalid coordinates will fail
76* Recognize when the stated objective has been achieved and declare completion immediately
77* Focus on the explicit task given, not implied or potential follow-up tasks
78
79Remember: Be thorough but focused. Complete the specific task requested efficiently and provide clear results.`;
80
81export const BLOCKED_DOMAINS = [
82  "maliciousbook.com",
83  "evilvideos.com",
84  "darkwebforum.com",
85  "shadytok.com",
86  "suspiciouspins.com",
87  "ilanbigio.com",
88];
89
90export const MODEL_CONFIGS = {
91  "claude-3-5-sonnet-20241022": {
92    toolType: "computer_20241022",
93    betaFlag: "computer-use-2024-10-22",
94    description: "Stable Claude 3.5 Sonnet (recommended)",
95  },
96  "claude-3-7-sonnet-20250219": {
97    toolType: "computer_20250124",
98    betaFlag: "computer-use-2025-01-24",
99    description: "Claude 3.7 Sonnet (newer)",
100  },
101  "claude-sonnet-4-20250514": {
102    toolType: "computer_20250124",
103    betaFlag: "computer-use-2025-01-24",
104    description: "Claude 4 Sonnet (newest)",
105  },
106  "claude-opus-4-20250514": {
107    toolType: "computer_20250124",
108    betaFlag: "computer-use-2025-01-24",
109    description: "Claude 4 Opus (newest)",
110  },
111};
112
113export const CUA_KEY_TO_PLAYWRIGHT_KEY: Record<string, string> = {
114  "/": "Divide",
115  "\\": "Backslash",
116  alt: "Alt",
117  arrowdown: "ArrowDown",
118  arrowleft: "ArrowLeft",
119  arrowright: "ArrowRight",
120  arrowup: "ArrowUp",
121  backspace: "Backspace",
122  capslock: "CapsLock",
123  cmd: "Meta",
124  ctrl: "Control",
125  delete: "Delete",
126  end: "End",
127  enter: "Enter",
128  esc: "Escape",
129  home: "Home",
130  insert: "Insert",
131  option: "Alt",
132  pagedown: "PageDown",
133  pageup: "PageUp",
134  shift: "Shift",
135  space: " ",
136  super: "Meta",
137  tab: "Tab",
138  win: "Meta",
139  Return: "Enter",
140  KP_Enter: "Enter",
141  Escape: "Escape",
142  BackSpace: "Backspace",
143  Delete: "Delete",
144  Tab: "Tab",
145  ISO_Left_Tab: "Shift+Tab",
146  Up: "ArrowUp",
147  Down: "ArrowDown",
148  Left: "ArrowLeft",
149  Right: "ArrowRight",
150  Page_Up: "PageUp",
151  Page_Down: "PageDown",
152  Home: "Home",
153  End: "End",
154  Insert: "Insert",
155  F1: "F1",
156  F2: "F2",
157  F3: "F3",
158  F4: "F4",
159  F5: "F5",
160  F6: "F6",
161  F7: "F7",
162  F8: "F8",
163  F9: "F9",
164  F10: "F10",
165  F11: "F11",
166  F12: "F12",
167  Shift_L: "Shift",
168  Shift_R: "Shift",
169  Control_L: "Control",
170  Control_R: "Control",
171  Alt_L: "Alt",
172  Alt_R: "Alt",
173  Meta_L: "Meta",
174  Meta_R: "Meta",
175  Super_L: "Meta",
176  Super_R: "Meta",
177  minus: "-",
178  equal: "=",
179  bracketleft: "[",
180  bracketright: "]",
181  semicolon: ";",
182  apostrophe: "'",
183  grave: "`",
184  comma: ",",
185  period: ".",
186  slash: "/",
187};
188
189type ModelName = keyof typeof MODEL_CONFIGS;
190
191interface ModelConfig {
192  toolType: string;
193  betaFlag: string;
194  description: string;
195}
196
197export function chunks(s: string, chunkSize: number): string[] {
198  const result: string[] = [];
199  for (let i = 0; i < s.length; i += chunkSize) {
200    result.push(s.slice(i, i + chunkSize));
201  }
202  return result;
203}
204
205export function pp(obj: any): void {
206  console.log(JSON.stringify(obj, null, 2));
207}
208
209export function checkBlocklistedUrl(url: string): void {
210  try {
211    const hostname = new URL(url).hostname || "";
212    const isBlocked = BLOCKED_DOMAINS.some(
213      (blocked) => hostname === blocked || hostname.endsWith(`.${blocked}`)
214    );
215    if (isBlocked) {
216      throw new Error(`Blocked URL: ${url}`);
217    }
218  } catch (error) {
219    if (error instanceof Error && error.message.startsWith("Blocked URL:")) {
220      throw error;
221    }
222  }
223}

Step 3: Create Steel Browser Integration

Typescript

steelBrowser.ts

1const TYPING_DELAY_MS = 12;
2const TYPING_GROUP_SIZE = 50;
3
4export class SteelBrowser {
5  private client: Steel;
6  private session: any;
7  private browser: Browser | null = null;
8  private page: Page | null = null;
9  private dimensions: [number, number];
10  private proxy: boolean;
11  private solveCaptcha: boolean;
12  private virtualMouse: boolean;
13  private sessionTimeout: number;
14  private adBlocker: boolean;
15  private startUrl: string;
16  private lastMousePosition: [number, number] | null = null;
17
18  constructor(
19    width: number = 1024,
20    height: number = 768,
21    proxy: boolean = false,
22    solveCaptcha: boolean = false,
23    virtualMouse: boolean = true,
24    sessionTimeout: number = 900000,
25    adBlocker: boolean = true,
26    startUrl: string = "https://www.google.com"
27  ) {
28    this.client = new Steel({
29      steelAPIKey: process.env.STEEL_API_KEY!,
30    });
31    this.dimensions = [width, height];
32    this.proxy = proxy;
33    this.solveCaptcha = solveCaptcha;
34    this.virtualMouse = virtualMouse;
35    this.sessionTimeout = sessionTimeout;
36    this.adBlocker = adBlocker;
37    this.startUrl = startUrl;
38  }
39
40  getDimensions(): [number, number] {
41    return this.dimensions;
42  }
43
44  getCurrentUrl(): string {
45    return this.page?.url() || "";
46  }
47
48  async initialize(): Promise<void> {
49    const [width, height] = this.dimensions;
50    const sessionParams = {
51      useProxy: this.proxy,
52      solveCaptcha: this.solveCaptcha,
53      apiTimeout: this.sessionTimeout,
54      blockAds: this.adBlocker,
55      dimensions: { width, height },
56    };
57
58    this.session = await this.client.sessions.create(sessionParams);
59    console.log("Steel Session created successfully!");
60    console.log(`View live session at: ${this.session.sessionViewerUrl}`);
61
62    const cdpUrl = `${this.session.websocketUrl}&apiKey=${process.env.STEEL_API_KEY}`;
63
64    this.browser = await chromium.connectOverCDP(cdpUrl, {
65      timeout: 60000,
66    });
67
68    const context = this.browser.contexts()
69[0];
70
71    await context.route("**/*", async (route, request) => {
72      const url = request.url();
73      try {
74        checkBlocklistedUrl(url);
75        await route.continue();
76      } catch (error) {
77        console.log(`Blocking URL: ${url}`);
78        await route.abort();
79      }
80    });
81
82    if (this.virtualMouse) {
83      await context.addInitScript(`
84        if (window.self === window.top) {
85          function initCursor() {
86            const CURSOR_ID = '__cursor__';
87            if (document.getElementById(CURSOR_ID)) return;
88
89            const cursor = document.createElement('div');
90            cursor.id = CURSOR_ID;
91            Object.assign(cursor.style, {
92              position: 'fixed',
93              top: '0px',
94              left: '0px',
95              width: '20px',
96              height: '20px',
97              backgroundImage: 'url("data:image/svg+xml;utf8,<svg width=\\'16\\' height=\\'16\\' viewBox=\\'0 0 20 20\\' fill=\\'black\\' outline=\\'white\\' xmlns=\\'http://www.w3.org/2000/svg\\'><path d=\\'M15.8089 7.22221C15.9333 7.00888 15.9911 6.78221 15.9822 6.54221C15.9733 6.29333 15.8978 6.06667 15.7555 5.86221C15.6133 5.66667 15.4311 5.52445 15.2089 5.43555L1.70222 0.0888888C1.47111 0 1.23555 -0.0222222 0.995555 0.0222222C0.746667 0.0755555 0.537779 0.186667 0.368888 0.355555C0.191111 0.533333 0.0755555 0.746667 0.0222222 0.995555C-0.0222222 1.23555 0 1.47111 0.0888888 1.70222L5.43555 15.2222C5.52445 15.4445 5.66667 15.6267 5.86221 15.7689C6.06667 15.9111 6.28888 15.9867 6.52888 15.9955H6.58221C6.82221 15.9955 7.04445 15.9333 7.24888 15.8089C7.44445 15.6845 7.59555 15.52 7.70221 15.3155L10.2089 10.2222L15.3022 7.70221C15.5155 7.59555 15.6845 7.43555 15.8089 7.22221Z\\' ></path></svg>")',
98              backgroundSize: 'cover',
99              pointerEvents: 'none',
100              zIndex: '99999',
101              transform: 'translate(-2px, -2px)',
102            });
103
104            document.body.appendChild(cursor);
105
106            document.addEventListener("mousemove", (e) => {
107              cursor.style.top = e.clientY + "px";
108              cursor.style.left = e.clientX + "px";
109            });
110          }
111
112          function checkBody() {
113            if (document.body) {
114              initCursor();
115            } else {
116              requestAnimationFrame(checkBody);
117            }
118          }
119          requestAnimationFrame(checkBody);
120        }
121      `);
122    }
123
124    this.page = context.pages()
125[0];
126
127    const [viewportWidth, viewportHeight] = this.dimensions;
128    await this.page.setViewportSize({
129      width: viewportWidth,
130      height: viewportHeight,
131    });
132
133    await this.page.goto(this.startUrl);
134  }
135
136  async cleanup(): Promise<void> {
137    if (this.page) {
138      await this.page.close();
139    }
140    if (this.browser) {
141      await this.browser.close();
142    }
143    if (this.session) {
144      console.log("Releasing Steel session...");
145      await this.client.sessions.release(this.session.id);
146      console.log(
147        `Session completed. View replay at ${this.session.sessionViewerUrl}`
148      );
149    }
150  }
151
152  async screenshot(): Promise<string> {
153    if (!this.page) throw new Error("Page not initialized");
154
155    try {
156      const [width, height] = this.dimensions;
157      const buffer = await this.page.screenshot({
158        fullPage: false,
159        clip: { x: 0, y: 0, width, height },
160      });
161      return buffer.toString("base64");
162    } catch (error) {
163      console.log(`Screenshot failed, trying CDP fallback: ${error}`);
164      try {
165        const cdpSession = await this.page.context().newCDPSession(this.page);
166        const result = await cdpSession.send("Page.captureScreenshot", {
167          format: "png",
168          fromSurface: false,
169        });
170        await cdpSession.detach();
171        return result.data;
172      } catch (cdpError) {
173        console.log(`CDP screenshot also failed: ${cdpError}`);
174        throw error;
175      }
176    }
177  }
178
179  private validateAndGetCoordinates(
180    coordinate: [number, number] | number[]
181  ): [number, number] {
182    if (!Array.isArray(coordinate) || coordinate.length !== 2) {
183      throw new Error(`${coordinate} must be a tuple or list of length 2`);
184    }
185    if (!coordinate.every((i) => typeof i === "number" && i >= 0)) {
186      throw new Error(
187        `${coordinate} must be a tuple/list of non-negative numbers`
188      );
189    }
190
191    const [x, y] = this.clampCoordinates(coordinate[0], coordinate[1]);
192    return [x, y];
193  }
194
195  private clampCoordinates(x: number, y: number): [number, number] {
196    const [width, height] = this.dimensions;
197    const clampedX = Math.max(0, Math.min(x, width - 1));
198    const clampedY = Math.max(0, Math.min(y, height - 1));
199
200    if (x !== clampedX || y !== clampedY) {
201      console.log(
202        `⚠️  Coordinate clamped: (${x}, ${y}) → (${clampedX}, ${clampedY})`
203      );
204    }
205
206    return [clampedX, clampedY];
207  }
208
209  async executeComputerAction(
210    action: string,
211    text?: string,
212    coordinate?: [number, number] | number[],
213    scrollDirection?: "up" | "down" | "left" | "right",
214    scrollAmount?: number,
215    duration?: number,
216    key?: string
217  ): Promise<string> {
218    if (!this.page) throw new Error("Page not initialized");
219
220    if (action === "left_mouse_down" || action === "left_mouse_up") {
221      if (coordinate !== undefined) {
222        throw new Error(`coordinate is not accepted for ${action}`);
223      }
224
225      if (action === "left_mouse_down") {
226        await this.page.mouse.down();
227      } else {
228        await this.page.mouse.up();
229      }
230
231      return this.screenshot();
232    }
233
234    if (action === "scroll") {
235      if (
236        !scrollDirection ||
237        !["up", "down", "left", "right"].includes(scrollDirection)
238      ) {
239        throw new Error(
240          "scroll_direction must be 'up', 'down', 'left', or 'right'"
241        );
242      }
243      if (scrollAmount === undefined || scrollAmount < 0) {
244        throw new Error("scroll_amount must be a non-negative number");
245      }
246
247      if (coordinate !== undefined) {
248        const [x, y] = this.validateAndGetCoordinates(coordinate);
249        await this.page.mouse.move(x, y);
250        this.lastMousePosition = [x, y];
251      }
252
253      if (text) {
254        let modifierKey = text;
255        if (modifierKey in CUA_KEY_TO_PLAYWRIGHT_KEY) {
256          modifierKey = CUA_KEY_TO_PLAYWRIGHT_KEY[modifierKey];
257        }
258        await this.page.keyboard.down(modifierKey);
259      }
260
261      const scrollMapping = {
262        down: [0, 100 * scrollAmount],
263        up: [0, -100 * scrollAmount],
264        right: [100 * scrollAmount, 0],
265        left: [-100 * scrollAmount, 0],
266      };
267      const [deltaX, deltaY] = scrollMapping[scrollDirection];
268      await this.page.mouse.wheel(deltaX, deltaY);
269
270      if (text) {
271        let modifierKey = text;
272        if (modifierKey in CUA_KEY_TO_PLAYWRIGHT_KEY) {
273          modifierKey = CUA_KEY_TO_PLAYWRIGHT_KEY[modifierKey];
274        }
275        await this.page.keyboard.up(modifierKey);
276      }
277
278      return this.screenshot();
279    }
280
281    if (action === "hold_key" || action === "wait") {
282      if (duration === undefined || duration < 0) {
283        throw new Error("duration must be a non-negative number");
284      }
285      if (duration > 100) {
286        throw new Error("duration is too long");
287      }
288
289      if (action === "hold_key") {
290        if (text === undefined) {
291          throw new Error("text is required for hold_key");
292        }
293
294        let holdKey = text;
295        if (holdKey in CUA_KEY_TO_PLAYWRIGHT_KEY) {
296          holdKey = CUA_KEY_TO_PLAYWRIGHT_KEY[holdKey];
297        }
298
299        await this.page.keyboard.down(holdKey);
300        await new Promise((resolve) => setTimeout(resolve, duration * 1000));
301        await this.page.keyboard.up(holdKey);
302      } else if (action === "wait") {
303        await new Promise((resolve) => setTimeout(resolve, duration * 1000));
304      }
305
306      return this.screenshot();
307    }
308
309    if (
310      [
311        "left_click",
312        "right_click",
313        "double_click",
314        "triple_click",
315        "middle_click",
316      ].includes(action)
317    ) {
318      if (text !== undefined) {
319        throw new Error(`text is not accepted for ${action}`);
320      }
321
322      let clickX: number, clickY: number;
323      if (coordinate !== undefined) {
324        const [x, y] = this.validateAndGetCoordinates(coordinate);
325        await this.page.mouse.move(x, y);
326        this.lastMousePosition = [x, y];
327        clickX = x;
328        clickY = y;
329      } else if (this.lastMousePosition) {
330        [clickX, clickY] = this.lastMousePosition;
331      } else {
332        const [width, height] = this.dimensions;
333        clickX = Math.floor(width / 2);
334        clickY = Math.floor(height / 2);
335      }
336
337      if (key) {
338        let modifierKey = key;
339        if (modifierKey in CUA_KEY_TO_PLAYWRIGHT_KEY) {
340          modifierKey = CUA_KEY_TO_PLAYWRIGHT_KEY[modifierKey];
341        }
342        await this.page.keyboard.down(modifierKey);
343      }
344
345      if (action === "left_click") {
346        await this.page.mouse.click(clickX, clickY);
347      } else if (action === "right_click") {
348        await this.page.mouse.click(clickX, clickY, { button: "right" });
349      } else if (action === "double_click") {
350        await this.page.mouse.dblclick(clickX, clickY);
351      } else if (action === "triple_click") {
352        for (let i = 0; i < 3; i++) {
353          await this.page.mouse.click(clickX, clickY);
354        }
355      } else if (action === "middle_click") {
356        await this.page.mouse.click(clickX, clickY, { button: "middle" });
357      }
358
359      if (key) {
360        let modifierKey = key;
361        if (modifierKey in CUA_KEY_TO_PLAYWRIGHT_KEY) {
362          modifierKey = CUA_KEY_TO_PLAYWRIGHT_KEY[modifierKey];
363        }
364        await this.page.keyboard.up(modifierKey);
365      }
366
367      return this.screenshot();
368    }
369
370    if (action === "mouse_move" || action === "left_click_drag") {
371      if (coordinate === undefined) {
372        throw new Error(`coordinate is required for ${action}`);
373      }
374      if (text !== undefined) {
375        throw new Error(`text is not accepted for ${action}`);
376      }
377
378      const [x, y] = this.validateAndGetCoordinates(coordinate);
379
380      if (action === "mouse_move") {
381        await this.page.mouse.move(x, y);
382        this.lastMousePosition = [x, y];
383      } else if (action === "left_click_drag") {
384        await this.page.mouse.down();
385        await this.page.mouse.move(x, y);
386        await this.page.mouse.up();
387        this.lastMousePosition = [x, y];
388      }
389
390      return this.screenshot();
391    }
392
393    if (action === "key" || action === "type") {
394      if (text === undefined) {
395        throw new Error(`text is required for ${action}`);
396      }
397      if (coordinate !== undefined) {
398        throw new Error(`coordinate is not accepted for ${action}`);
399      }
400
401      if (action === "key") {
402        let pressKey = text;
403
404        if (pressKey.includes("+")) {
405          const keyParts = pressKey.split("+");
406          const modifierKeys = keyParts.slice(0, -1);
407          const mainKey = keyParts[keyParts.length - 1];
408
409          const playwrightModifiers: string[] = [];
410          for (const mod of modifierKeys) {
411            if (["ctrl", "control"].includes(mod.toLowerCase())) {
412              playwrightModifiers.push("Control");
413            } else if (mod.toLowerCase() === "shift") {
414              playwrightModifiers.push("Shift");
415            } else if (["alt", "option"].includes(mod.toLowerCase())) {
416              playwrightModifiers.push("Alt");
417            } else if (["cmd", "meta", "super"].includes(mod.toLowerCase())) {
418              playwrightModifiers.push("Meta");
419            } else {
420              playwrightModifiers.push(mod);
421            }
422          }
423
424          let finalMainKey = mainKey;
425          if (mainKey in CUA_KEY_TO_PLAYWRIGHT_KEY) {
426            finalMainKey = CUA_KEY_TO_PLAYWRIGHT_KEY[mainKey];
427          }
428
429          pressKey = [...playwrightModifiers, finalMainKey].join("+");
430        } else {
431          if (pressKey in CUA_KEY_TO_PLAYWRIGHT_KEY) {
432            pressKey = CUA_KEY_TO_PLAYWRIGHT_KEY[pressKey];
433          }
434        }
435
436        await this.page.keyboard.press(pressKey);
437      } else if (action === "type") {
438        for (const chunk of chunks(text, TYPING_GROUP_SIZE)) {
439          await this.page.keyboard.type(chunk, { delay: TYPING_DELAY_MS });
440          await new Promise((resolve) => setTimeout(resolve, 10));
441        }
442      }
443
444      return this.screenshot();
445    }
446
447    if (action === "screenshot" || action === "cursor_position") {
448      if (text !== undefined) {
449        throw new Error(`text is not accepted for ${action}`);
450      }
451      if (coordinate !== undefined) {
452        throw new Error(`coordinate is not accepted for ${action}`);
453      }
454
455      return this.screenshot();
456    }
457
458    throw new Error(`Invalid action: ${action}`);
459  }
460}

Step 4: Create the Agent Class

Typescript

claudeAgent.ts

1type ModelName = keyof typeof MODEL_CONFIGS;
2
3interface ModelConfig {
4  toolType: string;
5  betaFlag: string;
6  description: string;
7}
8
9export class ClaudeAgent {
10  private client: Anthropic;
11  private computer: SteelBrowser;
12  private messages: MessageParam[];
13  private model: ModelName;
14  private modelConfig: ModelConfig;
15  private tools: any[];
16  private systemPrompt: string;
17  private viewportWidth: number;
18  private viewportHeight: number;
19
20  constructor(
21    computer: SteelBrowser,
22    model: ModelName = "claude-3-5-sonnet-20241022"
23  ) {
24    this.client = new Anthropic({
25      apiKey: process.env.ANTHROPIC_API_KEY!,
26    });
27    this.computer = computer;
28    this.model = model;
29    this.messages = [];
30
31    if (!(model in MODEL_CONFIGS)) {
32      throw new Error(
33        `Unsupported model: ${model}. Available models: ${Object.keys(
34          MODEL_CONFIGS
35        )}`
36      );
37    }
38
39    this.modelConfig = MODEL_CONFIGS[model];
40
41    const [width, height] = computer.getDimensions();
42    this.viewportWidth = width;
43    this.viewportHeight = height;
44
45    this.systemPrompt = SYSTEM_PROMPT.replace(
46      "<COORDINATE_SYSTEM>",
47      `<COORDINATE_SYSTEM>
48* The browser viewport dimensions are ${width}x${height} pixels
49* The browser viewport has specific dimensions that you must respect`
50    );
51
52    this.tools = [
53      {
54        type: this.modelConfig.toolType,
55        name: "computer",
56        display_width_px: width,
57        display_height_px: height,
58        display_number: 1,
59      },
60    ];
61  }
62
63  getViewportInfo(): any {
64    return {
65      innerWidth: this.viewportWidth,
66      innerHeight: this.viewportHeight,
67      devicePixelRatio: 1.0,
68      screenWidth: this.viewportWidth,
69      screenHeight: this.viewportHeight,
70      scrollX: 0,
71      scrollY: 0,
72    };
73  }
74
75  validateScreenshotDimensions(screenshotBase64: string): any {
76    try {
77      const imageBuffer = Buffer.from(screenshotBase64, "base64");
78
79      if (imageBuffer.length === 0) {
80        console.log("⚠️  Empty screenshot data");
81        return {};
82      }
83
84      const viewportInfo = this.getViewportInfo();
85
86      const scalingInfo = {
87        screenshot_size: ["unknown", "unknown"],
88        viewport_size: [this.viewportWidth, this.viewportHeight],
89        actual_viewport: [viewportInfo.innerWidth, viewportInfo.innerHeight],
90        device_pixel_ratio: viewportInfo.devicePixelRatio,
91        width_scale: 1.0,
92        height_scale: 1.0,
93      };
94
95      return scalingInfo;
96    } catch (e) {
97      console.log(`⚠️  Error validating screenshot dimensions: ${e}`);
98      return {};
99    }
100  }
101
102  async processResponse(message: Message): Promise<string> {
103    let responseText = "";
104
105    for (const block of message.content) {
106      if (block.type === "text") {
107        responseText += block.text;
108        console.log(block.text);
109      } else if (block.type === "tool_use") {
110        const toolName = block.name;
111        const toolInput = block.input as any;
112
113        console.log(`🔧 ${toolName}(${JSON.stringify(toolInput)})`);
114
115        if (toolName === "computer") {
116          const action = toolInput.action;
117          const params = {
118            text: toolInput.text,
119            coordinate: toolInput.coordinate,
120            scrollDirection: toolInput.scroll_direction,
121            scrollAmount: toolInput.scroll_amount,
122            duration: toolInput.duration,
123            key: toolInput.key,
124          };
125
126          try {
127            const screenshotBase64 = await this.computer.executeComputerAction(
128              action,
129              params.text,
130              params.coordinate,
131              params.scrollDirection,
132              params.scrollAmount,
133              params.duration,
134              params.key
135            );
136
137            if (action === "screenshot") {
138              this.validateScreenshotDimensions(screenshotBase64);
139            }
140
141            const toolResult: ToolResultBlockParam = {
142              type: "tool_result",
143              tool_use_id: block.id,
144              content: [
145                {
146                  type: "image",
147                  source: {
148                    type: "base64",
149                    media_type: "image/png",
150                    data: screenshotBase64,
151                  },
152                },
153              ],
154            };
155
156            this.messages.push({
157              role: "assistant",
158              content: [block],
159            });
160            this.messages.push({
161              role: "user",
162              content: [toolResult],
163            });
164
165            return this.getClaudeResponse();
166          } catch (error) {
167            console.log(`❌ Error executing ${action}: ${error}`);
168            const toolResult: ToolResultBlockParam = {
169              type: "tool_result",
170              tool_use_id: block.id,
171              content: `Error executing ${action}: ${String(error)}`,
172              is_error: true,
173            };
174
175            this.messages.push({
176              role: "assistant",
177              content: [block],
178            });
179            this.messages.push({
180              role: "user",
181              content: [toolResult],
182            });
183
184            return this.getClaudeResponse();
185          }
186        }
187      }
188    }
189
190    if (
191      responseText &&
192      !message.content.some((block) => block.type === "tool_use")
193    ) {
194      this.messages.push({
195        role: "assistant",
196        content: responseText,
197      });
198    }
199
200    return responseText;
201  }
202
203  async getClaudeResponse(): Promise<string> {
204    try {
205      const response = await this.client.beta.messages.create(
206        {
207          model: this.model,
208          max_tokens: 4096,
209          messages: this.messages,
210          tools: this.tools,
211        },
212        {
213          headers: {
214            "anthropic-beta": this.modelConfig.betaFlag,
215          },
216        }
217      );
218
219      return this.processResponse(response);
220    } catch (error) {
221      const errorMsg = `Error communicating with Claude: ${error}`;
222      console.log(`❌ ${errorMsg}`);
223      return errorMsg;
224    }
225  }
226
227  async executeTask(
228    task: string,
229    printSteps: boolean = true,
230    debug: boolean = false,
231    maxIterations: number = 50
232  ): Promise<string> {
233    this.messages = [
234      {
235        role: "user",
236        content: this.systemPrompt,
237      },
238      {
239        role: "user",
240        content: task,
241      },
242    ];
243
244    let iterations = 0;
245    let consecutiveNoActions = 0;
246    let lastAssistantMessages: string[] = [];
247
248    console.log(`🎯 Executing task: ${task}`);
249    console.log("=".repeat(60));
250
251    const isTaskComplete = (
252      content: string
253    ): { completed: boolean; reason?: string } => {
254      if (content.includes("TASK_COMPLETED:")) {
255        return { completed: true, reason: "explicit_completion" };
256      }
257      if (
258        content.includes("TASK_FAILED:") ||
259        content.includes("TASK_ABANDONED:")
260      ) {
261        return { completed: true, reason: "explicit_failure" };
262      }
263
264      const completionPatterns = [
265        /task\s+(completed|finished|done|accomplished)/i,
266        /successfully\s+(completed|finished|found|gathered)/i,
267        /here\s+(is|are)\s+the\s+(results?|information|summary)/i,
268        /to\s+summarize/i,
269        /in\s+conclusion/i,
270        /final\s+(answer|result|summary)/i,
271      ];
272
273      const failurePatterns = [
274        /cannot\s+(complete|proceed|access|continue)/i,
275        /unable\s+to\s+(complete|access|find|proceed)/i,
276        /blocked\s+by\s+(captcha|security|authentication)/i,
277        /giving\s+up/i,
278        /no\s+longer\s+able/i,
279        /have\s+tried\s+multiple\s+approaches/i,
280      ];
281
282      if (completionPatterns.some((pattern) => pattern.test(content))) {
283        return { completed: true, reason: "natural_completion" };
284      }
285
286      if (failurePatterns.some((pattern) => pattern.test(content))) {
287        return { completed: true, reason: "natural_failure" };
288      }
289
290      return { completed: false };
291    };
292
293    const detectRepetition = (newMessage: string): boolean => {
294      if (lastAssistantMessages.length < 2) return false;
295
296      const similarity = (str1: string, str2: string): number => {
297        const words1 = str1.toLowerCase().split(/\s+/);
298        const words2 = str2.toLowerCase().split(/\s+/);
299        const commonWords = words1.filter((word) => words2.includes(word));
300        return commonWords.length / Math.max(words1.length, words2.length);
301      };
302
303      return lastAssistantMessages.some(
304        (prevMessage) => similarity(newMessage, prevMessage) > 0.8
305      );
306    };
307
308    while (iterations < maxIterations) {
309      iterations++;
310      let hasActions = false;
311
312      if (this.messages.length > 0) {
313        const lastMessage = this.messages[this.messages.length - 1];
314        if (
315          lastMessage?.role === "assistant" &&
316          typeof lastMessage.content === "string"
317        ) {
318          const content = lastMessage.content;
319
320          const completion = isTaskComplete(content);
321          if (completion.completed) {
322            console.log(`✅ Task completed (${completion.reason})`);
323            break;
324          }
325
326          if (detectRepetition(content)) {
327            console.log("🔄 Repetition detected - stopping execution");
328            lastAssistantMessages.push(content);
329            break;
330          }
331
332          lastAssistantMessages.push(content);
333          if (lastAssistantMessages.length > 3) {
334            lastAssistantMessages.shift();
335          }
336        }
337      }
338
339      if (debug) {
340        pp(this.messages);
341      }
342
343      try {
344        const response = await this.client.beta.messages.create(
345          {
346            model: this.model,
347            max_tokens: 4096,
348            messages: this.messages,
349            tools: this.tools,
350          },
351          {
352            headers: {
353              "anthropic-beta": this.modelConfig.betaFlag,
354            },
355          }
356        );
357
358        if (debug) {
359          pp(response);
360        }
361
362        for (const block of response.content) {
363          if (block.type === "tool_use") {
364            hasActions = true;
365          }
366        }
367
368        await this.processResponse(response);
369
370        if (!hasActions) {
371          consecutiveNoActions++;
372          if (consecutiveNoActions >= 3) {
373            console.log(
374              "⚠️  No actions for 3 consecutive iterations - stopping"
375            );
376            break;
377          }
378        } else {
379          consecutiveNoActions = 0;
380        }
381      } catch (error) {
382        console.error(`❌ Error during task execution: ${error}`);
383        throw error;
384      }
385    }
386
387    if (iterations >= maxIterations) {
388      console.warn(
389        `⚠️  Task execution stopped after ${maxIterations} iterations`
390      );
391    }
392
393    const assistantMessages = this.messages.filter(
394      (item) => item.role === "assistant"
395    );
396    const finalMessage = assistantMessages[assistantMessages.length - 1];
397
398    if (finalMessage && typeof finalMessage.content === "string") {
399      return finalMessage.content;
400    }
401
402    return "Task execution completed (no final message)";
403  }
404}

Step 5: Create the Main Script

Typescript

main.ts

1async function main(): Promise<void> {
2  console.log("🚀 Steel + Claude Computer Use Assistant");
3  console.log("=".repeat(60));
4
5  if (STEEL_API_KEY === "your-steel-api-key-here") {
6    console.warn(
7      "⚠️  WARNING: Please replace 'your-steel-api-key-here' with your actual Steel API key"
8    );
9    console.warn(
10      "   Get your API key at: https://app.steel.dev/settings/api-keys"
11    );
12    return;
13  }
14
15  if (ANTHROPIC_API_KEY === "your-anthropic-api-key-here") {
16    console.warn(
17      "⚠️  WARNING: Please replace 'your-anthropic-api-key-here' with your actual Anthropic API key"
18    );
19    console.warn("   Get your API key at: https://console.anthropic.com/");
20    return;
21  }
22
23  console.log("\nStarting Steel browser session...");
24
25  const computer = new SteelBrowser();
26
27  try {
28    await computer.initialize();
29    console.log("✅ Steel browser session started!");
30
31    const agent = new ClaudeAgent(computer, "claude-3-5-sonnet-20241022");
32
33    const startTime = Date.now();
34
35    try {
36      const result = await agent.executeTask(TASK, true, false, 50);
37
38      const duration = ((Date.now() - startTime) / 1000).toFixed(1);
39
40      console.log("\n" + "=".repeat(60));
41      console.log("🎉 TASK EXECUTION COMPLETED");
42      console.log("=".repeat(60));
43      console.log(`⏱️  Duration: ${duration} seconds`);
44      console.log(`🎯 Task: ${TASK}`);
45      console.log(`📋 Result:\n${result}`);
46      console.log("=".repeat(60));
47    } catch (error) {
48      console.error(`❌ Task execution failed: ${error}`);
49      process.exit(1);
50    }
51  } catch (error) {
52    console.log(`❌ Failed to start Steel browser: ${error}`);
53    console.log("Please check your STEEL_API_KEY and internet connection.");
54    process.exit(1);
55  } finally {
56    await computer.cleanup();
57  }
58}
59
60main().catch(console.error);

Running Your Agent

Execute your script:

You'll see the session URL printed in the console. Open this URL to view the live browser session.

The agent will execute the task defined in the TASK environment variable or the default task.

You can modify the task by setting the environment variable:

Terminal

export TASK="Research the latest developments in artificial intelligence"
npx ts-node main.ts

Customizing your agent's task

Try modifying the task to make your agent perform different actions:

ENV

.env

1// Research specific topics
2TASK = "Go to https://arxiv.org, search for 'machine learning', and summarize the latest papers.";
3
4// E-commerce tasks
5TASK = "Go to https://www.amazon.com, search for 'wireless headphones', and compare the top 3 results.";
6
7// Information gathering
8TASK = "Go to https://docs.anthropic.com, find information about Claude's capabilities, and provide a summary.";

Supported Models: This example uses Claude 3.5 Sonnet, but you can use any of the supported Claude models including Claude 3.7 Sonnet, Claude 4 Sonnet, or Claude 4 Opus. Update the model parameter in the ClaudeAgent constructor to switch models.

Next Steps

Explore the Steel API documentation for more advanced features
Check out the Anthropic documentation for more information about Claude's computer use capabilities
Add additional features like session recording or multi-session management