Quickstart (Python)
How to use Claude Computer Use with Steel
This guide shows you how to use Claude models with computer use capabilities and Steel browsers to create AI agents that navigate the web.
We'll build a Claude Computer Use loop that enables autonomous web task execution through iterative screenshot analysis and action planning.
Prerequisites
-
Python 3.11+
-
A Steel API key (sign up here)
-
An Anthropic API key with access to Claude models
Step 1: Setup and Dependencies
First, create a project directory, set up a virtual environment, and install the required packages:
# Create a project directorymkdir steel-claude-computer-usecd steel-claude-computer-use# Recommended: Create and activate a virtual environmentpython -m venv venvsource venv/bin/activate # On Windows, use: venv\Scripts\activate# Install required packagespip install steel-sdk anthropic playwright python-dotenv pillow
Create a .env
file with your API keys:
1STEEL_API_KEY=your_steel_api_key_here2ANTHROPIC_API_KEY=your_anthropic_api_key_here3TASK=Go to Wikipedia and search for machine learning
Step 2: Create Helper Functions
1import os2import time3import base644import json5import re6from typing import List, Dict7from urllib.parse import urlparse89from dotenv import load_dotenv10from PIL import Image11from io import BytesIO12from playwright.sync_api import sync_playwright, Error as PlaywrightError13from steel import Steel14from anthropic import Anthropic15from anthropic.types.beta import BetaMessageParam161718load_dotenv(override=True)1920# Replace with your own API keys21STEEL_API_KEY = os.getenv("STEEL_API_KEY") or "your-steel-api-key-here"22ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY") or "your-anthropic-api-key-here"2324# Replace with your own task25TASK = os.getenv("TASK") or "Go to Wikipedia and search for machine learning"2627SYSTEM_PROMPT = """You are an expert browser automation assistant operating in an iterative execution loop. Your goal is to efficiently complete tasks using a Chrome browser with full internet access.2829<CAPABILITIES>30* You control a Chrome browser tab and can navigate to any website31* You can click, type, scroll, take screenshots, and interact with web elements32* You have full internet access and can visit any public website33* You can read content, fill forms, search for information, and perform complex multi-step tasks34* After each action, you receive a screenshot showing the current state3536<COORDINATE_SYSTEM>37* The browser viewport has specific dimensions that you must respect38* All coordinates (x, y) must be within the viewport bounds39* X coordinates must be between 0 and the display width (inclusive)40* Y coordinates must be between 0 and the display height (inclusive)41* Always ensure your click, move, scroll, and drag coordinates are within these bounds42* If you're unsure about element locations, take a screenshot first to see the current state4344<AUTONOMOUS_EXECUTION>45* Work completely independently - make decisions and act immediately without asking questions46* Never request clarification, present options, or ask for permission47* Make intelligent assumptions based on task context48* If something is ambiguous, choose the most logical interpretation and proceed49* Take immediate action rather than explaining what you might do50* When the task objective is achieved, immediately declare "TASK_COMPLETED:" - do not provide commentary or ask questions5152<REASONING_STRUCTURE>53For each step, you must reason systematically:54* Analyze your previous action's success/failure and current state55* Identify what specific progress has been made toward the goal56* Determine the next immediate objective and how to achieve it57* Choose the most efficient action sequence to make progress5859<EFFICIENCY_PRINCIPLES>60* Combine related actions when possible rather than single-step execution61* Navigate directly to relevant websites without unnecessary exploration62* Use screenshots strategically to understand page state before acting63* Be persistent with alternative approaches if initial attempts fail64* Focus on the specific information or outcome requested6566<COMPLETION_CRITERIA>67* MANDATORY: When you complete the task, your final message MUST start with "TASK_COMPLETED: [brief summary]"68* MANDATORY: If technical issues prevent completion, your final message MUST start with "TASK_FAILED: [reason]"69* MANDATORY: If you abandon the task, your final message MUST start with "TASK_ABANDONED: [explanation]"70* Do not write anything after completing the task except the required completion message71* Do not ask questions, provide commentary, or offer additional help after task completion72* The completion message is the end of the interaction - nothing else should follow7374<CRITICAL_REQUIREMENTS>75* This is fully automated execution - work completely independently76* Start by taking a screenshot to understand the current state77* Never click on browser UI elements78* Always respect coordinate boundaries - invalid coordinates will fail79* Recognize when the stated objective has been achieved and declare completion immediately80* Focus on the explicit task given, not implied or potential follow-up tasks8182Remember: Be thorough but focused. Complete the specific task requested efficiently and provide clear results."""8384BLOCKED_DOMAINS = [85"maliciousbook.com",86"evilvideos.com",87"darkwebforum.com",88"shadytok.com",89"suspiciouspins.com",90"ilanbigio.com",91]9293MODEL_CONFIGS = {94"claude-3-5-sonnet-20241022": {95"tool_type": "computer_20241022",96"beta_flag": "computer-use-2024-10-22",97"description": "Stable Claude 3.5 Sonnet (recommended)"98},99"claude-3-7-sonnet-20250219": {100"tool_type": "computer_20250124",101"beta_flag": "computer-use-2025-01-24",102"description": "Claude 3.7 Sonnet (newer)"103},104"claude-sonnet-4-20250514": {105"tool_type": "computer_20250124",106"beta_flag": "computer-use-2025-01-24",107"description": "Claude 4 Sonnet (newest)"108},109"claude-opus-4-20250514": {110"tool_type": "computer_20250124",111"beta_flag": "computer-use-2025-01-24",112"description": "Claude 4 Opus (newest)"113}114}115116CUA_KEY_TO_PLAYWRIGHT_KEY = {117"/": "Divide",118"\\": "Backslash",119"alt": "Alt",120"arrowdown": "ArrowDown",121"arrowleft": "ArrowLeft",122"arrowright": "ArrowRight",123"arrowup": "ArrowUp",124"backspace": "Backspace",125"capslock": "CapsLock",126"cmd": "Meta",127"ctrl": "Control",128"delete": "Delete",129"end": "End",130"enter": "Enter",131"esc": "Escape",132"home": "Home",133"insert": "Insert",134"option": "Alt",135"pagedown": "PageDown",136"pageup": "PageUp",137"shift": "Shift",138"space": " ",139"super": "Meta",140"tab": "Tab",141"win": "Meta",142"Return": "Enter",143"KP_Enter": "Enter",144"Escape": "Escape",145"BackSpace": "Backspace",146"Delete": "Delete",147"Tab": "Tab",148"ISO_Left_Tab": "Shift+Tab",149"Up": "ArrowUp",150"Down": "ArrowDown",151"Left": "ArrowLeft",152"Right": "ArrowRight",153"Page_Up": "PageUp",154"Page_Down": "PageDown",155"Home": "Home",156"End": "End",157"Insert": "Insert",158"F1": "F1", "F2": "F2", "F3": "F3", "F4": "F4",159"F5": "F5", "F6": "F6", "F7": "F7", "F8": "F8",160"F9": "F9", "F10": "F10", "F11": "F11", "F12": "F12",161"Shift_L": "Shift", "Shift_R": "Shift",162"Control_L": "Control", "Control_R": "Control",163"Alt_L": "Alt", "Alt_R": "Alt",164"Meta_L": "Meta", "Meta_R": "Meta",165"Super_L": "Meta", "Super_R": "Meta",166"minus": "-",167"equal": "=",168"bracketleft": "[",169"bracketright": "]",170"semicolon": ";",171"apostrophe": "'",172"grave": "`",173"comma": ",",174"period": ".",175"slash": "/",176}177178179def chunks(s: str, chunk_size: int) -> List[str]:180return [s[i : i + chunk_size] for i in range(0, len(s), chunk_size)]181182183def pp(obj):184print(json.dumps(obj, indent=2))185186187def show_image(base_64_image):188image_data = base64.b64decode(base_64_image)189image = Image.open(BytesIO(image_data))190image.show()191192193def check_blocklisted_url(url: str) -> None:194hostname = urlparse(url).hostname or ""195if any(196hostname == blocked or hostname.endswith(f".{blocked}")197for blocked in BLOCKED_DOMAINS198):199raise ValueError(f"Blocked URL: {url}")
Step 3: Create Steel Browser Integration
1class SteelBrowser:2def __init__(3self,4width: int = 1024,5height: int = 768,6proxy: bool = False,7solve_captcha: bool = False,8virtual_mouse: bool = True,9session_timeout: int = 900000,10ad_blocker: bool = True,11start_url: str = "https://www.google.com",12):13self.client = Steel(14steel_api_key=os.getenv("STEEL_API_KEY"),15)16self.dimensions = (width, height)17self.proxy = proxy18self.solve_captcha = solve_captcha19self.virtual_mouse = virtual_mouse20self.session_timeout = session_timeout21self.ad_blocker = ad_blocker22self.start_url = start_url23self.session = None24self._playwright = None25self._browser = None26self._page = None27self._last_mouse_position = None2829def get_dimensions(self):30return self.dimensions3132def get_current_url(self) -> str:33return self._page.url if self._page else ""3435def __enter__(self):36width, height = self.dimensions37session_params = {38"use_proxy": self.proxy,39"solve_captcha": self.solve_captcha,40"api_timeout": self.session_timeout,41"block_ads": self.ad_blocker,42"dimensions": {"width": width, "height": height}43}44self.session = self.client.sessions.create(**session_params)4546print("Steel Session created successfully!")47print(f"View live session at: {self.session.session_viewer_url}")4849self._playwright = sync_playwright().start()50browser = self._playwright.chromium.connect_over_cdp(51f"{self.session.websocket_url}&apiKey={os.getenv('STEEL_API_KEY')}",52timeout=6000053)54self._browser = browser55context = browser.contexts[0]5657def handle_route(route, request):58url = request.url59try:60check_blocklisted_url(url)61route.continue_()62except ValueError:63print(f"Blocking URL: {url}")64route.abort()6566if self.virtual_mouse:67context.add_init_script("""68if (window.self === window.top) {69function initCursor() {70const CURSOR_ID = '__cursor__';71if (document.getElementById(CURSOR_ID)) return;7273const cursor = document.createElement('div');74cursor.id = CURSOR_ID;75Object.assign(cursor.style, {76position: 'fixed',77top: '0px',78left: '0px',79width: '20px',80height: '20px',81backgroundImage: 'url("data:image/svg+xml;utf8,<svg width=\\'16\\' height=\\'16\\' viewBox=\\'0 0 20 20\\' fill=\\'black\\' outline=\\'white\\' xmlns=\\'http://www.w3.org/2000/svg\\'><path d=\\'M15.8089 7.22221C15.9333 7.00888 15.9911 6.78221 15.9822 6.54221C15.9733 6.29333 15.8978 6.06667 15.7555 5.86221C15.6133 5.66667 15.4311 5.52445 15.2089 5.43555L1.70222 0.0888888C1.47111 0 1.23555 -0.0222222 0.995555 0.0222222C0.746667 0.0755555 0.537779 0.186667 0.368888 0.355555C0.191111 0.533333 0.0755555 0.746667 0.0222222 0.995555C-0.0222222 1.23555 0 1.47111 0.0888888 1.70222L5.43555 15.2222C5.52445 15.4445 5.66667 15.6267 5.86221 15.7689C6.06667 15.9111 6.28888 15.9867 6.52888 15.9955H6.58221C6.82221 15.9955 7.04445 15.9333 7.24888 15.8089C7.44445 15.6845 7.59555 15.52 7.70221 15.3155L10.2089 10.2222L15.3022 7.70221C15.5155 7.59555 15.6845 7.43555 15.8089 7.22221Z\\' ></path></svg>")',82backgroundSize: 'cover',83pointerEvents: 'none',84zIndex: '99999',85transform: 'translate(-2px, -2px)',86});8788document.body.appendChild(cursor);8990document.addEventListener("mousemove", (e) => {91cursor.style.top = e.clientY + "px";92cursor.style.left = e.clientX + "px";93});94}9596requestAnimationFrame(function checkBody() {97if (document.body) {98initCursor();99} else {100requestAnimationFrame(checkBody);101}102});103}104""")105106self._page = context.pages[0]107self._page.route("**/*", handle_route)108109self._page.set_viewport_size({"width": width, "height": height})110111self._page.goto(self.start_url)112113return self114115def __exit__(self, exc_type, exc_val, exc_tb):116if self._page:117self._page.close()118if self._browser:119self._browser.close()120if self._playwright:121self._playwright.stop()122123if self.session:124print("Releasing Steel session...")125self.client.sessions.release(self.session.id)126print(f"Session completed. View replay at {self.session.session_viewer_url}")127128def screenshot(self) -> str:129try:130width, height = self.dimensions131png_bytes = self._page.screenshot(132full_page=False,133clip={"x": 0, "y": 0, "width": width, "height": height}134)135return base64.b64encode(png_bytes).decode("utf-8")136except PlaywrightError as error:137print(f"Screenshot failed, trying CDP fallback: {error}")138try:139cdp_session = self._page.context.new_cdp_session(self._page)140result = cdp_session.send(141"Page.captureScreenshot", {"format": "png", "fromSurface": False}142)143return result["data"]144except PlaywrightError as cdp_error:145print(f"CDP screenshot also failed: {cdp_error}")146raise error147148def validate_and_get_coordinates(self, coordinate):149if not isinstance(coordinate, (list, tuple)) or len(coordinate) != 2:150raise ValueError(f"{coordinate} must be a tuple or list of length 2")151if not all(isinstance(i, int) and i >= 0 for i in coordinate):152raise ValueError(f"{coordinate} must be a tuple/list of non-negative ints")153154x, y = self.clamp_coordinates(coordinate[0], coordinate[1])155return x, y156157def clamp_coordinates(self, x: int, y: int):158width, height = self.dimensions159clamped_x = max(0, min(x, width - 1))160clamped_y = max(0, min(y, height - 1))161162if x != clamped_x or y != clamped_y:163print(f"โ ๏ธ Coordinate clamped: ({x}, {y}) โ ({clamped_x}, {clamped_y})")164165return clamped_x, clamped_y166167def execute_computer_action(168self,169action: str,170text: str = None,171coordinate = None,172scroll_direction: str = None,173scroll_amount: int = None,174duration = None,175key: str = None,176**kwargs177) -> str:178179if action in ("left_mouse_down", "left_mouse_up"):180if coordinate is not None:181raise ValueError(f"coordinate is not accepted for {action}")182183if action == "left_mouse_down":184self._page.mouse.down()185elif action == "left_mouse_up":186self._page.mouse.up()187188return self.screenshot()189190if action == "scroll":191if scroll_direction is None or scroll_direction not in ("up", "down", "left", "right"):192raise ValueError("scroll_direction must be 'up', 'down', 'left', or 'right'")193if scroll_amount is None or not isinstance(scroll_amount, int) or scroll_amount < 0:194raise ValueError("scroll_amount must be a non-negative int")195196if coordinate is not None:197x, y = self.validate_and_get_coordinates(coordinate)198self._page.mouse.move(x, y)199self._last_mouse_position = (x, y)200201if text:202modifier_key = text203if modifier_key in CUA_KEY_TO_PLAYWRIGHT_KEY:204modifier_key = CUA_KEY_TO_PLAYWRIGHT_KEY[modifier_key]205self._page.keyboard.down(modifier_key)206207scroll_mapping = {208"down": (0, 100 * scroll_amount),209"up": (0, -100 * scroll_amount),210"right": (100 * scroll_amount, 0),211"left": (-100 * scroll_amount, 0)212}213delta_x, delta_y = scroll_mapping[scroll_direction]214self._page.mouse.wheel(delta_x, delta_y)215216if text:217self._page.keyboard.up(modifier_key)218219return self.screenshot()220221if action in ("hold_key", "wait"):222if duration is None or not isinstance(duration, (int, float)):223raise ValueError("duration must be a number")224if duration < 0:225raise ValueError("duration must be non-negative")226if duration > 100:227raise ValueError("duration is too long")228229if action == "hold_key":230if text is None:231raise ValueError("text is required for hold_key")232233hold_key = text234if hold_key in CUA_KEY_TO_PLAYWRIGHT_KEY:235hold_key = CUA_KEY_TO_PLAYWRIGHT_KEY[hold_key]236237self._page.keyboard.down(hold_key)238time.sleep(duration)239self._page.keyboard.up(hold_key)240241elif action == "wait":242time.sleep(duration)243244return self.screenshot()245246if action in ("left_click", "right_click", "double_click", "triple_click", "middle_click"):247if text is not None:248raise ValueError(f"text is not accepted for {action}")249250if coordinate is not None:251x, y = self.validate_and_get_coordinates(coordinate)252self._page.mouse.move(x, y)253self._last_mouse_position = (x, y)254click_x, click_y = x, y255elif self._last_mouse_position:256click_x, click_y = self._last_mouse_position257else:258width, height = self.dimensions259click_x, click_y = width // 2, height // 2260261if key:262modifier_key = key263if modifier_key in CUA_KEY_TO_PLAYWRIGHT_KEY:264modifier_key = CUA_KEY_TO_PLAYWRIGHT_KEY[modifier_key]265self._page.keyboard.down(modifier_key)266267if action == "left_click":268self._page.mouse.click(click_x, click_y)269elif action == "right_click":270self._page.mouse.click(click_x, click_y, button="right")271elif action == "double_click":272self._page.mouse.dblclick(click_x, click_y)273elif action == "triple_click":274for _ in range(3):275self._page.mouse.click(click_x, click_y)276elif action == "middle_click":277self._page.mouse.click(click_x, click_y, button="middle")278279if key:280self._page.keyboard.up(modifier_key)281282return self.screenshot()283284if action in ("mouse_move", "left_click_drag"):285if coordinate is None:286raise ValueError(f"coordinate is required for {action}")287if text is not None:288raise ValueError(f"text is not accepted for {action}")289290x, y = self.validate_and_get_coordinates(coordinate)291292if action == "mouse_move":293self._page.mouse.move(x, y)294self._last_mouse_position = (x, y)295elif action == "left_click_drag":296self._page.mouse.down()297self._page.mouse.move(x, y)298self._page.mouse.up()299self._last_mouse_position = (x, y)300301return self.screenshot()302303if action in ("key", "type"):304if text is None:305raise ValueError(f"text is required for {action}")306if coordinate is not None:307raise ValueError(f"coordinate is not accepted for {action}")308309if action == "key":310press_key = text311312if "+" in press_key:313key_parts = press_key.split("+")314modifier_keys = key_parts[:-1]315main_key = key_parts[-1]316317playwright_modifiers = []318for mod in modifier_keys:319if mod.lower() in ("ctrl", "control"):320playwright_modifiers.append("Control")321elif mod.lower() in ("shift",):322playwright_modifiers.append("Shift")323elif mod.lower() in ("alt", "option"):324playwright_modifiers.append("Alt")325elif mod.lower() in ("cmd", "meta", "super"):326playwright_modifiers.append("Meta")327else:328playwright_modifiers.append(mod)329330if main_key in CUA_KEY_TO_PLAYWRIGHT_KEY:331main_key = CUA_KEY_TO_PLAYWRIGHT_KEY[main_key]332333press_key = "+".join(playwright_modifiers + [main_key])334else:335if press_key in CUA_KEY_TO_PLAYWRIGHT_KEY:336press_key = CUA_KEY_TO_PLAYWRIGHT_KEY[press_key]337338self._page.keyboard.press(press_key)339elif action == "type":340for chunk in chunks(text, 50):341self._page.keyboard.type(chunk, delay=12)342time.sleep(0.01)343344return self.screenshot()345346if action in ("screenshot", "cursor_position"):347if text is not None:348raise ValueError(f"text is not accepted for {action}")349if coordinate is not None:350raise ValueError(f"coordinate is not accepted for {action}")351352return self.screenshot()353354raise ValueError(f"Invalid action: {action}")
Step 4: Create the Agent Class
1class ClaudeAgent:2def __init__(self, computer = None, model: str = "claude-3-5-sonnet-20241022"):3self.client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))4self.computer = computer5self.messages: List[BetaMessageParam] = []6self.model = model78if computer:9width, height = computer.get_dimensions()10self.viewport_width = width11self.viewport_height = height1213self.system_prompt = SYSTEM_PROMPT.replace(14'<COORDINATE_SYSTEM>',15f'<COORDINATE_SYSTEM>\n* The browser viewport dimensions are {width}x{height} pixels\n* The browser viewport has specific dimensions that you must respect'16)1718if model not in MODEL_CONFIGS:19raise ValueError(f"Unsupported model: {model}. Available models: {list(MODEL_CONFIGS.keys())}")2021self.model_config = MODEL_CONFIGS[model]2223self.tools = [{24"type": self.model_config["tool_type"],25"name": "computer",26"display_width_px": width,27"display_height_px": height,28"display_number": 1,29}]30else:31self.viewport_width = 102432self.viewport_height = 76833self.system_prompt = SYSTEM_PROMPT3435def get_viewport_info(self) -> dict:36if not self.computer or not self.computer._page:37return {}3839try:40return self.computer._page.evaluate("""41() => ({42innerWidth: window.innerWidth,43innerHeight: window.innerHeight,44devicePixelRatio: window.devicePixelRatio,45screenWidth: window.screen.width,46screenHeight: window.screen.height,47scrollX: window.scrollX,48scrollY: window.scrollY49})50""")51except:52return {}5354def validate_screenshot_dimensions(self, screenshot_base64: str) -> dict:55try:56image_data = base64.b64decode(screenshot_base64)57image = Image.open(BytesIO(image_data))58screenshot_width, screenshot_height = image.size5960viewport_info = self.get_viewport_info()6162scaling_info = {63"screenshot_size": (screenshot_width, screenshot_height),64"viewport_size": (self.viewport_width, self.viewport_height),65"actual_viewport": (viewport_info.get('innerWidth', 0), viewport_info.get('innerHeight', 0)),66"device_pixel_ratio": viewport_info.get('devicePixelRatio', 1.0),67"width_scale": screenshot_width / self.viewport_width if self.viewport_width > 0 else 1.0,68"height_scale": screenshot_height / self.viewport_height if self.viewport_height > 0 else 1.069}7071if scaling_info["width_scale"] != 1.0 or scaling_info["height_scale"] != 1.0:72print(f"โ ๏ธ Screenshot scaling detected:")73print(f" Screenshot: {screenshot_width}x{screenshot_height}")74print(f" Expected viewport: {self.viewport_width}x{self.viewport_height}")75print(f" Actual viewport: {viewport_info.get('innerWidth', 'unknown')}x{viewport_info.get('innerHeight', 'unknown')}")76print(f" Scale factors: {scaling_info['width_scale']:.3f}x{scaling_info['height_scale']:.3f}")7778return scaling_info79except Exception as e:80print(f"โ ๏ธ Error validating screenshot dimensions: {e}")81return {}8283def execute_task(84self,85task: str,86print_steps: bool = True,87debug: bool = False,88max_iterations: int = 5089) -> str:9091input_items = [92{93"role": "user",94"content": task,95},96]9798new_items = []99iterations = 0100consecutive_no_actions = 0101last_assistant_messages = []102103print(f"๐ฏ Executing task: {task}")104print("=" * 60)105106def is_task_complete(content: str) -> dict:107if "TASK_COMPLETED:" in content:108return {"completed": True, "reason": "explicit_completion"}109if "TASK_FAILED:" in content or "TASK_ABANDONED:" in content:110return {"completed": True, "reason": "explicit_failure"}111112completion_patterns = [113r'task\s+(completed|finished|done|accomplished)',114r'successfully\s+(completed|finished|found|gathered)',115r'here\s+(is|are)\s+the\s+(results?|information|summary)',116r'to\s+summarize',117r'in\s+conclusion',118r'final\s+(answer|result|summary)'119]120121failure_patterns = [122r'cannot\s+(complete|proceed|access|continue)',123r'unable\s+to\s+(complete|access|find|proceed)',124r'blocked\s+by\s+(captcha|security|authentication)',125r'giving\s+up',126r'no\s+longer\s+able',127r'have\s+tried\s+multiple\s+approaches'128]129130for pattern in completion_patterns:131if re.search(pattern, content, re.IGNORECASE):132return {"completed": True, "reason": "natural_completion"}133134for pattern in failure_patterns:135if re.search(pattern, content, re.IGNORECASE):136return {"completed": True, "reason": "natural_failure"}137138return {"completed": False}139140def detect_repetition(new_message: str) -> bool:141if len(last_assistant_messages) < 2:142return False143144def similarity(str1: str, str2: str) -> float:145words1 = str1.lower().split()146words2 = str2.lower().split()147common_words = [word for word in words1 if word in words2]148return len(common_words) / max(len(words1), len(words2))149150return any(similarity(new_message, prev_message) > 0.8151for prev_message in last_assistant_messages)152153while iterations < max_iterations:154iterations += 1155has_actions = False156157if new_items and new_items[-1].get("role") == "assistant":158last_message = new_items[-1]159if last_message.get("content") and len(last_message["content"]) > 0:160content = last_message["content"][0].get("text", "")161162completion = is_task_complete(content)163if completion["completed"]:164print(f"โ Task completed ({completion['reason']})")165break166167if detect_repetition(content):168print("๐ Repetition detected - stopping execution")169last_assistant_messages.append(content)170break171172last_assistant_messages.append(content)173if len(last_assistant_messages) > 3:174last_assistant_messages.pop(0)175176if debug:177pp(input_items + new_items)178179try:180response = self.client.beta.messages.create(181model=self.model,182max_tokens=4096,183system=self.system_prompt,184messages=input_items + new_items,185tools=self.tools,186betas=[self.model_config["beta_flag"]]187)188189if debug:190pp(response)191192for block in response.content:193if block.type == "text":194print(block.text)195new_items.append({196"role": "assistant",197"content": [198{199"type": "text",200"text": block.text201}202]203})204elif block.type == "tool_use":205has_actions = True206if block.name == "computer":207tool_input = block.input208action = tool_input.get("action")209210print(f"๐ง {action}({tool_input})")211212screenshot_base64 = self.computer.execute_computer_action(213action=action,214text=tool_input.get("text"),215coordinate=tool_input.get("coordinate"),216scroll_direction=tool_input.get("scroll_direction"),217scroll_amount=tool_input.get("scroll_amount"),218duration=tool_input.get("duration"),219key=tool_input.get("key")220)221222if action == "screenshot":223self.validate_screenshot_dimensions(screenshot_base64)224225new_items.append({226"role": "assistant",227"content": [228{229"type": "tool_use",230"id": block.id,231"name": block.name,232"input": tool_input233}234]235})236237current_url = self.computer.get_current_url()238check_blocklisted_url(current_url)239240new_items.append({241"role": "user",242"content": [243{244"type": "tool_result",245"tool_use_id": block.id,246"content": [247{248"type": "image",249"source": {250"type": "base64",251"media_type": "image/png",252"data": screenshot_base64253}254}255]256}257]258})259260if not has_actions:261consecutive_no_actions += 1262if consecutive_no_actions >= 3:263print("โ ๏ธ No actions for 3 consecutive iterations - stopping")264break265else:266consecutive_no_actions = 0267268except Exception as error:269print(f"โ Error during task execution: {error}")270raise error271272if iterations >= max_iterations:273print(f"โ ๏ธ Task execution stopped after {max_iterations} iterations")274275assistant_messages = [item for item in new_items if item.get("role") == "assistant"]276if assistant_messages:277final_message = assistant_messages[-1]278content = final_message.get("content")279if isinstance(content, list) and len(content) > 0:280for block in content:281if isinstance(block, dict) and block.get("type") == "text":282return block.get("text", "Task execution completed (no final message)")283284return "Task execution completed (no final message)"
Step 5: Create the Main Script
1def main():2print("๐ Steel + Claude Computer Use Assistant")3print("=" * 60)45if STEEL_API_KEY == "your-steel-api-key-here":6print("โ ๏ธ WARNING: Please replace 'your-steel-api-key-here' with your actual Steel API key")7print(" Get your API key at: https://app.steel.dev/settings/api-keys")8return910if ANTHROPIC_API_KEY == "your-anthropic-api-key-here":11print("โ ๏ธ WARNING: Please replace 'your-anthropic-api-key-here' with your actual Anthropic API key")12print(" Get your API key at: https://console.anthropic.com/")13return1415print("\nStarting Steel browser session...")1617try:18with SteelBrowser() as computer:19print("โ Steel browser session started!")2021agent = ClaudeAgent(22computer=computer,23model="claude-3-5-sonnet-20241022",24)2526start_time = time.time()2728try:29result = agent.execute_task(30TASK,31print_steps=True,32debug=False,33max_iterations=50,34)3536duration = f"{(time.time() - start_time):.1f}"3738print("\n" + "=" * 60)39print("๐ TASK EXECUTION COMPLETED")40print("=" * 60)41print(f"โฑ๏ธ Duration: {duration} seconds")42print(f"๐ฏ Task: {TASK}")43print(f"๐ Result:\n{result}")44print("=" * 60)4546except Exception as error:47print(f"โ Task execution failed: {error}")48exit(1)4950except Exception as e:51print(f"โ Failed to start Steel browser: {e}")52print("Please check your STEEL_API_KEY and internet connection.")53exit(1)545556if __name__ == "__main__":57main()
Running Your Agent
Execute your script:
You'll see the session URL printed in the console. Open this URL to view the live browser session. The agent will execute the task defined in the TASK
environment variable or the default task.
You can modify the task by setting the environment variable:
export TASK="Search for the latest developments in artificial intelligence"python main.py
Customizing your agent's task
Try modifying the task to make your agent perform different actions:
1# Research specific topics2TASK = "Go to https://arxiv.org, search for 'computer vision', and summarize the latest papers."34# E-commerce tasks5TASK = "Go to https://www.amazon.com, search for 'mechanical keyboards', and compare the top 3 results."67# Information gathering8TASK = "Go to https://docs.anthropic.com, find information about Claude's capabilities, and provide a summary."
Supported Models: This example uses Claude 3.5 Sonnet, but you can use any of the supported Claude models including Claude 3.7 Sonnet, Claude 4 Sonnet, or Claude 4 Opus. Update the model parameter in the ClaudeAgent constructor to switch models.
Next Steps
-
Explore the Steel API documentation for more advanced features
-
Check out the Anthropic documentation for more information about Claude's computer use capabilities
-
Add additional features like session recording or multi-session management