Gemini Computer Use

Wire Gemini 3's computer-use tool into a Steel browser session.

The Gemini Computer Use integration runs Gemini 3's vision-based agent loop on a Steel browser session. Gemini takes screenshots through Steel, decides the next action (click, type, scroll), and Steel executes it, so you can automate complex web tasks without writing custom selectors.

It pairs well with Steel's anti-bot capabilities, proxy support, and sandboxed environments.

Requirements

  • Gemini API Key: A Gemini 3 model with computer use
  • Steel API Key: Active Steel subscription
  • Runtime: Python 3.10+ or Node.js 20+

Connect Steel to Gemini

Steel's sessions.computer API takes screenshots and executes actions; pair it with Gemini 3's built-in computer use:

Typescript
import { Steel } from "steel-sdk";
const steel = new Steel({ steelAPIKey: STEEL_API_KEY });
const session = await steel.sessions.create({
dimensions: { width: 1280, height: 800 },
});
// Take a screenshot via Steel
const { base64_image } = await steel.sessions.computer(session.id, {
action: "take_screenshot",
});
// Send to Gemini with the computer-use tool, then route Gemini's
// returned actions back through `steel.sessions.computer({ action: ... })`.

Full runnable starter: Steel + Gemini Computer Use recipe →

Resources