Browser tool

Learn how to use Browser tool with Composio

Overview

SLUG: BROWSER_TOOL

Description

Composio enables AI Agents and LLMs to authenticate and integrate with various tools via function calling.

Tools

Executing tools

To prototype you can execute some tools to see the responses and working on the [Browser tool toolkit’s playground](https://app.composio.dev/app/Browser tool)

Python
1from composio import Composio
2from openai import OpenAI
3import json
4
5openai = OpenAI()
6composio = Composio()
7
8# User ID must be a valid UUID format
9user_id = "0000-0000-0000" # Replace with actual user UUID from your database
10
11tools = composio.tools.get(user_id=user_id, toolkits=["BROWSER_TOOL"])
12
13print("[!] Tools:")
14print(json.dumps(tools))
15
16def invoke_llm(task = "What can you do?"):
17 completion = openai.chat.completions.create(
18 model="gpt-4o",
19 messages=[
20 {
21 "role": "user",
22 "content": task, # Your task here!
23 },
24 ],
25 tools=tools,
26 )
27
28 # Handle Result from tool call
29 result = composio.provider.handle_tool_calls(user_id=user_id, response=completion)
30 print(f"[!] Completion: {completion}")
31 print(f"[!] Tool call result: {result}")
32
33invoke_llm()

Tool List

Tool Name: Copy Selected Text

Description

Copy currently selected text on the page to clipboard - ideal for extracting highlighted content, copying form data, or harvesting visible text selections.

Action Parameters

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Drag and Drop

Description

Execute precise drag and drop operations - essential for file uploads, list reordering, element moving, and complex ui interactions that require drag-based manipulation.

Action Parameters

button
stringDefaults to left
endX
integerRequired
endY
integerRequired
startX
integerRequired
startY
integerRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Fetch Webpage Content

Description

Decision brain: essential for analysis, planning, and information gathering. primary uses: understand page state, locate elements, verify changes, plan next actions fallback role: when screenshot blocked/fails, use this diligently for information format: html=element discovery & coordinates | markdown=content analysis hybrid workflow: use for decisions → ai agent for precision clicking

Action Parameters

format
stringDefaults to markdown
idleTtlSec
integer
newPage
boolean
returnPartialOnTimeout
booleanDefaults to True
url
string
wait
integerDefaults to 1000

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Get Clipboard Content

Description

Read current content from the system clipboard - essential for data transfer workflows, extracting copied text, and reading user-copied data for processing.

Action Parameters

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Keyboard Shortcut

Description

Execute keyboard shortcuts and key combinations - essential for copy/paste, navigation, and application commands that agents need for efficient browser automation.

Action Parameters

holdTime
integer
keys
arrayRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Mouse Click

Description

Manual precision: coordinate-based clicking - less efficient than ai agent for complex clicks. when to use: simple single clicks when you have exact coordinates better alternative: use performwebtask ai agent for complex/precise clicking pattern: fetchwebpage(html) → estimate coordinates → click → verify hints: center ~(640,350) | header ~y=150 | content ~y=300-500 | success rate: 85%

Action Parameters

button
stringDefaults to left
x
integerRequired
y
integerRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Mouse Double Click

Description

Execute a precise double click at specified screen coordinates - ideal for opening files, selecting text, or activating ui elements that require double click gestures.

Action Parameters

button
stringDefaults to left
x
integerRequired
y
integerRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Mouse Down (Press and Hold)

Description

Press and hold mouse button at coordinates - use for starting custom drag operations, text selections, or long-press interactions. must be followed by mouseup action to complete.

Action Parameters

button
stringDefaults to left
x
integerRequired
y
integerRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Mouse Move

Description

Move mouse cursor to precise coordinates without clicking - perfect for triggering hover effects, revealing tooltips, and positioning for subsequent interactions.

Action Parameters

x
integerRequired
y
integerRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Mouse Up (Release Button)

Description

Release mouse button at coordinates - completes drag operations, text selections, and long-press interactions. should be used after mousedown to finish mouse button sequences.

Action Parameters

button
stringDefaults to left
x
integerRequired
y
integerRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Navigate to URL

Description

Session foundation: always start here - creates browser session and navigates to url. workflow: navigate() → fetchwebpage() → analysis/planning → ai agent or manual interactions → verify provides: live debugurl for user, initial page snapshot, persistent session tip: always show debugurl to user before proceeding

Action Parameters

idleTtlSec
integer
url
stringRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Paste Text

Description

Paste text content at the current cursor position - perfect for filling forms, inserting data into text fields, or quick content insertion at focused elements.

Action Parameters

text
stringRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: AI Perform Web Task

Description

Ai agent: best for precise clicking and complex interactions. preferred for: button clicks, form interactions, precise targeting (coordinates handled well) hybrid workflow: use after fetchwebpage analysis for informed ai interactions when to use: any clicking task, multi-step workflows, dynamic content strategy: ai handles precision → regular tools for verification

Action Parameters

aiAgent
stringDefaults to browser-use
aiModel
stringDefaults to gpt-5-nano
aiProvider
stringDefaults to openai
highlightElements
booleanDefaults to True
idleTtlSec
integer
outputSchema
object
prompt
stringRequired
url
string

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Screenshot Webpage

Description

Capture high-quality screenshot of any webpage with extensive customization options - perfect for archiving, visual documentation, full-page captures, and cross-device viewport testing.

Action Parameters

captureFullHeight
boolean
height
integerDefaults to 720
idleTtlSec
integer
imageQuality
integerDefaults to 80
scrollAllContent
boolean
url
stringRequired
wait
integerDefaults to 1000
width
integerDefaults to 1280

Action Response

data
stringRequired
error
string
mimeType
stringRequired
successful
booleanRequired

Tool Name: Scroll Page

Description

Page navigation: smooth scrolling. use: when target element not visible after fetchwebpage() distance: 200px=fine | 400px=sections | 800px=quick traverse always: scroll → fetchwebpage() → verify | success rate: 99%

Action Parameters

deltaX
integer
deltaY
integerRequired
steps
integerDefaults to 1
useOs
booleanDefaults to True
x
integerDefaults to 640
y
integerDefaults to 360

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Set Clipboard Content

Description

Store text content in the system clipboard for later paste operations - perfect for preparing data transfers, staging content for forms, or cross-application data sharing.

Action Parameters

text
stringRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Take Screenshot

Description

Visual verification: capture screenshot - must be called alone in multi execute. 🚨 critical: screenshot cannot be combined with other tools in same multi execute call fallback: if screenshot blocked/fails, use fetchwebpage diligently for information use: debug ui, verify states, document results after page changes renders: inline visual feedback | success rate: 99%

Action Parameters

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Type Text

Description

Controlled input: human-like typing. pattern: click to focus → typetext() → verify speed: delay=0 (fast) | delay=50 (human-like) | delay=100+ (careful) must focus input field first | success rate: 95%

Action Parameters

delay
integer
text
stringRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired