Webscraping ai

Learn how to use Webscraping ai with Composio

Overview

SLUG: WEBSCRAPING_AI

Description

WebScraping.AI provides an API for web scraping with features like Chrome JS rendering, rotating proxies, and HTML parsing.

Authentication Details

generic_api_key
stringRequired

Connecting to Webscraping ai

Create an auth config

Use the dashboard to create an auth config for the Webscraping ai toolkit. This allows you to connect multiple Webscraping ai accounts to Composio for agents to use.

1

Select App

Navigate to [Webscraping ai](https://platform.composio.dev?next_page=/marketplace/Webscraping ai).

2

Configure Auth Config Settings

Select among the supported auth schemes of and configure them here.

3

Create and Get auth config ID

Click “Create Webscraping ai Auth Config”. After creation, copy the displayed ID starting with ac_. This is your auth config ID. This is not a sensitive ID — you can save it in environment variables or a database. This ID will be used to create connections to the toolkit for a given user.

Connect Your Account

Using API Key

1from composio import Composio
2
3# Replace these with your actual values
4webscraping_ai_auth_config_id = "ac_YOUR_WEBSCRAPING_AI_CONFIG_ID" # Auth config ID created above
5user_id = "0000-0000-0000" # UUID from database/app
6
7composio = Composio()
8
9def authenticate_toolkit(user_id: str, auth_config_id: str):
10 # Replace this with a method to retrieve an API key from the user.
11 # Or supply your own.
12 user_api_key = input("[!] Enter API key")
13
14 connection_request = composio.connected_accounts.initiate(
15 user_id=user_id,
16 auth_config_id=auth_config_id,
17 config={"auth_scheme": "API_KEY", "val": {"generic_api_key": user_api_key}}
18 )
19
20 # API Key authentication is immediate - no redirect needed
21 print(f"Successfully connected Webscraping ai for user {user_id}")
22 print(f"Connection status: {connection_request.status}")
23
24 return connection_request.id
25
26
27connection_id = authenticate_toolkit(user_id, webscraping_ai_auth_config_id)
28
29# You can verify the connection using:
30connected_account = composio.connected_accounts.get(connection_id)
31print(f"Connected account: {connected_account}")

Tools

Executing tools

To prototype you can execute some tools to see the responses and working on the [Webscraping ai toolkit’s playground](https://app.composio.dev/app/Webscraping ai)

Python
1from composio import Composio
2from openai import OpenAI
3import json
4
5openai = OpenAI()
6composio = Composio()
7
8# User ID must be a valid UUID format
9user_id = "0000-0000-0000" # Replace with actual user UUID from your database
10
11tools = composio.tools.get(user_id=user_id, toolkits=["WEBSCRAPING_AI"])
12
13print("[!] Tools:")
14print(json.dumps(tools))
15
16def invoke_llm(task = "What can you do?"):
17 completion = openai.chat.completions.create(
18 model="gpt-4o",
19 messages=[
20 {
21 "role": "user",
22 "content": task, # Your task here!
23 },
24 ],
25 tools=tools,
26 )
27
28 # Handle Result from tool call
29 result = composio.provider.handle_tool_calls(user_id=user_id, response=completion)
30 print(f"[!] Completion: {completion}")
31 print(f"[!] Tool call result: {result}")
32
33invoke_llm()

Tool List

Tool Name: Get account usage and quota

Description

Tool to retrieve account api call quota and usage. use when checking remaining requests and subscription details.

Action Parameters

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Retrieve HTML Content

Description

Tool to retrieve html content of a web page. use when you need raw page html, optionally rendered with javascript.

Action Parameters

cookies
object
device
stringDefaults to desktop
headers
object
js
boolean
proxy
string
url
stringRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Get Rendered HTML

Description

Tool to retrieve fully rendered html of a webpage. use when js-generated content must be included.

Action Parameters

cookies
string
device
string
disable_images
boolean
headers
object
js
string
locale
string
proxy_type
string
referer
string
timeout
integer
url
stringRequired
user_agent
string
wait
integer

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Get Text

Description

Tool to retrieve raw text content from a specified web page. use when you need plain text extraction from a url.

Action Parameters

locale
string
proxy
string
render_js
boolean
session
string
timeout
integer
url
stringRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired