Scrapingant

Learn how to use Scrapingant with Composio

Overview

SLUG: SCRAPINGANT

Description

ScrapingAnt is a web scraping API that provides tools for data extraction, including features like Chrome page rendering, low latency rotating proxies, JavaScript execution, and unlimited parallel requests.

Authentication Details

generic_api_key
stringRequired

Connecting to Scrapingant

Create an auth config

Use the dashboard to create an auth config for the Scrapingant toolkit. This allows you to connect multiple Scrapingant accounts to Composio for agents to use.

1

Select App

Navigate to the Scrapingant toolkit page and click “Setup Integration”.

2

Configure Auth Config Settings

Select among the supported auth schemes of and configure them here.

3

Create and Get auth config ID

Click “Create Integration”. After creation, copy the displayed ID starting with ac_. This is your auth config ID. This is not a sensitive ID — you can save it in environment variables or a database. This ID will be used to create connections to the toolkit for a given user.

Connect Your Account

Using API Key

1from composio import Composio
2from composio.types import auth_scheme
3
4# Replace these with your actual values
5scrapingant_auth_config_id = "ac_YOUR_SCRAPINGANT_CONFIG_ID" # Auth config ID created above
6user_id = "0000-0000-0000" # UUID from database/app
7
8composio = Composio()
9
10def authenticate_toolkit(user_id: str, auth_config_id: str):
11 # Replace this with a method to retrieve an API key from the user.
12 # Or supply your own.
13 user_api_key = input("[!] Enter API key")
14
15 connection_request = composio.connected_accounts.initiate(
16 user_id=user_id,
17 auth_config_id=auth_config_id,
18 config={"auth_scheme": "API_KEY", "val": user_api_key}
19 )
20
21 # API Key authentication is immediate - no redirect needed
22 print(f"Successfully connected Scrapingant for user {user_id}")
23 print(f"Connection status: {connection_request.status}")
24
25 return connection_request.id
26
27
28connection_id = authenticate_toolkit(user_id, scrapingant_auth_config_id)
29
30# You can verify the connection using:
31connected_account = composio.connected_accounts.get(connection_id)
32print(f"Connected account: {connected_account}")

Tools

Executing tools

To prototype you can execute some tools to see the responses and working on the Scrapingant toolkit’s playground

Python
1from composio import Composio
2from openai import OpenAI
3import json
4
5openai = OpenAI()
6composio = Composio()
7
8# User ID must be a valid UUID format
9user_id = "0000-0000-0000" # Replace with actual user UUID from your database
10
11tools = composio.tools.get(user_id=user_id, toolkits=["SCRAPINGANT"])
12
13print("[!] Tools:")
14print(json.dumps(tools))
15
16def invoke_llm(task = "What can you do?"):
17 completion = openai.chat.completions.create(
18 model="gpt-4o",
19 messages=[
20 {
21 "role": "user",
22 "content": task, # Your task here!
23 },
24 ],
25 tools=tools,
26 )
27
28 # Handle Result from tool call
29 result = composio.provider.handle_tool_calls(user_id=user_id, response=completion)
30 print(f"[!] Completion: {completion}")
31 print(f"[!] Tool call result: {result}")
32
33invoke_llm()

Tool List

Tool Name: Extract Content as Markdown

Description

This tool extracts content from a given url and converts it into markdown format. it is particularly useful for preparing text for language learning models (llms) and retrieval-augmented generation (rag) systems. it supports get, post, put, and delete methods.

Action Parameters

block_resource
array
browser
boolean
cookies
string
js_snippet
string
method
stringDefaults to get
proxy_country
string
proxy_type
string
return_page_source
boolean
url
stringRequired
wait_for_selector
string

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Extract Data with AI

Description

This tool allows you to extract structured data from a web page using scrapingant's ai-powered extraction capabilities. you provide a url and an ai query (prompt) describing what data you want to extract, and the tool returns the extracted data in a structured format. it supports additional parameters for browser rendering, proxies, and cookies to handle dynamic content and localization.

Action Parameters

cookies
string
enable_javascript
boolean
extract_properties
stringRequired
proxy_country
string
proxy_type
string
return_text
boolean
url
stringRequired
wait_for_selector
string

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Get API Credits Usage

Description

This tool retrieves the current api credit usage status for the authenticated scrapingant account. it enables users to monitor their consumption of api credits, check their current usage against the subscription limits, and manage their api credits effectively.

Action Parameters

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Scrape Web Page

Description

This tool scrapes a web page using the scrapingant api. it fetches the html content of the specified url. users can customize the scraping behavior by enabling a headless browser, using proxies, waiting for specific elements, executing javascript, passing cookies, and blocking certain resources.

Action Parameters

block_resource
array
browser
booleanDefaults to True
cookies
string
js_snippet
string
proxy_country
string
proxy_type
string
return_page_source
boolean
url
stringRequired
wait_for_selector
string

Action Response

data
objectRequired
error
string
successful
booleanRequired

Tool Name: Scrape with Extended JSON Output

Description

This tool scrapes a target url and returns an extended json response. it utilizes scrapingant's /v2/extended endpoint, providing richer information than the standard scraping tool, including page html, cookies, headers, and additional details.

Action Parameters

url
stringRequired

Action Response

data
objectRequired
error
string
successful
booleanRequired