Scrapingant | Composio Docs

Overview

SLUG: SCRAPINGANT

Description

ScrapingAnt is a web scraping API that provides tools for data extraction, including features like Chrome page rendering, low latency rotating proxies, JavaScript execution, and unlimited parallel requests.

Authentication Details

API Key

generic_api_key

stringRequired

Connecting to Scrapingant

Create an auth config

Use the dashboard to create an auth config for the Scrapingant toolkit. This allows you to connect multiple Scrapingant accounts to Composio for agents to use.

Select App

Navigate to Scrapingant.

Configure Auth Config Settings

Select among the supported auth schemes of and configure them here.

Create and Get auth config ID

Click “Create Scrapingant Auth Config”. After creation, copy the displayed ID starting with ac_. This is your auth config ID. This is not a sensitive ID — you can save it in environment variables or a database. This ID will be used to create connections to the toolkit for a given user.

Connect Your Account

Using API Key

1 from composio import Composio
2 
3 # Replace these with your actual values
4 scrapingant_auth_config_id = "ac_YOUR_SCRAPINGANT_CONFIG_ID" # Auth config ID created above
5 user_id = "0000-0000-0000"  # UUID from database/app
6 
7 composio = Composio()
8 
9 def authenticate_toolkit(user_id: str, auth_config_id: str):
10     # Replace this with a method to retrieve an API key from the user.
11     # Or supply your own.
12     user_api_key = input("[!] Enter API key")
13 
14     connection_request = composio.connected_accounts.initiate(
15         user_id=user_id,
16         auth_config_id=auth_config_id,
17         config={"auth_scheme": "API_KEY", "val": {"generic_api_key": user_api_key}}
18     )
19 
20     # API Key authentication is immediate - no redirect needed
21     print(f"Successfully connected Scrapingant for user {user_id}")
22     print(f"Connection status: {connection_request.status}")
23     
24     return connection_request.id
25 
26 
27 connection_id = authenticate_toolkit(user_id, scrapingant_auth_config_id)
28 
29 # You can verify the connection using:
30 connected_account = composio.connected_accounts.get(connection_id)
31 print(f"Connected account: {connected_account}")

Tools

Executing tools

To prototype you can execute some tools to see the responses and working on the Scrapingant toolkit’s playground

OpenAI (Python)

Anthropic (TypeScript)

Google (Python)

Vercel (TypeScript)

Python

1 from composio import Composio
2 from openai import OpenAI
3 import json
4 
5 openai = OpenAI()
6 composio = Composio()
7 
8 # User ID must be a valid UUID format
9 user_id = "0000-0000-0000"  # Replace with actual user UUID from your database
10 
11 tools = composio.tools.get(user_id=user_id, toolkits=["SCRAPINGANT"])
12 
13 print("[!] Tools:")
14 print(json.dumps(tools))
15 
16 def invoke_llm(task = "What can you do?"):
17     completion = openai.chat.completions.create(
18         model="gpt-4o",
19         messages=[
20             {
21                 "role": "user",
22                 "content": task,  # Your task here!
23             },
24         ],
25         tools=tools,
26     )
27 
28     # Handle Result from tool call
29     result = composio.provider.handle_tool_calls(user_id=user_id, response=completion)
30     print(f"[!] Completion: {completion}")
31     print(f"[!] Tool call result: {result}")
32 
33 invoke_llm()

Tool List

SCRAPINGANT_EXTRACT_CONTENT_AS_MARKDOWN

Tool Name: Extract Content as Markdown

Description

This tool extracts content from a given URL and converts it into Markdown format. It is particularly useful for preparing text for Language Learning Models (LLMs) and Retrieval-Augmented Generation (RAG) systems. It supports GET, POST, PUT, and DELETE methods.

Action Parameters

block_resource

array

browser

boolean

string

js_snippet

string

method

stringDefaults to get

proxy_country

string

proxy_type

string

return_page_source

boolean

url

stringRequired

wait_for_selector

string

Action Response

data

objectRequired

error

string

successful

booleanRequired

SCRAPINGANT_EXTRACT_DATA_WITH_AI

Tool Name: Extract Data with AI

Description

This tool allows you to extract structured data from a web page using ScrapingAnt's AI-powered extraction capabilities. You provide a URL and an AI query (prompt) describing what data you want to extract, and the tool returns the extracted data in a structured format. It supports additional parameters for browser rendering, proxies, and cookies to handle dynamic content and localization.

Action Parameters

string

enable_javascript

boolean

extract_properties

stringRequired

proxy_country

string

proxy_type

string

return_text

boolean

url

stringRequired

wait_for_selector

string

Action Response

data

objectRequired

error

string

successful

booleanRequired

SCRAPINGANT_GET_API_CREDITS_USAGE

Tool Name: Get API Credits Usage

Description

This tool retrieves the current API credit usage status for the authenticated ScrapingAnt account. It enables users to monitor their consumption of API credits, check their current usage against the subscription limits, and manage their API credits effectively.

Action Parameters

Action Response

data

objectRequired

error

string

successful

booleanRequired

SCRAPINGANT_SCRAPE_WEB_PAGE

Tool Name: Scrape Web Page

Description

This tool scrapes a web page using the ScrapingAnt API. It fetches the HTML content of the specified URL. Users can customize the scraping behavior by enabling a headless browser, using proxies, waiting for specific elements, executing JavaScript, passing cookies, and blocking certain resources.

Action Parameters

block_resource

array

browser

booleanDefaults to True

string

js_snippet

string

proxy_country

string

proxy_type

string

return_page_source

boolean

url

stringRequired

wait_for_selector

string

Action Response

data

objectRequired

error

string

successful

booleanRequired

SCRAPINGANT_SCRAPE_WITH_EXTENDED_JSON_OUTPUT

Tool Name: Scrape with Extended JSON Output

Description

This tool scrapes a target URL and returns an extended JSON response. It utilizes ScrapingAnt's /v2/extended endpoint, providing richer information than the standard scraping tool, including page HTML, cookies, headers, and additional details.

Action Parameters

url

stringRequired

Action Response

data

objectRequired

error

string

successful

booleanRequired