diff --git a/mcp-server.mdx b/mcp-server.mdx index 4e41cb84..c0b00bc1 100644 --- a/mcp-server.mdx +++ b/mcp-server.mdx @@ -402,7 +402,37 @@ Check the status of a batch operation. } ``` -### 4. Search Tool (`firecrawl_search`) +### 4. Map Tool (`firecrawl_map`) + +Map a website to discover all indexed URLs on the site. + +```json +{ + "name": "firecrawl_map", + "arguments": { + "url": "https://example.com", + "search": "blog", + "sitemap": "include", + "includeSubdomains": false, + "limit": 100, + "ignoreQueryParameters": true + } +} +``` + +#### Map Tool Options: + +- `url`: The base URL of the website to map +- `search`: Optional search term to filter URLs +- `sitemap`: Control sitemap usage - "include", "skip", or "only" +- `includeSubdomains`: Whether to include subdomains in the mapping +- `limit`: Maximum number of URLs to return +- `ignoreQueryParameters`: Whether to ignore query parameters when mapping + +**Best for:** Discovering URLs on a website before deciding what to scrape; finding specific sections of a website. +**Returns:** Array of URLs found on the site. + +### 5. Search Tool (`firecrawl_search`) Search the web and optionally extract content from search results. @@ -422,7 +452,7 @@ Search the web and optionally extract content from search results. } ``` -### 5. Crawl Tool (`firecrawl_crawl`) +### 6. Crawl Tool (`firecrawl_crawl`) Start an asynchronous crawl with advanced options. @@ -439,7 +469,22 @@ Start an asynchronous crawl with advanced options. } ``` -### 6. Extract Tool (`firecrawl_extract`) +### 7. Check Crawl Status (`firecrawl_check_crawl_status`) + +Check the status of a crawl job. + +```json +{ + "name": "firecrawl_check_crawl_status", + "arguments": { + "id": "550e8400-e29b-41d4-a716-446655440000" + } +} +``` + +**Returns:** Status and progress of the crawl job, including results if available. + +### 8. Extract Tool (`firecrawl_extract`) Extract structured information from web pages using LLM capabilities. Supports both cloud AI and self-hosted LLM extraction. @@ -496,58 +541,6 @@ Example response: When using a self-hosted instance, the extraction will use your configured LLM. For cloud API, it uses Firecrawl's managed LLM service. -### 7. Deep Research Tool (firecrawl_deep_research) - -Conduct deep web research on a query using intelligent crawling, search, and LLM analysis. - -```json -{ - "name": "firecrawl_deep_research", - "arguments": { - "query": "how does carbon capture technology work?", - "maxDepth": 3, - "timeLimit": 120, - "maxUrls": 50 - } -} -``` - -Arguments: - -- query (string, required): The research question or topic to explore. -- maxDepth (number, optional): Maximum recursive depth for crawling/search (default: 3). -- timeLimit (number, optional): Time limit in seconds for the research session (default: 120). -- maxUrls (number, optional): Maximum number of URLs to analyze (default: 50). - -Returns: - -- Final analysis generated by an LLM based on research. (data.finalAnalysis) -- May also include structured activities and sources used in the research process. - -### 8. Generate LLMs.txt Tool (firecrawl_generate_llmstxt) - -Generate a standardized llms.txt (and optionally llms-full.txt) file for a given domain. This file defines how large language models should interact with the site. - -```json -{ - "name": "firecrawl_generate_llmstxt", - "arguments": { - "url": "https://example.com", - "maxUrls": 20, - "showFullText": true - } -} -``` - -Arguments: - -- url (string, required): The base URL of the website to analyze. -- maxUrls (number, optional): Max number of URLs to include (default: 10). -- showFullText (boolean, optional): Whether to include llms-full.txt contents in the response. - -Returns: - -- Generated llms.txt file contents and optionally the llms-full.txt (data.llmstxt and/or data.llmsfulltxt) ## Logging System