# Tavily > ## Documentation Index --- # Source: https://docs.tavily.com/documentation/partnerships/IBM.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # IBM watsonx Orchestrate > Integrate Tavily's AI-powered research capabilities with IBM watsonx Orchestrate ## Overview Tavily offers two services on IBM watsonx Orchestrate: * **Tavily Research Agent** — An AI-powered research agent that conducts comprehensive web research using coordinated parallel sub-agents to deliver detailed, citation-backed reports on complex topics. * **Tavily Search API** — Real-time web search optimized for AI agents and LLMs. Both services are available through the IBM Cloud catalog and can be procured using IBM credits. ## Setup Guide ### Step 1: Create a Tavily Instance on IBM Cloud 1. Navigate to [IBM Cloud](https://cloud.ibm.com/) 2. In the search bar, type "Tavily" to find the available services Search for Tavily in IBM Cloud 3. Select either **Tavily Search API** or **Tavily Research Agent** depending on your needs 4. Click **Create** to provision a new instance Create Tavily instance ### Step 2: Copy Your Bearer Token Once your instance is created, copy the bearer token from the credentials section. You'll need this to connect the agent in watsonx Orchestrate. Copy bearer token ### Step 3: Add Tavily to watsonx Orchestrate 1. Navigate to [watsonx Orchestrate](https://dl.watson-orchestrate.ibm.com/chat) 2. Create a new agent Create agent in watsonx Orchestrate 3. Name your agent Name your agent 4. Add a collaborator agent Add collaborator agent 5. Select **Tavily Research Agent** from the partner agents list Select Tavily agent 6. Review the agent details and click **Add as collaborator** Add Tavily as collaborator 7. Enter your bearer token (from Step 2) in the **Bearer token** field and click **Register and add** Register agent with bearer token 8. The Tavily Research Agent will now appear in your agent's **Toolset** under the Agents section Tavily agent loaded in toolset ### Step 4: Try It Out Ask a question in the chat that requires real-time web research, and watsonx Orchestrate will automatically hand off to the Tavily Research Agent. Tavily Research Agent handoff example Your Tavily Research Agent is now ready to use within watsonx Orchestrate. ## Resources * [IBM watsonx Orchestrate Documentation](https://www.ibm.com/docs/en/watsonx/watson-orchestrate/base?topic=agents-adding-orchestration#adding-a-collaborator-agent) * [Partner Agents Catalog](https://www.ibm.com/docs/en/watsonx/watson-orchestrate/base?topic=catalog-partner-agents) --- # Source: https://docs.tavily.com/documentation/about.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # About > Welcome to Tavily! Looking for a step-by-step tutorial to get started in under 5 minutes? Head to our [Quickstart guide](/guides/quickstart) and start coding! ## Who are we? We're a team of AI researchers and developers passionate about helping you build the next generation of AI assistants. Our mission is to empower individuals and organizations with accurate, unbiased, and factual information. ## What is the Tavily Search Engine? Building an AI agent that leverages realtime online information is not a simple task. Scraping doesn't scale and requires expertise to refine, current search engine APIs don't provide explicit information to queries but simply potential related articles (which are not always related), and are not very customziable for AI agent needs. This is why we're excited to introduce the first search engine for AI agents - [Tavily](https://app.tavily.com). Tavily is a search engine optimized for LLMs, aimed at efficient, quick and persistent search results. Unlike other search APIs such as Serp or Google, Tavily focuses on optimizing search for AI developers and autonomous AI agents. We take care of all the burden of searching, scraping, filtering and extracting the most relevant information from online sources. All in a single API call! To try the API in action, you can now use our hosted version on our [API Playground](https://app.tavily.com/playground). If you're an AI developer looking to integrate your application with our API, or seek increased API limits, [please reach out!](mailto:support@tavily.com) ## Why choose Tavily? Tavily shines where others fail, with a Search API optimized for LLMs. Tailored just for LLM Agents, we ensure the search results are optimized for RAG. We take care of all the burden in searching, scraping, filtering and extracting information from online sources. All in a single API call! Simply pass the returned search results as context to your LLM. Beyond just fetching results, the Tavily Search API offers precision. With customizable search depths, domain management, and parsing HTML content controls, you're in the driver's seat. Committed to speed and efficiency, our API guarantees real-time and trusted information. Our team works hard to improve Tavily's performance over time. We appreciate the essence of adaptability. That's why integrating our API with your existing setup is a breeze. You can choose our [Python library](https://pypi.org/project/tavily-python/), [JavaScript package](https://www.npmjs.com/package/@tavily/core) or a simple API call. You can also use Tavily through any of our supported partners such as [LangChain](/integrations/langchain) and [LlamaIndex](/integrations/llamaindex). Our detailed documentation ensures you're never left in the dark. From setup basics to nuanced features, we've got you covered. ## How does the Search API work? Traditional search APIs such as Google, Serp and Bing retrieve search results based on a user query. However, the results are sometimes irrelevant to the goal of the search, and return simple URLs and snippets of content which are not always relevant. Because of this, any developer would need to then scrape the sites to extract relevant content, filter irrelevant information, optimize the content to fit LLM context limits, and more. This task is a burden and requires a lot of time and effort to complete. The Tavily Search API takes care of all of this for you in a single API call. The Tavily Search API aggregates up to 20 sites per a single API call, and uses proprietary AI to score, filter and rank the top most relevant sources and content to your task, query or goal. In addition, Tavily allows developers to add custom fields such as context and limit response tokens to enable the optimal search experience for LLMs. Tavily can also help your AI agent make better decisions by including a short answer for cross-agent communication. With LLM hallucinations, it's crucial to optimize for RAG with the right context and information. This is where Tavily comes in, delivering accurate and precise information for your RAG applications. ## Getting started [Sign up](https://app.tavily.com) for Tavily to get your API key. You get **1,000 free API Credits every month**. No credit card required. You get 1,000 free API Credits every month. **No credit card required.** Head to our [API Playground](https://app.tavily.com/playground) to familiarize yourself with our API. To get started with Tavily's APIs and SDKs using code, head to our [Quickstart Guide](/guides/quickstart) and follow the steps. Got questions? Stumbled upon an issue? Simply intrigued? Don't hesitate! Our support team is always on standby, eager to assist. Join us, dive deep, and redefine your search experience! [Contact us!](mailto:support@tavily.com) --- # Source: https://docs.tavily.com/documentation/integrations/agent-builder.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # OpenAI Agent Builder > Integrate OpenAI’s Agent Builder with Tavily’s MCP server to empower your AI agents with real-time web access. ## Getting Started Before you begin, make sure you have: * A [Tavily API key](https://app.tavily.com/home) (sign up for free if you don't have one) * An OpenAI account with [organization verification](https://help.openai.com/en/articles/10910291-api-organization-verification) Navigate to [Agent Builder](https://platform.openai.com/agent-builder) and click **Create New Workflow** to begin building your AI agent. Create New Workflow Click on the agent node in your workflow canvas to open the configuration panel. Agent Block In the configuration panel, locate and click on **Tools** in the sidebar to add external capabilities to your agent. Tools Panel In the MCP configuration section, paste the Tavily MCP server URL: ```bash theme={null} https://mcp.tavily.com/mcp/?tavilyApiKey=YOUR_API_KEY ``` Remember to replace `YOUR_API_KEY` with your actual Tavily API key. {" "} Need an API key? Get one instantly from your [Tavily dashboard](https://app.tavily.com/home) Click **Connect** to establish the connection to Tavily. Tavily MCP Configuration Once connected, you'll see Tavily's suite of tools available: * **tavily\_search** - Execute a search query. * **tavily\_extract** - Extract web page content from one or more specified URLs. * **tavily\_map** - Traverses websites like a graph and can explore hundreds of paths in parallel with intelligent discovery to generate comprehensive site maps. * **tavily\_crawl** - Traversal tool that can explore hundreds of paths in parallel with built-in extraction and intelligent discovery. Select the tools you want to activate for this agent, then click **Add** to integrate them. Tavily Tools Available Now configure your agent: * **Name**: Choose a descriptive name for your agent * **Instructions**: Define the agent's role and how it should use Tavily's tools * **Reasoning**: Set the appropriate reasoning effort level * Click **Preview** to test the configuration **Sample instructions:** ``` You are a research assistant that uses Tavily to search the web for up-to-date information. When the user asks questions that require current information, use Tavily to find relevant and recent sources. ``` Agent Configuration Panel Test your agent with queries that require real-time information to verify everything is working as expected. Agent Testing Interface ## Real-World Applications ### Market Research Agents Build agents that continuously monitor industry trends, competitor activities, and market sentiment by searching for and analyzing relevant business information. ### Content Curation Systems Create agents that automatically find, extract, and summarize content from multiple sources based on your specific criteria and preferences. ### Competitive Intelligence Develop agents that crawl competitor websites, map their content strategies, and extract pricing, features, and positioning information. ### News & Event Monitors Build agents that track breaking news on specific topics by leveraging Tavily's news search mode, providing real-time updates with citations. --- # Source: https://docs.tavily.com/documentation/agent-skills.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Tavily Agent Skills > Official skills that define best practices for working with the Tavily API. Useful for AI agents like Claude Code, Codex, or Cursor. `/tavily-ai/skills` Sign up at tavily.com ## Why Use These Skills? These official skills define best practices for working with the Tavily API, going beyond just using the endpoints. They give AI agents low-level control to build custom web tooling directly in your development environment. These skills bring Tavily's services (search, extract, crawl, research) right where you work. The real-time context these tools provide significantly enhances your agent's capabilities for development tasks. Most importantly, the **tavily-best-practices** skill turns your AI agent into a true Tavily expert. Instead of reading API docs, just ask your agent how to integrate Tavily into your project. All API best practices are baked in, dramatically accelerating your build process. ## What You Can Build Copy-paste these prompts into your AI agent and start building: Build a chatbot that can answer questions about current events and up-to-date information. **Try these prompts:** ``` /tavily-best-practices Build a chatbot that integrates Tavily search to answer questions with up-to-date web information ``` ``` /tavily-best-practices Add Tavily search to my internal company chatbot so it can answer questions about our competitors ``` Create a live news dashboard that tracks topics and analyzes sentiment. **Try these prompts:** ``` /tavily-best-practices Build a website that refreshes daily with Tesla news and gives a sentiment score on each article ``` ``` /tavily-best-practices Create a news monitoring dashboard that tracks AI industry news and sends daily Slack summaries ``` Build tools that automatically enrich leads with company data from the web. **Try these prompts:** ``` /tavily-best-practices Build a lead enrichment tool that uses Tavily to find company information from their website ``` ``` /tavily-best-practices Create a script that takes a list of company URLs and extracts key business information ``` Build an autonomous agent that monitors competitors and surfaces insights. **Try these prompts:** ``` /tavily-best-practices Build a market research tool that crawls competitor documentation and pricing pages ``` ``` /tavily-best-practices Create an agent that monitors competitor product launches and generates weekly reports ``` The `/tavily-best-practices` skill is your fastest path to production. Describe what you want to build and your agent generates working code with best practices baked in. ## Installation ### Prerequisites * [Tavily API key](https://app.tavily.com/home) - Sign up for free * An AI agent that supports skills (Claude Code, Codex, Cursor, etc.) ### Step 1: Configure Your API Key Add your Tavily API key to your agent's environment. For Claude Code, add it to your settings file: ```bash macOS theme={null} # Open your Claude settings file open -e "$HOME/.claude/settings.json" # Or with VS Code code "$HOME/.claude/settings.json" ``` ```bash Linux theme={null} # Open your Claude settings file nano "$HOME/.claude/settings.json" # Or with VS Code code "$HOME/.claude/settings.json" ``` ```bash Windows theme={null} code %USERPROFILE%\.claude\settings.json ``` Add the following configuration: ```json theme={null} { "env": { "TAVILY_API_KEY": "tvly-YOUR_API_KEY" } } ``` Replace `tvly-YOUR_API_KEY` with your actual Tavily API key from [app.tavily.com](https://app.tavily.com/home) ### Step 2: Install the Skills Run this command in your terminal: ```bash theme={null} npx skills add tavily-ai/skills ``` ### Step 3: Restart Your Agent After installation, restart your AI agent to load the skills. ## Available Skills Build production-ready Tavily integrations with best practices baked in. Reference documentation for implementing web search, content extraction, crawling, and research in agentic workflows, RAG systems, or autonomous agents. **Invoke explicitly:** ``` /tavily-best-practices ``` **Example prompts:** * "Add Tavily search to my internal company chatbot so it can answer questions about our competitors" * "Build a lead enrichment tool that uses Tavily to find company information from their website" * "Create a news monitoring agent that tracks mentions of our brand using Tavily search" * "Implement a RAG pipeline that uses Tavily extract to pull content from industry reports" Search the web using Tavily's LLM-optimized search API. Returns relevant results with content snippets, scores, and metadata. **Invoke explicitly:** ``` /search ``` **Example prompts:** * "Search for the latest news on AI regulations" * "/search current React best practices" * "Search for Python async patterns" Get AI-synthesized research on any topic with citations. Supports structured JSON output for integration into pipelines. **Invoke explicitly:** ``` /research ``` **Example prompts:** * "Research the latest developments in quantum computing" * "/research AI agent frameworks and save to report.json" * "Research the competitive landscape for AI coding assistants" Crawl any website and save pages as local markdown files. Ideal for downloading documentation, knowledge bases, or web content for offline access or analysis. **Invoke explicitly:** ``` /crawl ``` **Example prompts:** * "Crawl the Stripe API docs and save them locally" * "/crawl [https://docs.example.com](https://docs.example.com)" * "Download the Next.js documentation for offline reference" Extract content from specific URLs using Tavily's extraction API. Returns clean markdown/text from web pages. **Invoke explicitly:** ``` /extract ``` **Example prompts:** * "Extract the content from this article URL" * "/extract [https://example.com/blog/post](https://example.com/blog/post)" * "Extract content from these three documentation pages" ## Usage Examples ### Automatic Skill Invocation Your AI agent will automatically use Tavily skills when appropriate. Simply describe what you need: ``` Research the latest developments in AI agents and summarize the key trends ``` ``` Search for the latest news on AI regulations ``` ``` Crawl the Stripe API docs and save them locally ``` ### Explicit Skill Invocation You can also invoke skills directly using slash commands: ``` /research AI agent frameworks and save to report.json ``` ``` /search current React best practices ``` ``` /crawl https://docs.example.com ``` ``` /extract https://example.com/blog/post ``` ``` /tavily-best-practices ``` ## Claude Code Plugin If you're using Claude Code specifically, you can also install the skills as a plugin. ### Step 1: Configure Your API Key Add your Tavily API key to your Claude Code settings file: ```bash theme={null} code ~/.claude/settings.json ``` Add the following configuration: ```json theme={null} { "env": { "TAVILY_API_KEY": "tvly-YOUR_API_KEY" } } ``` ### Step 2: Install the Skills Run these commands inside Claude Code: ``` /plugin marketplace add tavily-ai/skills ``` ``` /plugin install tavily@skills ``` ### Step 3: Restart Claude Code Clear your session and restart to load the plugin: ``` /clear ``` Then press `Ctrl+C` to restart. --- # Source: https://docs.tavily.com/documentation/integrations/agno.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Agno > Tavily is now available for integration through Agno. ## Introduction Integrate [Tavily with Agno](https://docs.agno.com/tools/toolkits/search/tavily#tavily) to enhance your AI agents with powerful web search capabilities. Agno provides a lightweight library for building agents with memory, knowledge, tools, and reasoning, making it easy to incorporate real-time web search and data extraction into your AI applications. ## Step-by-Step Integration Guide ### Step 1: Install Required Packages Install the necessary Python packages: ```bash theme={null} pip install agno tavily-python ``` ### Step 2: Set Up API Keys * **Tavily API Key:** [Get your Tavily API key here](https://app.tavily.com/home) * **OpenAI API Key:** [Get your OpenAI API key here](https://platform.openai.com/account/api-keys) Set these as environment variables in your terminal or add them to your environment configuration file: ```bash theme={null} export TAVILY_API_KEY=your_tavily_api_key export OPENAI_API_KEY=your_openai_api_key ``` ### Step 3: Initialize Agno Agent with Tavily Tools ```python theme={null} from agno.agent import Agent from agno.tools.tavily import TavilyTools import os # Initialize the agent with Tavily tools agent = Agent( tools=[TavilyTools( search=True, # Enable search functionality max_tokens=8000, # Increase max tokens for more detailed results search_depth="advanced", # Use advanced search for comprehensive results format="markdown" # Format results as markdown )], show_tool_calls=True ) ``` ### Step 4: Example Use Cases ```python theme={null} # Example 1: Basic search with default parameters agent.print_response("Latest developments in quantum computing", markdown=True) # Example 2: Market research with multiple parameters agent.print_response( "Analyze the competitive landscape of AI-powered customer service solutions in 2024, " "focusing on market leaders and emerging trends", markdown=True ) # Example 3: Technical documentation search agent.print_response( "Find the latest documentation and tutorials about Python async programming, " "focusing on asyncio and FastAPI", markdown=True ) # Example 4: News aggregation agent.print_response( "Gather the latest news about artificial intelligence from tech news websites " "published in the last week", markdown=True ) ``` ## Additional Use Cases 1. **Content Curation**: Gather and organize information from multiple sources 2. **Real-time Data Integration**: Keep your AI agents up-to-date with the latest information 3. **Technical Documentation**: Search and analyze technical documentation 4. **Market Analysis**: Conduct comprehensive market research and analysis --- # Source: https://docs.tavily.com/documentation/integrations/anthropic.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Anthropic > Integrate Tavily with Anthropic Claude to enhance your AI applications with real-time web search capabilities. ## Installation Install the required packages: ```bash theme={null} pip install anthropic tavily-python ``` ## Setup Set up your API keys: ```python theme={null} import os # Set your API keys os.environ["OPENAI_API_KEY"] = "your-openai-api-key" os.environ["TAVILY_API_KEY"] = "your-tavily-api-key" ``` ## Using Tavily with Anthropic tool calling ```python theme={null} import json from anthropic import Anthropic from tavily import TavilyClient # Initialize clients client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"]) tavily_client = TavilyClient(api_key=os.environ["TAVILY_API_KEY"]) MODEL_NAME = "claude-sonnet-4-20250514" ``` ## Implementation ### System prompt Define a system prompt to guide Claude's behavior: ```python theme={null} SYSTEM_PROMPT = ( "You are a research assistant. Use the tavily_search tool when needed. " "After tools run and tool results are provided back to you, produce a concise, well-structured summary " "with a short bullet list of key points and a 'Sources' section listing the URLs. " ) ``` ### Tool schema Define the Tavily search tool for Claude with enhanced parameters: ```python theme={null} tools = [ { "name": "tavily_search", "description": "Search the web using Tavily. Return relevant links & summaries.", "input_schema": { "type": "object", "properties": { "query": {"type": "string", "description": "Search query string."}, "max_results": {"type": "integer", "default": 5}, "search_depth": {"type": "string", "enum": ["basic", "advanced"]}, }, "required": ["query"] } } ] ``` Scroll to the bottom to find the full json schema for search, extract, map and crawl ### Tool execution Create optimized functions to handle Tavily searches: ```python theme={null} def tavily_search(**kwargs): return tavily_client.search(**kwargs) def process_tool_call(name, args): if name == "tavily_search": return tavily_search(**args) raise ValueError(f"Unknown tool: {name}") ``` ### Main chat function The main function that handles the two-step conversation with Claude: ```python theme={null} def chat_with_claude(user_message: str): print(f"\n{'='*50}\nUser Message: {user_message}\n{'='*50}") # ---- Call 1: allow tools so Claude can ask for searches ---- initial_response = client.messages.create( model=MODEL_NAME, max_tokens=4096, system=SYSTEM_PROMPT, messages=[{"role": "user", "content": [{"type": "text", "text": user_message}]}], tools=tools, ) print("\nInitial Response stop_reason:", initial_response.stop_reason) print("Initial content:", initial_response.content) # If Claude already answered in text, return it if initial_response.stop_reason != "tool_use": final_text = next((b.text for b in initial_response.content if getattr(b, "type", None) == "text"), None) print("\nFinal Response:", final_text) return final_text # ---- Execute ALL tool_use blocks from Call 1 ---- tool_result_blocks = [] for block in initial_response.content: if getattr(block, "type", None) == "tool_use": result = process_tool_call(block.name, block.input) tool_result_blocks.append({ "type": "tool_result", "tool_use_id": block.id, "content": [{"type": "text", "text": json.dumps(result)}], }) # ---- Call 2: NO tools; ask for the final summary from tool results ---- final_response = client.messages.create( model=MODEL_NAME, max_tokens=4096, system=SYSTEM_PROMPT, messages=[ {"role": "user", "content": [{"type": "text", "text": user_message}]}, {"role": "assistant", "content": initial_response.content}, # Claude's tool requests {"role": "user", "content": tool_result_blocks}, # Your tool results {"role": "user", "content": [{"type": "text", "text": "Please synthesize the final answer now based on the tool results above. " "Include 3–7 bullets and a 'Sources' section with URLs."}]}, ], ) final_text = next((b.text for b in final_response.content if getattr(b, "type", None) == "text"), None) print("\nFinal Response:", final_text) return final_text ``` ### Usage example ```python theme={null} # Example usage chat_with_claude("What is trending now in the agents space in 2025?") ``` ```python theme={null} import os import json from anthropic import Anthropic from tavily import TavilyClient client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"]) tavily_client = TavilyClient(api_key=os.environ["TAVILY_API_KEY"]) MODEL_NAME = "claude-sonnet-4-20250514" SYSTEM_PROMPT = ( "You are a research assistant. Use the tavily_search tool when needed. " "After tools run and tool results are provided back to you, produce a concise, well-structured summary " "with a short bullet list of key points and a 'Sources' section listing the URLs. " ) # ---- Define your client-side tool schema for Anthropic ---- tools = [ { "name": "tavily_search", "description": "Search the web using Tavily. Return relevant links & summaries.", "input_schema": { "type": "object", "properties": { "query": {"type": "string", "description": "Search query string."}, "max_results": {"type": "integer", "default": 5}, "search_depth": {"type": "string", "enum": ["basic", "advanced"]}, }, "required": ["query"] } } ] # ---- Your local tool executor ---- def tavily_search(**kwargs): return tavily_client.search(**kwargs) def process_tool_call(name, args): if name == "tavily_search": return tavily_search(**args) raise ValueError(f"Unknown tool: {name}") def chat_with_claude(user_message: str): print(f"\n{'='*50}\nUser Message: {user_message}\n{'='*50}") # ---- Call 1: allow tools so Claude can ask for searches ---- initial_response = client.messages.create( model=MODEL_NAME, max_tokens=4096, system=SYSTEM_PROMPT, messages=[{"role": "user", "content": [{"type": "text", "text": user_message}]}], tools=tools, ) print("\nInitial Response stop_reason:", initial_response.stop_reason) print("Initial content:", initial_response.content) # If Claude already answered in text, return it if initial_response.stop_reason != "tool_use": final_text = next((b.text for b in initial_response.content if getattr(b, "type", None) == "text"), None) print("\nFinal Response:", final_text) return final_text # ---- Execute ALL tool_use blocks from Call 1 ---- tool_result_blocks = [] for block in initial_response.content: if getattr(block, "type", None) == "tool_use": result = process_tool_call(block.name, block.input) tool_result_blocks.append({ "type": "tool_result", "tool_use_id": block.id, "content": [{"type": "text", "text": json.dumps(result)}], }) # ---- Call 2: NO tools; ask for the final summary from tool results ---- final_response = client.messages.create( model=MODEL_NAME, max_tokens=4096, system=SYSTEM_PROMPT, messages=[ {"role": "user", "content": [{"type": "text", "text": user_message}]}, {"role": "assistant", "content": initial_response.content}, # Claude's tool requests {"role": "user", "content": tool_result_blocks}, # Your tool results {"role": "user", "content": [{"type": "text", "text": "Please synthesize the final answer now based on the tool results above. " "Include 3–7 bullets and a 'Sources' section with URLs."}]}, ], ) final_text = next((b.text for b in final_response.content if getattr(b, "type", None) == "text"), None) print("\nFinal Response:", final_text) return final_text # Example usage chat_with_claude("What is trending now in the agents space in 2025?") ``` ## Tavily endpoints schema for Anthropic tool definition > **Note:** When using these schemas, you can customize which parameters are exposed to the model based on your specific use case. For example, if you are building a finance application, you might set `topic`: `"finance"` for all queries without exposing the `topic` parameter. This way, the LLM can focus on deciding other parameters, such as `time_range`, `country`, and so on, based on the user's request. Feel free to modify these schemas as needed and only pass the parameters that are relevant to your application. > **API Format:** The schemas below are for Anthropic's tool format. Each tool uses the `input_schema` structure with `type`, `properties`, and `required` fields.
```python theme={null} tools = [ { "name": "tavily_search", "description": "A powerful web search tool that provides comprehensive, real-time results using Tavily's AI search engine. Returns relevant web content with customizable parameters for result count, content type, and domain filtering. Ideal for gathering current information, news, and detailed web content analysis.", "input_schema": { "type": "object", "required": ["query"], "properties": { "query": { "type": "string", "description": "Search query" }, "auto_parameters": { "type": "boolean", "default": False, "description": "Auto-tune parameters based on the query. Explicit values you pass still win." }, "topic": { "type": "string", "enum": ["general", "news","finance"], "default": "general", "description": "The category of the search. This will determine which of our agents will be used for the search" }, "search_depth": { "type": "string", "enum": ["basic", "advanced"], "default": "basic", "description": "The depth of the search. It can be 'basic' or 'advanced'" }, "chunks_per_source": { "type": "integer", "minimum": 1, "maximum": 3, "default": 3, "description": "Chunks are short content snippets (maximum 500 characters each) pulled directly from the source." }, "max_results": { "type": "integer", "minimum": 0, "maximum": 20, "default": 5, "description": "The maximum number of search results to return" }, "time_range": { "type": "string", "enum": ["day", "week", "month", "year"], "description": "The time range back from the current date to include in the search results. This feature is available for both 'general' and 'news' search topics" }, "start_date": { "type": "string", "format": "date", "description": "Will return all results after the specified start date. Required to be written in the format YYYY-MM-DD." }, "end_date": { "type": "string", "format": "date", "description": "Will return all results before the specified end date. Required to be written in the format YYYY-MM-DD" }, "include_answer": { "description": "Include an LLM-generated answer. 'basic' is brief; 'advanced' is more detailed.", "oneOf": [ {"type": "boolean"}, {"type": "string", "enum": ["basic", "advanced"]} ], "default": False }, "include_raw_content": { "description": "Include the cleaned and parsed HTML content of each search result", "oneOf": [ {"type": "boolean"}, {"type": "string", "enum": ["markdown", "text"]} ], "default": False }, "include_images": { "type": "boolean", "default": False, "description": "Include a list of query-related images in the response" }, "include_image_descriptions": { "type": "boolean", "default": False, "description": "Include a list of query-related images and their descriptions in the response" }, "include_favicon": { "type": "boolean", "default": False, "description": "Whether to include the favicon URL for each result" }, "include_usage": { "type": "boolean", "default": False, "description": "Whether to include credit usage information in the response" }, "include_domains": { "type": "array", "items": {"type": "string"}, "maxItems": 300, "description": "A list of domains to specifically include in the search results, if the user asks to search on specific sites set this to the domain of the site" }, "exclude_domains": { "type": "array", "items": {"type": "string"}, "maxItems": 150, "description": "List of domains to specifically exclude, if the user asks to exclude a domain set this to the domain of the site" }, "country": { "type": "string", "enum": ["afghanistan", "albania", "algeria", "andorra", "angola", "argentina", "armenia", "australia", "austria", "azerbaijan", "bahamas", "bahrain", "bangladesh", "barbados", "belarus", "belgium", "belize", "benin", "bhutan", "bolivia", "bosnia and herzegovina", "botswana", "brazil", "brunei", "bulgaria", "burkina faso", "burundi", "cambodia", "cameroon", "canada", "cape verde", "central african republic", "chad", "chile", "china", "colombia", "comoros", "congo", "costa rica", "croatia", "cuba", "cyprus", "czech republic", "denmark", "djibouti", "dominican republic", "ecuador", "egypt", "el salvador", "equatorial guinea", "eritrea", "estonia", "ethiopia", "fiji", "finland", "france", "gabon", "gambia", "georgia", "germany", "ghana", "greece", "guatemala", "guinea", "haiti", "honduras", "hungary", "iceland", "india", "indonesia", "iran", "iraq", "ireland", "israel", "italy", "jamaica", "japan", "jordan", "kazakhstan", "kenya", "kuwait", "kyrgyzstan", "latvia", "lebanon", "lesotho", "liberia", "libya", "liechtenstein", "lithuania", "luxembourg", "madagascar", "malawi", "malaysia", "maldives", "mali", "malta", "mauritania", "mauritius", "mexico", "moldova", "monaco", "mongolia", "montenegro", "morocco", "mozambique", "myanmar", "namibia", "nepal", "netherlands", "new zealand", "nicaragua", "niger", "nigeria", "north korea", "north macedonia", "norway", "oman", "pakistan", "panama", "papua new guinea", "paraguay", "peru", "philippines", "poland", "portugal", "qatar", "romania", "russia", "rwanda", "saudi arabia", "senegal", "serbia", "singapore", "slovakia", "slovenia", "somalia", "south africa", "south korea", "south sudan", "spain", "sri lanka", "sudan", "sweden", "switzerland", "syria", "taiwan", "tajikistan", "tanzania", "thailand", "togo", "trinidad and tobago", "tunisia", "turkey", "turkmenistan", "uganda", "ukraine", "united arab emirates", "united kingdom", "united states", "uruguay", "uzbekistan", "venezuela", "vietnam", "yemen", "zambia", "zimbabwe"], "description": "Boost search results from a specific country. This will prioritize content from the selected country in the search results. Available only if topic is general. Country names MUST be written in lowercase, plain English, with spaces and no underscores." } } } } ] ```
```python theme={null} tools = [ { "name": "tavily_extract", "description": "A powerful web content extraction tool that retrieves and processes raw content from specified URLs, ideal for data collection, content analysis, and research tasks.", "input_schema": { "type": "object", "required": ["urls"], "properties": { "urls": { "type": "string", "description": "List of URLs to extract content from" }, "include_images": { "type": "boolean", "default": False, "description": "Include a list of images extracted from the urls in the response" }, "include_favicon": { "type": "boolean", "default": False, "description": "Whether to include the favicon URL for each result" }, "include_usage": { "type": "boolean", "default": False, "description": "Whether to include credit usage information in the response" }, "extract_depth": { "type": "string", "enum": ["basic", "advanced"], "default": "basic", "description": "Depth of extraction - 'basic' or 'advanced', if urls are linkedin use 'advanced' or if explicitly told to use advanced" }, "timeout": { "type": "number", "enum": ["basic", "advanced"], "minimum": 0, "maximum": 60, "default": None, "description": "Maximum time in seconds to wait for the URL extraction before timing out. Must be between 1.0 and 60.0 seconds. If not specified, default timeouts are applied based on extract_depth: 10 seconds for basic extraction and 30 seconds for advanced extraction" }, "format": { "type": "string", "enum": ["markdown", "text"], "default": "markdown", "description": "The format of the extracted web page content. markdown returns content in markdown format. text returns plain text and may increase latency." } } } } ] ``` ```python theme={null} tools = [ { "name": "tavily_map", "description": "A powerful web mapping tool that creates a structured map of website URLs, allowing you to discover and analyze site structure, content organization, and navigation paths. Perfect for site audits, content discovery, and understanding website architecture.", "input_schema": { "type": "object", "required": ["url"], "properties": { "url": { "type": "string", "description": "The root URL to begin the mapping" }, "instructions": { "type": "string", "description": "Natural language instructions for the crawler" }, "max_depth": { "type": "integer", "minimum": 1, "maximum": 5, "default": 1, "description": "Max depth of the mapping. Defines how far from the base URL the crawler can explore" }, "max_breadth": { "type": "integer", "minimum": 1, "default": 20, "description": "Max number of links to follow per level of the tree (i.e., per page)" }, "limit": { "type": "integer", "minimum": 1, "default": 50, "description": "Total number of links the crawler will process before stopping" }, "select_paths": { "type": "array", "items": {"type": "string"}, "description": "Regex patterns to select only URLs with specific path patterns (e.g., /docs/.*, /api/v1.*)" }, "select_domains": { "type": "array", "items": {"type": "string"}, "description": "Regex patterns to select crawling to specific domains or subdomains (e.g., ^docs\\.example\\.com$)" }, "exclude_paths": { "type": "array", "items": {"type": "string"}, "description": "Regex patterns to exclude URLs with specific path patterns (e.g., /admin/.*)." }, "exclude_domains": { "type": "array", "items": {"type": "string"}, "description": "Regex patterns to exclude specific domains or subdomains" }, "allow_external": { "type": "boolean", "default": True, "description": "Whether to allow following links that go to external domains" }, "categories": { "type": "array", "items": { "type": "string", "enum": ["Documentation", "Blog", "Careers","About","Pricing","Community","Developers","Contact","Media"] }, "description": "Filter URLs using predefined categories like documentation, blog, api, etc" }, "include_usage": { "type": "boolean", "default": False, "description": "Whether to include credit usage information in the response" } } } } ] ``` ```python theme={null} tools = [ { "name": "tavily_crawl", "description": "A powerful web crawler that initiates a structured web crawl starting from a specified base URL. The crawler expands from that point like a tree, following internal links across pages. You can control how deep and wide it goes, and guide it to focus on specific sections of the site.", "input_schema": { "type": "object", "required": ["url"], "properties": { "url": { "type": "string", "description": "The root URL to begin the crawl" }, "instructions": { "type": "string", "description": "Natural language instructions for the crawler" }, "max_depth": { "type": "integer", "minimum": 1, "maximum: 5, "default": 1, "description": "Max depth of the crawl. Defines how far from the base URL the crawler can explore." }, "max_breadth": { "type": "integer", "minimum": 1, "default": 20, "description": "Max number of links to follow per level of the tree (i.e., per page)" }, "limit": { "type": "integer", "minimum": 1, "default": 50, "description": "Total number of links the crawler will process before stopping" }, "select_paths": { "type": "array", "items": {"type": "string"}, "description": "Regex patterns to select only URLs with specific path patterns (e.g., /docs/.*, /api/v1.*)" }, "select_domains": { "type": "array", "items": {"type": "string"}, "description": "Regex patterns to select crawling to specific domains or subdomains (e.g., ^docs\\.example\\.com$)" }, "exclude_paths": { "type": "array", "items": {"type": "string"}, "description": "Regex patterns to exclude paths (e.g., /private/.*, /admin/.*)" }, "exclude_domains": { "type": "array", "items": {"type": "string"}, "description": "Regex patterns to exclude domains/subdomains (e.g., ^private\\.example\\.com$)" }, "allow_external": { "type": "boolean", "default": True, "description": "Whether to allow following links that go to external domains" }, "include_images": { "type": "boolean", "default": False, "description": "Include images discovered during the crawl" }, "categories": { "type": "array", "items": { "type": "string", "enum": ["Careers", "Blog", "Documentation", "About", "Pricing", "Community", "Developers", "Contact", "Media"] }, "description": "Filter URLs using predefined categories like documentation, blog, api, etc" }, "extract_depth": { "type": "string", "enum": ["basic", "advanced"], "default": "basic", "description": "Advanced extraction retrieves more data, including tables and embedded content, with higher success but may increase latency" }, "format": { "type": "string", "enum": ["markdown", "text"], "default": "markdown", "description": "The format of the extracted web page content. markdown returns content in markdown format. text returns plain text and may increase latency." }, "include_favicon": { "type": "boolean", "default": False, "description": "Whether to include the favicon URL for each result" }, "include_usage": { "type": "boolean", "default": False, "description": "Whether to include credit usage information in the response" } } } } ] ``` For more information about Tavily's capabilities, check out our [API documentation](/documentation/api-reference/introduction) and [best practices](/documentation/best-practices/best-practices-search). --- # Source: https://docs.tavily.com/documentation/api-credits.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Credits & Pricing > Learn how to get and manage your Tavily API Credits. ## Free API Credits You get 1,000 free API Credits every month. **No credit card required.** ## Pricing Overview Tavily operates on a simple, credit-based model: * **Free**: 1,000 credits/month * **Pay-as-you-go**: \$0.008 per credit (allows you to be charged per credit once your plan’s credit limit is reached). * **Monthly plans**: \$0.0075 - \$0.005 per credit * **Enterprise**: Custom pricing and volume |
**Plan**
| **Credits per month** | **Monthly price** | **Price per credit** | | -------------------------------- | --------------------- | ----------------- | -------------------- | | **Researcher** | 1,000 | Free | - | | **Project** | 4,000 | \$30 | \$0.0075 | | **Bootstrap** | 15,000 | \$100 | \$0.0067 | | **Startup** | 38,000 | \$220 | \$0.0058 | | **Growth** | 100,000 | \$500 | \$0.005 | | **Pay as you go** | Per usage | \$0.008 / Credit | \$0.008 | | **Enterprise** | Custom | Custom | Custom | Head to [billing](https://app.tavily.com/billing) to explore our different options and manage your plan. ## API Credits Costs ### Tavily Search Your [search depth](/api-reference/endpoint/search#body-search-depth) determines the cost of your request. * **Basic Search (`basic`):** Each request costs **1 API credit**. * **Advanced Search (`advanced`):** Each request costs **2 API credits**. ### Tavily Extract The number of successful URL extractions and your [extraction depth](/api-reference/endpoint/extract#body-extract-depth) determines the cost of your request. You never get charged if a URL extraction fails. * **Basic Extract (`basic`):** Every 5 successful URL extractions cost **1 API credit** * **Advanced Extract (`advanced`):** Every 5 successful URL extractions cost **2 API credits** ### Tavily Map The number of pages mapped and whether or not natural-language [instructions](/documentation/api-reference/endpoint/map#instructions) are specified determines the cost of your request. You never get charged if a map request fails. * **Regular Mapping:** Every 10 successful pages returned cost **1 API credit** * **Map with (`instructions`):** Every 10 successful pages returned cost **2 API credits** ### Tavily Crawl Tavily Crawl combines both mapping and extraction operations, so the cost is the sum of both: * **Crawl Cost = Mapping Cost + Extraction Cost** For example: * If you crawl 10 pages with basic extraction depth, you'll be charged **1 credit for mapping** (10 pages) + **2 credits for extraction** (10 successful extractions ÷ 5) = **3 total credits** * If you crawl 10 pages with advanced extraction depth, you'll be charged **1 credit for mapping** + **4 credits for extraction** = **5 total credits** ### Tavily Research Tavily Research follows a dynamic pricing model with minimum and maximum credit consumption boundaries associated with each request. The minimum and maximum boundaries differ based on if the request uses `model=mini` or `model=pro`. | Request Cost Boundaries | model=pro | model=mini | | ----------------------- | ----------- | ----------- | | Per-request minimum | 15 credits | 4 credits | | Per-request maximum | 250 credits | 110 credits | --- # Source: https://docs.tavily.com/documentation/best-practices/api-key-management.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # API Key Management > Learn how to handle API key leaks and best practices for key rotation. ## What to do if your API key leaks If you suspect or know that your API key has been leaked (e.g., committed to a public repository, shared in a screenshot, or exposed in client-side code), **immediate action is required** to protect your account and quota. Follow these steps immediately: 1. **Log in to your account**: Go to the [Tavily Dashboard](https://app.tavily.com). 2. **Revoke the leaked key**: Navigate to the API Keys section. Identify the compromised key and delete or revoke it immediately. This will stop any unauthorized usage. 3. **Generate a new key**: Create a new API key to replace the compromised one. 4. **Update your applications**: Replace the old key with the new one in your environment variables, secrets management systems, and application code. If you notice any unusual activity or usage spikes associated with the leaked key before you revoked it, please contact [support@tavily.com](mailto:support@tavily.com) for assistance. ## Rotating your API keys As a general security best practice, we recommend rotating your API keys periodically (e.g., every 90 days). This minimizes the impact if a key is ever compromised without your knowledge. ### How to rotate your keys safely To rotate your keys without downtime: 1. **Generate a new key**: Create a new API key in the [Tavily Dashboard](https://app.tavily.com) while keeping the old one active. 2. **Update your application**: Deploy your application with the new API key. 3. **Verify functionality**: Ensure your application is working correctly with the new key. 4. **Revoke the old key**: Once you are confirmed that the new key is in use and everything is functioning as expected, delete the old API key from the dashboard. Never hardcode API keys in your source code. Always use environment variables or a secure secrets manager to store your credentials. --- # Source: https://docs.tavily.com/documentation/best-practices/best-practices-crawl.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Best Practices for Crawl > Learn how to optimize crawl parameters, focus your crawls, and efficiently extract content from websites. ## Crawl vs Map Understanding when to use each API: | Feature | Crawl | Map | | ---------------------- | ---------------------------- | ------------------------ | | **Content extraction** | Full content | URLs only | | **Use case** | Deep content analysis | Site structure discovery | | **Speed** | Slower (extracts content) | Faster (URLs only) | | **Best for** | RAG, analysis, documentation | Sitemap generation | ### Use Crawl when you need: * Full content extraction from pages * Deep content analysis * Processing of paginated or nested content * Extraction of specific content patterns * Integration with RAG systems ### Use Map when you need: * Quick site structure discovery * URL collection without content extraction * Sitemap generation * Path pattern matching * Domain structure analysis ## Crawl Parameters ### Instructions Guide the crawl with natural language to focus on relevant content: ```json theme={null} { "url": "example.com", "max_depth": 2, "instructions": "Find all documentation pages about Python" } ``` **When to use instructions:** * To focus crawling on specific topics or content types * When you need semantic filtering of pages * For agentic use cases where relevance is critical ### Chunks per Source Control the amount of content returned per page to prevent context window explosion: ```json theme={null} { "url": "example.com", "instructions": "Find all documentation about authentication", "chunks_per_source": 3 } ``` **Key benefits:** * Returns only relevant content snippets (max 500 characters each) instead of full page content * Prevents context window from exploding in agentic use cases * Chunks appear in `raw_content` as: ` [...] [...] ` > `chunks_per_source` is only available when instructions are provided. ### Depth and breadth | Parameter | Description | Impact | | ------------- | ----------------------------------------------- | -------------------------- | | `max_depth` | How many levels deep to crawl from starting URL | Exponential latency growth | | `max_breadth` | Maximum links to follow per page | Horizontal spread | | `limit` | Total maximum pages to crawl | Hard cap on pages | **Performance tip:** Each level of depth increases crawl time exponentially. Start with `max_depth=1` and increase as needed. ```json theme={null} // Conservative crawl { "url": "example.com", "max_depth": 1, "max_breadth": 20, "limit": 20 } // Comprehensive crawl { "url": "example.com", "max_depth": 3, "max_breadth": 100, "limit": 500 } ``` ## Filtering and Focusing ### Path patterns Use regex patterns to include or exclude specific paths: ```json theme={null} // Target specific sections { "url": "example.com", "select_paths": ["/blog/.*", "/docs/.*", "/guides/.*"], "exclude_paths": ["/private/.*", "/admin/.*", "/test/.*"] } // Paginated content { "url": "example.com/blog", "max_depth": 2, "select_paths": ["/blog/.*", "/blog/page/.*"], "exclude_paths": ["/blog/tag/.*"] } ``` ### Domain filtering Control which domains to crawl: ```json theme={null} // Stay within subdomain { "url": "docs.example.com", "select_domains": ["^docs.example.com$"], "max_depth": 2 } // Exclude specific domains { "url": "example.com", "exclude_domains": ["^ads.example.com$", "^tracking.example.com$"], "max_depth": 2 } ``` ### Extract depth Controls extraction quality vs. speed. | Depth | When to use | | ----------------- | -------------------------------------- | | `basic` (default) | Simple content, faster processing | | `advanced` | Complex pages, tables, structured data | ```json theme={null} { "url": "docs.example.com", "max_depth": 2, "extract_depth": "advanced", "select_paths": ["/docs/.*"] } ``` ## Use Cases ### 1. Deep or Unlinked Content Many sites have content that's difficult to access through standard means: * Deeply nested pages not in main navigation * Paginated archives (old blog posts, changelogs) * Internal search-only content **Best Practice:** ```json theme={null} { "url": "example.com", "max_depth": 3, "max_breadth": 50, "limit": 200, "select_paths": ["/blog/.*", "/changelog/.*"], "exclude_paths": ["/private/.*", "/admin/.*"] } ``` ### 2. Structured but Nonstandard Layouts For content that's structured but not marked up in schema.org: * Documentation * Changelogs * FAQs **Best Practice:** ```json theme={null} { "url": "docs.example.com", "max_depth": 2, "extract_depth": "advanced", "select_paths": ["/docs/.*"] } ``` ### 3. Multi-modal Information Needs When you need to combine information from multiple sections: * Cross-referencing content * Finding related information * Building comprehensive knowledge bases **Best Practice:** ```json theme={null} { "url": "example.com", "max_depth": 2, "instructions": "Find all documentation pages that link to API reference docs", "extract_depth": "advanced" } ``` ### 4. Rapidly Changing Content For content that updates frequently: * API documentation * Product announcements * News sections **Best Practice:** ```json theme={null} { "url": "api.example.com", "max_depth": 1, "max_breadth": 100 } ``` ### 5. Behind Auth / Paywalls For content requiring authentication: * Internal knowledge bases * Customer help centers * Gated documentation **Best Practice:** ```json theme={null} { "url": "help.example.com", "max_depth": 2, "select_domains": ["^help.example.com$"], "exclude_domains": ["^public.example.com$"] } ``` ### 6. Complete Coverage / Auditing For comprehensive content analysis: * Legal compliance checks * Security audits * Policy verification **Best Practice:** ```json theme={null} { "url": "example.com", "max_depth": 3, "max_breadth": 100, "limit": 1000, "extract_depth": "advanced", "instructions": "Find all mentions of GDPR and data protection policies" } ``` ### 7. Semantic Search or RAG Integration For feeding content into LLMs or search systems: * RAG systems * Enterprise search * Knowledge bases **Best Practice:** ```json theme={null} { "url": "docs.example.com", "max_depth": 2, "extract_depth": "advanced", "include_images": true } ``` ### 8. Known URL Patterns When you have specific paths to crawl: * Sitemap-based crawling * Section-specific extraction * Pattern-based content collection **Best Practice:** ```json theme={null} { "url": "example.com", "max_depth": 1, "select_paths": ["/docs/.*", "/api/.*", "/guides/.*"], "exclude_paths": ["/private/.*", "/admin/.*"] } ``` ## Performance Optimization ### Depth vs. Performance * Each level of depth increases crawl time exponentially * Start with max\_depth: 1 and increase as needed * Use max\_breadth to control horizontal expansion * Set appropriate limit to prevent excessive crawling ### Rate Limiting * Respect site's robots.txt * Implement appropriate delays between requests * Monitor API usage and limits * Use appropriate error handling for rate limits ## Integration with Map Consider using Map before Crawl to: 1. Discover site structure 2. Identify relevant paths 3. Plan crawl strategy 4. Validate URL patterns **Example workflow:** 1. Use Map to get site structure 2. Analyze paths and patterns 3. Configure Crawl with discovered paths 4. Execute focused crawl **Benefits:** * Discover site structure before crawling * Identify relevant path patterns * Avoid unnecessary crawling * Validate URL patterns work correctly ## Common Pitfalls ### Excessive depth * **Problem:** Setting `max_depth=4` or higher * **Impact:** Exponential crawl time, unnecessary pages * **Solution:** Start with 1-2 levels, increase only if needed ### Unfocused crawling * **Problem:** No `instructions` provided, crawling entire site * **Impact:** Wasted resources, irrelevant content, context explosion * **Solution:** Use instructions to focus the crawl semantically ### Missing limits * **Problem:** No `limit` parameter set * **Impact:** Runaway crawls, unexpected costs * **Solution:** Always set a reasonable `limit` value ### Ignoring failed results * **Problem:** Not checking which pages failed extraction * **Impact:** Incomplete data, missed content * **Solution:** Monitor failed results and adjust parameters ## Summary * Use instructions and chunks\_per\_source for focused, relevant results in agentic use cases * Start with conservative parameters (`max_depth=1, max_breadth=20`) * Use path patterns to focus crawling on relevant content * Choose appropriate extract\_depth based on content complexity * Set reasonable limits to prevent excessive crawling * Monitor failed results and adjust patterns accordingly * Use Map first to understand site structure * Implement error handling for rate limits and failures * Respect robots.txt and site policies * Optimize for your use case (speed vs. completeness) * Process results incrementally rather than waiting for full crawl > Crawling is powerful but resource-intensive. Focus your crawls, start small, monitor results, and scale gradually based on actual needs. --- # Source: https://docs.tavily.com/documentation/best-practices/best-practices-extract.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Best Practices for Extract > Learn how to optimize content extraction, choose the right approach, and configure parameters for better performance. ## Extract Parameters ### Query Use query to rerank extracted content chunks based on relevance: ```python theme={null} await tavily_client.extract( urls=["https://example.com/article"], query="machine learning applications in healthcare" ) ``` **When to use query:** * To extract only relevant portions of long documents * When you need focused content instead of full page extraction * For targeted information retrieval from specific URLs > When `query` is provided, chunks are reranked based on relevance to the query. ### Chunks Per Source Control the amount of content returned per URL to prevent context window explosion: ```python theme={null} await tavily_client.extract( urls=["https://example.com/article"], query="machine learning applications in healthcare", chunks_per_source=3 ) ``` **Key benefits:** * Returns only relevant content snippets (max 500 characters each) instead of full page content * Prevents context window from exploding * Chunks appear in `raw_content` as: ` [...] [...] ` * Must be between 1 and 5 chunks per source > `chunks_per_source` is only available when `query` is provided. **Example with multiple URLs:** ```python theme={null} await tavily_client.extract( urls=[ "https://example.com/ml-healthcare", "https://example.com/ai-diagnostics", "https://example.com/medical-ai" ], query="AI diagnostic tools accuracy", chunks_per_source=2 ) ``` This returns the 2 most relevant chunks from each URL, giving you focused, relevant content without overwhelming your context window. ## Extraction Approaches ### Search with include\_raw\_content Enable include\_raw\_content=true in Search API calls to retrieve both search results and extracted content simultaneously. ```python theme={null} response = await tavily_client.search( query="AI healthcare applications", include_raw_content=True, max_results=5 ) ``` **When to use:** * Quick prototyping * Simple queries where search results are likely relevant * Single API call convenience ### Direct Extract API Use the Extract API when you want control over which specific URLs to extract from. ```python theme={null} await tavily_client.extract( urls=["https://example.com/article1", "https://example.com/article2"], query="machine learning applications", chunks_per_source=3 ) ``` **When to use:** * You already have specific URLs to extract from * You want to filter or curate URLs before extraction * You need targeted extraction with query and chunks\_per\_source **Key difference:** The main distinction is control, with Extract you choose exactly which URLs to extract from, while Search with `include_raw_content` extracts from all search results. ## Extract Depth The `extract_depth` parameter controls extraction comprehensiveness: | Depth | Use case | | ----------------- | --------------------------------------------- | | `basic` (default) | Simple text extraction, faster processing | | `advanced` | Complex pages, tables, structured data, media | ### Using `extract_depth=advanced` Best for content requiring detailed extraction: ```python theme={null} await tavily_client.extract( url="https://example.com/complex-page", extract_depth="advanced" ) ``` **When to use advanced:** * Dynamic content or JavaScript-rendered pages * Tables and structured information * Embedded media and rich content * Higher extraction success rates needed `extract_depth=advanced` provides better accuracy but increases latency and cost. Use `basic` for simple content. ## Advanced Filtering Strategies Beyond query-based filtering, consider these approaches for curating URLs before extraction: | Strategy | When to use | | ------------ | ---------------------------------------------- | | Re-ranking | Use dedicated re-ranking models for precision | | LLM-based | Let an LLM assess relevance before extraction | | Clustering | Group similar documents, extract from clusters | | Domain-based | Filter by trusted domains before extracting | | Score-based | Filter search results by relevance score | ### Example: Score-based filtering ```python theme={null} import asyncio from tavily import AsyncTavilyClient tavily_client = AsyncTavilyClient(api_key="tvly-YOUR_API_KEY") async def filtered_extraction(): # Search first response = await tavily_client.search( query="AI healthcare applications", search_depth="advanced", max_results=20 ) # Filter by relevance score (>0.5) relevant_urls = [ result['url'] for result in response.get('results', []) if result.get('score', 0) > 0.5 ] # Extract from filtered URLs with targeted query extracted_data = await tavily_client.extract( urls=relevant_urls, query="machine learning diagnostic tools", chunks_per_source=3, extract_depth="advanced" ) return extracted_data asyncio.run(filtered_extraction()) ``` ## Integration with Search ### Optimal workflow * **Search** to discover relevant URLs * **Filter** by relevance score, domain, or content snippet * **Re-rank** if needed using specialized models * **Extract** from top-ranked sources with query and chunks\_per\_source * **Validate** extracted content quality * **Process** for your RAG or AI application ### Example end-to-end pipeline ```python theme={null} async def content_pipeline(topic): # 1. Search with sub-queries queries = generate_subqueries(topic) responses = await asyncio.gather( *[tavily_client.search(**q) for q in queries] ) # 2. Filter and aggregate urls = [] for response in responses: urls.extend([ r['url'] for r in response['results'] if r['score'] > 0.5 ]) # 3. Deduplicate urls = list(set(urls))[:20] # Top 20 unique URLs # 4. Extract with error handling extracted = await asyncio.gather( *(tavily_client.extract(url, extract_depth="advanced") for url in urls), return_exceptions=True ) # 5. Filter successful extractions return [e for e in extracted if not isinstance(e, Exception)] ``` ## Summary 1. **Use query and chunks\_per\_source** for targeted, focused extraction 2. **Choose Extract API** when you need control over which URLs to extract from 3. **Filter URLs** before extraction using scores, re-ranking, or domain trust 4. **Choose appropriate extract\_depth** based on content complexity 5. **Process URLs concurrently** with async operations for better performance 6. **Implement error handling** to manage failed extractions gracefully 7. **Validate extracted content** before downstream processing 8. **Optimize costs** by extracting only necessary content with chunks\_per\_source > Start with query and chunks\_per\_source for targeted extraction. Filter URLs strategically, extract with appropriate depth, and handle errors gracefully for production-ready pipelines. --- # Source: https://docs.tavily.com/documentation/best-practices/best-practices-research.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Best Practices for Research > Learn how to write effective prompts, choose the right model, and configure output formats for better research results. ## Prompting Define a **clear goal** with all **details** and **direction**. * **Be specific when you can.** If you already know important details, include them.
(E.g. Target market or industry, key competitors, customer segments, geography, or constraints) * **Only stay open-ended if you don't know details and want discovery.** If you're exploring broadly, make that explicit (e.g., "tell me about the most impactful AI innovations in healthcare in 2025"). * **Avoid contradictions.** Don't include conflicting information, constraints, or goals in your prompt. * **Share what's already known.** Include prior assumptions, existing decisions, or baseline knowledge—so the research doesn't repeat what you already have. * **Keep the prompt clean and directed.** Use a clear task statement + essential context + desired output format. Avoid messy background dumps. ### Example Queries ```text theme={null} "Research the company ____ and it's 2026 outlook. Provide a brief overview of the company, its products, services, and market position." ``` ```text theme={null} "Conduct a competitive analysis of ____ in 2026. Identify their main competitors, compare market positioning, and analyze key differentiators." ``` ```text theme={null} "We're evaluating Notion as a potential partner. We already know they primarily serve SMB and mid-market teams, expanded their AI features significantly in 2025, and most often compete with Confluence and ClickUp. Research Notion's 2026 outlook, including market position, growth risks, and where a partnership could be most valuable. Include citations." ``` ## Model | Model | Best For | | ------ | -------------------------------------------------------------------- | | `pro` | Comprehensive, multi-agent research for complex, multi-domain topics | | `mini` | Targeted, efficient research for narrow or well-scoped questions | | `auto` | When you're unsure how complex research will be | ### Pro Provides comprehensive, multi-agent research suited for complex topics that span multiple subtopics or domains. Use when you want deeper analysis, more thorough reports, or maximum accuracy. ```json theme={null} { "input": "Analyze the competitive landscape for ____ in the SMB market, including key competitors, positioning, pricing models, customer segments, recent product moves, and where ____ has defensible advantages or risks over the next 2–3 years.", "model": "pro" } ``` ### Mini Optimized for targeted, efficient research. Works best for narrow or well-scoped questions where you still benefit from agentic searching and synthesis, but don't need extensive depth. ```json theme={null} { "input": "What are the top 5 competitors to ____ in the SMB market, and how do they differentiate?", "model": "mini" } ``` ## Structured Output vs. Report * **Structured Output** - Best for data enrichment, pipelines, or powering UIs with specific fields. * **Report** — Best for reading, sharing, or displaying verbatim (e.g., chat interfaces, briefs, newsletters). ### Formatting Your Schema * **Write clear field descriptions.** In 1–3 sentences, say exactly what the field should contain and what to look for. This makes it easier for our models to interpret what you're looking for. * **Match the structure you actually need.** Use the right types (arrays, objects, enums) instead of packing multiple values into one string (e.g., `competitors: string[]`, not `"A, B, C"`). * **Avoid duplicate or overlapping fields.** Keep each field unique and specific - contradictions or redundancy can confuse our models. ## Streaming vs. Polling Best for user interfaces where you want real-time updates. Best for background processes where you check status periodically. See streaming in action with the [live demo](https://chat-research.tavily.com/). --- # Source: https://docs.tavily.com/documentation/best-practices/best-practices-search.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Best Practices for Search > Learn how to optimize your queries, refine search filters, and leverage advanced parameters for better performance. ## Query Optimization ### Keep your query under 400 characters Keep queries concise—under **400 characters**. Think of it as a query for an agent performing web search, not long-form prompts. ### Break complex queries into sub-queries For complex or multi-topic queries, send separate focused requests: ```json theme={null} // Instead of one massive query, break it down: { "query": "Competitors of company ABC." } { "query": "Financial performance of company ABC." } { "query": "Recent developments of company ABC." } ``` ## Search Depth The `search_depth` parameter controls the tradeoff between latency and relevance: Latency vs Relevance by Search Depth *This chart is a heuristic and is not to scale.* | Depth | Latency | Relevance | Content Type | | ------------ | ------- | --------- | ------------ | | `ultra-fast` | Lowest | Lower | Content | | `fast` | Low | Good | Chunks | | `basic` | Medium | High | Content | | `advanced` | Higher | Highest | Chunks | ### Content types | Type | Description | | ----------- | --------------------------------------------------------- | | **Content** | NLP-based summary of the page, providing general context | | **Chunks** | Short snippets reranked by relevance to your search query | Use **chunks** when you need highly targeted information aligned with your query. Use **content** when a general page summary is sufficient. ### Fast + Ultra-Fast | Depth | When to use | | ------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `ultra-fast` | When latency is absolutely crucial. Delivers near-instant results, prioritizing speed over relevance. Ideal for real-time applications where response time is critical. | | `fast` | When latency is more important than relevance, but you want results in reranked chunks format. Good for applications that need quick, targeted snippets. | | `basic` | A solid balance between relevance and latency. Best for general-purpose searches where you need quality results without the overhead of advanced processing. | | `advanced` | When you need the highest relevance and are willing to trade off latency. Best for queries seeking specific, detailed information. | ### Using `search_depth=advanced` Best for queries seeking specific information: ```json theme={null} { "query": "How many countries use Monday.com?", "search_depth": "advanced", "chunks_per_source": 3, "include_raw_content": true } ``` ## Filtering Results ### By date | Parameter | Description | | ------------------------- | ------------------------------------------------------- | | `time_range` | Filter by relative time: `day`, `week`, `month`, `year` | | `start_date` / `end_date` | Filter by specific date range (format: `YYYY-MM-DD`) | ```json theme={null} { "query": "latest ML trends", "time_range": "month" } { "query": "AI news", "start_date": "2025-01-01", "end_date": "2025-02-01" } ``` ### By topic Use `topic` to filter by content type. Set to `news` for news sources (includes `published_date` metadata): ```json theme={null} { "query": "What happened today in NY?", "topic": "news" } ``` ### By domain | Parameter | Description | | ----------------- | ------------------------------------- | | `include_domains` | Limit to specific domains | | `exclude_domains` | Filter out specific domains | | `country` | Boost results from a specific country | ```json theme={null} // Restrict to LinkedIn profiles { "query": "CEO background at Google", "include_domains": ["linkedin.com/in"] } // Exclude irrelevant domains { "query": "US economy trends", "exclude_domains": ["espn.com", "vogue.com"] } // Boost results from a country { "query": "tech startup funding", "country": "united states" } // Wildcard: limit to .com, exclude specific site { "query": "AI news", "include_domains": ["*.com"], "exclude_domains": ["example.com"] } ``` Keep domain lists short and relevant for best results. ## Response Content ### `max_results` Limits results returned (default: `5`). Setting too high may return lower-quality results. ### `include_raw_content` Returns full extracted page content. For comprehensive extraction, consider a two-step process: 1. Search to retrieve relevant URLs 2. Use [Extract API](/documentation/best-practices/best-practices-extract#2-two-step-process-search-then-extract) to get content ### `auto_parameters` Tavily automatically configures parameters based on query intent. Your explicit values override automatic ones. ```json theme={null} { "query": "impact of AI in education policy", "auto_parameters": true, "search_depth": "basic" // Override to control cost } ``` `auto_parameters` may set `search_depth` to `advanced` (2 credits). Set it manually to control cost. ## Async & Performance Use async calls for concurrent requests: ```python theme={null} import asyncio from tavily import AsyncTavilyClient tavily_client = AsyncTavilyClient("tvly-YOUR_API_KEY") async def fetch_and_gather(): queries = ["latest AI trends", "future of quantum computing"] responses = await asyncio.gather( *(tavily_client.search(q) for q in queries), return_exceptions=True ) for response in responses: if isinstance(response, Exception): print(f"Failed: {response}") else: print(response) asyncio.run(fetch_and_gather()) ``` ## Post-Processing ### Using metadata Leverage response metadata to refine results: | Field | Use case | | ------------- | ---------------------------------- | | `score` | Filter/rank by relevance score | | `title` | Keyword filtering on headlines | | `content` | Quick relevance check | | `raw_content` | Deep analysis and regex extraction | ### Score-based filtering The `score` indicates relevance between query and content. Higher is better, but the ideal threshold depends on your use case. ```python theme={null} # Filter results with score > 0.7 filtered = [r for r in results if r['score'] > 0.7] ``` ### Regex extraction Extract structured data from `raw_content`: ```python theme={null} import re # Extract location text = "Company: Tavily, Location: New York" match = re.search(r"Location: (\w+)", text) location = match.group(1) if match else None # "New York" # Extract all emails text = "Contact: john@example.com, support@tavily.com" emails = re.findall(r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}", text) ``` --- # Source: https://docs.tavily.com/changelog.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Changelog
Track API usage by project with the new X-Project-ID header

  • You can now attach a Project ID to your API requests to organize and track usage by project. This is useful when a single API key is used across multiple projects or applications.
  • HTTP Header: Add X-Project-ID: your-project-id to any API request
  • Python SDK: Pass project\_id="your-project-id" when instantiating the client, or set the TAVILY\_PROJECT environment variable
  • JavaScript SDK: Pass projectId: "your-project-id" when instantiating the client, or set the TAVILY\_PROJECT environment variable
  • An API key can be associated with multiple projects
  • Filter requests by project in the /logs endpoint and platform usage dashboard to keep track of where requests originate from

search\_depth parameter - New options: fast and ultra-fast

  • fast (BETA)
    • Optimized for low latency while maintaining high relevance to the user query
    • Cost: 1 API Credit
  • ultra-fast (BETA)
    • Optimized strictly for latency
    • Cost: 1 API Credit

query and chunks\_per\_source parameters for Extract and Crawl

  • query (Extract)
    • Type: string
    • User intent for reranking extracted content chunks. When provided, chunks are reranked based on relevance to this query.
  • chunks\_per\_source (Extract & Crawl)
    • Type: integer
    • Range: 1 to 5
    • Default: 3
    • Chunks are short content snippets (maximum 500 characters each) pulled directly from the source.
    • Use chunks\_per\_source to define the maximum number of relevant chunks returned per source and to control the raw\_content length.
    • Chunks will appear in the raw\_content field as: \ \[...] \ \[...] \.
    • Available only when query is provided (Extract) or instructions are provided (Crawl).

include\_usage parameter

  • You can now include credit usage information in the API response for the Search, Extract, Crawl, and Map endpoints.
  • Set the include\_usage parameter to true to receive credit usage information in the API response.
  • Type: boolean
  • Default: false
  • When enabled, the response includes a usage object with credits information, making it easy to track API credit consumption for each request.
  • Note: The value may be 0 if the total successful calls have not yet reached the minimum threshold. See our Credits & Pricing documentation for details.

Tavily is now integrated with Vercel AI SDK v5

  • We've released a new @tavily/ai-sdk package that provides pre-built AI SDK tools for Vercel's AI SDK v5.
  • Easily add real-time web search, content extraction, intelligent crawling, and site mapping to your AI SDK project with ready-to-use tools.
  • Available Tools: tavilySearch, tavilyExtract, tavilyCrawl, and tavilyMap
  • Full TypeScript support with proper type definitions and seamless integration with Vercel AI SDK v5.
  • Check out our integration guide to get started.

timeout parameter for Crawl and timeout parameter for Map

  • You can now specify a custom timeout for the Crawl and Map endpoints to control how long to wait for the operation before timing out.
  • Type: float
  • Range: Between 10 and 150 seconds
  • Default: 150 seconds
  • This gives you fine-grained control over crawl and map operation timeouts, allowing you to balance between reliability and speed based on your specific use case.

Role options: Owner, Admin, Member

You can now assign roles to team members, giving you more control over access and permissions. Each team has one owner, while there can be multiple admins and multiple members. The key distinction between roles is in their permissions for Billing and Settings:

  • Owner
    • Full access to all Settings
    • Access and ownership of the Billing account
  • Admin
    • Full access to Settings except ownership transfer
    • No access to Billing
  • Member
    • Limited Settings access (view members only)
    • No access to Billing

timeout parameter

  • You can now specify a custom timeout for the Extract endpoint to control how long to wait for URL extraction before timing out.
  • Type: number (float)
  • Range: Between 1.0 and 60.0 seconds
  • Default behavior: If not specified, automatic timeouts are applied based on extract\_depth: 10 seconds for basic extraction and 30 seconds for advanced extraction.
  • This gives you fine-grained control over extraction timeouts, allowing you to balance between reliability and speed based on your specific use case.

start\_date parameter,end\_date parameter

  • You can now use both the start\_date and end\_date parameters in the Search endpoints.
  • start\_date will return all results after the specified start date. Required to be written in the format YYYY-MM-DD.
  • end\_date will return all results before the specified end date. Required to be written in the format YYYY-MM-DD.
  • Set start\_date to 2025-01-01 and end\_date to 2025-04-01 to reiceive results strictly from this time range.

Login to your account to view the usage dashboard


The usage dashboard provides the following features to paid users/teams:
  • The Usage Graph offers a breakdown of daily usage across all Tavily endpoints with historical data to enable month over month usage and spend comparison.
  • The Logs Table offers granular insight into each API request to ensure visibility and traceability with every Tavily interaction.

include\_favicon parameter

  • You can now include the favicon URL for each result in the Search, Extract, and Crawl endpoints.
  • Set the include\_favicon parameter to true to receive the favicon URL (if available) for each result in the API response.
  • This makes it easy to display website icons alongside your search, extraction, or crawl results, improving the visual context and user experience in your application.
Tavily Search
auto\_parameters

  • Boolean default: false
  • When auto\_parameters is enabled, Tavily automatically configures search parameters based on your query's content and intent. You can still set other parameters manually, and your explicit values will override the automatic ones.
  • The parameters include\_answer, include\_raw\_content, and max\_results must always be set manually, as they directly affect response size.
  • Note: search\_depth may be automatically set to advanced when it's likely to improve results. This uses 2 API credits per request. To avoid the extra cost, you can explicitly set search\_depth to basic.
/usage endpoint
    Easily check your API usage and plan limits.
    Just GET [https://api.tavily.com/usage](https://api.tavily.com/usage) with your API key to monitor your account in real time.
Tavily Search
country parameter

Boost search results from a specific country.

    This will prioritize content from the selected country in the search results. Available only if topic is general.
Make & n8n Integrations
Tavily Extract
format parameter
  • Type: enum\
  • Default: markdown
  • The format of the extracted web page content. markdown returns content in markdown format. text returns plain text and may increase latency.
  • Available options: markdown, text
Tavily Search
search\_depth and chunks\_per\_sourceparameters
  • search\_depth
    • Type: enum\
    • Default: basic
    • The depth of the search. advanced search is tailored to retrieve the most relevant sources and content snippets for your query, while basic search provides generic content snippets from each source.
    • A basic search costs 1 API Credit, while an advanced search costs 2 API Credits.
    • Available options: basic, advanced
  • chunks\_per\_source
    • Chunks are short content snippets (maximum 500 characters each) pulled directly from the source.
    • Use chunks\_per\_source to define the maximum number of relevant chunks returned per source and to control the content length.
    • Chunks will appear in the content field as: \ \[...] \ \[...] \.
    • Available only when search\_depth is advanced.
    • Required range: 1 \< x \< 3
Tavily Crawl
  • Tavily Crawl enables you to traverse a website like a graph, starting from a base URL and automatically discovering and extracting content from multiple linked pages. With Tavily Crawl, you can:
    • Specify the starting URL and let the crawler intelligently follow links to map out the site structure.
    • Control the depth and breadth of the crawl, allowing you to focus on specific sections or perform comprehensive site-wide analysis.
    • Apply filters and custom instructions to target only the most relevant pages or content types.
    • Aggregate extracted content for further analysis, reporting, or integration into your workflows.
    • Seamlessly integrate with your automation tools or use the API directly for flexible, programmatic access.
    Tavily Crawl is ideal for use cases such as large-scale content aggregation, competitive research, knowledge base creation, and more.
    For full details and API usage examples, see the Tavily Crawl API reference.
--- # Source: https://docs.tavily.com/examples/use-cases/chat.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Chat > Build a conversational chat agent with real-time web search, crawl, and extract capabilities using Tavily's API Tavily Chatbot Demo ## Try Our Chatbot ### Step 1: Get Your API Key ### Step 2: Chat with Tavily ### Step 3: Read The Open Source Code ## Features 1. **Fast Results**: Tavily's API delivers quick responses essential for real-time chat experiences. 2. **Intelligent Parameter Selection**: Dynamically select API parameters based on conversation context using LangChain integration. Specifically designed for agentic systems. All you need is a natural language input, no need to configure structured JSON for our API. 3. **Content Snippets**: Tavily provides compact summaries of search results in the `content` field, best for maintaining small context sizes in low latency, multi-turn applications. 4. **Source Attribution**: All search, extract, and crawl results include URLs, enabling easy implementation of citations for transparency and credibility in responses. ## How Does It Work? The chatbot uses a simple ReAct architecture to manage conversation flow and decision-making. Here's how the core components work together: The workflow consists of several key components: The chatbot uses LangGraph MemorySaver to manage conversation flow. The graph structure conrtols how messages are processed and routed. This code snippet is not meant to run standalone. View the full implementation in our [github repository](https://github.com/tavily-ai/tavily-chat). ```python theme={null} class WebAgent: def __init__( self, ): self.llm = ChatOpenAI( model="gpt-4.1-nano", api_key=os.getenv("OPENAI_API_KEY") ).with_config({"tags": ["streaming"]}) # Define the LangChain search tool self.search = TavilySearch( max_results=10, topic="general", api_key=os.getenv("TAVILY_API_KEY") ) # Define the LangChain extract tool self.extract = TavilyExtract( extract_depth="advanced", api_key=os.getenv("TAVILY_API_KEY") ) # Define the LangChain crawl tool self.crawl = TavilyCrawl(api_key=os.getenv("TAVILY_API_KEY")) self.prompt = PROMPT self.checkpointer = MemorySaver() def build_graph(self): """ Build and compile the LangGraph workflow. """ return create_react_agent( prompt=self.prompt, model=self.llm, tools=[self.search, self.extract, self.crawl], checkpointer=self.checkpointer, ) ``` The router decides whether to use base knowledge or perform a Tavily web search, extract, or crawl based on: * Question complexity * Need for current information * Available conversation context The chatbot maintains conversation history using a memory system that: * Preserves context across multiple exchanges * Stores relevant search results for future reference * Manages system prompts and initialization When Tavily access is needed, the chatbot: * Performs targeted web search, extract, or crawl using the LangChain integration * Includes source citations Users receive real-time updates on: * Search progress * Response generation * Source processing --- # Source: https://docs.tavily.com/examples/use-cases/company-research.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Company Research > Perform in-depth company research with Tavily Search and Extract. Company Research Demo ## Try Our Company Researcher ### Step 1: Get Your API Key ### Step 2: Try the Company Researcher ### Step 3: Read The Open Source Code ## Why Use Tavily for company research? This is one of the most popular use cases for Tavily. Our powerful APIs can easily be integrated with agentic workflows to perform in-depth, accurate company research. Tavily offers several advantages for conducting in-depth company research: 1. **Comprehensive Data Gathering**: Tavily's advanced search algorithms pull relevant information from a wide range of online sources, providing a robust foundation for in-depth company research. 2. **Flexible Agentic Search**: When Tavily is integrated into agentic workflows, such as those powered by frameworks like LangGraph, it allows AI agents to dynamically tailor their search strategies. The agents can decide to perform either a news or general search depending on the context, retrieve raw content for more in-depth analysis, or simply pull summaries when high-level insights are sufficient. This adaptability ensures that the research process is optimized according to the specific requirements of the task and the nature of the data available, bringing a new level of autonomy and intelligence to the research process. 3. **Real-time Data Retrieval**: Tavily ensures that the data used for research is up-to-date by querying live sources. This is crucial for company research where timely information can impact the accuracy and relevance of the analysis. 4. **Efficient and Scalable**: Tavily handles multiple queries simultaneously, making it capable of processing large datasets quickly. This efficiency reduces the time needed for comprehensive research, allowing for faster decision-making. --- # Source: https://docs.tavily.com/documentation/integrations/composio.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Composio > Tavily is now available for integration through Composio. ## Introduction Integrate Tavily with Composio to enhance your AI workflows with powerful web search capabilities. Composio provides a platform to connect your AI agents to external tools like Tavily, making it easy to incorporate real-time web search and data extraction into your applications. ## Step-by-Step Integration Guide ### Step 1: Install Required Packages Install the necessary Python packages: ```bash theme={null} pip install composio composio-openai openai python-dotenv ``` ### Step 2: Set Up API Keys * **OpenAI API Key:** [Get your OpenAI API key here](https://platform.openai.com/account/api-keys) * **Composio API Key:** [Get your Composio API key here](https://app.composio.dev/dashboard) Set these as environment variables in your terminal or add them to your environment configuration file: ```bash theme={null} export OPENAI_API_KEY=your_openai_api_key export COMPOSIO_API_KEY=your_composio_api_key ``` ### Step 3: Connect Tavily to Composio ```python theme={null} from composio import Composio from dotenv import load_dotenv load_dotenv() composio = Composio() # Use composio managed auth auth_config = composio.auth_configs.create( toolkit="tavily", options={ "type": "use_custom_auth", "auth_scheme": "API_KEY", "credentials": {} } ) print(auth_config) auth_config_id = auth_config.id user_id = "your-user-id" connection_request = composio.connected_accounts.link(user_id, auth_config_id) print(connection_request.redirect_url) ``` ### Step 4: Example Use Case ```python theme={null} from composio import Composio from composio_openai import OpenAIProvider from openai import OpenAI import os from dotenv import load_dotenv load_dotenv() # Initialize OpenAI client with API key client = OpenAI() # Initialize Composio toolset composio = Composio( api_key=os.getenv("COMPOSIO_API_KEY"), provider=OpenAIProvider() ) user_id = "your-user-id" # Get the Tavily tool with all available parameters tools = composio.tools.get(user_id, toolkits=['TAVILY'] ) # Define the market research task with specific parameters task = { "query": "Analyze the competitive landscape of AI-powered customer service solutions in 2024", "search_depth": "advanced", "include_answer": True, "max_results": 10, # Focus on relevant industry sources "include_domains": [ "techcrunch.com", "venturebeat.com", "forbes.com", "gartner.com", "marketsandmarkets.com" ], } # Send request to LLM messages = [{"role": "user", "content": str(task)}] response = client.chat.completions.create( model="gpt-4.1", messages=messages, tools=tools, tool_choice="auto" ) # Handle tool call via Composio execution_result = None response_message = response.choices[0].message if response_message.tool_calls: execution_result = composio.provider.handle_tool_calls(user_id,response) print("Execution Result:", execution_result) messages.append(response_message) # Add tool response messages for tool_call, result in zip(response_message.tool_calls, execution_result): messages.append({ "role": "tool", "content": str(result["data"]), "tool_call_id": tool_call.id }) # Get final response from LLM final_response = client.chat.completions.create( model="gpt-4.1", messages=messages ) print("\nMarket Research Summary:") print(final_response.choices[0].message.content) else: print("LLM responded directly (no tool used):", response_message.content) ``` ## Additional Use Cases 1. **Research Automation**: Automate the collection and summarization of research data 2. **Content Curation**: Gather and organize information from multiple sources 3. **Real-time Data Integration**: Keeping your AI models up-to-date with the latest information. --- # Source: https://docs.tavily.com/examples/quick-tutorials/cookbook.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Cookbook > A collection of guided examples and code snippets for using Tavily. ## Fundamentals Search, Extract, and Crawl the Web Build a Web Research Agent Combine Internal Data with Web Data ## Search Track product updates from any company ## Research Asynchronous polling for background research requests Stream real-time progress and answers during research Get results in custom schema formats Refine user prompts through multi-turn clarification before research Combine Tavily research with your internal data ## Crawl Crawl websites and turn content into a searchable knowledge base Intelligent web research agent that autonomously gathers and synthesizes information Collect data from websites and export the results as organized PDF files --- # Source: https://docs.tavily.com/examples/use-cases/crawl-to-rag.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Crawl to RAG > Turn Any Website into a Searchable Knowledge Base using Tavily and MongoDB. ## The system operates through a two-step process: ### 1. Website Crawling & Vectorization: Use Tavily's crawling endpoint to extract and sitemap content from a webpage URL, then embed it into a MongoDB Atlas vector index for retrieval. Vectorize ### 2. Intelligent Q\&A Interface: Query your crawled data through a conversational agent that provides citation-backed answers while maintaining conversation history and context. The agent intelligently distinguishes between informational questions (requiring vector search) and conversational queries (using general knowledge). Chat with vector ## Try Our Crawl to RAG Use Case ### Step 1: Get Your API Key ### Step 2: Chat with Tavily ### Step 3: Read The Open Source Code ## Features 1. **Advanced Web Crawling**: Deep website content extraction using Tavily's crawling API 2. **Vector Search**: MongoDB Atlas vector search with OpenAI embeddings for semantic content retrieval 3. **Smart Question Routing**: Automatic detection of informational vs. conversational queries 4. **Persistent Memory**: Conversation history and context preservation using LangGraph-MongoDB checkpointing 5. **Session Management**: Thread-based conversational persistance and vector store management --- # Source: https://docs.tavily.com/documentation/api-reference/endpoint/crawl.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Tavily Crawl > Tavily Crawl is a graph-based website traversal tool that can explore hundreds of paths in parallel with built-in extraction and intelligent discovery. ## OpenAPI ````yaml POST /crawl openapi: 3.0.3 info: title: Tavily Search and Extract API description: >- Our REST API provides seamless access to Tavily Search, a powerful search engine for LLM agents, and Tavily Extract, an advanced web scraping solution optimized for LLMs. version: 1.0.0 servers: - url: https://api.tavily.com/ security: [] tags: - name: Search - name: Extract - name: Crawl - name: Map - name: Research - name: Usage paths: /crawl: post: summary: Initiate a web crawl from a base URL description: >- Tavily Crawl is a graph-based website traversal tool that can explore hundreds of paths in parallel with built-in extraction and intelligent discovery. requestBody: description: Parameters for the Tavily Crawl request. required: true content: application/json: schema: type: object properties: url: type: string description: The root URL to begin the crawl. example: docs.tavily.com instructions: type: string description: >- Natural language instructions for the crawler. When specified, the mapping cost increases to 2 API credits per 10 successful pages instead of 1 API credit per 10 pages. example: Find all pages about the Python SDK chunks_per_source: type: integer description: >- Chunks are short content snippets (maximum 500 characters each) pulled directly from the source. Use `chunks_per_source` to define the maximum number of relevant chunks returned per source and to control the `raw_content` length. Chunks will appear in the `raw_content` field as: ` [...] [...] `. Available only when `instructions` are provided. Must be between 1 and 5. minimum: 1 maximum: 5 default: 3 max_depth: type: integer description: >- Max depth of the crawl. Defines how far from the base URL the crawler can explore. default: 1 minimum: 1 maximum: 5 max_breadth: type: integer description: >- Max number of links to follow per level of the tree (i.e., per page). default: 20 minimum: 1 maximum: 500 limit: type: integer description: >- Total number of links the crawler will process before stopping. default: 50 minimum: 1 select_paths: type: array description: >- Regex patterns to select only URLs with specific path patterns (e.g., `/docs/.*`, `/api/v1.*`). items: type: string default: null select_domains: type: array description: >- Regex patterns to select crawling to specific domains or subdomains (e.g., `^docs\.example\.com$`). items: type: string default: null exclude_paths: type: array description: >- Regex patterns to exclude URLs with specific path patterns (e.g., `/private/.*`, `/admin/.*`). items: type: string default: null exclude_domains: type: array description: >- Regex patterns to exclude specific domains or subdomains from crawling (e.g., `^private\.example\.com$`). items: type: string default: null allow_external: type: boolean description: >- Whether to include external domain links in the final results list. default: true include_images: type: boolean description: Whether to include images in the crawl results. default: false extract_depth: type: string description: >- Advanced extraction retrieves more data, including tables and embedded content, with higher success but may increase latency. `basic` extraction costs 1 credit per 5 successful extractions, while `advanced` extraction costs 2 credits per 5 successful extractions. enum: - basic - advanced default: basic format: type: string description: >- The format of the extracted web page content. `markdown` returns content in markdown format. `text` returns plain text and may increase latency. enum: - markdown - text default: markdown include_favicon: type: boolean description: Whether to include the favicon URL for each result. default: false timeout: type: number format: float description: >- Maximum time in seconds to wait for the crawl operation before timing out. Must be between 10 and 150 seconds. minimum: 10 maximum: 150 default: 150 include_usage: type: boolean description: >- Whether to include credit usage information in the response. `NOTE:`The value may be 0 if the total use of /extract and /map have not yet reached minimum requirements. See our [Credits & Pricing documentation](https://docs.tavily.com/documentation/api-credits) for details. default: false required: - url responses: '200': description: Crawl results returned successfully content: application/json: schema: type: object properties: base_url: type: string description: The base URL that was crawled. example: docs.tavily.com results: type: array description: A list of extracted content from the crawled URLs. items: type: object properties: url: type: string description: The URL that was crawled. example: https://docs.tavily.com raw_content: type: string description: >- The full content extracted from the page. When `query` is provided, contains the top-ranked chunks joined by `[...]` separator. favicon: type: string description: The favicon URL for the result. example: >- https://mintlify.s3-us-west-1.amazonaws.com/tavilyai/_generated/favicon/apple-touch-icon.png?v=3 example: - url: https://docs.tavily.com/welcome raw_content: >- Welcome - Tavily Docs [Tavily Docs home page![light logo](https://mintlify.s3.us-west-1.amazonaws.com/tavilyai/logo/light.svg)![dark logo](https://mintlify.s3.us-west-1.amazonaws.com/tavilyai/logo/dark.svg)](https://tavily.com/) Search or ask... Ctrl K - [Support](mailto:support@tavily.com) - [Get an API key](https://app.tavily.com) - [Get an API key](https://app.tavily.com) Search... Navigation [Home](/welcome)[Documentation](/documentation/about)[SDKs](/sdk/python/quick-start)[Examples](/examples/use-cases/data-enrichment)[FAQ](/faq/faq) Explore our docs Your journey to state-of-the-art web search starts right here. [## Quickstart Start searching with Tavily in minutes](documentation/quickstart)[## API Reference Start using Tavily's powerful APIs](documentation/api-reference/endpoint/search)[## API Credits Overview Learn how to get and manage your Tavily API Credits](documentation/api-credits)[## Rate Limits Learn about Tavily's API rate limits for both development and production environments](documentation/rate-limits)[## Python Get started with our Python SDK, `tavily-python`](sdk/python/quick-start)[## Playground Explore Tavily's APIs with our interactive playground](https://app.tavily.com/playground) favicon: >- https://mintlify.s3-us-west-1.amazonaws.com/tavilyai/_generated/favicon/apple-touch-icon.png?v=3 - url: https://docs.tavily.com/documentation/api-credits raw_content: >- Credits & Pricing - Tavily Docs [Tavily Docs home page![light logo](https://mintlify.s3.us-west-1.amazonaws.com/tavilyai/logo/light.svg)![dark logo](https://mintlify.s3.us-west-1.amazonaws.com/tavilyai/logo/dark.svg)](https://tavily.com/) Search or ask... Ctrl K - [Support](mailto:support@tavily.com) - [Get an API key](https://app.tavily.com) - [Get an API key](https://app.tavily.com) Search... Navigation Overview Credits & Pricing [Home](/welcome)[Documentation](/documentation/about)[SDKs](/sdk/python/quick-start)[Examples](/examples/use-cases/data-enrichment)[FAQ](/faq/faq) - [API Playground](https://app.tavily.com/playground) - [Community](https://community.tavily.com) - [Blog](https://blog.tavily.com) ##### Overview - [About](/documentation/about) - [Quickstart](/documentation/quickstart) - [Credits & Pricing](/documentation/api-credits) - [Rate Limits](/documentation/rate-limits) ##### API Reference - [Introduction](/documentation/api-reference/introduction) - [POST Tavily Search](/documentation/api-reference/endpoint/search) - [POST Tavily Extract](/documentation/api-reference/endpoint/extract) - [POST Tavily Crawl](/documentation/api-reference/endpoint/crawl) - [POST Tavily Map](/documentation/api-reference/endpoint/map) ##### Best Practices - [Best Practices for Search](/documentation/best-practices/best-practices-search) - [Best Practices for Extract](/documentation/best-practices/best-practices-extract) ##### Tavily MCP Server - [Tavily MCP Server](/documentation/mcp) ##### Integrations - [LangChain](/documentation/integrations/langchain) - [LlamaIndex](/documentation/integrations/llamaindex) - [Zapier](/documentation/integrations/zapier) - [Dify](/documentation/integrations/dify) - [Composio](/documentation/integrations/composio) - [Make](/documentation/integrations/make) - [Agno](/documentation/integrations/agno) - [Pydantic AI](/documentation/integrations/pydantic-ai) - [FlowiseAI](/documentation/integrations/flowise) ##### Legal - [Security & Compliance](https://trust.tavily.com) - [Privacy Policy](https://tavily.com/privacy) ##### Help - [Help Center](https://help.tavily.com) ##### Tavily Search Crawler - [Tavily Search Crawler](/documentation/search-crawler) Overview # Credits & Pricing Learn how to get and manage your Tavily API Credits. ## [​](#free-api-credits) Free API Credits [## Get your free API key You get 1,000 free API Credits every month. **No credit card required.**](https://app.tavily.com) ## [​](#pricing-overview) Pricing Overview Tavily operates on a simple, credit-based model: - **Free**: 1,000 credits/month - **Pay-as-you-go**: $0.008 per credit (allows you to be charged per credit once your plan's credit limit is reached). - **Monthly plans**: $0.0075 - $0.005 per credit - **Enterprise**: Custom pricing and volume | **Plan** | **Credits per month** | **Monthly price** | **Price per credit** | | --- | --- | --- | --- | | **Researcher** | 1,000 | Free | - | | **Project** | 4,000 | $30 | $0.0075 | | **Bootstrap** | 15,000 | $100 | $0.0067 | | **Startup** | 38,000 | $220 | $0.0058 | | **Growth** | 100,000 | $500 | $0.005 | | **Pay as you go** | Per usage | $0.008 / Credit | $0.008 | | **Enterprise** | Custom | Custom | Custom | Head to [my plan](https://app.tavily.com/account/plan) to explore our different options and manage your plan. ## [​](#api-credits-costs) API Credits Costs ### [​](#tavily-search) Tavily Search Your [search depth](/api-reference/endpoint/search#body-search-depth) determines the cost of your request. - **Basic Search (`basic`):** Each request costs **1 API credit**. - **Advanced Search (`advanced`):** Each request costs **2 API credits**. ### [​](#tavily-extract) Tavily Extract The number of successful URL extractions and your [extraction depth](/api-reference/endpoint/extract#body-extract-depth) determines the cost of your request. You never get charged if a URL extraction fails. - **Basic Extract (`basic`):** Every 5 successful URL extractions cost **1 API credit** - **Advanced Extract (`advanced`):** Every 5 successful URL extractions cost **2 API credits** [Quickstart](/documentation/quickstart)[Rate Limits](/documentation/rate-limits) [x](https://x.com/tavilyai)[github](https://github.com/tavily-ai)[linkedin](https://linkedin.com/company/tavily)[website](https://tavily.com) [Powered by Mintlify](https://mintlify.com/preview-request?utm_campaign=poweredBy&utm_medium=docs&utm_source=docs.tavily.com) On this page - [Free API Credits](#free-api-credits) - [Pricing Overview](#pricing-overview) - [API Credits Costs](#api-credits-costs) - [Tavily Search](#tavily-search) - [Tavily Extract](#tavily-extract) favicon: >- https://mintlify.s3-us-west-1.amazonaws.com/tavilyai/_generated/favicon/apple-touch-icon.png?v=3 - url: https://docs.tavily.com/documentation/about raw_content: >- Who are we? ----------- We're a team of AI researchers and developers passionate about helping you build the next generation of AI assistants. Our mission is to empower individuals and organizations with accurate, unbiased, and factual information. What is the Tavily Search Engine? --------------------------------- Building an AI agent that leverages realtime online information is not a simple task. Scraping doesn't scale and requires expertise to refine, current search engine APIs don't provide explicit information to queries but simply potential related articles (which are not always related), and are not very customziable for AI agent needs. This is why we're excited to introduce the first search engine for AI agents - [Tavily](https://app.tavily.com/). Tavily is a search engine optimized for LLMs, aimed at efficient, quick and persistent search results. Unlike other search APIs such as Serp or Google, Tavily focuses on optimizing search for AI developers and autonomous AI agents. We take care of all the burden of searching, scraping, filtering and extracting the most relevant information from online sources. All in a single API call! To try the API in action, you can now use our hosted version on our [API Playground](https://app.tavily.com/playground). Why choose Tavily? ------------------ Tavily shines where others fail, with a Search API optimized for LLMs. How does the Search API work? ----------------------------- Traditional search APIs such as Google, Serp and Bing retrieve search results based on a user query. However, the results are sometimes irrelevant to the goal of the search, and return simple URLs and snippets of content which are not always relevant. Because of this, any developer would need to then scrape the sites to extract relevant content, filter irrelevant information, optimize the content to fit LLM context limits, and more. This task is a burden and requires a lot of time and effort to complete. The Tavily Search API takes care of all of this for you in a single API call. The Tavily Search API aggregates up to 20 sites per a single API call, and uses proprietary AI to score, filter and rank the top most relevant sources and content to your task, query or goal. In addition, Tavily allows developers to add custom fields such as context and limit response tokens to enable the optimal search experience for LLMs. Tavily can also help your AI agent make better decisions by including a short answer for cross-agent communication. Getting started --------------- [Sign up](https://app.tavily.com/) for Tavily to get your API key. You get **1,000 free API Credits every month**. No credit card required. [Get your free API key --------------------- You get 1,000 free API Credits every month. **No credit card required.**](https://app.tavily.com/)Head to our [API Playground](https://app.tavily.com/playground) to familiarize yourself with our API. To get started with Tavily's APIs and SDKs using code, head to our [Quickstart Guide](https://docs.tavily.com/guides/quickstart) and follow the steps. favicon: >- https://mintlify.s3-us-west-1.amazonaws.com/tavilyai/_generated/favicon/apple-touch-icon.png?v=3 response_time: type: number format: float description: Time in seconds it took to complete the request. example: 1.23 usage: type: object description: Credit usage details for the request. example: credits: 1 request_id: type: string description: >- A unique request identifier you can share with customer support to help resolve issues with specific requests. example: 123e4567-e89b-12d3-a456-426614174111 '400': description: Bad Request - Your request is invalid. content: application/json: schema: type: object properties: detail: type: object properties: error: type: string example: detail: error: '[400] No starting url provided' '401': description: Unauthorized - Your API key is wrong or missing. content: application/json: schema: type: object properties: detail: type: object properties: error: type: string example: detail: error: 'Unauthorized: missing or invalid API key.' '403': description: Forbidden - URL is not supported. content: application/json: schema: type: object properties: detail: type: object properties: error: type: string example: detail: error: '[403] URL is not supported' '429': description: Too many requests - Rate limit exceeded content: application/json: schema: type: object properties: detail: type: object properties: error: type: string example: detail: error: >- Your request has been blocked due to excessive requests. Please reduce rate of requests. '432': description: Key limit or Plan Limit exceeded content: application/json: schema: type: object properties: detail: type: object properties: error: type: string example: detail: error: >- This request exceeds your plan's set usage limit. Please upgrade your plan or contact support@tavily.com '433': description: PayGo limit exceeded content: application/json: schema: type: object properties: detail: type: object properties: error: type: string example: detail: error: >- This request exceeds the pay-as-you-go limit. You can increase your limit on the Tavily dashboard. '500': description: Internal Server Error - We had a problem with our server. content: application/json: schema: type: object properties: detail: type: object properties: error: type: string example: detail: error: '[500] Internal server error' security: - bearerAuth: [] components: securitySchemes: bearerAuth: type: http scheme: bearer bearerFormat: JWT description: >- Bearer authentication header in the form Bearer , where is your Tavily API key (e.g., Bearer tvly-YOUR_API_KEY). ```` --- # Source: https://docs.tavily.com/documentation/integrations/crewai.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # CrewAI > Integrate Tavily with CrewAI to build powerful AI agents that can search the web. ## Introduction This guide shows you how to integrate Tavily with CrewAI to create sophisticated AI agents that can search the web and extract content. By combining CrewAI's multi-agent framework with Tavily's real-time web search capabilities, you can build AI systems that research, analyze, and process web information autonomously. ## Prerequisites Before you begin, make sure you have: * An OpenAI API key from [OpenAI Platform](https://platform.openai.com/) * A Tavily API key from [Tavily Dashboard](https://app.tavily.com/sign-in) ## Installation Install the required packages: > **Note:** The stable python versions to use with CrewAI are `Python >=3.10 and Python <3.13` . ```bash theme={null} pip install 'crewai[tools]' pip install pydantic ``` ## Setup Set up your API keys: ```python theme={null} import os # Set your API keys os.environ["OPENAI_API_KEY"] = "your-openai-api-key" os.environ["TAVILY_API_KEY"] = "your-tavily-api-key" ``` ## Using Tavily Search with CrewAI CrewAI provides built-in Tavily tools that make it easy to integrate web search capabilities into your AI agents. The `TavilySearchTool` allows your agents to search the web for real-time information. ```python theme={null} import os from crewai import Agent, Task, Crew from crewai_tools import TavilySearchTool ``` ```python theme={null} # Initialize the Tavily search tool tavily_tool = TavilySearchTool() ``` ```python theme={null} # Create an agent that uses the tool researcher = Agent( role='News Researcher', goal='Find trending information about AI agents', backstory='An expert News researcher specializing in technology, focused on AI.', tools=[tavily_tool], verbose=True ) ``` ```python theme={null} # Create a task for the agent research_task = Task( description='Search for the top 3 Agentic AI trends in 2025.', expected_output='A JSON report summarizing the top 3 AI trends found.', agent=researcher ) ``` ```python theme={null} # Form the crew and execute the task crew = Crew( agents=[researcher], tasks=[research_task], verbose=True ) result = crew.kickoff() print(result) ``` ### Customizing search tool parameters **Example:** ```python theme={null} from crewai_tools import TavilySearchTool # You can configure the tool with specific parameters tavily_search_tool = TavilySearchTool( search_depth="advanced", max_results=10, include_answer=True ) ``` You can customize the search tool by passing parameters to configure its behavior.Below are available parameters in crewai integration: **Available Parameters:** * `query` (str): Required. The search query string. * `search_depth` (Literal\["basic", "advanced"], optional): The depth of the search. Defaults to "basic". * `topic` (Literal\["general", "news", "finance"], optional): The topic to focus the search on. Defaults to "general". * `time_range` (Literal\["day", "week", "month", "year"], optional): The time range for the search. Defaults to None. * `max_results` (int, optional): The maximum number of search results to return. Defaults to 5. * `include_domains` (Sequence\[str], optional): A list of domains to prioritize in the search. Defaults to None. * `exclude_domains` (Sequence\[str], optional): A list of domains to exclude from the search. Defaults to None. * `include_answer` (Union\[bool, Literal\["basic", "advanced"]], optional): Whether to include a direct answer synthesized from the search results. Defaults to False. * `include_raw_content` (bool, optional): Whether to include the raw HTML content of the searched pages. Defaults to False. * `include_images` (bool, optional): Whether to include image results. Defaults to False. * `timeout` (int, optional): The request timeout in seconds. Defaults to 60. > **Explore More Parameters**: For a complete list of available parameters and their descriptions, visit our [API documentation](/documentation/api-reference/endpoint/search) to discover all the customization options available for search operations. ```python theme={null} import os from crewai import Agent, Task, Crew from crewai_tools import TavilySearchTool # Set up environment variables os.environ["OPENAI_API_KEY"] = "your-openai-api-key" os.environ["TAVILY_API_KEY"] = "your-tavily-api-key" # Initialize the tool tavily_tool = TavilySearchTool() # Create an agent that uses the tool researcher = Agent( role='News Researcher', goal='Find trending information about AI agents', backstory='An expert News researcher specializing in technology, focused on AI.', tools=[tavily_tool], verbose=True ) # Create a task for the agent research_task = Task( description='Search for the top 3 Agentic AI trends in 2025.', expected_output='A JSON report summarizing the top 3 AI trends found.', agent=researcher ) # Form the crew and kick it off crew = Crew( agents=[researcher], tasks=[research_task], verbose=True ) result = crew.kickoff() print(result) ``` ## Using Tavily Extract with CrewAI The `TavilyExtractorTool` allows your CrewAI agents to extract and process content from specific web pages. This is particularly useful for content analysis, data collection, and research tasks. ```python theme={null} import os from crewai import Agent, Task, Crew from crewai_tools import TavilyExtractorTool ``` ```python theme={null} # Initialize the Tavily extractor tool tavily_tool = TavilyExtractorTool() ``` ```python theme={null} # Create an agent that uses the tool extractor_agent = Agent( role='Web Page Content Extractor', goal='Extract key information from the given web pages', backstory='You are an expert at extracting relevant content from websites using the Tavily Extract.', tools=[tavily_tool], verbose=True ) ``` ```python theme={null} # Define a task for the agent extract_task = Task( description='Extract the main content from the URL https://en.wikipedia.org/wiki/Lionel_Messi .', expected_output='A JSON string containing the extracted content from the URL.', agent=extractor_agent ) ``` ```python theme={null} # Create and run the crew crew = Crew( agents=[extractor_agent], tasks=[extract_task], verbose=False ) result = crew.kickoff() print(result) ``` ### Customizing extract tool parameters **Example:** ```python theme={null} from crewai_tools import TavilyExtractorTool # You can configure the tool with specific parameters tavily_extract_tool = TavilyExtractorTool( extract_depth="advanced", include_images=True, timeout=45 ) ``` You can customize the extract tool by passing parameters to configure its behavior. Below are available parameters in crewai integration: **Available Parameters:** * `urls` (Union\[List\[str], str]): Required. A single URL string or a list of URL strings to extract data from. * `include_images` (Optional\[bool]): Whether to include images in the extraction results. Defaults to False. * `extract_depth` (Literal\["basic", "advanced"]): The depth of extraction. Use "basic" for faster, surface-level extraction or "advanced" for more comprehensive extraction. Defaults to "basic". * `timeout` (int): The maximum time in seconds to wait for the extraction request to complete. Defaults to 60. > **Explore More Parameters**: For a complete list of available parameters and their descriptions, visit our [API documentation](/documentation/api-reference/endpoint/extract) to discover all the customization options available for extract operations. ```python theme={null} import os from crewai import Agent, Task, Crew from crewai_tools import TavilyExtractorTool # Set up environment variables os.environ["OPENAI_API_KEY"] = "your-openai-api-key" os.environ["TAVILY_API_KEY"] = "your-tavily-api-key" # Initialize the Tavily extractor tool tavily_tool = TavilyExtractorTool() # Create an agent that uses the tool extractor_agent = Agent( role='Web Page Content Extractor', goal='Extract key information from the given web pages', backstory='You are an expert at extracting relevant content from websites using the Tavily Extract.', tools=[tavily_tool], verbose=True ) # Define a task for the agent extract_task = Task( description='Extract the main content from the URL https://en.wikipedia.org/wiki/Lionel_Messi .', expected_output='A JSON string containing the extracted content from the URL.', agent=extractor_agent ) # Create and execute the crew crew = Crew( agents=[extractor_agent], tasks=[extract_task], verbose=True ) # Run the extraction result = crew.kickoff() print("Extraction Results:") print(result) ``` For more information about Tavily's capabilities, check out our [API documentation](/documentation/api-reference/introduction) and [best practices](/documentation/best-practices/best-practices-search). --- # Source: https://docs.tavily.com/examples/use-cases/data-enrichment.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Data Enrichment > Enhance datasets with Tavily's APIs. #### Fill in spreadsheet columns Enrichment1 Demo #### Enrich your spreadsheet Enrichment2 Demo #### Export as CSV Enrichment3 Demo ## Try Our Data Enrichment Agent ### Step 1: Get Your API Key ### Step 2: Try the Data Enrichment Agent ### Step 3: Read The Open Source Code --- # Source: https://docs.tavily.com/documentation/integrations/dify.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Dify > Tavily is now available for no-code integration through Dify. ## Introduction Integrate Tavily with Dify to enhance your AI workflows without writing any code. Dify is a no-code platform that allows you to build and deploy AI applications using various tools, including the **Tavily Search API** and **Tavily Extract API**. This integration enables access to real-time web data, improving the capabilities of your AI applications. ## How to set up Tavily with Dify Follow these steps to integrate Tavily with Dify: Go to [Dify](https://dify.ai/) and log in to your account. Go to the [Tavily Dashboard](https://app.tavily.com/home) to obtain your **API key**. Install the **Tavily tool** from the [Plugin Marketplace](https://marketplace.dify.ai/plugins/langgenius/tavily) to enable integration with your Dify workflows. In **Dify**, navigate to **Tools > Tavily > To Authorize** and enter your **Tavily API key** to connect your Dify instance to Tavily. ## Using the Tavily tool in Dify Tavily can be utilized in various Dify application types: ### Chatflow / Workflow Applications Dify’s Chatflow and Workflow applications support Tavily tool nodes, which include: * **Tavily Search API** – Perform dynamic web searches and retrieve up-to-date information. * **Tavily Extract API** – Extract raw content from web pages. These nodes allow you to automate tasks such as research, content curation, and real-time data integration into your workflows. ### Agent Applications In Agent applications, you can integrate the Tavily tool to access web data in real time. Use this to: * Retrieve structured and relevant search results. * Extract raw content for further processing. * Provide accurate, context-aware answers to user queries. defy ## Example use case: automated deep research Use **Tavily Search API** within **Dify** to conduct automated, multi-step searches, iterating through multiple queries to gather, refine, and summarize insights for comprehensive reports. For a detailed walkthrough, check out this blog post: [DeepResearch: Building a Research Automation App with Dify](https://dify.ai/blog/deepresearch-building-a-research-automation-app-with-dify) ## Best practices for using Tavily in Dify * **Design Concise Queries** – Use focused queries to maximize the relevance of search results. * **Utilize Domain Filtering** – Use the `include_domains` parameter to narrow search results to specific domains. * **Enable an Agentic Workflow** – Leverage an LLM to dynamically generate and refine queries for Tavily. *** --- # Source: https://docs.tavily.com/documentation/api-reference/endpoint/extract.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Tavily Extract > Extract web page content from one or more specified URLs using Tavily Extract. ## OpenAPI ````yaml POST /extract openapi: 3.0.3 info: title: Tavily Search and Extract API description: >- Our REST API provides seamless access to Tavily Search, a powerful search engine for LLM agents, and Tavily Extract, an advanced web scraping solution optimized for LLMs. version: 1.0.0 servers: - url: https://api.tavily.com/ security: [] tags: - name: Search - name: Extract - name: Crawl - name: Map - name: Research - name: Usage paths: /extract: post: summary: Retrieve raw web content from specified URLs description: >- Extract web page content from one or more specified URLs using Tavily Extract. requestBody: description: Parameters for the Tavily Extract request. required: true content: application/json: schema: type: object properties: urls: oneOf: - type: string description: The URL to extract content from. example: https://en.wikipedia.org/wiki/Artificial_intelligence - type: array items: type: string description: A list of URLs to extract content from. example: - https://en.wikipedia.org/wiki/Artificial_intelligence - https://en.wikipedia.org/wiki/Machine_learning - https://en.wikipedia.org/wiki/Data_science query: type: string description: >- User intent for reranking extracted content chunks. When provided, chunks are reranked based on relevance to this query. chunks_per_source: type: integer description: >- Chunks are short content snippets (maximum 500 characters each) pulled directly from the source. Use `chunks_per_source` to define the maximum number of relevant chunks returned per source and to control the `raw_content` length. Chunks will appear in the `raw_content` field as: ` [...] [...] `. Available only when `query` is provided. Must be between 1 and 5. minimum: 1 maximum: 5 default: 3 extract_depth: type: string description: >- The depth of the extraction process. `advanced` extraction retrieves more data, including tables and embedded content, with higher success but may increase latency.`basic` extraction costs 1 credit per 5 successful URL extractions, while `advanced` extraction costs 2 credits per 5 successful URL extractions. enum: - basic - advanced default: basic include_images: type: boolean description: >- Include a list of images extracted from the URLs in the response. Default is false. default: false include_favicon: type: boolean description: Whether to include the favicon URL for each result. default: false format: type: string description: >- The format of the extracted web page content. `markdown` returns content in markdown format. `text` returns plain text and may increase latency. enum: - markdown - text default: markdown timeout: type: number format: float description: >- Maximum time in seconds to wait for the URL extraction before timing out. Must be between 1.0 and 60.0 seconds. If not specified, default timeouts are applied based on extract_depth: 10 seconds for basic extraction and 30 seconds for advanced extraction. minimum: 1 maximum: 60 default: None include_usage: type: boolean description: >- Whether to include credit usage information in the response. `NOTE:`The value may be 0 if the total successful URL extractions has not yet reached 5 calls. See our [Credits & Pricing documentation](https://docs.tavily.com/documentation/api-credits) for details. default: false required: - urls responses: '200': description: Extraction results returned successfully content: application/json: schema: type: object properties: results: type: array description: A list of extracted content from the provided URLs. items: type: object properties: url: type: string description: The URL from which the content was extracted. example: >- https://en.wikipedia.org/wiki/Artificial_intelligence raw_content: type: string description: >- The full content extracted from the page. When `query` is provided, contains the top-ranked chunks joined by `[...]` separator. example: >- "Jump to content\nMain menu\nSearch\nAppearance\nDonate\nCreate account\nLog in\nPersonal tools\n Photograph your local culture, help Wikipedia and win!\nToggle the table of contents\nArtificial intelligence\n161 languages\nArticle\nTalk\nRead\nView source\nView history\nTools\nFrom Wikipedia, the free encyclopedia\n\"AI\" redirects here. For other uses, see AI (disambiguation) and Artificial intelligence (disambiguation).\nPart of a series on\nArtificial intelligence (AI)\nshow\nMajor goals\nshow\nApproaches\nshow\nApplications\nshow\nPhilosophy\nshow\nHistory\nshow\nGlossary\nvte\nArtificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines, particularly computer systems. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals.[1] Such machines may be called AIs.\nHigh-profile applications of AI include advanced web search engines (e.g., Google Search); recommendation systems (used by YouTube, Amazon, and Netflix); virtual assistants (e.g., Google Assistant, Siri, and Alexa); autonomous vehicles (e.g., Waymo); generative and creative tools (e.g., ChatGPT and AI art); and superhuman play and analysis in strategy games (e.g., chess and Go)................... images: type: array example: [] description: >- This is only available if `include_images` is set to `true`. A list of image URLs extracted from the page. items: type: string favicon: type: string description: The favicon URL for the result. example: >- https://en.wikipedia.org/static/favicon/wikipedia.ico failed_results: type: array example: [] description: A list of URLs that could not be processed. items: type: object properties: url: type: string description: The URL that failed to be processed. error: type: string description: >- An error message describing why the URL couldn't be processed. response_time: type: number format: float description: Time in seconds it took to complete the request. example: 0.02 usage: type: object description: Credit usage details for the request. example: credits: 1 request_id: type: string description: >- A unique request identifier you can share with customer support to help resolve issues with specific requests. example: 123e4567-e89b-12d3-a456-426614174111 '400': description: Bad Request content: application/json: schema: type: object properties: detail: type: object properties: error: type: string example: detail: error: <400 Bad Request, (e.g Max 20 URLs are allowed.)> '401': description: Unauthorized - Your API key is wrong or missing. content: application/json: schema: type: object properties: detail: type: object properties: error: type: string example: detail: error: 'Unauthorized: missing or invalid API key.' '429': description: Too many requests - Rate limit exceeded content: application/json: schema: type: object properties: detail: type: object properties: error: type: string example: detail: error: >- Your request has been blocked due to excessive requests. Please reduce rate of requests. '432': description: Key limit or Plan Limit exceeded content: application/json: schema: type: object properties: detail: type: object properties: error: type: string example: detail: error: >- <432 Custom Forbidden Error (e.g This request exceeds your plan's set usage limit. Please upgrade your plan or contact support@tavily.com)> '433': description: PayGo limit exceeded content: application/json: schema: type: object properties: detail: type: object properties: error: type: string example: detail: error: >- This request exceeds the pay-as-you-go limit. You can increase your limit on the Tavily dashboard. '500': description: Internal Server Error - We had a problem with our server. content: application/json: schema: type: object properties: detail: type: object properties: error: type: string example: detail: error: Internal Server Error security: - bearerAuth: [] components: securitySchemes: bearerAuth: type: http scheme: bearer bearerFormat: JWT description: >- Bearer authentication header in the form Bearer , where is your Tavily API key (e.g., Bearer tvly-YOUR_API_KEY). ```` --- # Source: https://docs.tavily.com/faq/faq.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Frequently Asked Questions Tavily allows your AI agent to access the web, securely, and at scale. Supercharge your AI agent with real-time search, scraping, and structured data retrieval in a single API call. Tavily simplifies the process of integrating dynamic web information into AI-driven solutions. Tavily offers three different endpoints: * **Tavily Search API** - A search engine designed for AI agents, combining search and scraping capabilities. * **Tavily Extract API** - Scrape up to 20 URLs in a single API call. * **Tavily Crawl API** - Map and crawl domains efficiently. Tavily Search API is a specialized search engine designed for LLMs and AI agents. It provides real-time, customizable, and RAG-ready search results and extracted content, enabling AI applications to retrieve and process data efficiently. **Traditional Search APIs:** Unlike Bing, Google, or SerpAPI, Tavily dynamically searches the web, reviews multiple sources, and extracts the most relevant content, delivering concise, ready-to-use information optimized for AI applications. **AI Answer Engine APIs:** Unlike Perplexity Sonar API or OpenAI Web Search API, Tavily focuses on delivering high-quality, customizable search results. Developers control search depth, domain targeting, and content extraction. LLM-generated answers are optional, making Tavily a flexible, search-first solution adaptable to different use cases. #### Features & Benefits * **Built for AI** – Designed for AI workflows like Retrieval-Augmented Generation (RAG) with structured and customizable search. * **Customizable** – Control search depth, target specific domains, extract full page content, and get an LLM-generated response in one API call. * **Real-time & Reliable** – Delivers up-to-date and real-time results. * **Easy Integration** – Simple API setup with support for Python, JavaScript, LangChain, and LlamaIndex. * **Secure & Scalable** – SOC 2 certified, zero data retention, and built to handle high-volume workloads. Tavily uses advanced algorithms and NLP techniques to gather data from trusted, authoritative sources. Users can also prioritize preferred sources to enhance relevance. Tavily prioritizes speed and typically returns results within seconds. Complex queries involving extensive data retrieval may take slightly longer. #### Pricing & Plans Yes! Tavily offers a free plan with limited monthly API calls, allowing you to test its capabilities before committing to a paid plan. No credit card is required. * **Free**: 1,000 credits/month * **Pay-as-you-go**: \$0.008 per credit * **Monthly plans**: \$0.0075 - \$0.005 per credit * **Enterprise**: Custom pricing and volume Your API credits reset on the first day of each month, regardless of the billing date. This ensures you start each month with a clean slate of credits to use for your searches. When upgrading or downgrading your plan, charges are typically **prorated**. This means: * **Upgrading**: If you upgrade mid-cycle, you'll only pay the difference for the remaining days in your billing period. * **Downgrading**: Downgrades take effect at the start of the next billing cycle, and you will continue on your current plan until the cycle ends. Yes! Tavily offers free access for students. Contact [support@tavily.com](mailto:support@tavily.com) for eligibility details. #### Integration & Usage Tavily supports Python, Node.js, and cURL. The API is simple to set up—just sign up, [get your API key](https://app.tavily.com/home), and integrate it within minutes. Visit our [SDKs](/sdk) and [API Reference](/documentation/api-reference/introduction) for more guidance and information. GPT Researcher is an open-source, autonomous research agent powered by Tavily’s Search API. It automates the research process by retrieving, filtering, and synthesizing data from over 20 web sources per task. #### Support & Privacy * **Paid Subscriptions** – Email support via [support@tavily.com](mailto:support@tavily.com). * **Enterprise Plan** – White-glove support including: * Personal Slack channel * Dedicated account manager * AI engineer for technical assistance and optimizations * Uptime and support SLAs Tavily's privacy policy is available [here](https://tavily.com/privacy), outlining how data is handled and ensuring compliance with global regulations. The [Tavily Help Center](https://help.tavily.com/) is a comprehensive knowledge base with detailed guides on how to use Tavily. You can search for the information you need, explore tutorials, and find answers to common questions. #### Getting Started 1. [Sign up for an account](https://tavily.com/) 2. [Get your API key](https://app.tavily.com/home) 3. Integrate it into your application using our Python or Node.js SDK. 4. Start retrieving real-time search results! --- # Source: https://docs.tavily.com/documentation/integrations/flowise.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # FlowiseAI > Tavily is now available for integration through Flowise. ## Introduction Integrate [Tavily with FlowiseAI](https://docs.flowiseai.com/integrations/langchain/tools/tavily-ai) to enhance your AI workflows with powerful web search capabilities. Flowise provides a no-code platform for building AI applications, and the Tavily integration offers real-time, accurate search results tailored for LLMs and RAG (Retrieval-Augmented Generation) systems. Set up Tavily in Flowise to create chatflows or agent flows that can automate research, track news, or feed relevant data into your connected applications. ## How to set up Tavily with Flowise Follow these steps to integrate Tavily with Flowise:
[Login](https://flowiseai.com/) to your Flowise account.

Create a new flow in Flowise:

  1. Click "Create New Flow"
  2. Select either "Chat Flow" or "Agent Flow" as the type
  3. Name your flow (e.g., "Research Assistant")

Add the Tavily node to your flow:

For Chat Flow:

  1. Click on the (+) button
  2. Navigate to LangChain > Tools > Tavily API
  3. Drag the Tavily node into your flow

For Agent Flow:

  1. Click on the (+) button
  2. Navigate to Tools > Tavily API
  3. Drag the Tavily node into your flow

Configure the Tavily node with your credentials and parameters:

  1. Enter your Tavily API key in the credentials section
  2. Configure additional parameters, for example:
    • Search Depth: Choose between 'basic' or 'advanced'
    • Max Results: Set the number of results to return
    • Include Domains: Specify domains to include in search
    • Exclude Domains: Specify domains to exclude from search

Connect the Tavily node to other nodes in your flow:

  1. Connect to any node that accepts tool inputs
  2. Connect to an LLM node for query processing
  3. Connect to a Response node to format results
## Using Tavily in Flowise Tavily can be utilized in various Flowise application types: ### Chatflow Applications Flowise's Chatflow applications support Tavily tool node. This node allows you to automate tasks such as research, content curation, and real-time data integration into your workflows. ### Agent Applications In Agent applications, you can integrate the Tavily tool to access web data in real time. Use this to: * Retrieve structured and relevant search results * Extract raw content for further processing * Provide accurate, context-aware answers to user queries Flowise Tavily Integration --- # Source: https://docs.tavily.com/documentation/integrations/google-adk.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.tavily.com/llms.txt > Use this file to discover all available pages before exploring further. # Google ADK > Connect your Google ADK agent to Tavily's AI-focused search, extraction, and crawling platform for real-time web intelligence. ## Introduction The Tavily MCP Server connects your ADK agent to Tavily's AI-focused search, extraction, and crawling platform. This gives your agent the ability to perform real-time web searches, intelligently extract specific data from web pages, and crawl or create structured maps of websites. ## Prerequisites Before you begin, make sure you have: * Python 3.9 or later * pip for installing packages * A [Tavily API key](https://app.tavily.com/home) (sign up for free if you don't have one) * A [Gemini API key](https://aistudio.google.com/app/apikey) for Google AI Studio ## Installation Install ADK by running: ```bash theme={null} pip install google-adk mcp ``` ## Building Your Agent ### Step 1: Create an Agent Project Run the `adk create` command to start a new agent project: ```bash theme={null} adk create my_agent ``` This creates a new directory with the following structure: ``` my_agent/ agent.py # main agent code .env # API keys or project IDs __init__.py ``` ### Step 2: Update Your Agent Code Edit the `my_agent/agent.py` file to integrate Tavily. Choose either **Remote MCP Server** or **Local MCP Server**: ```python Remote MCP Server theme={null} from google.adk.agents import Agent from google.adk.tools.mcp_tool.mcp_session_manager import StreamableHTTPServerParams from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset import os # Get API key from environment TAVILY_API_KEY = os.getenv("TAVILY_API_KEY") root_agent = Agent( model="gemini-2.5-pro", name="tavily_agent", instruction="You are a helpful assistant that uses Tavily to search the web, extract content, and explore websites. Use Tavily's tools to provide up-to-date information to users.", tools=[ MCPToolset( connection_params=StreamableHTTPServerParams( url="https://mcp.tavily.com/mcp/", headers={ "Authorization": f"Bearer {TAVILY_API_KEY}", }, ), ) ], ) ``` ```python Local MCP Server theme={null} from google.adk.agents import Agent from google.adk.tools.mcp_tool.mcp_session_manager import StdioConnectionParams from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset from mcp import StdioServerParameters import os # Get API key from environment TAVILY_API_KEY = os.getenv("TAVILY_API_KEY") root_agent = Agent( model="gemini-2.5-pro", name="tavily_agent", instruction="You are a helpful assistant that uses Tavily to search the web, extract content, and explore websites.", tools=[ MCPToolset( connection_params=StdioConnectionParams( server_params=StdioServerParameters( command="npx", args=[ "-y", "tavily-mcp@latest", ], env={ "TAVILY_API_KEY": TAVILY_API_KEY, } ), timeout=30, ), ) ], ) ``` ### Step 3: Set Your API Keys Update the `my_agent/.env` file with your API keys: ```bash theme={null} echo 'GOOGLE_API_KEY="YOUR_GEMINI_API_KEY"' >> my_agent/.env echo 'TAVILY_API_KEY="YOUR_TAVILY_API_KEY"' >> my_agent/.env ``` Or manually edit the `.env` file: ``` GOOGLE_API_KEY="your_gemini_api_key_here" TAVILY_API_KEY="your_tavily_api_key_here" ``` ### Step 4: Run Your Agent You can run your ADK agent in two ways: #### Run with Command-Line Interface Run your agent using the `adk run` command: ```bash theme={null} adk run my_agent ``` This starts an interactive command-line interface where you can chat with your agent and test Tavily's capabilities. #### Run with Web Interface Start the ADK web interface for a visual testing experience: ```bash theme={null} adk web --port 8000 ``` **Note:** Run this command from the parent directory that contains your `my_agent/` folder. For example, if your agent is inside `agents/my_agent/`, run `adk web` from the `agents/` directory. This starts a web server with a chat interface. Access it at `http://localhost:8000`, select your agent from the dropdown, and start chatting. ## Example Usage Once your agent is set up and running, you can interact with it through the command-line interface or web interface. Here's a simple example: **User Query:** ``` Find all documentation pages on tavily.com and provide instructions on how to get started with Tavily ``` The agent automatically combines multiple Tavily tools to provide comprehensive answers, making it easy to explore websites and gather information without manual navigation. Tavily-ADK ## Available Tools Once connected, your agent gains access to Tavily's powerful web intelligence tools: ### tavily-search Execute a search query to find relevant information across the web. ### tavily-extract Extract structured data from any web page. Extract text, links, and images from single pages or batch process multiple URLs efficiently. ### tavily-map Traverses websites like a graph and can explore hundreds of paths in parallel with intelligent discovery to generate comprehensive site maps. ### tavily-crawl Traversal tool that can explore hundreds of paths in parallel with built-in extraction and intelligent discovery. --- # Source: https://docs.tavily.com/examples/open-sources/gpt-researcher.md # GPT Researcher ## Multi Agent Frameworks We are strong advocates for the future of AI agents, envisioning a world where autonomous agents communicate and collaborate as a cohesive team to undertake and complete complex tasks. We hold the belief that research is a pivotal element in successfully tackling these complex tasks, ensuring superior outcomes. Consider the scenario of developing a coding agent responsible for coding tasks using the latest API documentation and best practices. It would be wise to integrate an agent specializing in research to curate the most recent and relevant documentation, before crafting a technical design that would subsequently be handed off to the coding assistant tasked with generating the code. This approach is applicable across various sectors, including finance, business analysis, healthcare, marketing, and legal, among others. One multi-agent framework that we're excited about is [LangGraph](https://langchain-ai.github.io/langgraph/), built by the team at [Langchain](https://www.langchain.com/). LangGraph is a Python library for building stateful, multi-actor applications with LLMs. It extends the [LangChain Expression Language](https://python.langchain.com/docs/concepts/lcel/) with the ability to coordinate multiple chains (or actors) across multiple steps of computation. What's great about LangGraph is that it follows a DAG architecture, enabling each specialized agent to communicate with one another, and subsequently trigger actions among other agents within the graph. We've added an example for leveraging [GPT Researcher with LangGraph](https://github.com/assafelovic/gpt-researcher/tree/master/multi_agents) which can be found in `/multi_agents`. The example demonstrates a generic use case for an editorial agent team that works together to complete a research report on a given task. ### The Multi Agent Team The research team is made up of 7 AI agents: 1. Chief Editor - Oversees the research process and manages the team. This is the "master" agent that coordinates the other agents using Langgraph. 2. Researcher (gpt-researcher) - A specialized autonomous agent that conducts in depth research on a given topic. 3. Editor - Responsible for planning the research outline and structure. 4. Reviewer - Validates the correctness of the research results given a set of criteria. 5. Revisor - Revises the research results based on the feedback from the reviewer. 6. Writer - Responsible for compiling and writing the final report. 7. Publisher - Responsible for publishing the final report in various formats. ### How it works Generally, the process is based on the following stages: 1. Planning stage 2. Data collection and analysis 3. Writing and submission 4. Review and revision 5. Publication ### Architecture ### Steps More specifically (as seen in the architecture diagram) the process is as follows: 1. Browser (gpt-researcher) - Browses the internet for initial research based on the given research task. 2. Editor - Plans the report outline and structure based on the initial research. 3. For each outline topic (in parallel): 4. Researcher (gpt-researcher) - Runs an in depth research on the subtopics and writes a draft. 5. Reviewer - Validates the correctness of the draft given a set of criteria and provides feedback. 6. Revisor - Revises the draft until it is satisfactory based on the reviewer feedback. 7. Writer - Compiles and writes the final report including an introduction, conclusion and references section from the given research findings. 8. Publisher - Publishes the final report to multi formats such as PDF, Docx, Markdown, etc. ### How to run 1. Install required packages: ```python theme={null} pip install -r requirements.txt ``` 2. Run the application: ```python theme={null} python main.py ``` ### Usage To change the research query and customize the report, edit the `task.json` file in the main directory. ## Customization The config.py enables you to customize GPT Researcher to your specific needs and preferences. Thanks to our amazing community and contributions, GPT Researcher supports multiple LLMs and Retrievers. In addition, GPT Researcher can be tailored to various report formats (such as APA), word count, research iterations depth, etc. GPT Researcher defaults to our recommended suite of integrations: [OpenAI](https://platform.openai.com/docs/overview) for LLM calls and [Tavily API](https://app.tavily.com/home) for retrieving realtime online information. As seen below, OpenAI still stands as the superior LLM. We assume it will stay this way for some time, and that prices will only continue to decrease, while performance and speed increase over time. It may not come as a surprise that our default search engine is Tavily. We're aimed at building our search engine to tailor the exact needs of searching and aggregating for the most factual and unbiased information for research tasks. We highly recommend using it with GPT Researcher, and more generally with LLM applications that are built with RAG. Here is an example of the default config.py file found in `/gpt_researcher/config/`: ```python theme={null} def __init__(self, config_file: str = None): self.config_file = config_file self.retriever = "tavily" self.llm_provider = "openai" self.fast_llm_model = "gpt-3.5-turbo" self.smart_llm_model = "gpt-4o" self.fast_token_limit = 2000 self.smart_token_limit = 4000 self.browse_chunk_max_length = 8192 self.summary_token_limit = 700 self.temperature = 0.6 self.user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)" \ " Chrome/119.0.0.0 Safari/537.36 Edg/119.0.0.0" self.memory_backend = "local" self.total_words = 1000 self.report_format = "apa" self.max_iterations = 1 self.load_config_file() ``` Please note that you can also include your own external JSON file by adding the path in the config\_file param. To learn more about additional LLM support you can check out the [Langchain supported LLMs documentation](https://python.langchain.com/docs/integrations/llms/). Simply pass different provider names in the `llm_provider` config param. You can also change the search engine by modifying the retriever param to others such as `duckduckgo`, `googleAPI`, `googleSerp`, `searx` and more. Please note that you might need to sign up and obtain an API key for any of the other supported retrievers and LLM providers. ## Agent Example If you're interested in using GPT Researcher as a standalone agent, you can easily import it into any existing Python project. Below, is an example of calling the agent to generate a research report: ```python theme={null} from gpt_researcher import GPTResearcher import asyncio # It is best to define global constants at the top of your script QUERY = "What happened in the latest burning man floods?" REPORT_TYPE = "research_report" async def fetch_report(query, report_type): """ Fetch a research report based on the provided query and report type. """ researcher = GPTResearcher(query=query, report_type=report_type, config_path=None) await researcher.conduct_research() report = await researcher.write_report() return report async def generate_research_report(): """ This is a sample script that executes an async main function to run a research report. """ report = await fetch_report(QUERY, REPORT_TYPE) print(report) if __name__ == "__main__": asyncio.run(generate_research_report()) ``` You can further enhance this example to use the returned report as context for generating valuable content such as news article, marketing content, email templates, newsletters, etc. You can also use GPT Researcher to gather information about code documentation, business analysis, financial information and more. All of which can be used to complete much more complex tasks that require factual and high quality realtime information. ## Getting Started **Step 0** - Install Python 3.11 or later. [See here](https://www.tutorialsteacher.com/python/install-python) for a step-by-step guide. **Step 1** - Download the project and navigate to its directory ```python theme={null} $ git clone https://github.com/assafelovic/gpt-researcher.git $ cd gpt-researcher ``` **Step 2** - Set up API keys using two methods: exporting them directly or storing them in a `.env` file. For Linux/Temporary Windows Setup, use the export method: ```python theme={null} export OPENAI_API_KEY={Your OpenAI API Key here} export TAVILY_API_KEY={Your Tavily API Key here} ``` For a more permanent setup, create a `.env` file in the current gpt-researcher folder and input the keys as follows: ```python theme={null} OPENAI_API_KEY={Your OpenAI API Key here} TAVILY_API_KEY={Your Tavily API Key here} ``` For LLM, we recommend [OpenAI GPT](https://platform.openai.com/docs/guides/text-generation), but you can use any other LLM model (including open sources), simply change the llm model and provider in config/config.py. For search engine, we recommend [Tavily Search API](https://app.tavily.com/home), but you can also refer to other search engines of your choice by changing the search provider in config/config.py to `duckduckgo`, `googleAPI`, `googleSerp`, `searx`, or `bing`. Then add the corresponding env API key as seen in the config.py file. ### Quickstart **Step 1** - Install dependencies ```python theme={null} $ pip install -r requirements.txt ``` **Step 2** - Run the agent with FastAPI ```python theme={null} $ uvicorn main:app --reload ``` **Step 3** - Go to [http://localhost:8000](http://localhost:8000) on any browser and enjoy researching! ### Using Virtual Environment or Poetry Select either based on your familiarity with each: ### Virtual Environment Establishing the Virtual Environment with Activate/Deactivate configuration Create a virtual environment using the `venv` package with the environment name ``, for example, `env`. Execute the following command in the PowerShell/CMD terminal: ```python theme={null} python -m venv env ``` To activate the virtual environment, use the following activation script in PowerShell/CMD terminal: ```python theme={null} .\env\Scripts\activate ``` To deactivate the virtual environment, run the following deactivation script in PowerShell/CMD terminal: ```python theme={null} deactivate ``` Install the dependencies for a Virtual environment After activating the `env` environment, install dependencies using the `requirements.txt` file with the following command: ```python theme={null} python -m pip install -r requirements.txt ``` ### Poetry Establishing the Poetry dependencies and virtual environment with Poetry version `~1.7.1` Install project dependencies and simultaneously create a virtual environment for the specified project. By executing this command, Poetry reads the project's "pyproject.toml" file to determine the required dependencies and their versions, ensuring a consistent and isolated development environment. The virtual environment allows for a clean separation of project-specific dependencies, preventing conflicts with system-wide packages and enabling more straightforward dependency management throughout the project's lifecycle. ```python theme={null} poetry install ``` Activate the virtual environment associated with a Poetry project By running this command, the user enters a shell session within the isolated environment associated with the project, providing a dedicated space for development and execution. This virtual environment ensures that the project dependencies are encapsulated, avoiding conflicts with system-wide packages. Activating the Poetry shell is essential for seamlessly working on a project, as it ensures that the correct versions of dependencies are used and provides a controlled environment conducive to efficient development and testing. ```python theme={null} poetry shell ``` ### Run the app Launch the FastAPI application agent on a Virtual Environment or Poetry setup by executing the following command: ```python theme={null} python -m uvicorn main:app --reload ``` Visit [http://localhost:8000](http://localhost:8000) in any web browser and explore your research! ### Try it with Docker **Step 1** - Install Docker Follow the instructions [here](https://docs.docker.com/engine/install/) **Step 2** - Create `.env` file with your OpenAI Key or simply export it ```python theme={null} $ export OPENAI_API_KEY={Your API Key here} $ export TAVILY_API_KEY={Your Tavily API Key here} ``` **Step 3** - Run the application ```python theme={null} $ docker-compose up ``` **Step 4** - Go to [http://localhost:8000](http://localhost:8000) on any browser and enjoy researching! ## Introduction [GPT Researcher](https://gptr.dev/) is an autonomous agent designed for comprehensive online research on a variety of tasks. The agent can produce detailed, factual and unbiased research reports, with customization options for focusing on relevant resources, outlines, and lessons. Inspired by the recent [Plan-and-Solve](https://arxiv.org/abs/2305.04091) and [RAG](https://arxiv.org/abs/2005.11401) papers, GPT Researcher addresses issues of speed, determinism and reliability, offering a more stable performance and increased speed through parallelized agent work, as opposed to synchronous operations. ### Why GPT Researcher? 1. To form objective conclusions for manual research tasks can take time, sometimes weeks to find the right resources and information. 2. Current LLMs are trained on past and outdated information, with heavy risks of hallucinations, making them almost irrelevant for research tasks. 3. Solutions that enable web search (such as ChatGPT + Web Plugin), only consider limited resources and content that in some cases result in superficial conclusions or biased answers. 4. Using only a selection of resources can create bias in determining the right conclusions for research questions or tasks. ### Architecture The main idea is to run "planner" and "execution" agents, whereas the planner generates questions to research, and the execution agents seek the most related information based on each generated research question. Finally, the planner filters and aggregates all related information and creates a research report. The agents leverage both gpt3.5-turbo and gpt-4-turbo (128K context) to complete a research task. We optimize for costs using each only when necessary. The average research task takes around 3 minutes to complete, and costs \~\$0.1. More specifically: 1. Create a domain specific agent based on research query or task. 2. Generate a set of research questions that together form an objective opinion on any given task. 3. For each research question, trigger a crawler agent that scrapes online resources for information relevant to the given task. 4. For each scraped resources, summarize based on relevant information and keep track of its sources. 5. Finally, filter and aggregate all summarized sources and generate a final research report. ### Demo