# Ragflow > slug: /what-is-agent-context-engine --- --- sidebar_position: 2 slug: /what-is-agent-context-engine --- # What is Agent context engine? From 2025, a silent revolution began beneath the dazzling surface of AI Agents. While the world marveled at agents that could write code, analyze data, and automate workflows, a fundamental bottleneck emerged: why do even the most advanced agents still stumble on simple questions, forget previous conversations, or misuse available tools? The answer lies not in the intelligence of the Large Language Model (LLM) itself, but in the quality of the Context it receives. An LLM, no matter how powerful, is only as good as the information we feed it. Today’s cutting-edge agents are often crippled by a cumbersome, manual, and error-prone process of context assembly—a process known as Context Engineering. This is where the Agent Context Engine comes in. It is not merely an incremental improvement but a foundational shift, representing the evolution of RAG from a singular technique into the core data and intelligence substrate for the entire Agent ecosystem. ## Beyond the hype: The reality of today's "intelligent" Agents Today, the “intelligence” behind most AI Agents hides a mountain of human labor. Developers must: - Hand-craft elaborate prompt templates - Hard-code document-retrieval logic for every task - Juggle tool descriptions, conversation history, and knowledge snippets inside a tiny context window - Repeat the whole process for each new scenario This pattern is called Context Engineering. It is deeply tied to expert know-how, almost impossible to scale, and prohibitively expensive to maintain. When an enterprise needs to keep dozens of distinct agents alive, the artisanal workshop model collapses under its own weight. The mission of an Agent Context Engine is to turn Context Engineering from an “art” into an industrial-grade science. Deconstructing the Agent Context Engine So, what exactly is an Agent Context Engine? It is a unified, intelligent, and automated platform responsible for the end-to-end process of assembling the optimal context for an LLM or Agent at the moment of inference. It moves from artisanal crafting to industrialized production. At its core, an Agent Context Engine is built on a triumvirate of next-generation retrieval capabilities, seamlessly integrated into a single service layer: 1. The Knowledge Core (Advanced RAG): This is the evolution of traditional RAG. It moves beyond simple chunk-and-embed to intelligently process static, private enterprise knowledge. Techniques like TreeRAG (building LLM-generated document outlines for "locate-then-expand" retrieval) and GraphRAG (extracting entity networks to find semantically distant connections) work to close the "semantic gap." The engine’s Ingestion Pipeline acts as the ETL for unstructured data, parsing multi-format documents and using LLMs to enrich content with summaries, metadata, and structure before indexing. 2. The Memory Layer: An Agent’s intelligence is defined by its ability to learn from interaction. The Memory Layer is a specialized retrieval system for dynamic, episodic data: conversation history, user preferences, and the agent’s own internal state (e.g., "waiting for human input"). It manages the lifecycle of this data—storing raw dialogue, triggering summarization into semantic memory, and retrieving relevant past interactions to provide continuity and personalization. Technologically, it is a close sibling to RAG, but focused on a temporal stream of data. 3. The Tool Orchestrator: As MCP (Model Context Protocol) enables the connection of hundreds of internal services as tools, a new problem arises: tool selection. The Context Engine solves this with Tool Retrieval. Instead of dumping all tool descriptions into the prompt, it maintains an index of tools and—critically—an index of Playbooks or Guidelines (best practices on when and how to use tools). For a given task, it retrieves only the most relevant tools and instructions, transforming the LLM’s job from "searching a haystack" to "following a recipe." ## Why we need a dedicated engine? The case for a unified substrate The necessity of an Agent Context Engine becomes clear when we examine the alternative: siloed, manually wired components. - The Data Silo Problem: Knowledge, memory, and tools reside in separate systems, requiring complex integration for each new agent. - The Assembly Line Bottleneck: Developers spend more time on context plumbing than on agent logic, slowing innovation to a crawl. - The "Context Ownership" Dilemma: In manually engineered systems, context logic is buried in code, owned by developers, and opaque to business users. An Engine makes context a configurable, observable, and customer-owned asset. The shift from Context Engineering to a Context Platform/Engine marks the maturation of enterprise AI, as summarized in the table below: | Dimension | Context engineering (present) | Context engineering/Platform (future) | | ------------------- | -------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- | | Context creation | Manual, artisanal work by developers and prompt engineers. | Automated, driven by intelligent ingestion pipelines and configurable rules. | | Context delivery | Hard-coded prompts and static retrieval logic embedded in agent workflows. | Dynamic, real-time retrieval and assembly based on the agent's live state and intent. | | Context maintenance | A development and operational burden, logic locked in code. | A manageable platform function, with visibility and control returned to the business. | ## RAGFlow: A resolute march toward the context engine of Agents This is the future RAGFlow is forging. We left behind the label of “yet another RAG system” long ago. From DeepDoc—our deeply-optimized, multimodal document parser—to the bleeding-edge architectures that bridge semantic chasms in complex RAG scenarios, all the way to a full-blown, enterprise-grade ingestion pipeline, every evolutionary step RAGFlow takes is a deliberate stride toward the ultimate form: an Agentic Context Engine. We believe tomorrow’s enterprise AI advantage will hinge not on who owns the largest model, but on who can feed that model the highest-quality, most real-time, and most relevant context. An Agentic Context Engine is the critical infrastructure that turns this vision into reality. In the paradigm shift from “hand-crafted prompts” to “intelligent context,” RAGFlow is determined to be the most steadfast propeller and enabler. We invite every developer, enterprise, and researcher who cares about the future of AI agents to follow RAGFlow’s journey—so together we can witness and build the cornerstone of the next-generation AI stack. --- --- sidebar_position: 1 slug: /what-is-rag --- # What is Retreival-Augmented-Generation (RAG)? Since large language models (LLMs) became the focus of technology, their ability to handle general knowledge has been astonishing. However, when questions shift to internal corporate documents, proprietary knowledge bases, or real-time data, the limitations of LLMs become glaringly apparent: they cannot access private information outside their training data. Retrieval-Augmented Generation (RAG) was born precisely to address this core need. Before an LLM generates an answer, it first retrieves the most relevant context from an external knowledge base and inputs it as "reference material" to the LLM, thereby guiding it to produce accurate answers. In short, RAG elevates LLMs from "relying on memory" to "having evidence to rely on," significantly improving their accuracy and trustworthiness in specialized fields and real-time information queries. ## Why RAG is important? Although LLMs excel in language understanding and generation, they have inherent limitations: - Static Knowledge: The model's knowledge is based on a data snapshot from its training time and cannot be automatically updated, making it difficult to perceive the latest information. - Blind Spot to External Data: They cannot directly access corporate private documents, real-time information streams, or domain-specific content. - Hallucination Risk: When lacking accurate evidence, they may still fabricate plausible-sounding but false answers to maintain conversational fluency. The introduction of RAG provides LLMs with real-time, credible "factual grounding." Its core mechanism is divided into two stages: - Retrieval Stage: Based on the user's question, quickly retrieve the most relevant documents or data fragments from an external knowledge base. - Generation Stage: The LLM organizes and generates the final answer by incorporating the retrieved information as context, combined with its own linguistic capabilities. This upgrades LLMs from "speaking from memory" to "speaking with documentation," significantly enhancing reliability in professional and enterprise-level applications. ## How RAG works? Retrieval-Augmented Generation enables LLMs to generate higher-quality responses by leveraging real-time, external, or private data sources through the introduction of an information retrieval mechanism. Its workflow can be divided into following key steps: ### Data processing and vectorization The knowledge required by RAG comes from unstructured data in various formats, such as documents, database records, or API return content. This data typically needs to be chunked, then transformed into vectors via an embedding model, and stored in a vector database. Why is Chunking Needed? Indexing entire documents directly faces the following problems: - Decreased Retrieval Precision: Vectorizing long documents leads to semantic "averaging," losing details. - Context Length Limitation: LLMs have a finite context window, requiring filtering of the most relevant parts for input. - Cost and Efficiency: Embedding computation and retrieval costs are higher for long texts. Therefore, an intelligent chunking strategy is key to balancing information integrity, retrieval granularity, and computational efficiency. ### Retrieve relevant information The user's query is also converted into a vector to perform semantic relevance searches (e.g., calculating cosine similarity) in the vector database, matching and recalling the most relevant text fragments. ### Context construction and answer generation The retrieved relevant content is added to the LLM's context as factual grounding, and the LLM finally generates the answer. Therefore, RAG can be seen as Context Engineering 1.0 for automated context construction. ## Deep dive into existing RAG architecture: beyond vector retrieval An industrial-grade RAG system is far from being as simple as "vector search + LLM"; its complexity and challenges are primarily embedded in the retrieval process. ### Data complexity: multimodal document processing Core Challenge: Corporate knowledge mostly exists in the form of multimodal documents containing text, charts, tables, and formulas. Simple OCR extraction loses a large amount of semantic information. Advanced Practice: Leading solutions, such as RAGFlow, tend to use Visual Language Models (VLM) or specialized parsing models like DeepDoc to "translate" multimodal documents into unimodal text rich in structural and semantic information. Converting multimodal information into high-quality unimodal text has become standard practice for advanced RAG. ### The complexity of chunking: the trade-off between precision and context A simple "chunk-embed-retrieve" pipeline has an inherent contradiction: - Semantic Matching requires small text chunks to ensure clear semantic focus. - Context Understanding requires large text chunks to ensure complete and coherent information. This forces system design into a difficult trade-off between "precise but fragmented" and "complete but vague." Advanced Practice: Leading solutions, such as RAGFlow, employ semantic enhancement techniques like constructing semantic tables of contents and knowledge graphs. These not only address semantic fragmentation caused by physical chunking but also enable the discovery of relevant content across documents based on entity-relationship networks. ### Why is a vector database insufficient for serving RAG? Vector databases excel at semantic similarity search, but RAG requires precise and reliable answers, demanding more capabilities from the retrieval system: - Hybrid Search: Relying solely on vector retrieval may miss exact keyword matches (e.g., product codes, regulation numbers). Hybrid search, combining vector retrieval with keyword retrieval (BM25), ensures both semantic breadth and keyword precision. - Tensor or Multi-Vector Representation: To support cross-modal data, employing tensor or multi-vector representation has become an important trend. - Metadata Filtering: Filtering based on attributes like date, department, and type is a rigid requirement in business scenarios. Therefore, the retrieval layer of RAG is a composite system based on vector search but must integrate capabilities like full-text search, re-ranking, and metadata filtering. ## RAG and memory: Retrieval from the same source but different streams Within the agent framework, the essence of the memory mechanism is the same as RAG: both retrieve relevant information from storage based on current needs. The key difference lies in the data source: - RAG: Targets pre-existing static or dynamic private data provided by the user in advance (e.g., documents, databases). - Memory: Targets dynamic data generated or perceived by the agent in real-time during interaction (e.g., conversation history, environmental state, tool execution results). They are highly consistent at the technical base (e.g., vector retrieval, keyword matching) and can be seen as the same retrieval capability applied in different scenarios ("existing knowledge" vs. "interaction memory"). A complete agent system often includes both an RAG module for inherent knowledge and a Memory module for interaction history. ## RAG applications RAG has demonstrated clear value in several typical scenarios: 1. Enterprise Knowledge Q&A and Internal Search By vectorizing corporate private data and combining it with an LLM, RAG can directly return natural language answers based on authoritative sources, rather than document lists. While meeting intelligent Q&A needs, it inherently aligns with corporate requirements for data security, access control, and compliance. 2. Complex Document Understanding and Professional Q&A For structurally complex documents like contracts and regulations, the value of RAG lies in its ability to generate accurate, verifiable answers while maintaining context integrity. Its system accuracy largely depends on text chunking and semantic understanding strategies. 3. Dynamic Knowledge Fusion and Decision Support In business scenarios requiring the synthesis of information from multiple sources, RAG evolves into a knowledge orchestration and reasoning support system for business decisions. Through a multi-path recall mechanism, it fuses knowledge from different systems and formats, maintaining factual consistency and logical controllability during the generation phase. ## The future of RAG The evolution of RAG is unfolding along several clear paths: 1. RAG as the data foundation for Agents RAG and agents have an architecture vs. scenario relationship. For agents to achieve autonomous and reliable decision-making and execution, they must rely on accurate and timely knowledge. RAG provides them with a standardized capability to access private domain knowledge and is an inevitable choice for building knowledge-aware agents. 2. Advanced RAG: Using LLMs to optimize retrieval itself The core feature of next-generation RAG is fully utilizing the reasoning capabilities of LLMs to optimize the retrieval process, such as rewriting queries, summarizing or fusing results, or implementing intelligent routing. Empowering every aspect of retrieval with LLMs is key to breaking through current performance bottlenecks. 3. Towards context engineering 2.0 Current RAG can be viewed as Context Engineering 1.0, whose core is assembling static knowledge context for single Q&A tasks. The forthcoming Context Engineering 2.0 will extend with RAG technology at its core, becoming a system that automatically and dynamically assembles comprehensive context for agents. The context fused by this system will come not only from documents but also include interaction memory, available tools/skills, and real-time environmental information. This marks the transition of agent development from a "handicraft workshop" model to the industrial starting point of automated context engineering. The essence of RAG is to build a dedicated, efficient, and trustworthy external data interface for large language models; its core is Retrieval, not Generation. Starting from the practical need to solve private data access, its technical depth is reflected in the optimization of retrieval for complex unstructured data. With its deep integration into agent architectures and its development towards automated context engineering, RAG is evolving from a technology that improves Q&A quality into the core infrastructure for building the next generation of trustworthy, controllable, and scalable intelligent applications. --- --- sidebar_position: 1 slug: /configurations --- # Configuration Configurations for deploying RAGFlow via Docker. ## Guidelines When it comes to system configurations, you will need to manage the following files: - [.env](https://github.com/infiniflow/ragflow/blob/main/docker/.env): Contains important environment variables for Docker. - [service_conf.yaml.template](https://github.com/infiniflow/ragflow/blob/main/docker/service_conf.yaml.template): Configures the back-end services. It specifies the system-level configuration for RAGFlow and is used by its API server and task executor. Upon container startup, the `service_conf.yaml` file will be generated based on this template file. This process replaces any environment variables within the template, allowing for dynamic configuration tailored to the container's environment. - [docker-compose.yml](https://github.com/infiniflow/ragflow/blob/main/docker/docker-compose.yml): The Docker Compose file for starting up the RAGFlow service. To update the default HTTP serving port (80), go to [docker-compose.yml](https://github.com/infiniflow/ragflow/blob/main/docker/docker-compose.yml) and change `80:80` to `:80`. :::tip NOTE Updates to the above configurations require a reboot of all containers to take effect: ```bash docker compose -f docker/docker-compose.yml up -d ``` ::: ## Docker Compose - **docker-compose.yml** Sets up environment for RAGFlow and its dependencies. - **docker-compose-base.yml** Sets up environment for RAGFlow's dependencies: Elasticsearch/[Infinity](https://github.com/infiniflow/infinity), MySQL, MinIO, and Redis. :::danger IMPORTANT We do not actively maintain **docker-compose-CN-oc9.yml**, **docker-compose-macos.yml**, so use them at your own risk. However, you are welcome to file a pull request to improve them. ::: ## Docker environment variables The [.env](https://github.com/infiniflow/ragflow/blob/main/docker/.env) file contains important environment variables for Docker. ### Elasticsearch - `STACK_VERSION` The version of Elasticsearch. Defaults to `8.11.3` - `ES_PORT` The port used to expose the Elasticsearch service to the host machine, allowing **external** access to the service running inside the Docker container. Defaults to `1200`. - `ELASTIC_PASSWORD` The password for Elasticsearch. ### Kibana - `KIBANA_PORT` The port used to expose the Kibana service to the host machine, allowing **external** access to the service running inside the Docker container. Defaults to `6601`. - `KIBANA_USER` The username for Kibana. Defaults to `rag_flow`. - `KIBANA_PASSWORD` The password for Kibana. Defaults to `infini_rag_flow`. ### Resource management - `MEM_LIMIT` The maximum amount of the memory, in bytes, that *a specific* Docker container can use while running. Defaults to `8073741824`. ### MySQL - `MYSQL_PASSWORD` The password for MySQL. - `MYSQL_PORT` The port used to expose the MySQL service to the host machine, allowing **external** access to the MySQL database running inside the Docker container. Defaults to `5455`. ### MinIO RAGFlow utilizes MinIO as its object storage solution, leveraging its scalability to store and manage all uploaded files. - `MINIO_CONSOLE_PORT` The port used to expose the MinIO console interface to the host machine, allowing **external** access to the web-based console running inside the Docker container. Defaults to `9001` - `MINIO_PORT` The port used to expose the MinIO API service to the host machine, allowing **external** access to the MinIO object storage service running inside the Docker container. Defaults to `9000`. - `MINIO_USER` The username for MinIO. - `MINIO_PASSWORD` The password for MinIO. ### Redis - `REDIS_PORT` The port used to expose the Redis service to the host machine, allowing **external** access to the Redis service running inside the Docker container. Defaults to `6379`. - `REDIS_USERNAME` Optional Redis ACL username when using Redis 6+ authentication. - `REDIS_PASSWORD` The password for Redis. ### RAGFlow - `SVR_HTTP_PORT` The port used to expose RAGFlow's HTTP API service to the host machine, allowing **external** access to the service running inside the Docker container. Defaults to `9380`. - `RAGFLOW-IMAGE` The Docker image edition. Defaults to `infiniflow/ragflow:v0.23.1` (the RAGFlow Docker image without embedding models). :::tip NOTE If you cannot download the RAGFlow Docker image, try the following mirrors. - For the `nightly` edition: - `RAGFLOW_IMAGE=swr.cn-north-4.myhuaweicloud.com/infiniflow/ragflow:nightly` or, - `RAGFLOW_IMAGE=registry.cn-hangzhou.aliyuncs.com/infiniflow/ragflow:nightly`. ::: ### Embedding service - `TEI_MODEL` The embedding model which text-embeddings-inference serves. Allowed values are one of `Qwen/Qwen3-Embedding-0.6B`(default), `BAAI/bge-m3`, and `BAAI/bge-small-en-v1.5`. - `TEI_PORT` The port used to expose the text-embeddings-inference service to the host machine, allowing **external** access to the text-embeddings-inference service running inside the Docker container. Defaults to `6380`. ### Timezone - `TZ` The local time zone. Defaults to `Asia/Shanghai`. ### Hugging Face mirror site - `HF_ENDPOINT` The mirror site for huggingface.co. It is disabled by default. You can uncomment this line if you have limited access to the primary Hugging Face domain. ### MacOS - `MACOS` Optimizations for macOS. It is disabled by default. You can uncomment this line if your OS is macOS. ### User registration - `REGISTER_ENABLED` - `1`: (Default) Enable user registration. - `0`: Disable user registration. ## Service configuration [service_conf.yaml.template](https://github.com/infiniflow/ragflow/blob/main/docker/service_conf.yaml.template) specifies the system-level configuration for RAGFlow and is used by its API server and task executor. ### `ragflow` - `host`: The API server's IP address inside the Docker container. Defaults to `0.0.0.0`. - `port`: The API server's serving port inside the Docker container. Defaults to `9380`. ### `mysql` - `name`: The MySQL database name. Defaults to `rag_flow`. - `user`: The username for MySQL. - `password`: The password for MySQL. - `port`: The MySQL serving port inside the Docker container. Defaults to `3306`. - `max_connections`: The maximum number of concurrent connections to the MySQL database. Defaults to `100`. - `stale_timeout`: Timeout in seconds. ### `minio` - `user`: The username for MinIO. - `password`: The password for MinIO. - `host`: The MinIO serving IP *and* port inside the Docker container. Defaults to `minio:9000`. ### `redis` - `host`: The Redis serving IP *and* port inside the Docker container. Defaults to `redis:6379`. - `db`: The Redis database index to use. Defaults to `1`. - `username`: Optional Redis ACL username (Redis 6+). - `password`: The password for the specified Redis user. ### `oauth` The OAuth configuration for signing up or signing in to RAGFlow using a third-party account. - ``: Custom channel ID. - `type`: Authentication type, options include `oauth2`, `oidc`, `github`. Default is `oauth2`, when `issuer` parameter is provided, defaults to `oidc`. - `icon`: Icon ID, options include `github`, `sso`, default is `sso`. - `display_name`: Channel name, defaults to the Title Case format of the channel ID. - `client_id`: Required, unique identifier assigned to the client application. - `client_secret`: Required, secret key for the client application, used for communication with the authentication server. - `authorization_url`: Base URL for obtaining user authorization. - `token_url`: URL for exchanging authorization code and obtaining access token. - `userinfo_url`: URL for obtaining user information (username, email, etc.). - `issuer`: Base URL of the identity provider. OIDC clients can dynamically obtain the identity provider's metadata (`authorization_url`, `token_url`, `userinfo_url`) through `issuer`. - `scope`: Requested permission scope, a space-separated string. For example, `openid profile email`. - `redirect_uri`: Required, URI to which the authorization server redirects during the authentication flow to return results. Must match the callback URI registered with the authentication server. Format: `https://your-app.com/v1/user/oauth/callback/`. For local configuration, you can directly use `http://127.0.0.1:80/v1/user/oauth/callback/`. :::tip NOTE The following are best practices for configuring various third-party authentication methods. You can configure one or multiple third-party authentication methods for Ragflow: ```yaml oauth: oauth2: display_name: "OAuth2" client_id: "your_client_id" client_secret: "your_client_secret" authorization_url: "https://your-oauth-provider.com/oauth/authorize" token_url: "https://your-oauth-provider.com/oauth/token" userinfo_url: "https://your-oauth-provider.com/oauth/userinfo" redirect_uri: "https://your-app.com/v1/user/oauth/callback/oauth2" oidc: display_name: "OIDC" client_id: "your_client_id" client_secret: "your_client_secret" issuer: "https://your-oauth-provider.com/oidc" scope: "openid email profile" redirect_uri: "https://your-app.com/v1/user/oauth/callback/oidc" github: # https://docs.github.com/en/apps/oauth-apps/building-oauth-apps/creating-an-oauth-app type: "github" icon: "github" display_name: "Github" client_id: "your_client_id" client_secret: "your_client_secret" redirect_uri: "https://your-app.com/v1/user/oauth/callback/github" ``` ::: ### `user_default_llm` The default LLM to use for a new RAGFlow user. It is disabled by default. To enable this feature, uncomment the corresponding lines in **service_conf.yaml.template**. - `factory`: The LLM supplier. Available options: - `"OpenAI"` - `"DeepSeek"` - `"Moonshot"` - `"Tongyi-Qianwen"` - `"VolcEngine"` - `"ZHIPU-AI"` - `api_key`: The API key for the specified LLM. You will need to apply for your model API key online. - `allowed_factories`: If this is set, the users will be allowed to add only the factories in this list. - `"OpenAI"` - `"DeepSeek"` - `"Moonshot"` :::tip NOTE If you do not set the default LLM here, configure the default LLM on the **Settings** page in the RAGFlow UI. ::: --- --- sidebar_position: 1 slug: /contributing --- # Contribution guidelines General guidelines for RAGFlow's community contributors. --- This document offers guidelines and major considerations for submitting your contributions to RAGFlow. - To report a bug, file a [GitHub issue](https://github.com/infiniflow/ragflow/issues/new/choose) with us. - For further questions, you can explore existing discussions or initiate a new one in [Discussions](https://github.com/orgs/infiniflow/discussions). ## What you can contribute The list below mentions some contributions you can make, but it is not a complete list. - Proposing or implementing new features - Fixing a bug - Adding test cases or demos - Posting a blog or tutorial - Updates to existing documents, codes, or annotations. - Suggesting more user-friendly error codes ## File a pull request (PR) ### General workflow 1. Fork our GitHub repository. 2. Clone your fork to your local machine: `git clone git@github.com:/ragflow.git` 3. Create a local branch: `git checkout -b my-branch` 4. Provide sufficient information in your commit message `git commit -m 'Provide sufficient info in your commit message'` 5. Commit changes to your local branch, and push to GitHub: (include necessary commit message) `git push origin my-branch.` 6. Submit a pull request for review. ### Before filing a PR - Consider splitting a large PR into multiple smaller, standalone PRs to keep a traceable development history. - Ensure that your PR addresses just one issue, or keep any unrelated changes small. - Add test cases when contributing new features. They demonstrate that your code functions correctly and protect against potential issues from future changes. ### Describing your PR - Ensure that your PR title is concise and clear, providing all the required information. - Refer to a corresponding GitHub issue in your PR description if applicable. - Include sufficient design details for *breaking changes* or *API changes* in your description. ### Reviewing & merging a PR Ensure that your PR passes all Continuous Integration (CI) tests before merging it. --- --- sidebar_position: 4 slug: /acquire_ragflow_api_key --- # Acquire RAGFlow API key An API key is required for the RAGFlow server to authenticate your HTTP/Python or MCP requests. This documents provides instructions on obtaining a RAGFlow API key. 1. Click your avatar in the top right corner of the RAGFlow UI to access the configuration page. 2. Click **API** to switch to the **API** page. 3. Obtain a RAGFlow API key: ![ragflow_api_key](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/ragflow_api_key.jpg) :::tip NOTE See the [RAGFlow HTTP API reference](../references/http_api_reference.md) or the [RAGFlow Python API reference](../references/python_api_reference.md) for a complete reference of RAGFlow's HTTP or Python APIs. ::: --- --- sidebar_position: 2 slug: /launch_ragflow_from_source --- # Launch service from source A guide explaining how to set up a RAGFlow service from its source code. By following this guide, you'll be able to debug using the source code. ## Target audience Developers who have added new features or modified existing code and wish to debug using the source code, *provided that* their machine has the target deployment environment set up. ## Prerequisites - CPU ≥ 4 cores - RAM ≥ 16 GB - Disk ≥ 50 GB - Docker ≥ 24.0.0 & Docker Compose ≥ v2.26.1 :::tip NOTE If you have not installed Docker on your local machine (Windows, Mac, or Linux), see the [Install Docker Engine](https://docs.docker.com/engine/install/) guide. ::: ## Launch a service from source To launch a RAGFlow service from source code: ### Clone the RAGFlow repository ```bash git clone https://github.com/infiniflow/ragflow.git cd ragflow/ ``` ### Install Python dependencies 1. Install uv: ```bash pipx install uv ``` 2. Install RAGFlow service's Python dependencies: ```bash uv sync --python 3.12 --frozen ``` *A virtual environment named `.venv` is created, and all Python dependencies are installed into the new environment.* If you need to run tests against the RAGFlow service, install the test dependencies: ```bash uv sync --python 3.12 --group test --frozen && uv pip install sdk/python --group test ``` ### Launch third-party services The following command launches the 'base' services (MinIO, Elasticsearch, Redis, and MySQL) using Docker Compose: ```bash docker compose -f docker/docker-compose-base.yml up -d ``` ### Update `host` and `port` Settings for Third-party Services 1. Add the following line to `/etc/hosts` to resolve all hosts specified in **docker/service_conf.yaml.template** to `127.0.0.1`: ``` 127.0.0.1 es01 infinity mysql minio redis ``` 2. In **docker/service_conf.yaml.template**, update mysql port to `5455` and es port to `1200`, as specified in **docker/.env**. ### Launch the RAGFlow backend service 1. Comment out the `nginx` line in **docker/entrypoint.sh**. ``` # /usr/sbin/nginx ``` 2. Activate the Python virtual environment: ```bash source .venv/bin/activate export PYTHONPATH=$(pwd) ``` 3. **Optional:** If you cannot access HuggingFace, set the HF_ENDPOINT environment variable to use a mirror site: ```bash export HF_ENDPOINT=https://hf-mirror.com ``` 4. Check the configuration in **conf/service_conf.yaml**, ensuring all hosts and ports are correctly set. 5. Run the **entrypoint.sh** script to launch the backend service: ```shell JEMALLOC_PATH=$(pkg-config --variable=libdir jemalloc)/libjemalloc.so; LD_PRELOAD=$JEMALLOC_PATH python rag/svr/task_executor.py 1; ``` ```shell python api/ragflow_server.py; ``` ### Launch the RAGFlow frontend service 1. Navigate to the `web` directory and install the frontend dependencies: ```bash cd web npm install ``` 2. Update `proxy.target` in **.umirc.ts** to `http://127.0.0.1:9380`: ```bash vim .umirc.ts ``` 3. Start up the RAGFlow frontend service: ```bash npm run dev ``` *The following message appears, showing the IP address and port number of your frontend service:* ![](https://github.com/user-attachments/assets/0daf462c-a24d-4496-a66f-92533534e187) ### Access the RAGFlow service In your web browser, enter `http://127.0.0.1:/`, ensuring the port number matches that shown in the screenshot above. ### Stop the RAGFlow service when the development is done 1. Stop the RAGFlow frontend service: ```bash pkill npm ``` 2. Stop the RAGFlow backend service: ```bash pkill -f "docker/entrypoint.sh" ``` --- --- sidebar_position: 1 slug: /launch_mcp_server --- # Launch RAGFlow MCP server Launch an MCP server from source or via Docker. --- A RAGFlow Model Context Protocol (MCP) server is designed as an independent component to complement the RAGFlow server. Note that an MCP server must operate alongside a properly functioning RAGFlow server. An MCP server can start up in either self-host mode (default) or host mode: - **Self-host mode**: When launching an MCP server in self-host mode, you must provide an API key to authenticate the MCP server with the RAGFlow server. In this mode, the MCP server can access *only* the datasets of a specified tenant on the RAGFlow server. - **Host mode**: In host mode, each MCP client can access their own datasets on the RAGFlow server. However, each client request must include a valid API key to authenticate the client with the RAGFlow server. Once a connection is established, an MCP server communicates with its client in MCP HTTP+SSE (Server-Sent Events) mode, unidirectionally pushing responses from the RAGFlow server to its client in real time. ## Prerequisites 1. Ensure RAGFlow is upgraded to v0.18.0 or later. 2. Have your RAGFlow API key ready. See [Acquire a RAGFlow API key](../acquire_ragflow_api_key.md). :::tip INFO If you wish to try out our MCP server without upgrading RAGFlow, community contributor [yiminghub2024](https://github.com/yiminghub2024) 👏 shares their recommended steps [here](#launch-an-mcp-server-without-upgrading-ragflow). ::: ## Launch an MCP server You can start an MCP server either from source code or via Docker. ### Launch from source code 1. Ensure that a RAGFlow server v0.18.0+ is properly running. 2. Launch the MCP server: ```bash # Launch the MCP server to work in self-host mode, run either of the following uv run mcp/server/server.py --host=127.0.0.1 --port=9382 --base-url=http://127.0.0.1:9380 --api-key=ragflow-xxxxx # uv run mcp/server/server.py --host=127.0.0.1 --port=9382 --base-url=http://127.0.0.1:9380 --mode=self-host --api-key=ragflow-xxxxx # To launch the MCP server to work in host mode, run the following instead: # uv run mcp/server/server.py --host=127.0.0.1 --port=9382 --base-url=http://127.0.0.1:9380 --mode=host ``` Where: - `host`: The MCP server's host address. - `port`: The MCP server's listening port. - `base_url`: The address of the running RAGFlow server. - `mode`: The launch mode. - `self-host`: (default) self-host mode. - `host`: host mode. - `api_key`: Required in self-host mode to authenticate the MCP server with the RAGFlow server. See [here](../acquire_ragflow_api_key.md) for instructions on acquiring an API key. ### Transports The RAGFlow MCP server supports two transports: the legacy SSE transport (served at `/sse`), introduced on November 5, 2024, and deprecated on March 26, 2025, and the streamable-HTTP transport (served at `/mcp`). The legacy SSE transport and the streamable HTTP transport with JSON responses are enabled by default. To disable either transport, use the flags `--no-transport-sse-enabled` or `--no-transport-streamable-http-enabled`. To disable JSON responses for the streamable HTTP transport, use the `--no-json-response` flag. ### Launch from Docker #### 1. Enable MCP server The MCP server is designed as an optional component that complements the RAGFlow server and disabled by default. To enable MCP server: 1. Navigate to **docker/docker-compose.yml**. 2. Uncomment the `services.ragflow.command` section as shown below: ```yaml {6-13} services: ragflow: ... image: ${RAGFLOW_IMAGE} # Example configuration to set up an MCP server: command: - --enable-mcpserver - --mcp-host=0.0.0.0 - --mcp-port=9382 - --mcp-base-url=http://127.0.0.1:9380 - --mcp-script-path=/ragflow/mcp/server/server.py - --mcp-mode=self-host - --mcp-host-api-key=ragflow-xxxxxxx # Optional transport flags for the RAGFlow MCP server. # If you set `mcp-mode` to `host`, you must add the --no-transport-streamable-http-enabled flag, because the streamable-HTTP transport is not yet supported in host mode. # The legacy SSE transport and the streamable-HTTP transport with JSON responses are enabled by default. # To disable a specific transport or JSON responses for the streamable-HTTP transport, use the corresponding flag(s): # - --no-transport-sse-enabled # Disables the legacy SSE endpoint (/sse) # - --no-transport-streamable-http-enabled # Disables the streamable-HTTP transport (served at the /mcp endpoint) # - --no-json-response # Disables JSON responses for the streamable-HTTP transport ``` Where: - `mcp-host`: The MCP server's host address. - `mcp-port`: The MCP server's listening port. - `mcp-base-url`: The address of the running RAGFlow server. - `mcp-script-path`: The file path to the MCP server’s main script. - `mcp-mode`: The launch mode. - `self-host`: (default) self-host mode. - `host`: host mode. - `mcp-host-api_key`: Required in self-host mode to authenticate the MCP server with the RAGFlow server. See [here](../acquire_ragflow_api_key.md) for instructions on acquiring an API key. :::tip INFO If you set `mcp-mode` to `host`, you must add the `--no-transport-streamable-http-enabled` flag, because the streamable-HTTP transport is not yet supported in host mode. ::: #### 2. Launch a RAGFlow server with an MCP server Run `docker compose -f docker-compose.yml up` to launch the RAGFlow server together with the MCP server. *The following ASCII art confirms a successful launch:* ```bash docker-ragflow-cpu-1 | Starting MCP Server on 0.0.0.0:9382 with base URL http://127.0.0.1:9380... docker-ragflow-cpu-1 | Starting 1 task executor(s) on host 'dd0b5e07e76f'... docker-ragflow-cpu-1 | 2025-04-18 15:41:18,816 INFO 27 ragflow_server log path: /ragflow/logs/ragflow_server.log, log levels: {'peewee': 'WARNING', 'pdfminer': 'WARNING', 'root': 'INFO'} docker-ragflow-cpu-1 | docker-ragflow-cpu-1 | __ __ ____ ____ ____ _____ ______ _______ ____ docker-ragflow-cpu-1 | | \/ |/ ___| _ \ / ___|| ____| _ \ \ / / ____| _ \ docker-ragflow-cpu-1 | | |\/| | | | |_) | \___ \| _| | |_) \ \ / /| _| | |_) | docker-ragflow-cpu-1 | | | | | |___| __/ ___) | |___| _ < \ V / | |___| _ < docker-ragflow-cpu-1 | |_| |_|\____|_| |____/|_____|_| \_\ \_/ |_____|_| \_\ docker-ragflow-cpu-1 | docker-ragflow-cpu-1 | MCP launch mode: self-host docker-ragflow-cpu-1 | MCP host: 0.0.0.0 docker-ragflow-cpu-1 | MCP port: 9382 docker-ragflow-cpu-1 | MCP base_url: http://127.0.0.1:9380 docker-ragflow-cpu-1 | INFO: Started server process [26] docker-ragflow-cpu-1 | INFO: Waiting for application startup. docker-ragflow-cpu-1 | INFO: Application startup complete. docker-ragflow-cpu-1 | INFO: Uvicorn running on http://0.0.0.0:9382 (Press CTRL+C to quit) docker-ragflow-cpu-1 | 2025-04-18 15:41:20,469 INFO 27 found 0 gpus docker-ragflow-cpu-1 | 2025-04-18 15:41:23,263 INFO 27 init database on cluster mode successfully docker-ragflow-cpu-1 | 2025-04-18 15:41:25,318 INFO 27 load_model /ragflow/rag/res/deepdoc/det.onnx uses CPU docker-ragflow-cpu-1 | 2025-04-18 15:41:25,367 INFO 27 load_model /ragflow/rag/res/deepdoc/rec.onnx uses CPU docker-ragflow-cpu-1 | ____ ___ ______ ______ __ docker-ragflow-cpu-1 | / __ \ / | / ____// ____// /____ _ __ docker-ragflow-cpu-1 | / /_/ // /| | / / __ / /_ / // __ \| | /| / / docker-ragflow-cpu-1 | / _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ / docker-ragflow-cpu-1 | /_/ |_|/_/ |_|\____//_/ /_/ \____/ |__/|__/ docker-ragflow-cpu-1 | docker-ragflow-cpu-1 | docker-ragflow-cpu-1 | 2025-04-18 15:41:29,088 INFO 27 RAGFlow version: v0.18.0-285-gb2c299fa full docker-ragflow-cpu-1 | 2025-04-18 15:41:29,088 INFO 27 project base: /ragflow docker-ragflow-cpu-1 | 2025-04-18 15:41:29,088 INFO 27 Current configs, from /ragflow/conf/service_conf.yaml: docker-ragflow-cpu-1 | ragflow: {'host': '0.0.0.0', 'http_port': 9380} ... docker-ragflow-cpu-1 | * Running on all addresses (0.0.0.0) docker-ragflow-cpu-1 | * Running on http://127.0.0.1:9380 docker-ragflow-cpu-1 | * Running on http://172.19.0.6:9380 docker-ragflow-cpu-1 | ______ __ ______ __ docker-ragflow-cpu-1 | /_ __/___ ______/ /__ / ____/ _____ _______ __/ /_____ _____ docker-ragflow-cpu-1 | / / / __ `/ ___/ //_/ / __/ | |/_/ _ \/ ___/ / / / __/ __ \/ ___/ docker-ragflow-cpu-1 | / / / /_/ (__ ) ,< / /____> ;` - Shows detailed status information for the service identified by **id**. - [Example](#example-show-service) `SHOW VERSION;` - Shows RAGFlow version. - [Example](#example-show-version) ### User Management Commands `LIST USERS;` - Lists all users known to the system. - [Example](#example-list-users) `SHOW USER ;` - Shows details and permissions for the user specified by **email**. The username must be enclosed in single or double quotes. - [Example](#example-show-user) `CREATE USER ;` - Create user by username and password. The username and password must be enclosed in single or double quotes. - [Example](#example-create-user) `DROP USER ;` - Removes the specified user from the system. Use with caution. - [Example](#example-drop-user) `ALTER USER PASSWORD ;` - Changes the password for the specified user. - [Example](#example-alter-user-password) `ALTER USER ACTIVE ;` - Changes the user to active or inactive. - [Example](#example-alter-user-active) ### Data and Agent Commands `LIST DATASETS OF ;` - Lists the datasets associated with the specified user. - [Example](#example-list-datasets-of-user) `LIST AGENTS OF ;` - Lists the agents associated with the specified user. - [Example](#example-list-agents-of-user) ### Meta-Commands - \? or \help Shows help information for the available commands. - \q or \quit Exits the CLI application. - [Example](#example-meta-commands) ### Examples - List all available services. ``` admin> list services; command: list services; Listing all services +-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+---------+ | extra | host | id | name | port | service_type | status | +-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+---------+ | {} | 0.0.0.0 | 0 | ragflow_0 | 9380 | ragflow_server | Timeout | | {'meta_type': 'mysql', 'password': 'infini_rag_flow', 'username': 'root'} | localhost | 1 | mysql | 5455 | meta_data | Alive | | {'password': 'infini_rag_flow', 'store_type': 'minio', 'user': 'rag_flow'} | localhost | 2 | minio | 9000 | file_store | Alive | | {'password': 'infini_rag_flow', 'retrieval_type': 'elasticsearch', 'username': 'elastic'} | localhost | 3 | elasticsearch | 1200 | retrieval | Alive | | {'db_name': 'default_db', 'retrieval_type': 'infinity'} | localhost | 4 | infinity | 23817 | retrieval | Timeout | | {'database': 1, 'mq_type': 'redis', 'password': 'infini_rag_flow'} | localhost | 5 | redis | 6379 | message_queue | Alive | +-------------------------------------------------------------------------------------------+-----------+----+---------------+-------+----------------+---------+ ``` - Show ragflow_server. ``` admin> show service 0; command: show service 0; Showing service: 0 Service ragflow_0 is alive. Detail: Confirm elapsed: 26.0 ms. ``` - Show mysql. ``` admin> show service 1; command: show service 1; Showing service: 1 Service mysql is alive. Detail: +---------+----------+------------------+------+------------------+------------------------+-------+-----------------+ | command | db | host | id | info | state | time | user | +---------+----------+------------------+------+------------------+------------------------+-------+-----------------+ | Daemon | None | localhost | 5 | None | Waiting on empty queue | 16111 | event_scheduler | | Sleep | rag_flow | 172.18.0.1:40046 | 1610 | None | | 2 | root | | Query | rag_flow | 172.18.0.1:35882 | 1629 | SHOW PROCESSLIST | init | 0 | root | +---------+----------+------------------+------+------------------+------------------------+-------+-----------------+ ``` - Show minio. ``` admin> show service 2; command: show service 2; Showing service: 2 Service minio is alive. Detail: Confirm elapsed: 2.1 ms. ``` - Show elasticsearch. ``` admin> show service 3; command: show service 3; Showing service: 3 Service elasticsearch is alive. Detail: +----------------+------+--------------+---------+----------------+--------------+---------------+--------------+------------------------------+----------------------------+-----------------+-------+---------------+---------+-------------+---------------------+--------+------------+--------------------+ | cluster_name | docs | docs_deleted | indices | indices_shards | jvm_heap_max | jvm_heap_used | jvm_versions | mappings_deduplicated_fields | mappings_deduplicated_size | mappings_fields | nodes | nodes_version | os_mem | os_mem_used | os_mem_used_percent | status | store_size | total_dataset_size | +----------------+------+--------------+---------+----------------+--------------+---------------+--------------+------------------------------+----------------------------+-----------------+-------+---------------+---------+-------------+---------------------+--------+------------+--------------------+ | docker-cluster | 717 | 86 | 37 | 42 | 3.76 GB | 1.74 GB | 21.0.1+12-29 | 6575 | 48.0 KB | 8521 | 1 | ['8.11.3'] | 7.52 GB | 4.55 GB | 61 | green | 4.60 MB | 4.60 MB | +----------------+------+--------------+---------+----------------+--------------+---------------+--------------+------------------------------+----------------------------+-----------------+-------+---------------+---------+-------------+---------------------+--------+------------+--------------------+ ``` - Show infinity. ``` admin> show service 4; command: show service 4; Showing service: 4 Fail to show service, code: 500, message: Infinity is not in use. ``` - Show redis. ``` admin> show service 5; command: show service 5; Showing service: 5 Service redis is alive. Detail: +-----------------+-------------------+---------------------------+-------------------------+---------------+-------------+--------------------------+---------------------+-------------+ | blocked_clients | connected_clients | instantaneous_ops_per_sec | mem_fragmentation_ratio | redis_version | server_mode | total_commands_processed | total_system_memory | used_memory | +-----------------+-------------------+---------------------------+-------------------------+---------------+-------------+--------------------------+---------------------+-------------+ | 0 | 2 | 1 | 10.41 | 7.2.4 | standalone | 10446 | 30.84G | 1.10M | +-----------------+-------------------+---------------------------+-------------------------+---------------+-------------+--------------------------+---------------------+-------------+ ``` - Show RAGFlow version ``` admin> show version; +-----------------------+ | version | +-----------------------+ | v0.21.0-241-gc6cf58d5 | +-----------------------+ ``` - List all user. ``` admin> list users; command: list users; Listing all users +-------------------------------+----------------------+-----------+----------+ | create_date | email | is_active | nickname | +-------------------------------+----------------------+-----------+----------+ | Mon, 22 Sep 2025 10:59:04 GMT | admin@ragflow.io | 1 | admin | | Sun, 14 Sep 2025 17:36:27 GMT | lynn_inf@hotmail.com | 1 | Lynn | +-------------------------------+----------------------+-----------+----------+ ``` - Show specified user. ``` admin> show user "admin@ragflow.io"; command: show user "admin@ragflow.io"; Showing user: admin@ragflow.io +-------------------------------+------------------+-----------+--------------+------------------+--------------+----------+-----------------+---------------+--------+-------------------------------+ | create_date | email | is_active | is_anonymous | is_authenticated | is_superuser | language | last_login_time | login_channel | status | update_date | +-------------------------------+------------------+-----------+--------------+------------------+--------------+----------+-----------------+---------------+--------+-------------------------------+ | Mon, 22 Sep 2025 10:59:04 GMT | admin@ragflow.io | 1 | 0 | 1 | True | Chinese | None | None | 1 | Mon, 22 Sep 2025 10:59:04 GMT | +-------------------------------+------------------+-----------+--------------+------------------+--------------+----------+-----------------+---------------+--------+-------------------------------+ ``` - Create new user. ``` admin> create user "example@ragflow.io" "psw"; command: create user "example@ragflow.io" "psw"; Create user: example@ragflow.io, password: psw, role: user +----------------------------------+--------------------+----------------------------------+--------------+---------------+----------+ | access_token | email | id | is_superuser | login_channel | nickname | +----------------------------------+--------------------+----------------------------------+--------------+---------------+----------+ | 5cdc6d1e9df111f099b543aee592c6bf | example@ragflow.io | 5cdc6ca69df111f099b543aee592c6bf | False | password | | +----------------------------------+--------------------+----------------------------------+--------------+---------------+----------+ ``` - Alter user password. ``` admin> alter user password "example@ragflow.io" "newpsw"; command: alter user password "example@ragflow.io" "newpsw"; Alter user: example@ragflow.io, password: newpsw Password updated successfully! ``` - Alter user active, turn off. ``` admin> alter user active "example@ragflow.io" off; command: alter user active "example@ragflow.io" off; Alter user example@ragflow.io activate status, turn off. Turn off user activate status successfully! ``` - Drop user. ``` admin> Drop user "example@ragflow.io"; command: Drop user "example@ragflow.io"; Drop user: example@ragflow.io Successfully deleted user. Details: Start to delete owned tenant. - Deleted 2 tenant-LLM records. - Deleted 0 langfuse records. - Deleted 1 tenant. - Deleted 1 user-tenant records. - Deleted 1 user. Delete done! ``` Delete user's data at the same time. - List the specified user's dataset. ``` admin> list datasets of "lynn_inf@hotmail.com"; command: list datasets of "lynn_inf@hotmail.com"; Listing all datasets of user: lynn_inf@hotmail.com +-----------+-------------------------------+---------+----------+---------------+------------+--------+-----------+-------------------------------+ | chunk_num | create_date | doc_num | language | name | permission | status | token_num | update_date | +-----------+-------------------------------+---------+----------+---------------+------------+--------+-----------+-------------------------------+ | 29 | Mon, 15 Sep 2025 11:56:59 GMT | 12 | Chinese | test_dataset | me | 1 | 12896 | Fri, 19 Sep 2025 17:50:58 GMT | | 4 | Sun, 28 Sep 2025 11:49:31 GMT | 6 | Chinese | dataset_share | team | 1 | 1121 | Sun, 28 Sep 2025 14:41:03 GMT | +-----------+-------------------------------+---------+----------+---------------+------------+--------+-----------+-------------------------------+ ``` - List the specified user's agents. ``` admin> list agents of "lynn_inf@hotmail.com"; command: list agents of "lynn_inf@hotmail.com"; Listing all agents of user: lynn_inf@hotmail.com +-----------------+-------------+------------+-----------------+ | canvas_category | canvas_type | permission | title | +-----------------+-------------+------------+-----------------+ | agent | None | team | research_helper | +-----------------+-------------+------------+-----------------+ ``` - Show help information. ``` admin> \help command: \help Commands: LIST SERVICES SHOW SERVICE STARTUP SERVICE SHUTDOWN SERVICE RESTART SERVICE LIST USERS SHOW USER DROP USER CREATE USER ALTER USER PASSWORD ALTER USER ACTIVE LIST DATASETS OF LIST AGENTS OF Meta Commands: \?, \h, \help Show this help \q, \quit, \exit Quit the CLI ``` - Exit ``` admin> \q command: \q Goodbye! ``` --- --- sidebar_position: 0 slug: /admin_service --- # Admin Service The Admin Service is the core backend management service of the RAGFlow system, providing comprehensive system administration capabilities through centralized API interfaces for managing and controlling the entire platform. Adopting a client-server architecture, it supports access and operations via both a Web UI and an Admin CLI, ensuring flexible and efficient execution of administrative tasks. The core functions of the Admin Service include real-time monitoring of the operational status of the RAGFlow server and its critical dependent components—such as MySQL, Elasticsearch, Redis, and MinIO—along with full-featured user management. In administrator mode, it enables key operations such as viewing user information, creating users, updating passwords, modifying activation status, and performing complete user data deletion. These functions remain accessible via the Admin CLI even when the web management interface is disabled, ensuring the system stays under control at all times. With its unified interface design, the Admin Service combines the convenience of visual administration with the efficiency and stability of command-line operations, serving as a crucial foundation for the reliable operation and secure management of the RAGFlow system. ## Starting the Admin Service ### Launching from source code 1. Before start Admin Service, please make sure RAGFlow system is already started. 2. Launch from source code: ```bash python admin/server/admin_server.py ``` The service will start and listen for incoming connections from the CLI on the configured port. ### Using docker image 1. Before startup, please configure the `docker_compose.yml` file to enable admin server: ```bash command: - --enable-adminserver ``` 2. Start the containers, the service will start and listen for incoming connections from the CLI on the configured port. --- --- sidebar_position: 1 slug: /admin_ui --- # Admin UI The RAGFlow Admin UI is a web-based interface that provides comprehensive system status monitoring and user management capabilities. ## Accessing the Admin UI To access the RAGFlow admin UI, append `/admin` to the web UI's address, e.g. `http://[RAGFLOW_WEB_UI_ADDR]/admin`, replace `[RAGFLOW_WEB_UI_ADDR]` with real RAGFlow web UI address. ### Default Credentials | Username | Password | |--------------------|----------| | `admin@ragflow.io` | `admin` | ## Admin UI Overview ### Service status The service status page displays of all services within the RAGFlow system. - **Service List**: View all services in a table. - **Filtering**: Use the filter button to filter services by **Service Type**. - **Search**: Use the search bar to quickly find services by **Name** or **Service Type**. - **Actions** (hover over a row to see action buttons): - **Extra Info**: Display additional configuration information of a service in a dialog. - **Service Details**: Display detailed status information of a service in a dialog. According to service's type, a service's status information could be displayed as a plain text, a key-value data list, a data table or a bar chart. ### User management The user management page provides comprehensive tools for managing all users in the RAGFlow system. - **User List**: View all users in a table. - **Search Users**: Use the search bar to find users by email or nickname. - **Filter Users**: Click the filter icon to filter by **Status**. - Click the **"New User"** button to create a new user account in a dialog. - Activate or deactivate a user by using the switch toggle in **Enable** column, changes take effect immediately. - **Actions** (hover over a row to see action buttons): - **View Details**: Navigate to the user detail page to see comprehensive user information. - **Change Password**: Force reset the user's password. - **Delete User**: Remove the user from the system with confirmation. ### User detail The user detail page displays a user's detailed information and all resources created or owned by the user, categorized by type (e.g. Dataset, Agent). --- --- sidebar_position: 31 slug: /chunker_title_component --- # Title chunker component A component that splits texts into chunks by heading level. --- A **Token chunker** component is a text splitter that uses specified heading level as delimiter to define chunk boundaries and create chunks. ## Scenario A **Title chunker** component is optional, usually placed immediately after **Parser**. :::caution WARNING Placing a **Title chunker** after a **Token chunker** is invalid and will cause an error. Please note that this restriction is not currently system-enforced and requires your attention. ::: ## Configurations ### Hierarchy Specifies the heading level to define chunk boundaries: - H1 - H2 - H3 (Default) - H4 Click **+ Add** to add heading levels here or update the corresponding **Regular Expressions** fields for custom heading patterns. ### Output The global variable name for the output of the **Title chunker** component, which can be referenced by subsequent components in the ingestion pipeline. - Default: `chunks` - Type: `Array` --- --- sidebar_position: 32 slug: /chunker_token_component --- # Token chunker component A component that splits texts into chunks, respecting a maximum token limit and using delimiters to find optimal breakpoints. --- A **Token chunker** component is a text splitter that creates chunks by respecting a recommended maximum token length, using delimiters to ensure logical chunk breakpoints. It splits long texts into appropriately-sized, semantically related chunks. ## Scenario A **Token chunker** component is optional, usually placed immediately after **Parser** or **Title chunker**. ## Configurations ### Recommended chunk size The recommended maximum token limit for each created chunk. The **Token chunker** component creates chunks at specified delimiters. If this token limit is reached before a delimiter, a chunk is created at that point. ### Overlapped percent (%) This defines the overlap percentage between chunks. An appropriate degree of overlap ensures semantic coherence without creating excessive, redundant tokens for the LLM. - Default: 0 - Maximum: 30% ### Delimiters Defaults to `\n`. Click the right-hand **Recycle bin** button to remove it, or click **+ Add** to add a delimiter. ### Output The global variable name for the output of the **Token chunker** component, which can be referenced by subsequent components in the ingestion pipeline. - Default: `chunks` - Type: `Array` --- --- sidebar_position: 35 slug: /docs_generator --- # Docs Generator component A component that generates downloadable PDF, DOCX, or TXT documents from markdown-style content with full Unicode support. --- The **Docs Generator** component enables you to create professional documents directly within your agent workflow. It accepts markdown-formatted text and converts it into downloadable files, making it ideal for generating reports, summaries, or any structured document output. ## Key features - **Multiple output formats**: PDF, DOCX, and TXT - **Full Unicode support**: Automatic font switching for CJK (Chinese, Japanese, Korean), Arabic, Hebrew, and other non-Latin scripts - **Rich formatting**: Headers, lists, tables, code blocks, and more - **Customizable styling**: Fonts, margins, page size, and orientation - **Document extras**: Logo, watermark, page numbers, and timestamps - **Direct download**: Generates a download button for the chat interface ## Prerequisites - Content to be converted into a document (typically from an **Agent** or other text-generating component). ## Examples You can pair an **Agent** component with the **Docs Generator** to create dynamic documents based on user queries. The **Agent** generates the content, and the **Docs Generator** converts it into a downloadable file. Connect the output to a **Message** component to display the download button in the chat. A typical workflow looks like: ``` Begin → Agent → Docs Generator → Message ``` In the **Message** component, reference the `download` output variable from the **Docs Generator** to display a download button in the chat interface. ## Configurations ### Content The main text content to include in the document. Supports Markdown formatting: - **Bold**: `**text**` or `__text__` - **Italic**: `*text*` or `_text_` - **Inline code**: `` `code` `` - **Headings**: `# Heading 1`, `## Heading 2`, `### Heading 3` - **Bullet lists**: `- item` or `* item` - **Numbered lists**: `1. item` - **Tables**: `| Column 1 | Column 2 |` - **Horizontal lines**: `---` - **Code blocks**: ` ``` code ``` ` :::tip NOTE Click **(x)** or type `/` to insert variables from upstream components. ::: ### Title Optional. The document title displayed at the top of the generated file. ### Subtitle Optional. A subtitle displayed below the title. ### Output format The file format for the generated document: - **PDF** (default): Portable Document Format with full styling support. - **DOCX**: Microsoft Word format. - **TXT**: Plain text format. ### Logo image Optional. A logo image to display at the top of the document. You can either: - Upload an image file using the file picker - Paste an image path, URL, or base64-encoded data ### Logo position The horizontal position of the logo: - **left** (default) - **center** - **right** ### Logo dimensions - **Logo width**: Width in inches (default: `2.0`) - **Logo height**: Height in inches (default: `1.0`) ### Font family The font used throughout the document: - **Helvetica** (default) - **Times-Roman** - **Courier** - **Helvetica-Bold** - **Times-Bold** ### Font size The base font size in points. Defaults to `12`. ### Title font size The font size for the document title. Defaults to `24`. ### Page size The paper size for the document: - **A4** (default) - **Letter** ### Orientation The page orientation: - **Portrait** (default) - **Landscape** ### Margins Page margins in inches: - **Margin top**: Defaults to `1.0` - **Margin bottom**: Defaults to `1.0` - **Margin left**: Defaults to `1.0` - **Margin right**: Defaults to `1.0` ### Filename Optional. Custom filename for the generated document. If left empty, a filename is auto-generated with a timestamp. ### Output directory The server directory where generated documents are saved. Defaults to `/tmp/pdf_outputs`. ### Add page numbers When enabled, page numbers are added to the footer of each page. Defaults to `true`. ### Add timestamp When enabled, a generation timestamp is added to the document footer. Defaults to `true`. ### Watermark text Optional. Text to display as a diagonal watermark across each page. Useful for marking documents as "Draft", "Confidential", etc. ## Output The **Docs Generator** component provides the following output variables: | Variable name | Type | Description | |---------------|-----------|--------------------------------------------------------------| | `file_path` | `string` | The server path where the generated document is saved. | | `pdf_base64` | `string` | The document content encoded in base64 format. | | `download` | `string` | JSON containing download information for the chat interface. | | `success` | `boolean` | Indicates whether the document was generated successfully. | ### Displaying the download button To display a download button in the chat, add a **Message** component after the **Docs Generator** and reference the `download` variable: 1. Connect the **Docs Generator** output to a **Message** component. 2. In the **Message** component's content field, type `/` and select `{Docs Generator_0@download}`. 3. When the agent runs, a download button will appear in the chat, allowing users to download the generated document. The download button automatically handles: - File type detection (PDF, DOCX, TXT) - Proper MIME type for browser downloads - Base64 decoding for direct file delivery ## Unicode and multi-language support The **Docs Generator** includes intelligent font handling for international content: ### How it works 1. **Content analysis**: The component scans the text for non-Latin characters. 2. **Automatic font switching**: When CJK or other complex scripts are detected, the system automatically switches to a compatible CID font (STSong-Light for Chinese, HeiseiMin-W3 for Japanese, HYSMyeongJo-Medium for Korean). 3. **Latin content**: For documents containing only Latin characters (including extended Latin, Cyrillic, and Greek), the user-selected font family is used. ### Supported scripts | Script | Unicode Range | Font Used | |------------------------------|---------------|--------------------| | Chinese (CJK) | U+4E00–U+9FFF | STSong-Light | | Japanese (Hiragana/Katakana) | U+3040–U+30FF | HeiseiMin-W3 | | Korean (Hangul) | U+AC00–U+D7AF | HYSMyeongJo-Medium | | Arabic | U+0600–U+06FF | CID font fallback | | Hebrew | U+0590–U+05FF | CID font fallback | | Devanagari (Hindi) | U+0900–U+097F | CID font fallback | | Thai | U+0E00–U+0E7F | CID font fallback | ### Font installation For full multi-language support in self-hosted deployments, ensure Unicode fonts are installed: **Linux (Debian/Ubuntu):** ```bash apt-get install fonts-freefont-ttf fonts-noto-cjk ``` **Docker:** The official RAGFlow Docker image includes these fonts. For custom images, add the font packages to your Dockerfile: ```dockerfile RUN apt-get update && apt-get install -y fonts-freefont-ttf fonts-noto-cjk ``` :::tip NOTE CID fonts (STSong-Light, HeiseiMin-W3, etc.) are built into ReportLab and do not require additional installation. They are used automatically when CJK content is detected. ::: ## Troubleshooting ### Characters appear as boxes or question marks This indicates missing font support. Ensure: 1. The content contains supported Unicode characters. 2. For self-hosted deployments, Unicode fonts are installed on the server. 3. The document is being viewed in a PDF reader that supports embedded fonts. ### Download button not appearing Ensure: 1. The **Message** component is connected after the **Docs Generator**. 2. The `download` variable is correctly referenced using `/` (which appears as `{Docs Generator_0@download}` when copied). 3. The document generation completed successfully (check `success` output). ### Large tables not rendering correctly For tables with many columns or large cell content: - The component automatically converts wide tables to a definition list format for better readability. - Consider splitting large tables into multiple smaller tables. - Use landscape orientation for wide tables. --- --- sidebar_position: 25 slug: /execute_sql --- # Execute SQL tool A tool that execute SQL queries on a specified relational database. --- The **Execute SQL** tool enables you to connect to a relational database and run SQL queries, whether entered directly or generated by the system’s Text2SQL capability via an **Agent** component. ## Prerequisites - A database instance properly configured and running. - The database must be one of the following types: - MySQL - PostgreSQL - MariaDB - Microsoft SQL Server ## Examples You can pair an **Agent** component with the **Execute SQL** tool, with the **Agent** generating SQL statements and the **Execute SQL** tool handling database connection and query execution. An example of this setup can be found in the **SQL Assistant** Agent template shown below: ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/exeSQL.jpg) ## Configurations ### SQL statement This text input field allows you to write static SQL queries, such as `SELECT * FROM my_table`, and dynamic SQL queries using variables. :::tip NOTE Click **(x)** or type `/` to insert variables. ::: For dynamic SQL queries, you can include variables in your SQL queries, such as `SELECT * FROM /sys.query`; if an **Agent** component is paired with the **Execute SQL** tool to generate SQL tasks (see the [Examples](#examples) section), you can directly insert that **Agent**'s output, `content`, into this field. ### Database type The supported database type. Currently, the following database types are available: - MySQL - PostgreSQL - MariaDB - Microsoft SQL Server (Mssql) ### Database Appears only when you select **Split** as method. ### Username The username with access privileges to the database. ### Host The IP address of the database server. ### Port The port number on which the database server is listening. ### Password The password for the database user. ### Max records The maximum number of records returned by the SQL query to control response size and improve efficiency. Defaults to `1024`. ### Output The **Execute SQL** tool provides two output variables: - `formalized_content`: A string. If you reference this variable in a **Message** component, the returned records are displayed as a table. - `json`: An object array. If you reference this variable in a **Message** component, the returned records will be presented as key-value pairs. --- --- sidebar_position: 30 slug: /http_request_component --- # HTTP request component A component that calls remote services. --- An **HTTP request** component lets you access remote APIs or services by providing a URL and an HTTP method, and then receive the response. You can customize headers, parameters, proxies, and timeout settings, and use common methods like GET and POST. It’s useful for exchanging data with external systems in a workflow. ## Prerequisites - An accessible remote API or service. - Add a Token or credentials to the request header, if the target service requires authentication. ## Configurations ### Url *Required*. The complete request address, for example: http://api.example.com/data. ### Method The HTTP request method to select. Available options: - GET - POST - PUT ### Timeout The maximum waiting time for the request, in seconds. Defaults to `60`. ### Headers Custom HTTP headers can be set here, for example: ```http { "Accept": "application/json", "Cache-Control": "no-cache", "Connection": "keep-alive" } ``` ### Proxy Optional. The proxy server address to use for this request. ### Clean HTML `Boolean`: Whether to remove HTML tags from the returned results and keep plain text only. ### Parameter *Optional*. Parameters to send with the HTTP request. Supports key-value pairs: - To assign a value using a dynamic system variable, set it as Variable. - To override these dynamic values under certain conditions and use a fixed static value instead, Value is the appropriate choice. :::tip NOTE - For GET requests, these parameters are appended to the end of the URL. - For POST/PUT requests, they are sent as the request body. ::: #### Example setting ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/http_settings.png) #### Example response ```html { "args": { "App": "RAGFlow", "Query": "How to do?", "Userid": "241ed25a8e1011f0b979424ebc5b108b" }, "headers": { "Accept": "/", "Accept-Encoding": "gzip, deflate, br, zstd", "Cache-Control": "no-cache", "Host": "httpbin.org", "User-Agent": "python-requests/2.32.2", "X-Amzn-Trace-Id": "Root=1-68c9210c-5aab9088580c130a2f065523" }, "origin": "185.36.193.38", "url": "https://httpbin.org/get?Userid=241ed25a8e1011f0b979424ebc5b108b&App=RAGFlow&Query=How+to+do%3F" } ``` ### Output The global variable name for the output of the HTTP request component, which can be referenced by other components in the workflow. - `Result`: `string` The response returned by the remote service. ## Example This is a usage example: a workflow sends a GET request from the **Begin** component to `https://httpbin.org/get` via the **HTTP Request_0** component, passes parameters to the server, and finally outputs the result through the **Message_0** component. ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/http_usage.PNG) --- --- sidebar_position: 40 slug: /indexer_component --- # Indexer component A component that defines how chunks are indexed. --- An **Indexer** component indexes chunks and configures their storage formats in the document engine. ## Scenario An **Indexer** component is the mandatory ending component for all ingestion pipelines. ## Configurations ### Search method This setting configures how chunks are stored in the document engine: as full-text, embeddings, or both. ### Filename embedding weight This setting defines the filename's contribution to the final embedding, which is a weighted combination of both the chunk content and the filename. Essentially, a higher value gives the filename more influence in the final *composite* embedding. - 0.1: Filename contributes 10% (chunk content 90%) - 0.5 (maximum): Filename contributes 50% (chunk content 90%) --- --- sidebar_position: 30 slug: /parser_component --- # Parser component A component that sets the parsing rules for your dataset. --- A **Parser** component is autopopulated on the ingestion pipeline canvas and required in all ingestion pipeline workflows. Just like the **Extract** stage in the traditional ETL process, a **Parser** component in an ingestion pipeline defines how various file types are parsed into structured data. Click the component to display its configuration panel. In this configuration panel, you set the parsing rules for various file types. ## Configurations Within the configuration panel, you can add multiple parsers and set the corresponding parsing rules or remove unwanted parsers. Please ensure your set of parsers covers all required file types; otherwise, an error would occur when you select this ingestion pipeline on your dataset's **Files** page. The **Parser** component supports parsing the following file types: | File type | File format | |---------------|--------------------------| | PDF | PDF | | Spreadsheet | XLSX, XLS, CSV | | Image | PNG, JPG, JPEG, GIF, TIF | | Email | EML | | Text & Markup | TXT, MD, MDX, HTML, JSON | | Word | DOCX | | PowerPoint | PPTX, PPT | | Audio | MP3, WAV | | Video | MP4, AVI, MKV | ### PDF parser The output of a PDF parser is `json`. In the PDF parser, you select the parsing method that works best with your PDFs. - DeepDoc: (Default) The default visual model performing OCR, TSR, and DLR tasks on complex PDFs, but can be time-consuming. - Naive: Skip OCR, TSR, and DLR tasks if *all* your PDFs are plain text. - [MinerU](https://github.com/opendatalab/MinerU): (Experimental) An open-source tool that converts PDF into machine-readable formats. - [Docling](https://github.com/docling-project/docling): (Experimental) An open-source document processing tool for gen AI. - A third-party visual model from a specific model provider. :::danger IMPORTANT Starting from v0.22.0, RAGFlow includes MinerU (≥ 2.6.3) as an optional PDF parser of multiple backends. Please note that RAGFlow acts only as a *remote client* for MinerU, calling the MinerU API to parse documents and reading the returned files. To use this feature: ::: 1. Prepare a reachable MinerU API service (FastAPI server). 2. In the **.env** file or from the **Model providers** page in the UI, configure RAGFlow as a remote client to MinerU: - `MINERU_APISERVER`: The MinerU API endpoint (e.g., `http://mineru-host:8886`). - `MINERU_BACKEND`: The MinerU backend: - `"pipeline"` (default) - `"vlm-http-client"` - `"vlm-transformers"` - `"vlm-vllm-engine"` - `"vlm-mlx-engine"` - `"vlm-vllm-async-engine"` - `"vlm-lmdeploy-engine"`. - `MINERU_SERVER_URL`: (optional) The downstream vLLM HTTP server (e.g., `http://vllm-host:30000`). Applicable when `MINERU_BACKEND` is set to `"vlm-http-client"`. - `MINERU_OUTPUT_DIR`: (optional) The local directory for holding the outputs of the MinerU API service (zip/JSON) before ingestion. - `MINERU_DELETE_OUTPUT`: Whether to delete temporary output when a temporary directory is used: - `1`: Delete. - `0`: Retain. 3. In the web UI, navigate to your dataset's **Configuration** page and find the **Ingestion pipeline** section: - If you decide to use a chunking method from the **Built-in** dropdown, ensure it supports PDF parsing, then select **MinerU** from the **PDF parser** dropdown. - If you use a custom ingestion pipeline instead, select **MinerU** in the **PDF parser** section of the **Parser** component. :::note All MinerU environment variables are optional. When set, these values are used to auto-provision a MinerU OCR model for the tenant on first use. To avoid auto-provisioning, skip the environment variable settings and only configure MinerU from the **Model providers** page in the UI. ::: :::caution WARNING Third-party visual models are marked **Experimental**, because we have not fully tested these models for the aforementioned data extraction tasks. ::: ### Spreadsheet parser A spreadsheet parser outputs `html`, preserving the original layout and table structure. You may remove this parser if your dataset contains no spreadsheets. ### Image parser An Image parser uses a native OCR model for text extraction by default. You may select an alternative VLM model, provided that you have properly configured it on the **Model provider** page. ### Email parser With the Email parser, you select the fields to parse from Emails, such as **subject** and **body**. The parser will then extract text from these specified fields. ### Text&Markup parser A Text&Markup parser automatically removes all formatting tags (e.g., those from HTML and Markdown files) to output clean, plain text only. ### Word parser A Word parser outputs `json`, preserving the original document structure information, including titles, paragraphs, tables, headers, and footers. ### PowerPoint (PPT) parser A PowerPoint parser extracts content from PowerPoint files into `json`, processing each slide individually and distinguishing between its title, body text, and notes. ### Audio parser An Audio parser transcribes audio files to text. To use this parser, you must first configure an ASR model on the **Model provider** page. ### Video parser A Video parser transcribes video files to text. To use this parser, you must first configure a VLM model on the **Model provider** page. ## Output The global variable names for the output of the **Parser** component, which can be referenced by subsequent components in the ingestion pipeline. | Variable name | Type | |---------------|-----------------| | `markdown` | `string` | | `text` | `string` | | `html` | `string` | | `json` | `Array` | --- --- sidebar_position: 37 slug: /transformer_component --- # Transformer component A component that uses an LLM to extract insights from the chunks. --- A **Transformer** component indexes chunks and configures their storage formats in the document engine. It *typically* precedes the **Indexer** in the ingestion pipeline, but you can also chain multiple **Transformer** components in sequence. ## Scenario A **Transformer** component is essential when you need the LLM to extract new information, such as keywords, questions, metadata, and summaries, from the original chunks. ## Configurations ### Model Click the dropdown menu of **Model** to show the model configuration window. - **Model**: The chat model to use. - Ensure you set the chat model correctly on the **Model providers** page. - You can use different models for different components to increase flexibility or improve overall performance. - **Creativity**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**. This parameter has three options: - **Improvise**: Produces more creative responses. - **Precise**: (Default) Produces more conservative responses. - **Balance**: A middle ground between **Improvise** and **Precise**. - **Temperature**: The randomness level of the model's output. Defaults to 0.1. - Lower values lead to more deterministic and predictable outputs. - Higher values lead to more creative and varied outputs. - A temperature of zero results in the same output for the same prompt. - **Top P**: Nucleus sampling. - Reduces the likelihood of generating repetitive or unnatural text by setting a threshold *P* and restricting the sampling to tokens with a cumulative probability exceeding *P*. - Defaults to 0.3. - **Presence penalty**: Encourages the model to include a more diverse range of tokens in the response. - A higher **presence penalty** value results in the model being more likely to generate tokens not yet been included in the generated text. - Defaults to 0.4. - **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text. - A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens. - Defaults to 0.7. - **Max tokens**: This sets the maximum length of the model's output, measured in the number of tokens (words or pieces of words). It is disabled by default, allowing the model to determine the number of tokens in its responses. :::tip NOTE - It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one. - If you are uncertain about the mechanism behind **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**, simply choose one of the three options of **Creativity**. ::: ### Result destination Select the type of output to be generated by the LLM: - Summary - Keywords - Questions - Metadata ### System prompt Typically, you use the system prompt to describe the task for the LLM, specify how it should respond, and outline other miscellaneous requirements. We do not plan to elaborate on this topic, as it can be as extensive as prompt engineering. :::tip NOTE The system prompt here automatically updates to match your selected **Result destination**. ::: ### User prompt The user-defined prompt. For example, you can type `/` or click **(x)** to insert variables of preceding components in the ingestion pipeline as the LLM's input. ### Output The global variable name for the output of the **Transformer** component, which can be referenced by subsequent **Transformer** components in the ingestion pipeline. - Default: `chunks` - Type: `Array` --- --- sidebar_position: 1 slug: /agent_introduction --- # Introduction to agents Key concepts, basic operations, a quick view of the agent editor. --- :::danger DEPRECATED! A new version is coming soon. ::: ## Key concepts Agents and RAG are complementary techniques, each enhancing the other’s capabilities in business applications. RAGFlow v0.8.0 introduces an agent mechanism, featuring a no-code workflow editor on the front end and a comprehensive graph-based task orchestration framework on the back end. This mechanism is built on top of RAGFlow's existing RAG solutions and aims to orchestrate search technologies such as query intent classification, conversation leading, and query rewriting to: - Provide higher retrievals and, - Accommodate more complex scenarios. ## Create an agent :::tip NOTE Before proceeding, ensure that: 1. You have properly set the LLM to use. See the guides on [Configure your API key](../models/llm_api_key_setup.md) or [Deploy a local LLM](../models/deploy_local_llm.mdx) for more information. 2. You have a dataset configured and the corresponding files properly parsed. See the guide on [Configure a dataset](../dataset/configure_knowledge_base.md) for more information. ::: Click the **Agent** tab in the middle top of the page to show the **Agent** page. As shown in the screenshot below, the cards on this page represent the created agents, which you can continue to edit. ![Agent_list](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/agent_list.jpg) We also provide templates catered to different business scenarios. You can either generate your agent from one of our agent templates or create one from scratch: 1. Click **+ Create agent** to show the **agent template** page: ![agent_template](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/agent_template_list.jpg) 2. To create an agent from scratch, click **Create Agent**. Alternatively, to create an agent from one of our templates, click the desired card, such as **Deep Research**, name your agent in the pop-up dialogue, and click **OK** to confirm. *You are now taken to the **no-code workflow editor** page.* ![add_component](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/add_component.jpg) 3. Click the **+** button on the **Begin** component to select the desired components in your workflow. 4. Click **Save** to apply changes to your agent. --- --- sidebar_position: 1 slug: /accelerate_agent_question_answering --- # Accelerate answering A checklist to speed up question answering. --- Please note that some of your settings may consume a significant amount of time. If you often find that your question answering is time-consuming, here is a checklist to consider: ## Balance task complexity with an Agent’s performance and speed? An Agent’s response time generally depends on many factors, e.g., the LLM’s capabilities and the prompt, the latter reflecting task complexity. When using an Agent, you should always balance task demands with the LLM’s ability. - For simple tasks, such as retrieval, rewriting, formatting, or structured data extraction, use concise prompts, remove planning or reasoning instructions, enforce output length limits, and select smaller or Turbo-class models. This significantly reduces latency and cost with minimal impact on quality. - For complex tasks, like multistep reasoning, cross-document synthesis, or tool-based workflows, maintain or enhance prompts that include planning, reflection, and verification steps. - In multi-Agent orchestration systems, delegate simple subtasks to sub-Agents using smaller, faster models, and reserve more powerful models for the lead Agent to handle complexity and uncertainty. :::tip KEY INSIGHT Focus on minimizing output tokens — through summarization, bullet points, or explicit length limits — as this has far greater impact on reducing latency than optimizing input size. ::: ## Disable Reasoning Disabling the **Reasoning** toggle will reduce the LLM's thinking time. For a model like Qwen3, you also need to add `/no_think` to the system prompt to disable reasoning. ## Disable Rerank model - Leaving the **Rerank model** field empty (in the corresponding **Retrieval** component) will significantly decrease retrieval time. - When using a rerank model, ensure you have a GPU for acceleration; otherwise, the reranking process will be *prohibitively* slow. :::tip NOTE Please note that rerank models are essential in certain scenarios. There is always a trade-off between speed and performance; you must weigh the pros against cons for your specific case. ::: ## Check the time taken for each task Click the light bulb icon above the *current* dialogue and scroll down the popup window to view the time taken for each task: | Item name | Description | |-------------------|-----------------------------------------------------------------------------------------------| | Total | Total time spent on this conversation round, including chunk retrieval and answer generation. | | Check LLM | Time to validate the specified LLM. | | Create retriever | Time to create a chunk retriever. | | Bind embedding | Time to initialize an embedding model instance. | | Bind LLM | Time to initialize an LLM instance. | | Tune question | Time to optimize the user query using the context of the multi-turn conversation. | | Bind reranker | Time to initialize an reranker model instance for chunk retrieval. | | Generate keywords | Time to extract keywords from the user query. | | Retrieval | Time to retrieve the chunks. | | Generate answer | Time to generate the answer. | --- --- sidebar_position: 3 slug: /embed_agent_into_webpage --- # Embed agent into webpage You can use iframe to embed an agent into a third-party webpage. 1. Before proceeding, you must [acquire an API key](../models/llm_api_key_setup.md); otherwise, an error message would appear. 2. On the **Agent** page, click an intended agent to access its editing page. 3. Click **Management > Embed into webpage** on the top right corner of the canvas to show the **iframe** window: 4. Copy the iframe and embed it into your webpage. ![Embed_agent](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/embed_agent_into_webpage.jpg) --- --- sidebar_position: 20 slug: /sandbox_quickstart --- # Sandbox quickstart A secure, pluggable code execution backend designed for RAGFlow and other applications requiring isolated code execution environments. ## Features: - Seamless RAGFlow Integration — Works out-of-the-box with the code component of RAGFlow. - High Security — Uses gVisor for syscall-level sandboxing to isolate execution. - Customisable Sandboxing — Modify seccomp profiles easily to tailor syscall restrictions. - Pluggable Runtime Support — Extendable to support any programming language runtime. - Developer Friendly — Quick setup with a convenient Makefile. ## Architecture The architecture consists of isolated Docker base images for each supported language runtime, managed by the executor manager service. The executor manager orchestrates sandboxed code execution using gVisor for syscall interception and optional seccomp profiles for enhanced syscall filtering. ## Prerequisites - Linux distribution compatible with gVisor. - gVisor installed and configured. - Docker version 25.0 or higher (API 1.44+). Ensure your executor manager image ships with Docker CLI `29.1.0` or higher to stay compatible with the latest Docker daemons. - Docker Compose version 2.26.1 or higher (similar to RAGFlow requirements). - uv package and project manager installed. - (Optional) GNU Make for simplified command-line management. :::tip NOTE The error message `client version 1.43 is too old. Minimum supported API version is 1.44` indicates that your executor manager image's built-in Docker CLI version is lower than `29.1.0` required by the Docker daemon in use. To solve this issue, pull the latest `infiniflow/sandbox-executor-manager:latest` from Docker Hub or rebuild it in `./sandbox/executor_manager`. ::: ## Build Docker base images The sandbox uses isolated base images for secure containerised execution environments. Build the base images manually: ```bash docker build -t sandbox-base-python:latest ./sandbox_base_image/python docker build -t sandbox-base-nodejs:latest ./sandbox_base_image/nodejs ``` Alternatively, build all base images at once using the Makefile: ```bash make build ``` Next, build the executor manager image: ```bash docker build -t sandbox-executor-manager:latest ./executor_manager ``` ## Running with RAGFlow 1. Verify that gVisor is properly installed and operational. 2. Configure the .env file located at docker/.env: - Uncomment sandbox-related environment variables. - Enable the sandbox profile at the bottom of the file. 3. Add the following entry to your /etc/hosts file to resolve the executor manager service: ```bash 127.0.0.1 es01 infinity mysql minio redis sandbox-executor-manager ``` 4. Start the RAGFlow service as usual. ## Running standalone ### Manual setup 1. Initialize the environment variables: ```bash cp .env.example .env ``` 2. Launch the sandbox services with Docker Compose: ```bash docker compose -f docker-compose.yml up ``` 3. Test the sandbox setup: ```bash source .venv/bin/activate export PYTHONPATH=$(pwd) uv pip install -r executor_manager/requirements.txt uv run tests/sandbox_security_tests_full.py ``` ### Using Makefile Run all setup, build, launch, and tests with a single command: ```bash make ``` ### Monitoring To follow logs of the executor manager container: ```bash docker logs -f sandbox-executor-manager ``` Or use the Makefile shortcut: ```bash make logs ``` --- --- sidebar_position: 2 slug: /ai_search --- # Search Conduct an AI search. --- An AI search is a single-turn AI conversation using a predefined retrieval strategy (a hybrid search of weighted keyword similarity and weighted vector similarity) and the system's default chat model. It does not involve advanced RAG strategies like knowledge graph, auto-keyword, or auto-question. The related chunks are listed below the chat model's response in descending order based on their similarity scores. ![Create search app](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/create_search_app.jpg) ![Search view](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/search_view.jpg) :::tip NOTE When debugging your chat assistant, you can use AI search as a reference to verify your model settings and retrieval strategy. ::: ## Prerequisites - Ensure that you have configured the system's default models on the **Model providers** page. - Ensure that the intended datasets are properly configured and the intended documents have finished file parsing. ## Frequently asked questions ### Key difference between an AI search and an AI chat? A chat is a multi-turn AI conversation where you can define your retrieval strategy (a weighted reranking score can be used to replace the weighted vector similarity in a hybrid search) and choose your chat model. In an AI chat, you can configure advanced RAG strategies, such as knowledge graphs, auto-keyword, and auto-question, for your specific case. Retrieved chunks are not displayed along with the answer. --- --- sidebar_position: 3 slug: /implement_deep_research --- # Implement deep research Implements deep research for agentic reasoning. --- From v0.17.0 onward, RAGFlow supports integrating agentic reasoning in an AI chat. The following diagram illustrates the workflow of RAGFlow's deep research: ![Image](https://github.com/user-attachments/assets/f65d4759-4f09-4d9d-9549-c0e1fe907525) To activate this feature: 1. Enable the **Reasoning** toggle in **Chat setting**. ![chat_reasoning](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/chat_reasoning.jpg) 2. Enter the correct Tavily API key to leverage Tavily-based web search: ![chat_tavily](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/chat_tavily.jpg) *The following is a screenshot of a conversation that integrates Deep Research:* ![Image](https://github.com/user-attachments/assets/165b88ff-1f5d-4fb8-90e2-c836b25e32e9) --- --- sidebar_position: 4 slug: /set_chat_variables --- # Set variables Set variables to be used together with the system prompt for your LLM. --- When configuring the system prompt for a chat model, variables play an important role in enhancing flexibility and reusability. With variables, you can dynamically adjust the system prompt to be sent to your model. In the context of RAGFlow, if you have defined variables in **Chat setting**, except for the system's reserved variable `{knowledge}`, you are required to pass in values for them from RAGFlow's [HTTP API](../../references/http_api_reference.md#converse-with-chat-assistant) or through its [Python SDK](../../references/python_api_reference.md#converse-with-chat-assistant). :::danger IMPORTANT In RAGFlow, variables are closely linked with the system prompt. When you add a variable in the **Variable** section, include it in the system prompt. Conversely, when deleting a variable, ensure it is removed from the system prompt; otherwise, an error would occur. ::: ## Where to set variables ![set_variables](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/chat_variables.jpg) ## 1. Manage variables In the **Variable** section, you add, remove, or update variables. ### `{knowledge}` - a reserved variable `{knowledge}` is the system's reserved variable, representing the chunks retrieved from the dataset(s) specified by **Knowledge bases** under the **Assistant settings** tab. If your chat assistant is associated with certain datasets, you can keep it as is. :::info NOTE It currently makes no difference whether `{knowledge}` is set as optional or mandatory, but please note this design will be updated in due course. ::: From v0.17.0 onward, you can start an AI chat without specifying datasets. In this case, we recommend removing the `{knowledge}` variable to prevent unnecessary reference and keeping the **Empty response** field empty to avoid errors. ### Custom variables Besides `{knowledge}`, you can also define your own variables to pair with the system prompt. To use these custom variables, you must pass in their values through RAGFlow's official APIs. The **Optional** toggle determines whether these variables are required in the corresponding APIs: - **Disabled** (Default): The variable is mandatory and must be provided. - **Enabled**: The variable is optional and can be omitted if not needed. ## 2. Update system prompt After you add or remove variables in the **Variable** section, ensure your changes are reflected in the system prompt to avoid inconsistencies or errors. Here's an example: ``` You are an intelligent assistant. Please answer the question by summarizing chunks from the specified dataset(s)... Your answers should follow a professional and {style} style. ... Here is the knowledge base: {knowledge} The above is the knowledge base. ``` :::tip NOTE If you have removed `{knowledge}`, ensure that you thoroughly review and update the entire system prompt to achieve optimal results. ::: ## APIs The *only* way to pass in values for the custom variables defined in the **Chat Configuration** dialogue is to call RAGFlow's [HTTP API](../../references/http_api_reference.md#converse-with-chat-assistant) or through its [Python SDK](../../references/python_api_reference.md#converse-with-chat-assistant). ### HTTP API See [Converse with chat assistant](../../references/http_api_reference.md#converse-with-chat-assistant). Here's an example: ```json {9} curl --request POST \ --url http://{address}/api/v1/chats/{chat_id}/completions \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data-binary ' { "question": "xxxxxxxxx", "stream": true, "style":"hilarious" }' ``` ### Python API See [Converse with chat assistant](../../references/python_api_reference.md#converse-with-chat-assistant). Here's an example: ```python {18} from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") assistant = rag_object.list_chats(name="Miss R") assistant = assistant[0] session = assistant.create_session() print("\n==================== Miss R =====================\n") print("Hello. What can I do for you?") while True: question = input("\n==================== User =====================\n> ") style = input("Please enter your preferred style (e.g., formal, informal, hilarious): ") print("\n==================== Miss R =====================\n") cont = "" for ans in session.ask(question, stream=True, style=style): print(ans.content[len(cont):], end='', flush=True) cont = ans.content ``` --- --- sidebar_position: 1 slug: /start_chat --- # Start AI chat Initiate an AI-powered chat with a configured chat assistant. --- Chats in RAGFlow are based on a particular dataset or multiple datasets. Once you have created your dataset, finished file parsing, and [run a retrieval test](../dataset/run_retrieval_test.md), you can go ahead and start an AI conversation. ## Start an AI chat You start an AI conversation by creating an assistant. 1. Click the **Chat** tab in the middle top of the page **>** **Create an assistant** to show the **Chat Configuration** dialogue *of your next dialogue*. > RAGFlow offers you the flexibility of choosing a different chat model for each dialogue, while allowing you to set the default models in **System Model Settings**. 2. Update Assistant-specific settings: - **Assistant name** is the name of your chat assistant. Each assistant corresponds to a dialogue with a unique combination of datasets, prompts, hybrid search configurations, and large model settings. - **Empty response**: - If you wish to *confine* RAGFlow's answers to your datasets, leave a response here. Then, when it doesn't retrieve an answer, it *uniformly* responds with what you set here. - If you wish RAGFlow to *improvise* when it doesn't retrieve an answer from your datasets, leave it blank, which may give rise to hallucinations. - **Show quote**: This is a key feature of RAGFlow and enabled by default. RAGFlow does not work like a black box. Instead, it clearly shows the sources of information that its responses are based on. - Select the corresponding datasets. You can select one or multiple datasets, but ensure that they use the same embedding model, otherwise an error would occur. 3. Update Prompt-specific settings: - In **System**, you fill in the prompts for your LLM, you can also leave the default prompt as-is for the beginning. - **Similarity threshold** sets the similarity "bar" for each chunk of text. The default is 0.2. Text chunks with lower similarity scores are filtered out of the final response. - **Vector similarity weight** is set to 0.3 by default. RAGFlow uses a hybrid score system to evaluate the relevance of different text chunks. This value sets the weight assigned to the vector similarity component in the hybrid score. - If **Rerank model** is left empty, the hybrid score system uses keyword similarity and vector similarity, and the default weight assigned to the keyword similarity component is 1-0.3=0.7. - If **Rerank model** is selected, the hybrid score system uses keyword similarity and reranker score, and the default weight assigned to the reranker score is 1-0.7=0.3. - **Top N** determines the *maximum* number of chunks to feed to the LLM. In other words, even if more chunks are retrieved, only the top N chunks are provided as input. - **Multi-turn optimization** enhances user queries using existing context in a multi-round conversation. It is enabled by default. When enabled, it will consume additional LLM tokens and significantly increase the time to generate answers. - **Use knowledge graph** indicates whether to use knowledge graph(s) in the specified dataset(s) during retrieval for multi-hop question answering. When enabled, this would involve iterative searches across entity, relationship, and community report chunks, greatly increasing retrieval time. - **Reasoning** indicates whether to generate answers through reasoning processes like Deepseek-R1/OpenAI o1. Once enabled, the chat model autonomously integrates Deep Research during question answering when encountering an unknown topic. This involves the chat model dynamically searching external knowledge and generating final answers through reasoning. - **Rerank model** sets the reranker model to use. It is left empty by default. - If **Rerank model** is left empty, the hybrid score system uses keyword similarity and vector similarity, and the default weight assigned to the vector similarity component is 1-0.7=0.3. - If **Rerank model** is selected, the hybrid score system uses keyword similarity and reranker score, and the default weight assigned to the reranker score is 1-0.7=0.3. - [Cross-language search](../../references/glossary.mdx#cross-language-search): Optional Select one or more target languages from the dropdown menu. The system’s default chat model will then translate your query into the selected target language(s). This translation ensures accurate semantic matching across languages, allowing you to retrieve relevant results regardless of language differences. - When selecting target languages, please ensure that these languages are present in the dataset to guarantee an effective search. - If no target language is selected, the system will search only in the language of your query, which may cause relevant information in other languages to be missed. - **Variable** refers to the variables (keys) to be used in the system prompt. `{knowledge}` is a reserved variable. Click **Add** to add more variables for the system prompt. - If you are uncertain about the logic behind **Variable**, leave it *as-is*. - As of v0.17.2, if you add custom variables here, the only way you can pass in their values is to call: - HTTP method [Converse with chat assistant](../../references/http_api_reference.md#converse-with-chat-assistant), or - Python method [Converse with chat assistant](../../references/python_api_reference.md#converse-with-chat-assistant). 4. Update Model-specific Settings: - In **Model**: you select the chat model. Though you have selected the default chat model in **System Model Settings**, RAGFlow allows you to choose an alternative chat model for your dialogue. - **Creativity**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**. This parameter has three options: - **Improvise**: Produces more creative responses. - **Precise**: (Default) Produces more conservative responses. - **Balance**: A middle ground between **Improvise** and **Precise**. - **Temperature**: The randomness level of the model's output. Defaults to 0.1. - Lower values lead to more deterministic and predictable outputs. - Higher values lead to more creative and varied outputs. - A temperature of zero results in the same output for the same prompt. - **Top P**: Nucleus sampling. - Reduces the likelihood of generating repetitive or unnatural text by setting a threshold *P* and restricting the sampling to tokens with a cumulative probability exceeding *P*. - Defaults to 0.3. - **Presence penalty**: Encourages the model to include a more diverse range of tokens in the response. - A higher **presence penalty** value results in the model being more likely to generate tokens not yet been included in the generated text. - Defaults to 0.4. - **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text. - A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens. - Defaults to 0.7. 5. Now, let's start the show: ![chat_thermal_solution](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/chat_thermal_solution.jpg) :::tip NOTE 1. Click the light bulb icon above the answer to view the expanded system prompt: ![prompt_display](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/prompt_display.jpg) *The light bulb icon is available only for the current dialogue.* 2. Scroll down the expanded prompt to view the time consumed for each task: ![time_elapsed](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/time_elapsed.jpg) ::: ## Update settings of an existing chat assistant ![chat_setting](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/chat_setting.jpg) ## Integrate chat capabilities into your application or webpage RAGFlow offers HTTP and Python APIs for you to integrate RAGFlow's capabilities into your applications. Read the following documents for more information: - [Acquire a RAGFlow API key](../../develop/acquire_ragflow_api_key.md) - [HTTP API reference](../../references/http_api_reference.md) - [Python API reference](../../references/python_api_reference.md) You can use iframe to embed the created chat assistant into a third-party webpage: 1. Before proceeding, you must [acquire an API key](../../develop/acquire_ragflow_api_key.md); otherwise, an error message would appear. 2. Hover over an intended chat assistant **>** **Edit** to show the **iframe** window: ![chat-embed](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/embed_chat_into_webpage.jpg) 3. Copy the iframe and embed it into your webpage. ![chat-embed](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/embedded_chat_app.jpg) --- --- sidebar_position: 3 slug: /add_google_drive --- # Add Google Drive ## 1. Create a Google Cloud Project You can either create a dedicated project for RAGFlow or use an existing Google Cloud external project. **Steps:** 1. Open the project creation page\ `https://console.cloud.google.com/projectcreate` ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image1.jpeg?raw=true) 2. Select **External** as the Audience ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image2.png?raw=true) 3. Click **Create** ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image3.jpeg?raw=true) ------------------------------------------------------------------------ ## 2. Configure OAuth Consent Screen 1. Go to **APIs & Services → OAuth consent screen** 2. Ensure **User Type = External** ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image4.jpeg?raw=true) 3. Add your test users under **Test Users** by entering email addresses ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image5.jpeg?raw=true) ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image6.jpeg?raw=true) ------------------------------------------------------------------------ ## 3. Create OAuth Client Credentials 1. Navigate to:\ `https://console.cloud.google.com/auth/clients` 2. Create a **Web Application** ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image7.png?raw=true) 3. Enter a name for the client 4. Add the following **Authorized Redirect URIs**: ``` http://localhost:9380/v1/connector/google-drive/oauth/web/callback ``` - If using Docker deployment: **Authorized JavaScript origin:** ``` http://localhost:80 ``` ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image8.png?raw=true) - If running from source: **Authorized JavaScript origin:** ``` http://localhost:9222 ``` ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image9.png?raw=true) 5. After saving, click **Download JSON**. This file will later be uploaded into RAGFlow. ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image10.png?raw=true) ------------------------------------------------------------------------ ## 4. Add Scopes 1. Open **Data Access → Add or remove scopes** 2. Paste and add the following entries: ``` https://www.googleapis.com/auth/drive.readonly https://www.googleapis.com/auth/drive.metadata.readonly https://www.googleapis.com/auth/admin.directory.group.readonly https://www.googleapis.com/auth/admin.directory.user.readonly ``` ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image11.jpeg?raw=true) 3. Update and Save changes ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image12.jpeg?raw=true) ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image13.jpeg?raw=true) ------------------------------------------------------------------------ ## 5. Enable Required APIs Navigate to the Google API Library:\ `https://console.cloud.google.com/apis/library` ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image14.png?raw=true) Enable the following APIs: - Google Drive API - Admin SDK API - Google Sheets API - Google Docs API ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image15.png?raw=true) ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image16.png?raw=true) ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image17.png?raw=true) ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image18.png?raw=true) ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image19.png?raw=true) ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image21.png?raw=true) ------------------------------------------------------------------------ ## 6. Add Google Drive As a Data Source in RAGFlow 1. Go to **Data Sources** inside RAGFlow 2. Select **Google Drive** 3. Upload the previously downloaded JSON credentials ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image22.jpeg?raw=true) 4. Enter the shared Google Drive folder link (https://drive.google.com/drive), such as: ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image23.png?raw=true) 5. Click **Authorize with Google** A browser window will appear. ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image25.jpeg?raw=true) Click: - **Continue** - **Select All → Continue** - Authorization should succeed - Select **OK** to add the data source ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image26.jpeg?raw=true) ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image27.jpeg?raw=true) ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image28.png?raw=true) ![placeholder-image](https://github.com/infiniflow/ragflow-docs/blob/040e4acd4c1eac6dc73dc44e934a6518de78d097/images/google_drive/image29.png?raw=true) --- --- sidebar_position: -6 slug: /auto_metadata --- # Auto-extract metadata Automatically extract metadata from uploaded files. --- RAGFlow v0.23.0 introduces the Auto-metadata feature, which uses large language models to automatically extract and generate metadata for files—eliminating the need for manual entry. In a typical RAG pipeline, metadata serves two key purposes: - During the retrieval stage: Filters out irrelevant documents, narrowing the search scope to improve retrieval accuracy. - During the generation stage: If a text chunk is retrieved, its associated metadata is also passed to the LLM, providing richer contextual information about the source document to aid answer generation. :::danger WARNING Enabling TOC extraction requires significant memory, computational resources, and tokens. ::: ## Procedure 1. On your dataset's **Configuration** page, select an indexing model, which will be used to generate the knowledge graph, RAPTOR, auto-metadata, auto-keyword, and auto-question features for this dataset. ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/indexing_model.png) 2. Click **Auto metadata** **>** **Settings** to go to the configuration page for automatic metadata generation rules. _The configuration page for rules on automatically generating metadata appears._ ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/auto_metadata_settings.png) 3. Click **+** to add new fields and enter the configuration page. ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/metadata_field_settings.png) 4. Enter a field name, such as Author, and add a description and examples in the Description section. This provides context to the large language model (LLM) for more accurate value extraction. If left blank, the LLM will extract values based only on the field name. 5. To restrict the LLM to generating metadata from a predefined list, enable the Restrict to defined values mode and manually add the allowed values. The LLM will then only generate results from this preset range. 6. Once configured, turn on the Auto-metadata switch on the Configuration page. All newly uploaded files will have these rules applied during parsing. For files that have already been processed, you must re-parse them to trigger metadata generation. You can then use the filter function to check the metadata generation status of your files. ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/enable_auto_metadata.png) --- --- sidebar_position: -4 slug: /configure_child_chunking_strategy --- # Configure child chunking strategy Set parent-child chunking strategy to improve retrieval. --- A persistent challenge in practical RAG applications lies in a structural tension within the traditional "chunk-embed-retrieve" pipeline: a single text chunk is tasked with both semantic matching (recall) and contextual understanding (utilization)—two inherently conflicting objectives. Recall demands fine-grained, precise chunks, while answer generation requires coherent, informationally complete context. To resolve this tension, RAGFlow previously introduced the Table of Contents (TOC) enhancement feature, which uses a large language model (LLM) to generate document structure and automatically supplements missing context during retrieval based on that TOC. In version 0.23.0, this capability has been systematically integrated into the Ingestion Pipeline, and a novel parent-child chunking mechanism has been introduced. Under this mechanism, a document is first segmented into larger parent chunks, each maintaining a relatively complete semantic unit to ensure logical and background integrity. Each parent chunk can then be further subdivided into multiple child chunks for precise recall. During retrieval, the system first locates the most relevant text segments based on the child chunks while automatically associating and recalling their parent chunk. This approach maintains high recall relevance while providing ample semantic background for the generation phase. For instance, when processing a *Compliance Handbook*, a user query about "liability for breach" might precisely retrieve a child chunk stating, "The penalty for breach is 20% of the total contract value," but without context, it cannot clarify whether this clause applies to "minor breach" or "material breach." Leveraging the parent-child chunking mechanism, the system returns this child chunk along with its parent chunk, which contains the complete section of the clause. This allows the LLM to make accurate judgments based on broader context, avoiding misinterpretation. Through this dual-layer structure of "precise localization + contextual supplementation," RAGFlow ensures retrieval accuracy while significantly enhancing the reliability and completeness of generated answers. ## Procedure 1. On your dataset's **Configuration** page, find the **Child chunk are used for retrieval** toggle: ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/child_chunking.png) 2. Set the delimiter for child chunks. 3. This configuration applies to the **Chunker** component when it comes to ingestion pipeline settings: ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/child_chunking_parser.png) --- --- sidebar_position: -10 slug: /configure_knowledge_base --- # Configure dataset Most of RAGFlow's chat assistants and Agents are based on datasets. Each of RAGFlow's datasets serves as a knowledge source, *parsing* files uploaded from your local machine and file references generated in RAGFlow's File system into the real 'knowledge' for future AI chats. This guide demonstrates some basic usages of the dataset feature, covering the following topics: - Create a dataset - Configure a dataset - Search for a dataset - Delete a dataset ## Create dataset With multiple datasets, you can build more flexible, diversified question answering. To create your first dataset: ![create dataset](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/create_knowledge_base.jpg) _Each time a dataset is created, a folder with the same name is generated in the **root/.knowledgebase** directory._ ## Configure dataset The following screenshot shows the configuration page of a dataset. A proper configuration of your dataset is crucial for future AI chats. For example, choosing the wrong embedding model or chunking method would cause unexpected semantic loss or mismatched answers in chats. ![dataset configuration](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/configure_knowledge_base.jpg) This section covers the following topics: - Select chunking method - Select embedding model - Upload file - Parse file - Intervene with file parsing results - Run retrieval testing ### Select chunking method RAGFlow offers multiple built-in chunking template to facilitate chunking files of different layouts and ensure semantic integrity. From the **Built-in** chunking method dropdown under **Parse type**, you can choose the default template that suits the layouts and formats of your files. The following table shows the descriptions and the compatible file formats of each supported chunk template: | **Template** | Description | File format | |--------------|-------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------| | General | Files are consecutively chunked based on a preset chunk token number. | MD, MDX, DOCX, XLSX, XLS (Excel 97-2003), PPT, PDF, TXT, JPEG, JPG, PNG, TIF, GIF, CSV, JSON, EML, HTML | | Q&A | Retrieves relevant information and generates answers to respond to questions. | XLSX, XLS (Excel 97-2003), CSV/TXT | | Resume | Enterprise edition only. You can also try it out on demo.ragflow.io. | DOCX, PDF, TXT | | Manual | | PDF | | Table | The table mode uses TSI technology for efficient data parsing. | XLSX, XLS (Excel 97-2003), CSV/TXT | | Paper | | PDF | | Book | | DOCX, PDF, TXT | | Laws | | DOCX, PDF, TXT | | Presentation | | PDF, PPTX | | Picture | | JPEG, JPG, PNG, TIF, GIF | | One | Each document is chunked in its entirety (as one). | DOCX, XLSX, XLS (Excel 97-2003), PDF, TXT | | Tag | The dataset functions as a tag set for the others. | XLSX, CSV/TXT | You can also change a file's chunking method on the **Files** page. ![change chunking method](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/change_chunking_method.jpg)
From v0.21.0 onward, RAGFlow supports ingestion pipeline for customized data ingestion and cleansing workflows. To use a customized data pipeline: 1. On the **Agent** page, click **+ Create agent** > **Create from blank**. 2. Select **Ingestion pipeline** and name your data pipeline in the popup, then click **Save** to show the data pipeline canvas. 3. After updating your data pipeline, click **Save** on the top right of the canvas. 4. Navigate to the **Configuration** page of your dataset, select **Choose pipeline** in **Ingestion pipeline**. *Your saved data pipeline will appear in the dropdown menu below.*
### Select embedding model An embedding model converts chunks into embeddings. It cannot be changed once the dataset has chunks. To switch to a different embedding model, you must delete all existing chunks in the dataset. The obvious reason is that we *must* ensure that files in a specific dataset are converted to embeddings using the *same* embedding model (ensure that they are compared in the same embedding space). :::danger IMPORTANT Some embedding models are optimized for specific languages, so performance may be compromised if you use them to embed documents in other languages. ::: ### Upload file - RAGFlow's File system allows you to link a file to multiple datasets, in which case each target dataset holds a reference to the file. - In **Knowledge Base**, you are also given the option of uploading a single file or a folder of files (bulk upload) from your local machine to a dataset, in which case the dataset holds file copies. While uploading files directly to a dataset seems more convenient, we *highly* recommend uploading files to RAGFlow's File system and then linking them to the target datasets. This way, you can avoid permanently deleting files uploaded to the dataset. ### Parse file File parsing is a crucial topic in dataset configuration. The meaning of file parsing in RAGFlow is twofold: chunking files based on file layout and building embedding and full-text (keyword) indexes on these chunks. After having selected the chunking method and embedding model, you can start parsing a file: ![parse file](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/parse_file.jpg) - As shown above, RAGFlow allows you to use a different chunking method for a particular file, offering flexibility beyond the default method. - As shown above, RAGFlow allows you to enable or disable individual files, offering finer control over dataset-based AI chats. ### Intervene with file parsing results RAGFlow features visibility and explainability, allowing you to view the chunking results and intervene where necessary. To do so: 1. Click on the file that completes file parsing to view the chunking results: _You are taken to the **Chunk** page:_ ![chunks](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/file_chunks.jpg) 2. Hover over each snapshot for a quick view of each chunk. 3. Double-click the chunked texts to add keywords, questions, tags, or make *manual* changes where necessary: ![update chunk](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/add_keyword_question.jpg) :::caution NOTE You can add keywords to a file chunk to increase its ranking for queries containing those keywords. This action increases its keyword weight and can improve its position in search list. ::: 4. In Retrieval testing, ask a quick question in **Test text** to double-check if your configurations work: _As you can tell from the following, RAGFlow responds with truthful citations._ ![retrieval test](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/retrieval_test.jpg) ### Run retrieval testing RAGFlow uses multiple recall of both full-text search and vector search in its chats. Prior to setting up an AI chat, consider adjusting the following parameters to ensure that the intended information always turns up in answers: - Similarity threshold: Chunks with similarities below the threshold will be filtered. By default, it is set to 0.2. - Vector similarity weight: The percentage by which vector similarity contributes to the overall score. By default, it is set to 0.3. See [Run retrieval test](./run_retrieval_test.md) for details. ## Search for dataset As of RAGFlow v0.23.1, the search feature is still in a rudimentary form, supporting only dataset search by name. ![search dataset](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/search_datasets.jpg) ## Delete dataset You are allowed to delete a dataset. Hover your mouse over the three dot of the intended dataset card and the **Delete** option appears. Once you delete a dataset, the associated folder under **root/.knowledge** directory is AUTOMATICALLY REMOVED. The consequence is: - The files uploaded directly to the dataset are gone; - The file references, which you created from within RAGFlow's File system, are gone, but the associated files still exist. ![delete dataset](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/delete_datasets.jpg) --- --- sidebar_position: 8 slug: /construct_knowledge_graph --- # Construct knowledge graph Generate a knowledge graph for your dataset. --- To enhance multi-hop question-answering, RAGFlow adds a knowledge graph construction step between data extraction and indexing, as illustrated below. This step creates additional chunks from existing ones generated by your specified chunking method. ![Image](https://github.com/user-attachments/assets/1ec21d8e-f255-4d65-9918-69b72dfa142b) From v0.16.0 onward, RAGFlow supports constructing a knowledge graph on a dataset, allowing you to construct a *unified* graph across multiple files within your dataset. When a newly uploaded file starts parsing, the generated graph will automatically update. :::danger WARNING Constructing a knowledge graph requires significant memory, computational resources, and tokens. ::: ## Scenarios Knowledge graphs are especially useful for multi-hop question-answering involving *nested* logic. They outperform traditional extraction approaches when you are performing question answering on books or works with complex entities and relationships. :::tip NOTE RAPTOR (Recursive Abstractive Processing for Tree Organized Retrieval) can also be used for multi-hop question-answering tasks. See [Enable RAPTOR](./enable_raptor.md) for details. You may use either approach or both, but ensure you understand the memory, computational, and token costs involved. ::: ## Prerequisites The system's default chat model is used to generate knowledge graph. Before proceeding, ensure that you have a chat model properly configured: ![Set default models](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/set_default_models.jpg) ## Configurations ### Entity types (*Required*) The types of the entities to extract from your dataset. The default types are: **organization**, **person**, **event**, and **category**. Add or remove types to suit your specific dataset. ### Method The method to use to construct knowledge graph: - **General**: Use prompts provided by [GraphRAG](https://github.com/microsoft/graphrag) to extract entities and relationships. - **Light**: (Default) Use prompts provided by [LightRAG](https://github.com/HKUDS/LightRAG) to extract entities and relationships. This option consumes fewer tokens, less memory, and fewer computational resources. ### Entity resolution Whether to enable entity resolution. You can think of this as an entity deduplication switch. When enabled, the LLM will combine similar entities - e.g., '2025' and 'the year of 2025', or 'IT' and 'Information Technology' - to construct a more effective graph. - (Default) Disable entity resolution. - Enable entity resolution. This option consumes more tokens. ### Community reports In a knowledge graph, a community is a cluster of entities linked by relationships. You can have the LLM generate an abstract for each community, known as a community report. See [here](https://www.microsoft.com/en-us/research/blog/graphrag-improving-global-search-via-dynamic-community-selection/) for more information. This indicates whether to generate community reports: - Generate community reports. This option consumes more tokens. - (Default) Do not generate community reports. ## Quickstart 1. Navigate to the **Configuration** page of your dataset and update: - Entity types: *Required* - Specifies the entity types in the knowledge graph to generate. You don't have to stick with the default, but you need to customize them for your documents. - Method: *Optional* - Entity resolution: *Optional* - Community reports: *Optional* *The default knowledge graph configurations for your dataset are now set.* 2. Navigate to the **Files** page of your dataset, click the **Generate** button on the top right corner of the page, then select **Knowledge graph** from the dropdown to initiate the knowledge graph generation process. *You can click the pause button in the dropdown to halt the build process when necessary.* 3. Go back to the **Configuration** page: *Once a knowledge graph is generated, the **Knowledge graph** field changes from `Not generated` to `Generated at a specific timestamp`. You can delete it by clicking the recycle bin button to the right of the field.* 4. To use the created knowledge graph, do either of the following: - In the **Chat setting** panel of your chat app, switch on the **Use knowledge graph** toggle. - If you are using an agent, click the **Retrieval** agent component to specify the dataset(s) and switch on the **Use knowledge graph** toggle. ## Frequently asked questions ### Does the knowledge graph automatically update when I remove a related file? Nope. The knowledge graph does *not* update *until* you regenerate a knowledge graph for your dataset. ### How to remove a generated knowledge graph? On the **Configuration** page of your dataset, find the **Knowledge graph** field and click the recycle bin button to the right of the field. ### Where is the created knowledge graph stored? All chunks of the created knowledge graph are stored in RAGFlow's document engine: either Elasticsearch or [Infinity](https://github.com/infiniflow/infinity). ### How to export a created knowledge graph? Nope. Exporting a created knowledge graph is not supported. If you still consider this feature essential, please [raise an issue](https://github.com/infiniflow/ragflow/issues) explaining your use case and its importance. --- --- sidebar_position: 4 slug: /enable_excel2html --- # Enable Excel2HTML Convert complex Excel spreadsheets into HTML tables. --- When using the **General** chunking method, you can enable the **Excel to HTML** toggle to convert spreadsheet files into HTML tables. If it is disabled, spreadsheet tables will be represented as key-value pairs. For complex tables that cannot be simply represented this way, you must enable this feature. :::caution WARNING The feature is disabled by default. If your dataset contains spreadsheets with complex tables, and you do not enable this feature, RAGFlow will not throw an error but your tables are likely to be garbled. ::: ## Scenarios Works with complex tables that cannot be represented as key-value pairs. Examples include spreadsheet tables with multiple columns, tables with merged cells, or multiple tables within one sheet. In such cases, consider converting these spreadsheet tables into HTML tables. ## Considerations - The Excel2HTML feature applies only to spreadsheet files (XLSX or XLS (Excel 97-2003)). - This feature is associated with the **General** chunking method. In other words, it is available *only when* you select the **General** chunking method. - When this feature is enabled, spreadsheet tables with more than 12 rows will be split into chunks of 12 rows each. ## Procedure 1. On your dataset's **Configuration** page, select **General** as the chunking method. _The **Excel to HTML** toggle appears._ 2. Enable **Excel to HTML** if your dataset contains complex spreadsheet tables that cannot be represented as key-value pairs. 3. Leave **Excel to HTML** disabled if your dataset has no spreadsheet tables or if its spreadsheet tables can be represented as key-value pairs. 4. If question-answering regarding complex tables is unsatisfactory, check if **Excel to HTML** is enabled. ## Frequently asked questions ### Should I enable this feature for PDFs with complex tables? Nope. This feature applies to spreadsheet files only. Enabling **Excel to HTML** does not affect your PDFs. --- --- sidebar_position: 7 slug: /enable_raptor --- # Enable RAPTOR A recursive abstractive method used in long-context knowledge retrieval and summarization, balancing broad semantic understanding with fine details. --- RAPTOR (Recursive Abstractive Processing for Tree Organized Retrieval) is an enhanced document preprocessing technique introduced in a [2024 paper](https://arxiv.org/html/2401.18059v1). Designed to tackle multi-hop question-answering issues, RAPTOR performs recursive clustering and summarization of document chunks to build a hierarchical tree structure. This enables more context-aware retrieval across lengthy documents. RAGFlow v0.6.0 integrates RAPTOR for document clustering as part of its data preprocessing pipeline between data extraction and indexing, as illustrated below. ![document_clustering](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/document_clustering_as_preprocessing.jpg) Our tests with this new approach demonstrate state-of-the-art (SOTA) results on question-answering tasks requiring complex, multistep reasoning. By combining RAPTOR retrieval with our built-in chunking methods and/or other retrieval-augmented generation (RAG) approaches, you can further improve your question-answering accuracy. :::danger WARNING Enabling RAPTOR requires significant memory, computational resources, and tokens. ::: ## Basic principles After the original documents are divided into chunks, the chunks are clustered by semantic similarity rather than by their original order in the text. Clusters are then summarized into higher-level chunks by your system's default chat model. This process is applied recursively, forming a tree structure with various levels of summarization from the bottom up. As illustrated in the figure below, the initial chunks form the leaf nodes (shown in blue) and are recursively summarized into a root node (shown in orange). ![raptor](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/clustering_and_summarizing.jpg) The recursive clustering and summarization capture a broad understanding (by the root node) as well as fine details (by the leaf nodes) necessary for multi-hop question-answering. ## Scenarios For multi-hop question-answering tasks involving complex, multistep reasoning, a semantic gap often exists between the question and its answer. As a result, searching with the question often fails to retrieve the relevant chunks that contribute to the correct answer. RAPTOR addresses this challenge by providing the chat model with richer and more context-aware and relevant chunks to summarize, enabling a holistic understanding without losing granular details. :::tip NOTE Knowledge graphs can also be used for multi-hop question-answering tasks. See [Construct knowledge graph](./construct_knowledge_graph.md) for details. You may use either approach or both, but ensure you understand the memory, computational, and token costs involved. ::: ## Prerequisites The system's default chat model is used to summarize clustered content. Before proceeding, ensure that you have a chat model properly configured: ![Set default models](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/set_default_models.jpg) ## Configurations The RAPTOR feature is disabled by default. To enable it, manually switch on the **Use RAPTOR to enhance retrieval** toggle on your dataset's **Configuration** page. ### Prompt The following prompt will be applied *recursively* for cluster summarization, with `{cluster_content}` serving as an internal parameter. We recommend that you keep it as-is for now. The design will be updated in due course. ``` Please summarize the following paragraphs... Paragraphs as following: {cluster_content} The above is the content you need to summarize. ``` ### Max token The maximum number of tokens per generated summary chunk. Defaults to 256, with a maximum limit of 2048. ### Threshold In RAPTOR, chunks are clustered by their semantic similarity. The **Threshold** parameter sets the minimum similarity required for chunks to be grouped together. It defaults to 0.1, with a maximum limit of 1. A higher **Threshold** means fewer chunks in each cluster, while a lower one means more. ### Max cluster The maximum number of clusters to create. Defaults to 64, with a maximum limit of 1024. ### Random seed A random seed. Click **+** to change the seed value. ## Quickstart 1. Navigate to the **Configuration** page of your dataset and update: - Prompt: *Optional* - We recommend that you keep it as-is until you understand the mechanism behind. - Max token: *Optional* - Threshold: *Optional* - Max cluster: *Optional* 2. Navigate to the **Files** page of your dataset, click the **Generate** button on the top right corner of the page, then select **RAPTOR** from the dropdown to initiate the RAPTOR build process. *You can click the pause button in the dropdown to halt the build process when necessary.* 3. Go back to the **Configuration** page: *The **RAPTOR** field changes from `Not generated` to `Generated at a specific timestamp` when a RAPTOR hierarchical tree structure is generated. You can delete it by clicking the recycle bin button to the right of the field.* 4. Once a RAPTOR hierarchical tree structure is generated, your chat assistant and **Retrieval** agent component will use it for retrieval as a default. --- --- sidebar_position: 4 slug: /enable_table_of_contents --- # Extract table of contents Extract table of contents (TOC) from documents to provide long context RAG and improve retrieval. --- During indexing, this technique uses LLM to extract and generate chapter information, which is added to each chunk to provide sufficient global context. At the retrieval stage, it first uses the chunks matched by search, then supplements missing chunks based on the table of contents structure. This addresses issues caused by chunk fragmentation and insufficient context, improving answer quality. :::danger WARNING Enabling TOC extraction requires significant memory, computational resources, and tokens. ::: ## Prerequisites The system's default chat model is used to summarize clustered content. Before proceeding, ensure that you have a chat model properly configured: ![Set default models](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/set_default_models.jpg) ## Quickstart 1. Navigate to the **Configuration** page. 2. Enable **TOC Enhance**. 3. To use this technique during retrieval, do either of the following: - In the **Chat setting** panel of your chat app, switch on the **TOC Enhance** toggle. - If you are using an agent, click the **Retrieval** agent component to specify the dataset(s) and switch on the **TOC Enhance** toggle. ## Frequently asked questions ### Will previously parsed files be searched using the TOC enhancement feature once I enable `TOC Enhance`? No. Only files parsed after you enable **TOC Enhance** will be searched using the TOC enhancement feature. To apply this feature to files parsed before enabling **TOC Enhance**, you must reparse them. --- --- sidebar_position: -5 slug: /manage_metadata --- # Manage metadata Manage metadata for your dataset and for your individual documents. --- From v0.23.0 onwards, RAGFlow allows you to manage metadata both at the dataset level and for individual files. ## Procedure 1. Click on **Metadata** within your dataset to access the **Manage Metadata** page. ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/click_metadata.png) 2. On the **Manage Metadata** page, you can do either of the following: - Edit Values: You can modify existing values. If you rename two values to be identical, they will be automatically merged. - Delete: You can delete specific values or entire fields. These changes will apply to all associated files. _The configuration page for rules on automatically generating metadata appears._ ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/manage_metadata.png) 3. To manage metadata for a single file, navigate to the file's details page as shown below. Click on the parsing method (e.g., **General**), then select **Set Metadata** to view or edit the file's metadata. Here, you can add, delete, or modify metadata fields for this specific file. Any edits made here will be reflected in the global statistics on the main Metadata management page for the knowledge base. ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/set_metadata.png) ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/edit_metadata.png) 4. The filtering function operates at two levels: knowledge base management and retrieval. Within the dataset, click the Filter button to view the number of files associated with each value under existing metadata fields. By selecting specific values, you can display all linked files. ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/filter_metadata.png) 5. Metadata filtering is also supported during the retrieval stage. In Chat, for example, you can set metadata filtering rules after configuring a knowledge base: ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/metadata_filtering_rules.png) - **Automatic** Mode: The system automatically filters documents based on the user's query and the existing metadata in the knowledge base. - **Semi-automatic** Mode: Users first define the filtering scope at the field level (e.g., for **Author**), and then the system automatically filters within that preset range. - **Manual** Mode: Users manually set precise, value-specific filter conditions, supported by operators such as **Equals**, **Not equals**, **In**, **Not in**, and more. --- --- sidebar_position: 10 slug: /run_retrieval_test --- # Run retrieval test Conduct a retrieval test on your dataset to check whether the intended chunks can be retrieved. --- After your files are uploaded and parsed, it is recommended that you run a retrieval test before proceeding with the chat assistant configuration. Running a retrieval test is *not* an unnecessary or superfluous step at all! Just like fine-tuning a precision instrument, RAGFlow requires careful tuning to deliver optimal question answering performance. Your dataset settings, chat assistant configurations, and the specified large and small models can all significantly impact the final results. Running a retrieval test verifies whether the intended chunks can be recovered, allowing you to quickly identify areas for improvement or pinpoint any issue that needs addressing. For instance, when debugging your question answering system, if you know that the correct chunks can be retrieved, you can focus your efforts elsewhere. For example, in issue [#5627](https://github.com/infiniflow/ragflow/issues/5627), the problem was found to be due to the LLM's limitations. During a retrieval test, chunks created from your specified chunking method are retrieved using a hybrid search. This search combines weighted keyword similarity with either weighted vector cosine similarity or a weighted reranking score, depending on your settings: - If no rerank model is selected, weighted keyword similarity will be combined with weighted vector cosine similarity. - If a rerank model is selected, weighted keyword similarity will be combined with weighted vector reranking score. In contrast, chunks created from [knowledge graph construction](./construct_knowledge_graph.md) are retrieved solely using vector cosine similarity. ## Prerequisites - Your files are uploaded and successfully parsed before running a retrieval test. - A knowledge graph must be successfully built before enabling **Use knowledge graph**. ## Configurations ### Similarity threshold This sets the bar for retrieving chunks: chunks with similarities below the threshold will be filtered out. By default, the threshold is set to 0.2. This means that only chunks with hybrid similarity score of 20 or higher will be retrieved. ### Vector similarity weight This sets the weight of vector similarity in the composite similarity score, whether used with vector cosine similarity or a reranking score. By default, it is set to 0.3, making the weight of the other component 0.7 (1 - 0.3). ### Rerank model - If left empty, RAGFlow will use a combination of weighted keyword similarity and weighted vector cosine similarity. - If a rerank model is selected, weighted keyword similarity will be combined with weighted vector reranking score. :::danger IMPORTANT Using a rerank model will significantly increase the time to receive a response. ::: ### Use knowledge graph In a knowledge graph, an entity description, a relationship description, or a community report each exists as an independent chunk. This switch indicates whether to add these chunks to the retrieval. The switch is disabled by default. When enabled, RAGFlow performs the following during a retrieval test: 1. Extract entities and entity types from your query using the LLM. 2. Retrieve top N entities from the graph based on their PageRank values, using the extracted entity types. 3. Find similar entities and their N-hop relationships from the graph using the embeddings of the extracted query entities. 4. Retrieve similar relationships from the graph using the query embedding. 5. Rank these retrieved entities and relationships by multiplying each one's PageRank value with its similarity score to the query, returning the top n as the final retrieval. 6. Retrieve the report for the community involving the most entities in the final retrieval. *The retrieved entity descriptions, relationship descriptions, and the top 1 community report are sent to the LLM for content generation.* :::danger IMPORTANT Using a knowledge graph in a retrieval test will significantly increase the time to receive a response. ::: ### Cross-language search To perform a [cross-language search](../../references/glossary.mdx#cross-language-search), select one or more target languages from the dropdown menu. The system’s default chat model will then translate your query entered in the Test text field into the selected target language(s). This translation ensures accurate semantic matching across languages, allowing you to retrieve relevant results regardless of language differences. :::tip NOTE - When selecting target languages, please ensure that these languages are present in the dataset to guarantee an effective search. - If no target language is selected, the system will search only in the language of your query, which may cause relevant information in other languages to be missed. ::: ### Test text This field is where you put in your testing query. ## Procedure 1. Navigate to the **Retrieval testing** page of your dataset, enter your query in **Test text**, and click **Testing** to run the test. 2. If the results are unsatisfactory, tune the options listed in the Configuration section and rerun the test. *The following is a screenshot of a retrieval test conducted without using knowledge graph. It demonstrates a hybrid search combining weighted keyword similarity and weighted vector cosine similarity. The overall hybrid similarity score is 28.56, calculated as 25.17 (term similarity score) x 0.7 + 36.49 (vector similarity score) x 0.3:* ![Image](https://github.com/user-attachments/assets/541554d4-3f3e-44e1-954b-0ae77d7372c6) *The following is a screenshot of a retrieval test conducted using a knowledge graph. It shows that only vector similarity is used for knowledge graph-generated chunks:* ![Image](https://github.com/user-attachments/assets/30a03091-0f7b-4058-901a-f4dc5ca5aa6b) :::caution WARNING If you have adjusted the default settings, such as keyword similarity weight or similarity threshold, to achieve the optimal results, be aware that these changes will not be automatically saved. You must apply them to your chat assistant settings or the **Retrieval** agent component settings. ::: ## Frequently asked questions ### Is an LLM used when the Use Knowledge Graph switch is enabled? Yes, your LLM will be involved to analyze your query and extract the related entities and relationship from the knowledge graph. This also explains why additional tokens and time will be consumed. --- --- sidebar_position: -3 slug: /select_pdf_parser --- # Select PDF parser Select a visual model for parsing your PDFs. --- RAGFlow isn't one-size-fits-all. It is built for flexibility and supports deeper customization to accommodate more complex use cases. From v0.17.0 onwards, RAGFlow decouples DeepDoc-specific data extraction tasks from chunking methods **for PDF files**. This separation enables you to autonomously select a visual model for OCR (Optical Character Recognition), TSR (Table Structure Recognition), and DLR (Document Layout Recognition) tasks that balances speed and performance to suit your specific use cases. If your PDFs contain only plain text, you can opt to skip these tasks by selecting the **Naive** option, to reduce the overall parsing time. ![data extraction](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/data_extraction.jpg) ## Prerequisites - The PDF parser dropdown menu appears only when you select a chunking method compatible with PDFs, including: - **General** - **Manual** - **Paper** - **Book** - **Laws** - **Presentation** - **One** - To use a third-party visual model for parsing PDFs, ensure you have set a default VLM under **Set default models** on the **Model providers** page. ## Quickstart 1. On your dataset's **Configuration** page, select a chunking method, say **General**. _The **PDF parser** dropdown menu appears._ 2. Select the option that works best with your scenario: - DeepDoc: (Default) The default visual model performing OCR, TSR, and DLR tasks on PDFs, but can be time-consuming. - Naive: Skip OCR, TSR, and DLR tasks if _all_ your PDFs are plain text. - [MinerU](https://github.com/opendatalab/MinerU): (Experimental) An open-source tool that converts PDF into machine-readable formats. - [Docling](https://github.com/docling-project/docling): (Experimental) An open-source document processing tool for gen AI. - A third-party visual model from a specific model provider. :::danger IMPORTANT Starting from v0.22.0, RAGFlow includes MinerU (≥ 2.6.3) as an optional PDF parser of multiple backends. Please note that RAGFlow acts only as a *remote client* for MinerU, calling the MinerU API to parse documents and reading the returned files. To use this feature: ::: 1. Prepare a reachable MinerU API service (FastAPI server). 2. In the **.env** file or from the **Model providers** page in the UI, configure RAGFlow as a remote client to MinerU: - `MINERU_APISERVER`: The MinerU API endpoint (e.g., `http://mineru-host:8886`). - `MINERU_BACKEND`: The MinerU backend: - `"pipeline"` (default) - `"vlm-http-client"` - `"vlm-transformers"` - `"vlm-vllm-engine"` - `"vlm-mlx-engine"` - `"vlm-vllm-async-engine"` - `"vlm-lmdeploy-engine"`. - `MINERU_SERVER_URL`: (optional) The downstream vLLM HTTP server (e.g., `http://vllm-host:30000`). Applicable when `MINERU_BACKEND` is set to `"vlm-http-client"`. - `MINERU_OUTPUT_DIR`: (optional) The local directory for holding the outputs of the MinerU API service (zip/JSON) before ingestion. - `MINERU_DELETE_OUTPUT`: Whether to delete temporary output when a temporary directory is used: - `1`: Delete. - `0`: Retain. 3. In the web UI, navigate to your dataset's **Configuration** page and find the **Ingestion pipeline** section: - If you decide to use a chunking method from the **Built-in** dropdown, ensure it supports PDF parsing, then select **MinerU** from the **PDF parser** dropdown. - If you use a custom ingestion pipeline instead, select **MinerU** in the **PDF parser** section of the **Parser** component. :::note All MinerU environment variables are optional. When set, these values are used to auto-provision a MinerU OCR model for the tenant on first use. To avoid auto-provisioning, skip the environment variable settings and only configure MinerU from the **Model providers** page in the UI. ::: :::caution WARNING Third-party visual models are marked **Experimental**, because we have not fully tested these models for the aforementioned data extraction tasks. ::: ## Frequently asked questions ### When should I select DeepDoc or a third-party visual model as the PDF parser? Use a visual model to extract data if your PDFs contain formatted or image-based text rather than plain text. DeepDoc is the default visual model but can be time-consuming. You can also choose a lightweight or high-performance VLM depending on your needs and hardware capabilities. ### Can I select a visual model to parse my DOCX files? No, you cannot. This dropdown menu is for PDFs only. To use this feature, convert your DOCX files to PDF first. --- --- sidebar_position: -8 slug: /set_context_window --- # Set context window size Set context window size for images and tables to improve long-context RAG performances. --- RAGFlow leverages built-in DeepDoc, along with external document models like MinerU and Docling, to parse document layouts. In previous versions, images and tables extracted based on document layout were treated as independent chunks. Consequently, if a search query did not directly match the content of an image or table, these elements would not be retrieved. However, real-world documents frequently interweave charts and tables with surrounding text, which often describes them. Therefore, recalling charts based on this contextual text is an essential capability. To address this, RAGFlow 0.23.0 introduces the **Image & table context window** feature. Inspired by key principles of the research-focused, open-source multimodal RAG project RAG-Anything, this functionality allows surrounding text and adjacent visuals to be grouped into a single chunk based on a user-configurable window size. This ensures they are retrieved together, significantly improving the recall accuracy for charts and tables. ## Procedure 1. On your dataset's **Configuration** page, find the **Image & table context window** slider: ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/image_table_context_window.png) 2. Adjust the number of context tokens according to your needs. *The number in the red box indicates that approximately **N tokens** of text from above and below the image/table will be captured and inserted into the image or table chunk as contextual information. The capture process intelligently optimizes boundaries at punctuation marks to preserve semantic integrity. * --- --- sidebar_position: -7 slug: /set_metadata --- # Set metadata Manually add metadata to an uploaded file --- On the **Dataset** page of your dataset, you can add metadata to any uploaded file. This approach enables you to 'tag' additional information like URL, author, date, and more to an existing file. In an AI-powered chat, such information will be sent to the LLM with the retrieved chunks for content generation. For example, if you have a dataset of HTML files and want the LLM to cite the source URL when responding to your query, add a `"url"` parameter to each file's metadata. ![Set metadata](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/set_metadata.jpg) :::tip NOTE Ensure that your metadata is in JSON format; otherwise, your updates will not be applied. ::: ![Input metadata](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/input_metadata.jpg) ## Related APIs [Retrieve chunks](../../references/http_api_reference.md#retrieve-chunks) ## Frequently asked questions ### Can I set metadata for multiple documents at once? From v0.23.0 onwards, you can set metadata for each document individually or have the LLM auto-generate metadata for multiple files. See [Extract metadata](./auto_metadata.md) for details. --- --- sidebar_position: -2 slug: /set_page_rank --- # Set page rank Create a step-retrieval strategy using page rank. --- ## Scenario In an AI-powered chat, you can configure a chat assistant or an agent to respond using knowledge retrieved from multiple specified datasets (datasets), provided that they employ the same embedding model. In situations where you prefer information from certain dataset(s) to take precedence or to be retrieved first, you can use RAGFlow's page rank feature to increase the ranking of chunks from these datasets. For example, if you have configured a chat assistant to draw from two datasets, dataset A for 2024 news and dataset B for 2023 news, but wish to prioritize news from year 2024, this feature is particularly useful. :::info NOTE It is important to note that this 'page rank' feature operates at the level of the entire dataset rather than on individual files or documents. ::: ## Configuration On the **Configuration** page of your dataset, drag the slider under **Page rank** to set the page rank value for your dataset. You are also allowed to input the intended page rank value in the field next to the slider. :::info NOTE The page rank value must be an integer. Range: [0,100] - 0: Disabled (Default) - A specific value: enabled ::: :::tip NOTE If you set the page rank value to a non-integer, say 1.7, it will be rounded down to the nearest integer, which in this case is 1. ::: ## Scoring mechanism If you configure a chat assistant's **similarity threshold** to 0.2, only chunks with a hybrid score greater than 0.2 x 100 = 20 will be retrieved and sent to the chat model for content generation. This initial filtering step is crucial for narrowing down relevant information. If you have assigned a page rank of 1 to dataset A (2024 news) and 0 to dataset B (2023 news), the final hybrid scores of the retrieved chunks will be adjusted accordingly. A chunk retrieved from dataset A with an initial score of 50 will receive a boost of 1 x 100 = 100 points, resulting in a final score of 50 + 1 x 100 = 150. In this way, chunks retrieved from dataset A will always precede chunks from dataset B. --- --- sidebar_position: 6 slug: /use_tag_sets --- # Use tag set Use a tag set to auto-tag chunks in your datasets. --- Retrieval accuracy is the touchstone for a production-ready RAG framework. In addition to retrieval-enhancing approaches like auto-keyword, auto-question, and knowledge graph, RAGFlow introduces an auto-tagging feature to address semantic gaps. The auto-tagging feature automatically maps tags in the user-defined tag sets to relevant chunks within your dataset based on similarity with each chunk. This automation mechanism allows you to apply an additional "layer" of domain-specific knowledge to existing datasets, which is particularly useful when dealing with a large number of chunks. To use this feature, ensure you have at least one properly configured tag set, specify the tag set(s) on the **Configuration** page of your dataset, and then re-parse your documents to initiate the auto-tagging process. During this process, each chunk in your dataset is compared with every entry in the specified tag set(s), and tags are automatically applied based on similarity. ## Scenarios Auto-tagging applies in situations where chunks are so similar to each other that the intended chunks cannot be distinguished from the rest. For example, when you have a few chunks about iPhone and a majority about iPhone case or iPhone accessaries, it becomes difficult to retrieve those chunks about iPhone without additional information. ## 1. Create tag set You can consider a tag set as a closed set, and the tags to attach to the chunks in your dataset are *exclusively* from the specified tag set. You use a tag set to "inform" RAGFlow which chunks to tag and which tags to apply. ### Prepare a tag table file A tag set can comprise one or multiple table files in XLSX, CSV, or TXT formats. Each table file in the tag set contains two columns, **Description** and **Tag**: - The first column provides descriptions of the tags listed in the second column. These descriptions can be example chunks or example queries. Similarity will be calculated between each entry in this column and every chunk in your dataset. - The **Tag** column includes tags to pair with the description entries. Multiple tags should be separated by a comma (,). :::tip NOTE As a rule of thumb, consider including the following entries in your tag table: - Descriptions of intended chunks, along with their corresponding tags. - User queries that fail to retrieve the correct responses using other methods, ensuring their tags match the intended chunks in your dataset. ::: ### Create a tag set :::danger IMPORTANT A tag set is *not* involved in document indexing or retrieval. Do not specify a tag set when configuring your chat assistant or agent. ::: 1. Click **+ Create dataset** to create a dataset. 2. Navigate to the **Configuration** page of the created dataset, select **Built-in** in **Ingestion pipeline**, then choose **Tag** as the default chunking method from the **Built-in** drop-down menu. 3. Go back to the **Files** page and upload and parse your table file in XLSX, CSV, or TXT formats. _A tag cloud appears under the **Tag view** section, indicating the tag set is created:_ ![Image](https://github.com/user-attachments/assets/abefbcbf-c130-4abe-95e1-267b0d2a0505) 4. Click the **Table** tab to view the tag frequency table: ![Image](https://github.com/user-attachments/assets/af91d10c-5ea5-491f-ab21-3803d5ebf59f) ## 2. Tag chunks Once a tag set is created, you can apply it to your dataset: 1. Navigate to the **Configuration** page of your dataset. 2. Select the tag set from the **Tag sets** dropdown and click **Save** to confirm. :::tip NOTE If the tag set is missing from the dropdown, check that it has been created or configured correctly. ::: 3. Re-parse your documents to start the auto-tagging process. _In an AI chat scenario using auto-tagged datasets, each query will be tagged using the corresponding tag set(s) and chunks with these tags will have a higher chance to be retrieved._ ## 3. Update tag set Creating a tag set is *not* for once and for all. Oftentimes, you may find it necessary to update or delete existing tags or add new entries. - You can update the existing tag set in the tag frequency table. - To add new entries, you can add and parse new table files in XLSX, CSV, or TXT formats. ### Update tag set in tag frequency table 1. Navigate to the **Configuration** page in your tag set. 2. Click the **Table** tab under **Tag view** to view the tag frequency table, where you can update tag names or delete tags. :::danger IMPORTANT When a tag set is updated, you must re-parse the documents in your dataset so that their tags can be updated accordingly. ::: ### Add new table files 1. Navigate to the **Configuration** page in your tag set. 2. Navigate to the **Dataset** page and upload and parse your table file in XLSX, CSV, or TXT formats. :::danger IMPORTANT If you add new table files to your tag set, it is at your own discretion whether to re-parse your documents in your datasets. ::: ## Frequently asked questions ### Can I reference more than one tag set? Yes, you can. Usually one tag set suffices. When using multiple tag sets, ensure they are independent of each other; otherwise, consider merging your tag sets. ### Difference between a tag set and a standard dataset? A standard dataset is a dataset. It will be searched by RAGFlow's document engine and the retrieved chunks will be fed to the LLM. In contrast, a tag set is used solely to attach tags to chunks within your dataset. It does not directly participate in the retrieval process, and you should not choose a tag set when selecting datasets for your chat assistant or agent. ### Difference between auto-tag and auto-keyword? Both features enhance retrieval in RAGFlow. The auto-keyword feature relies on the LLM and consumes a significant number of tokens, whereas the auto-tag feature is based on vector similarity and predefined tag set(s). You can view the keywords applied in the auto-keyword feature as an open set, as they are generated by the LLM. In contrast, a tag set can be considered a user-defined close set, requiring upload tag set(s) in specified formats before use. --- --- sidebar_position: 6 slug: /manage_files --- # Files RAGFlow's file management allows you to upload files individually or in bulk. You can then link an uploaded file to multiple target datasets. This guide showcases some basic usages of the file management feature. :::info IMPORTANT Compared to uploading files directly to various datasets, uploading them to RAGFlow's file management and then linking them to different datasets is *not* an unnecessary step, particularly when you want to delete some parsed files or an entire dataset but retain the original files. ::: ## Create folder RAGFlow's file management allows you to establish your file system with nested folder structures. To create a folder in the root directory of RAGFlow: ![create new folder](https://github.com/infiniflow/ragflow/assets/93570324/3a37a5f4-43a6-426d-a62a-e5cd2ff7a533) :::caution NOTE Each dataset in RAGFlow has a corresponding folder under the **root/.knowledgebase** directory. You are not allowed to create a subfolder within it. ::: ## Upload file RAGFlow's file management supports file uploads from your local machine, allowing both individual and bulk uploads: ![upload file](https://github.com/infiniflow/ragflow/assets/93570324/5d7ded14-ce2b-4703-8567-9356a978f45c) ![bulk upload](https://github.com/infiniflow/ragflow/assets/93570324/def0db55-824c-4236-b809-a98d8c8674e3) ## Preview file RAGFlow's file management supports previewing files in the following formats: - Documents (PDF, DOCS) - Tables (XLSX) - Pictures (JPEG, JPG, PNG, TIF, GIF) ![preview](https://github.com/infiniflow/ragflow/assets/93570324/2e931362-8bbf-482c-ac86-b68b09d331bc) ## Link file to datasets RAGFlow's file management allows you to *link* an uploaded file to multiple datasets, creating a file reference in each target dataset. Therefore, deleting a file in your file management will AUTOMATICALLY REMOVE all related file references across the datasets. ![link knowledgebase](https://github.com/infiniflow/ragflow/assets/93570324/6c6b8db4-3269-4e35-9434-6089887e3e3f) You can link your file to one dataset or multiple datasets at one time: ![link multiple kb](https://github.com/infiniflow/ragflow/assets/93570324/6c508803-fb1f-435d-b688-683066fd7fff) ## Move file to a specific folder ![move files](https://github.com/user-attachments/assets/3a2db469-6811-4ea0-be80-403b61ffe257) ## Search files or folders **File Management** only supports file name and folder name filtering in the current directory (files or folders in the child directory will not be retrieved). ![search file](https://github.com/infiniflow/ragflow/assets/93570324/77ffc2e5-bd80-4ed1-841f-068e664efffe) ## Rename file or folder RAGFlow's file management allows you to rename a file or folder: ![rename_file](https://github.com/infiniflow/ragflow/assets/93570324/5abb0704-d9e9-4b43-9ed4-5750ccee011f) ## Delete files or folders RAGFlow's file management allows you to delete files or folders individually or in bulk. To delete a file or folder: ![delete file](https://github.com/infiniflow/ragflow/assets/93570324/85872728-125d-45e9-a0ee-21e9d4cedb8b) To bulk delete files or folders: ![bulk delete](https://github.com/infiniflow/ragflow/assets/93570324/519b99ab-ec7f-4c8a-8cea-e0b6dcb3cb46) > - You are not allowed to delete the **root/.knowledgebase** folder. > - Deleting files that have been linked to datasets will **AUTOMATICALLY REMOVE** all associated file references across the datasets. ## Download uploaded file RAGFlow's file management allows you to download an uploaded file: ![download_file](https://github.com/infiniflow/ragflow/assets/93570324/cf3b297f-7d9b-4522-bf5f-4f45743e4ed5) > As of RAGFlow v0.23.1, bulk download is not supported, nor can you download an entire folder. --- # Data Migration Guide A common scenario is processing large datasets on a powerful instance (e.g., with a GPU) and then migrating the entire RAGFlow service to a different production environment (e.g., a CPU-only server). This guide explains how to safely back up and restore your data using our provided migration script. ## Identifying Your Data By default, RAGFlow uses Docker volumes to store all persistent data, including your database, uploaded files, and search indexes. You can see these volumes by running: ```bash docker volume ls ``` The output will look similar to this: ```text DRIVER VOLUME NAME local docker_esdata01 local docker_minio_data local docker_mysql_data local docker_redis_data ``` These volumes contain all the data you need to migrate. ## Step 1: Stop RAGFlow Services Before starting the migration, you must stop all running RAGFlow services on the **source machine**. Navigate to the project's root directory and run: ```bash docker-compose -f docker/docker-compose.yml down ``` **Important:** Do **not** use the `-v` flag (e.g., `docker-compose down -v`), as this will delete all your data volumes. The migration script includes a check and will prevent you from running it if services are active. ## Step 2: Back Up Your Data We provide a convenient script to package all your data volumes into a single backup folder. For a quick reference of the script's commands and options, you can run: ```bash bash docker/migration.sh help ``` To create a backup, run the following command from the project's root directory: ```bash bash docker/migration.sh backup ``` This will create a `backup/` folder in your project root containing compressed archives of your data volumes. You can also specify a custom name for your backup folder: ```bash bash docker/migration.sh backup my_ragflow_backup ``` This will create a folder named `my_ragflow_backup/` instead. ## Step 3: Transfer the Backup Folder Copy the entire backup folder (e.g., `backup/` or `my_ragflow_backup/`) from your source machine to the RAGFlow project directory on your **target machine**. You can use tools like `scp`, `rsync`, or a physical drive for the transfer. ## Step 4: Restore Your Data On the **target machine**, ensure that RAGFlow services are not running. Then, use the migration script to restore your data from the backup folder. If your backup folder is named `backup/`, run: ```bash bash docker/migration.sh restore ``` If you used a custom name, specify it in the command: ```bash bash docker/migration.sh restore my_ragflow_backup ``` The script will automatically create the necessary Docker volumes and unpack the data. **Note:** If the script detects that Docker volumes with the same names already exist on the target machine, it will warn you that restoring will overwrite the existing data and ask for confirmation before proceeding. ## Step 5: Start RAGFlow Services Once the restore process is complete, you can start the RAGFlow services on your new machine: ```bash docker-compose -f docker/docker-compose.yml up -d ``` **Note:** If you already have built a service by docker-compose before, you may need to backup your data for target machine like this guide above and run like: ```bash # Please backup by `sh docker/migration.sh backup backup_dir_name` before you do the following line. # !!! this line -v flag will delete the original docker volume docker-compose -f docker/docker-compose.yml down -v docker-compose -f docker/docker-compose.yml up -d ``` Your RAGFlow instance is now running with all the data from your original machine. --- --- sidebar_position: 1 slug: /llm_api_key_setup --- # Configure model API key An API key is required for RAGFlow to interact with an online AI model. This guide provides information about setting your model API key in RAGFlow. ## Get model API key RAGFlow supports most mainstream LLMs. Please refer to [Supported Models](../../references/supported_models.mdx) for a complete list of supported models. You will need to apply for your model API key online. Note that most LLM providers grant newly-created accounts trial credit, which will expire in a couple of months, or a promotional amount of free quota. :::note If you find your online LLM is not on the list, don't feel disheartened. The list is expanding, and you can [file a feature request](https://github.com/infiniflow/ragflow/issues/new?assignees=&labels=feature+request&projects=&template=feature_request.yml&title=%5BFeature+Request%5D%3A+) with us! Alternatively, if you have customized or locally-deployed models, you can [bind them to RAGFlow using Ollama, Xinference, or LocalAI](./deploy_local_llm.mdx). ::: ## Configure model API key You have two options for configuring your model API key: - Configure it in **service_conf.yaml.template** before starting RAGFlow. - Configure it on the **Model providers** page after logging into RAGFlow. ### Configure model API key before starting up RAGFlow 1. Navigate to **./docker/ragflow**. 2. Find entry **user_default_llm**: - Update `factory` with your chosen LLM. - Update `api_key` with yours. - Update `base_url` if you use a proxy to connect to the remote service. 3. Reboot your system for your changes to take effect. 4. Log into RAGFlow. _After logging into RAGFlow, you will find your chosen model appears under **Added models** on the **Model providers** page._ ### Configure model API key after logging into RAGFlow :::caution WARNING After logging into RAGFlow, configuring your model API key through the **service_conf.yaml.template** file will no longer take effect. ::: After logging into RAGFlow, you can *only* configure API Key on the **Model providers** page: 1. Click on your logo on the top right of the page **>** **Model providers**. 2. Find your model card under **Models to be added** and click **Add the model**. 3. Paste your model API key. 4. Fill in your base URL if you use a proxy to connect to the remote service. 5. Click **OK** to confirm your changes. --- --- sidebar_position: 3 slug: /join_or_leave_team --- # Join or leave a team Accept an invitation to join a team, decline an invitation, or leave a team. --- Once you join a team, you can do the following: - Upload documents to the team owner's shared datasets. - Parse documents in the team owner's shared datasets. - Use the team owner's shared Agents. :::tip NOTE You cannot invite users to a team unless you are its owner. ::: ## Prerequisites 1. Ensure that your Email address that received the team invitation is associated with a RAGFlow user account. 2. The team owner should share his datasets by setting their **Permission** to **Team**. ## Accept or decline team invite 1. You will be notified on the top right corner of your system page when you receive an invitation to join a team. 2. Click on your avatar in the top right corner of the page, then select **Team** in the left-hand panel to access the **Team** page. _On the **Team** page, you can view the information about members of your team and the teams you have joined._ _After accepting the team invite, you should be able to view and update the team owner's datasets whose **Permissions** is set to **Team**._ ## Leave a joined team --- --- sidebar_position: 2 slug: /manage_team_members --- # Manage team members Invite or remove team members. --- By default, each RAGFlow user is assigned a single team named after their name. RAGFlow allows you to invite RAGFlow users to your team. Your team members can help you: - Upload documents to your shared datasets. - Parse documents in your shared datasets. - Use your shared Agents. :::tip NOTE - Your team members are currently *not* allowed to invite users to your team, and only you, the team owner, is permitted to do so. - Sharing added models with team members is only available in RAGFlow's Enterprise edition. ::: ## Prerequisites 1. Ensure that the invited team member is a RAGFlow user and that the Email address used is associated with a RAGFlow user account. 2. To allow your team members to view and update your dataset, ensure that you set **Permissions** on its **Configuration** page from **Only me** to **Team**. ## Invite team members Click on your avatar in the top right corner of the page, then select **Team** in the left-hand panel to access the **Team** page. ![team_view](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/team_view.jpg) _On the **Team** page, you can view the information about members of your team and the teams you have joined._ You are, by default, the owner of your own team and the only person permitted to invite users to join your team or remove team members. ![invite_user](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/invite_user.jpg) ## Remove team members ![delete_invite](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/delete_invite.jpg) --- --- sidebar_position: 6 slug: /share_agent --- # Share Agent Share an Agent with your team members. --- When ready, you may share your Agents with your team members so that they can use them. Please note that your Agents are not shared automatically; you must manually enable sharing by selecting the corresponding **Permissions** radio button: 1. Click the intended Agent to open its editing canvas. 2. Click **Management** > **Settings** to show the **Agent settings** dialogue. 3. Change **Permissions** from **Only me** to **Team**. 4. Click **Save** to apply your changes. *When completed, your team members will see your shared Agents.* --- --- sidebar_position: 5 slug: /share_chat_assistant --- # Share chat assistant Sharing chat assistant is currently exclusive to RAGFlow Enterprise, but will be made available in due course. --- --- sidebar_position: 4 slug: /share_datasets --- # Share dataset Share a dataset with team members. --- When ready, you may share your datasets with your team members so that they can upload and parse files in them. Please note that your datasets are not shared automatically; you must manually enable sharing by selecting the appropriate **Permissions** radio button: 1. Navigate to the dataset's **Configuration** page. 2. Change **Permissions** from **Only me** to **Team**. 3. Click **Save** to apply your changes. *Once completed, your team members will see your shared datasets.* --- --- sidebar_position: 7 slug: /share_model --- # Share models Sharing models is currently exclusive to RAGFlow Enterprise. --- --- sidebar_position: 4 slug: /http_api_reference --- # HTTP API A complete reference for RAGFlow's RESTful API. Before proceeding, please ensure you [have your RAGFlow API key ready for authentication](https://ragflow.io/docs/dev/acquire_ragflow_api_key). --- ## ERROR CODES --- | Code | Message | Description | |------|-----------------------|----------------------------| | 400 | Bad Request | Invalid request parameters | | 401 | Unauthorized | Unauthorized access | | 403 | Forbidden | Access denied | | 404 | Not Found | Resource not found | | 500 | Internal Server Error | Server internal error | | 1001 | Invalid Chunk ID | Invalid Chunk ID | | 1002 | Chunk Update Failed | Chunk update failed | --- ## OpenAI-Compatible API --- ### Create chat completion **POST** `/api/v1/chats_openai/{chat_id}/chat/completions` Creates a model response for a given chat conversation. This API follows the same request and response format as OpenAI's API. It allows you to interact with the model in a manner similar to how you would with [OpenAI's API](https://platform.openai.com/docs/api-reference/chat/create). #### Request - Method: POST - URL: `/api/v1/chats_openai/{chat_id}/chat/completions` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"model"`: `string` - `"messages"`: `object list` - `"stream"`: `boolean` - `"extra_body"`: `object` (optional) ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/chats_openai/{chat_id}/chat/completions \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "model": "model", "messages": [{"role": "user", "content": "Say this is a test!"}], "stream": true, "extra_body": { "reference": true, "metadata_condition": { "logic": "and", "conditions": [ { "name": "author", "comparison_operator": "is", "value": "bob" } ] } } }' ``` ##### Request Parameters - `model` (*Body parameter*) `string`, *Required* The model used to generate the response. The server will parse this automatically, so you can set it to any value for now. - `messages` (*Body parameter*) `list[object]`, *Required* A list of historical chat messages used to generate the response. This must contain at least one message with the `user` role. - `stream` (*Body parameter*) `boolean` Whether to receive the response as a stream. Set this to `false` explicitly if you prefer to receive the entire response in one go instead of as a stream. - `extra_body` (*Body parameter*) `object` Extra request parameters: - `reference`: `boolean` - include reference in the final chunk (stream) or in the final message (non-stream). - `metadata_condition`: `object` - metadata filter conditions applied to retrieval results. #### Response Stream: ```json data:{ "id": "chatcmpl-3b0397f277f511f0b47f729e3aa55728", "choices": [ { "delta": { "content": "Hello! It seems like you're just greeting me. If you have a specific", "role": "assistant", "function_call": null, "tool_calls": null, "reasoning_content": null }, "finish_reason": null, "index": 0, "logprobs": null } ], "created": 1755084508, "model": "model", "object": "chat.completion.chunk", "system_fingerprint": "", "usage": null } data:{"id": "chatcmpl-3b0397f277f511f0b47f729e3aa55728", "choices": [{"delta": {"content": " question or need information, feel free to ask, and I'll do my best", "role": "assistant", "function_call": null, "tool_calls": null, "reasoning_content": null}, "finish_reason": null, "index": 0, "logprobs": null}], "created": 1755084508, "model": "model", "object": "chat.completion.chunk", "system_fingerprint": "", "usage": null} data:{"id": "chatcmpl-3b0397f277f511f0b47f729e3aa55728", "choices": [{"delta": {"content": " to assist you based on the knowledge base provided.", "role": "assistant", "function_call": null, "tool_calls": null, "reasoning_content": null}, "finish_reason": null, "index": 0, "logprobs": null}], "created": 1755084508, "model": "model", "object": "chat.completion.chunk", "system_fingerprint": "", "usage": null} data:{"id": "chatcmpl-3b0397f277f511f0b47f729e3aa55728", "choices": [{"delta": {"content": null, "role": "assistant", "function_call": null, "tool_calls": null, "reasoning_content": null}, "finish_reason": "stop", "index": 0, "logprobs": null}], "created": 1755084508, "model": "model", "object": "chat.completion.chunk", "system_fingerprint": "", "usage": {"prompt_tokens": 5, "completion_tokens": 188, "total_tokens": 193}} data:[DONE] ``` Non-stream: ```json { "choices": [ { "finish_reason": "stop", "index": 0, "logprobs": null, "message": { "content": "Hello! I'm your smart assistant. What can I do for you?", "role": "assistant" } } ], "created": 1755084403, "id": "chatcmpl-3b0397f277f511f0b47f729e3aa55728", "model": "model", "object": "chat.completion", "usage": { "completion_tokens": 55, "completion_tokens_details": { "accepted_prediction_tokens": 55, "reasoning_tokens": 5, "rejected_prediction_tokens": 0 }, "prompt_tokens": 5, "total_tokens": 60 } } ``` Failure: ```json { "code": 102, "message": "The last content of this conversation is not from user." } ``` --- ### Create agent completion **POST** `/api/v1/agents_openai/{agent_id}/chat/completions` Creates a model response for a given chat conversation. This API follows the same request and response format as OpenAI's API. It allows you to interact with the model in a manner similar to how you would with [OpenAI's API](https://platform.openai.com/docs/api-reference/chat/create). #### Request - Method: POST - URL: `/api/v1/agents_openai/{agent_id}/chat/completions` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"model"`: `string` - `"messages"`: `object list` - `"stream"`: `boolean` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/agents_openai/{agent_id}/chat/completions \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "model": "model", "messages": [{"role": "user", "content": "Say this is a test!"}], "stream": true }' ``` ##### Request Parameters - `model` (*Body parameter*) `string`, *Required* The model used to generate the response. The server will parse this automatically, so you can set it to any value for now. - `messages` (*Body parameter*) `list[object]`, *Required* A list of historical chat messages used to generate the response. This must contain at least one message with the `user` role. - `stream` (*Body parameter*) `boolean` Whether to receive the response as a stream. Set this to `false` explicitly if you prefer to receive the entire response in one go instead of as a stream. - `session_id` (*Body parameter*) `string` Agent session id. #### Response Stream: ```json ... data: { "id": "c39f6f9c83d911f0858253708ecb6573", "object": "chat.completion.chunk", "model": "d1f79142831f11f09cc51795b9eb07c0", "choices": [ { "delta": { "content": " terminal" }, "finish_reason": null, "index": 0 } ] } data: { "id": "c39f6f9c83d911f0858253708ecb6573", "object": "chat.completion.chunk", "model": "d1f79142831f11f09cc51795b9eb07c0", "choices": [ { "delta": { "content": "." }, "finish_reason": null, "index": 0 } ] } data: { "id": "c39f6f9c83d911f0858253708ecb6573", "object": "chat.completion.chunk", "model": "d1f79142831f11f09cc51795b9eb07c0", "choices": [ { "delta": { "content": "", "reference": { "chunks": { "20": { "id": "4b8935ac0a22deb1", "content": "```cd /usr/ports/editors/neovim/ && make install```## Android[Termux](https://github.com/termux/termux-app) offers a Neovim package.", "document_id": "4bdd2ff65e1511f0907f09f583941b45", "document_name": "INSTALL22.md", "dataset_id": "456ce60c5e1511f0907f09f583941b45", "image_id": "", "positions": [ [ 12, 11, 11, 11, 11 ] ], "url": null, "similarity": 0.5697155305154673, "vector_similarity": 0.7323851005515574, "term_similarity": 0.5000000005, "doc_type": "" } }, "doc_aggs": { "INSTALL22.md": { "doc_name": "INSTALL22.md", "doc_id": "4bdd2ff65e1511f0907f09f583941b45", "count": 3 }, "INSTALL.md": { "doc_name": "INSTALL.md", "doc_id": "4bd7fdd85e1511f0907f09f583941b45", "count": 2 }, "INSTALL(1).md": { "doc_name": "INSTALL(1).md", "doc_id": "4bdfb42e5e1511f0907f09f583941b45", "count": 2 }, "INSTALL3.md": { "doc_name": "INSTALL3.md", "doc_id": "4bdab5825e1511f0907f09f583941b45", "count": 1 } } } }, "finish_reason": null, "index": 0 } ] } data: [DONE] ``` Non-stream: ```json { "choices": [ { "finish_reason": "stop", "index": 0, "logprobs": null, "message": { "content": "\nTo install Neovim, the process varies depending on your operating system:\n\n### For Windows:\n1. **Download from GitHub**: \n - Visit the [Neovim releases page](https://github.com/neovim/neovim/releases)\n - Download the latest Windows installer (nvim-win64.msi)\n - Run the installer and follow the prompts\n\n2. **Using winget** (Windows Package Manager):\n...", "reference": { "chunks": { "20": { "content": "```cd /usr/ports/editors/neovim/ && make install```## Android[Termux](https://github.com/termux/termux-app) offers a Neovim package.", "dataset_id": "456ce60c5e1511f0907f09f583941b45", "doc_type": "", "document_id": "4bdd2ff65e1511f0907f09f583941b45", "document_name": "INSTALL22.md", "id": "4b8935ac0a22deb1", "image_id": "", "positions": [ [ 12, 11, 11, 11, 11 ] ], "similarity": 0.5697155305154673, "term_similarity": 0.5000000005, "url": null, "vector_similarity": 0.7323851005515574 } }, "doc_aggs": { "INSTALL(1).md": { "count": 2, "doc_id": "4bdfb42e5e1511f0907f09f583941b45", "doc_name": "INSTALL(1).md" }, "INSTALL.md": { "count": 2, "doc_id": "4bd7fdd85e1511f0907f09f583941b45", "doc_name": "INSTALL.md" }, "INSTALL22.md": { "count": 3, "doc_id": "4bdd2ff65e1511f0907f09f583941b45", "doc_name": "INSTALL22.md" }, "INSTALL3.md": { "count": 1, "doc_id": "4bdab5825e1511f0907f09f583941b45", "doc_name": "INSTALL3.md" } } }, "role": "assistant" } } ], "created": null, "id": "c39f6f9c83d911f0858253708ecb6573", "model": "d1f79142831f11f09cc51795b9eb07c0", "object": "chat.completion", "param": null, "usage": { "completion_tokens": 415, "completion_tokens_details": { "accepted_prediction_tokens": 0, "reasoning_tokens": 0, "rejected_prediction_tokens": 0 }, "prompt_tokens": 6, "total_tokens": 421 } } ``` Failure: ```json { "code": 102, "message": "The last content of this conversation is not from user." } ``` ## DATASET MANAGEMENT --- ### Create dataset **POST** `/api/v1/datasets` Creates a dataset. #### Request - Method: POST - URL: `/api/v1/datasets` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"name"`: `string` - `"avatar"`: `string` - `"description"`: `string` - `"embedding_model"`: `string` - `"permission"`: `string` - `"chunk_method"`: `string` - `"parser_config"`: `object` - `"parse_type"`: `int` - `"pipeline_id"`: `string` ##### A basic request example ```bash curl --request POST \ --url http://{address}/api/v1/datasets \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "name": "test_1" }' ``` ##### A request example specifying ingestion pipeline :::caution WARNING You must *not* include `"chunk_method"` or `"parser_config"` when specifying an ingestion pipeline. ::: ```bash curl --request POST \ --url http://{address}/api/v1/datasets \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "name": "test-sdk", "parse_type": , "pipeline_id": "" }' ``` ##### Request parameters - `"name"`: (*Body parameter*), `string`, *Required* The unique name of the dataset to create. It must adhere to the following requirements: - Basic Multilingual Plane (BMP) only - Maximum 128 characters - Case-insensitive - `"avatar"`: (*Body parameter*), `string` Base64 encoding of the avatar. - Maximum 65535 characters - `"description"`: (*Body parameter*), `string` A brief description of the dataset to create. - Maximum 65535 characters - `"embedding_model"`: (*Body parameter*), `string` The name of the embedding model to use. For example: `"BAAI/bge-large-zh-v1.5@BAAI"` - Maximum 255 characters - Must follow `model_name@model_factory` format - `"permission"`: (*Body parameter*), `string` Specifies who can access the dataset to create. Available options: - `"me"`: (Default) Only you can manage the dataset. - `"team"`: All team members can manage the dataset. - `"chunk_method"`: (*Body parameter*), `enum` The default chunk method of the dataset to create. Mutually exclusive with `"parse_type"` and `"pipeline_id"`. If you set `"chunk_method"`, do not include `"parse_type"` or `"pipeline_id"`. Available options: - `"naive"`: General (default) - `"book"`: Book - `"email"`: Email - `"laws"`: Laws - `"manual"`: Manual - `"one"`: One - `"paper"`: Paper - `"picture"`: Picture - `"presentation"`: Presentation - `"qa"`: Q&A - `"table"`: Table - `"tag"`: Tag - `"parser_config"`: (*Body parameter*), `object` The configuration settings for the dataset parser. The attributes in this JSON object vary with the selected `"chunk_method"`: - If `"chunk_method"` is `"naive"`, the `"parser_config"` object contains the following attributes: - `"auto_keywords"`: `int` - Defaults to `0` - Minimum: `0` - Maximum: `32` - `"auto_questions"`: `int` - Defaults to `0` - Minimum: `0` - Maximum: `10` - `"chunk_token_num"`: `int` - Defaults to `512` - Minimum: `1` - Maximum: `2048` - `"delimiter"`: `string` - Defaults to `"\n"`. - `"html4excel"`: `bool` - Whether to convert Excel documents into HTML format. - Defaults to `false` - `"layout_recognize"`: `string` - Defaults to `DeepDOC` - `"tag_kb_ids"`: `array` - IDs of datasets to be parsed using the ​​Tag chunk method. - Before setting this, ensure a tag set is created and properly configured. For details, see [Use tag set](https://ragflow.io/docs/dev/use_tag_sets). - `"task_page_size"`: `int` - For PDFs only. - Defaults to `12` - Minimum: `1` - `"raptor"`: `object` RAPTOR-specific settings. - Defaults to: `{"use_raptor": false}` - `"graphrag"`: `object` GRAPHRAG-specific settings. - Defaults to: `{"use_graphrag": false}` - If `"chunk_method"` is `"qa"`, `"manuel"`, `"paper"`, `"book"`, `"laws"`, or `"presentation"`, the `"parser_config"` object contains the following attribute: - `"raptor"`: `object` RAPTOR-specific settings. - Defaults to: `{"use_raptor": false}`. - If `"chunk_method"` is `"table"`, `"picture"`, `"one"`, or `"email"`, `"parser_config"` is an empty JSON object. - `"parse_type"`: (*Body parameter*), `int` The ingestion pipeline parse type identifier, i.e., the number of parsers in your **Parser** component. - Required (along with `"pipeline_id"`) if specifying an ingestion pipeline. - Must not be included when `"chunk_method"` is specified. - `"pipeline_id"`: (*Body parameter*), `string` The ingestion pipeline ID. Can be found in the corresponding URL in the RAGFlow UI. - Required (along with `"parse_type"`) if specifying an ingestion pipeline. - Must be a 32-character lowercase hexadecimal string, e.g., `"d0bebe30ae2211f0970942010a8e0005"`. - Must not be included when `"chunk_method"` is specified. :::caution WARNING You can choose either of the following ingestion options when creating a dataset, but *not* both: - Use a built-in chunk method -- specify `"chunk_method"` (optionally with `"parser_config"`). - Use an ingestion pipeline -- specify both `"parse_type"` and `"pipeline_id"`. If none of `"chunk_method"`, `"parse_type"`, or `"pipeline_id"` are provided, the system defaults to `chunk_method = "naive"`. ::: #### Response Success: ```json { "code": 0, "data": { "avatar": null, "chunk_count": 0, "chunk_method": "naive", "create_date": "Mon, 28 Apr 2025 18:40:41 GMT", "create_time": 1745836841611, "created_by": "3af81804241d11f0a6a79f24fc270c7f", "description": null, "document_count": 0, "embedding_model": "BAAI/bge-large-zh-v1.5@BAAI", "id": "3b4de7d4241d11f0a6a79f24fc270c7f", "language": "English", "name": "RAGFlow example", "pagerank": 0, "parser_config": { "chunk_token_num": 128, "delimiter": "\\n!?;。;!?", "html4excel": false, "layout_recognize": "DeepDOC", "raptor": { "use_raptor": false } }, "permission": "me", "similarity_threshold": 0.2, "status": "1", "tenant_id": "3af81804241d11f0a6a79f24fc270c7f", "token_num": 0, "update_date": "Mon, 28 Apr 2025 18:40:41 GMT", "update_time": 1745836841611, "vector_similarity_weight": 0.3, }, } ``` Failure: ```json { "code": 101, "message": "Dataset name 'RAGFlow example' already exists" } ``` --- ### Delete datasets **DELETE** `/api/v1/datasets` Deletes datasets by ID. #### Request - Method: DELETE - URL: `/api/v1/datasets` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"ids"`: `list[string]` or `null` ##### Request example ```bash curl --request DELETE \ --url http://{address}/api/v1/datasets \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "ids": ["d94a8dc02c9711f0930f7fbc369eab6d", "e94a8dc02c9711f0930f7fbc369eab6e"] }' ``` ##### Request parameters - `"ids"`: (*Body parameter*), `list[string]` or `null`, *Required* Specifies the datasets to delete: - If `null`, all datasets will be deleted. - If an array of IDs, only the specified datasets will be deleted. - If an empty array, no datasets will be deleted. #### Response Success: ```json { "code": 0 } ``` Failure: ```json { "code": 102, "message": "You don't own the dataset." } ``` --- ### Update dataset **PUT** `/api/v1/datasets/{dataset_id}` Updates configurations for a specified dataset. #### Request - Method: PUT - URL: `/api/v1/datasets/{dataset_id}` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"name"`: `string` - `"avatar"`: `string` - `"description"`: `string` - `"embedding_model"`: `string` - `"permission"`: `string` - `"chunk_method"`: `string` - `"pagerank"`: `int` - `"parser_config"`: `object` ##### Request example ```bash curl --request PUT \ --url http://{address}/api/v1/datasets/{dataset_id} \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "name": "updated_dataset" }' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The ID of the dataset to update. - `"name"`: (*Body parameter*), `string` The revised name of the dataset. - Basic Multilingual Plane (BMP) only - Maximum 128 characters - Case-insensitive - `"avatar"`: (*Body parameter*), `string` The updated base64 encoding of the avatar. - Maximum 65535 characters - `"embedding_model"`: (*Body parameter*), `string` The updated embedding model name. - Ensure that `"chunk_count"` is `0` before updating `"embedding_model"`. - Maximum 255 characters - Must follow `model_name@model_factory` format - `"permission"`: (*Body parameter*), `string` The updated dataset permission. Available options: - `"me"`: (Default) Only you can manage the dataset. - `"team"`: All team members can manage the dataset. - `"pagerank"`: (*Body parameter*), `int` refer to [Set page rank](https://ragflow.io/docs/dev/set_page_rank) - Default: `0` - Minimum: `0` - Maximum: `100` - `"chunk_method"`: (*Body parameter*), `enum` The chunking method for the dataset. Available options: - `"naive"`: General (default) - `"book"`: Book - `"email"`: Email - `"laws"`: Laws - `"manual"`: Manual - `"one"`: One - `"paper"`: Paper - `"picture"`: Picture - `"presentation"`: Presentation - `"qa"`: Q&A - `"table"`: Table - `"tag"`: Tag - `"parser_config"`: (*Body parameter*), `object` The configuration settings for the dataset parser. The attributes in this JSON object vary with the selected `"chunk_method"`: - If `"chunk_method"` is `"naive"`, the `"parser_config"` object contains the following attributes: - `"auto_keywords"`: `int` - Defaults to `0` - Minimum: `0` - Maximum: `32` - `"auto_questions"`: `int` - Defaults to `0` - Minimum: `0` - Maximum: `10` - `"chunk_token_num"`: `int` - Defaults to `512` - Minimum: `1` - Maximum: `2048` - `"delimiter"`: `string` - Defaults to `"\n"`. - `"html4excel"`: `bool` Indicates whether to convert Excel documents into HTML format. - Defaults to `false` - `"layout_recognize"`: `string` - Defaults to `DeepDOC` - `"tag_kb_ids"`: `array` refer to [Use tag set](https://ragflow.io/docs/dev/use_tag_sets) - Must include a list of dataset IDs, where each dataset is parsed using the ​​Tag Chunking Method - `"task_page_size"`: `int` For PDF only. - Defaults to `12` - Minimum: `1` - `"raptor"`: `object` RAPTOR-specific settings. - Defaults to: `{"use_raptor": false}` - `"graphrag"`: `object` GRAPHRAG-specific settings. - Defaults to: `{"use_graphrag": false}` - If `"chunk_method"` is `"qa"`, `"manuel"`, `"paper"`, `"book"`, `"laws"`, or `"presentation"`, the `"parser_config"` object contains the following attribute: - `"raptor"`: `object` RAPTOR-specific settings. - Defaults to: `{"use_raptor": false}`. - If `"chunk_method"` is `"table"`, `"picture"`, `"one"`, or `"email"`, `"parser_config"` is an empty JSON object. #### Response Success: ```json { "code": 0 } ``` Failure: ```json { "code": 102, "message": "Can't change tenant_id." } ``` --- ### List datasets **GET** `/api/v1/datasets?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}` Lists datasets. #### Request - Method: GET - URL: `/api/v1/datasets?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url http://{address}/api/v1/datasets?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id} \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `page`: (*Filter parameter*) Specifies the page on which the datasets will be displayed. Defaults to `1`. - `page_size`: (*Filter parameter*) The number of datasets on each page. Defaults to `30`. - `orderby`: (*Filter parameter*) The field by which datasets should be sorted. Available options: - `create_time` (default) - `update_time` - `desc`: (*Filter parameter*) Indicates whether the retrieved datasets should be sorted in descending order. Defaults to `true`. - `name`: (*Filter parameter*) The name of the dataset to retrieve. - `id`: (*Filter parameter*) The ID of the dataset to retrieve. #### Response Success: ```json { "code": 0, "data": [ { "avatar": "", "chunk_count": 59, "create_date": "Sat, 14 Sep 2024 01:12:37 GMT", "create_time": 1726276357324, "created_by": "69736c5e723611efb51b0242ac120007", "description": null, "document_count": 1, "embedding_model": "BAAI/bge-large-zh-v1.5", "id": "6e211ee0723611efa10a0242ac120007", "language": "English", "name": "mysql", "chunk_method": "naive", "parser_config": { "chunk_token_num": 8192, "delimiter": "\\n", "entity_types": [ "organization", "person", "location", "event", "time" ] }, "permission": "me", "similarity_threshold": 0.2, "status": "1", "tenant_id": "69736c5e723611efb51b0242ac120007", "token_num": 12744, "update_date": "Thu, 10 Oct 2024 04:07:23 GMT", "update_time": 1728533243536, "vector_similarity_weight": 0.3 } ], "total": 1 } ``` Failure: ```json { "code": 102, "message": "The dataset doesn't exist" } ``` --- ### Get knowledge graph **GET** `/api/v1/datasets/{dataset_id}/knowledge_graph` Retrieves the knowledge graph of a specified dataset. #### Request - Method: GET - URL: `/api/v1/datasets/{dataset_id}/knowledge_graph` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url http://{address}/api/v1/datasets/{dataset_id}/knowledge_graph \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The ID of the target dataset. #### Response Success: ```json { "code": 0, "data": { "graph": { "directed": false, "edges": [ { "description": "The notice is a document issued to convey risk warnings and operational alerts.The notice is a specific instance of a notification document issued under the risk warning framework.", "keywords": ["9", "8"], "source": "notice", "source_id": ["8a46cdfe4b5c11f0a5281a58e595aa1c"], "src_id": "xxx", "target": "xxx", "tgt_id": "xxx", "weight": 17.0 } ], "graph": { "source_id": ["8a46cdfe4b5c11f0a5281a58e595aa1c", "8a7eb6424b5c11f0a5281a58e595aa1c"] }, "multigraph": false, "nodes": [ { "description": "xxx", "entity_name": "xxx", "entity_type": "ORGANIZATION", "id": "xxx", "pagerank": 0.10804906590624092, "rank": 3, "source_id": ["8a7eb6424b5c11f0a5281a58e595aa1c"] } ] }, "mind_map": {} } } ``` Failure: ```json { "code": 102, "message": "The dataset doesn't exist" } ``` --- ### Delete knowledge graph **DELETE** `/api/v1/datasets/{dataset_id}/knowledge_graph` Removes the knowledge graph of a specified dataset. #### Request - Method: DELETE - URL: `/api/v1/datasets/{dataset_id}/knowledge_graph` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request DELETE \ --url http://{address}/api/v1/datasets/{dataset_id}/knowledge_graph \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The ID of the target dataset. #### Response Success: ```json { "code": 0, "data": true } ``` Failure: ```json { "code": 102, "message": "The dataset doesn't exist" } ``` --- ### Construct knowledge graph **POST** `/api/v1/datasets/{dataset_id}/run_graphrag` Constructs a knowledge graph from a specified dataset. #### Request - Method: POST - URL: `/api/v1/datasets/{dataset_id}/run_graphrag` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/datasets/{dataset_id}/run_graphrag \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The ID of the target dataset. #### Response Success: ```json { "code":0, "data":{ "graphrag_task_id":"e498de54bfbb11f0ba028f704583b57b" } } ``` Failure: ```json { "code": 102, "message": "Invalid Dataset ID" } ``` --- ### Get knowledge graph construction status **GET** `/api/v1/datasets/{dataset_id}/trace_graphrag` Retrieves the knowledge graph construction status for a specified dataset. #### Request - Method: GET - URL: `/api/v1/datasets/{dataset_id}/trace_graphrag` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url http://{address}/api/v1/datasets/{dataset_id}/trace_graphrag \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The ID of the target dataset. #### Response Success: ```json { "code":0, "data":{ "begin_at":"Wed, 12 Nov 2025 19:36:56 GMT", "chunk_ids":"", "create_date":"Wed, 12 Nov 2025 19:36:56 GMT", "create_time":1762947416350, "digest":"39e43572e3dcd84f", "doc_id":"44661c10bde211f0bc93c164a47ffc40", "from_page":100000000, "id":"e498de54bfbb11f0ba028f704583b57b", "priority":0, "process_duration":2.45419, "progress":1.0, "progress_msg":"19:36:56 created task graphrag\n19:36:57 Task has been received.\n19:36:58 [GraphRAG] doc:083661febe2411f0bc79456921e5745f has no available chunks, skip generation.\n19:36:58 [GraphRAG] build_subgraph doc:44661c10bde211f0bc93c164a47ffc40 start (chunks=1, timeout=10000000000s)\n19:36:58 Graph already contains 44661c10bde211f0bc93c164a47ffc40\n19:36:58 [GraphRAG] build_subgraph doc:44661c10bde211f0bc93c164a47ffc40 empty\n19:36:58 [GraphRAG] kb:33137ed0bde211f0bc93c164a47ffc40 no subgraphs generated successfully, end.\n19:36:58 Knowledge Graph done (0.72s)","retry_count":1, "task_type":"graphrag", "to_page":100000000, "update_date":"Wed, 12 Nov 2025 19:36:58 GMT", "update_time":1762947418454 } } ``` Failure: ```json { "code": 102, "message": "Invalid Dataset ID" } ``` --- ### Construct RAPTOR **POST** `/api/v1/datasets/{dataset_id}/run_raptor` Construct a RAPTOR from a specified dataset. #### Request - Method: POST - URL: `/api/v1/datasets/{dataset_id}/run_raptor` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/datasets/{dataset_id}/run_raptor \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The ID of the target dataset. #### Response Success: ```json { "code":0, "data":{ "raptor_task_id":"50d3c31cbfbd11f0ba028f704583b57b" } } ``` Failure: ```json { "code": 102, "message": "Invalid Dataset ID" } ``` --- ### Get RAPTOR construction status **GET** `/api/v1/datasets/{dataset_id}/trace_raptor` Retrieves the RAPTOR construction status for a specified dataset. #### Request - Method: GET - URL: `/api/v1/datasets/{dataset_id}/trace_raptor` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url http://{address}/api/v1/datasets/{dataset_id}/trace_raptor \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The ID of the target dataset. #### Response Success: ```json { "code":0, "data":{ "begin_at":"Wed, 12 Nov 2025 19:47:07 GMT", "chunk_ids":"", "create_date":"Wed, 12 Nov 2025 19:47:07 GMT", "create_time":1762948027427, "digest":"8b279a6248cb8fc6", "doc_id":"44661c10bde211f0bc93c164a47ffc40", "from_page":100000000, "id":"50d3c31cbfbd11f0ba028f704583b57b", "priority":0, "process_duration":0.948244, "progress":1.0, "progress_msg":"19:47:07 created task raptor\n19:47:07 Task has been received.\n19:47:07 Processing...\n19:47:07 Processing...\n19:47:07 Indexing done (0.01s).\n19:47:07 Task done (0.29s)", "retry_count":1, "task_type":"raptor", "to_page":100000000, "update_date":"Wed, 12 Nov 2025 19:47:07 GMT", "update_time":1762948027948 } } ``` Failure: ```json { "code": 102, "message": "Invalid Dataset ID" } ``` --- ## FILE MANAGEMENT WITHIN DATASET --- ### Upload documents **POST** `/api/v1/datasets/{dataset_id}/documents` Uploads documents to a specified dataset. #### Request - Method: POST - URL: `/api/v1/datasets/{dataset_id}/documents` - Headers: - `'Content-Type: multipart/form-data'` - `'Authorization: Bearer '` - Form: - `'file=@{FILE_PATH}'` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/datasets/{dataset_id}/documents \ --header 'Content-Type: multipart/form-data' \ --header 'Authorization: Bearer ' \ --form 'file=@./test1.txt' \ --form 'file=@./test2.pdf' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The ID of the dataset to which the documents will be uploaded. - `'file'`: (*Body parameter*) A document to upload. #### Response Success: ```json { "code": 0, "data": [ { "chunk_method": "naive", "created_by": "69736c5e723611efb51b0242ac120007", "dataset_id": "527fa74891e811ef9c650242ac120006", "id": "b330ec2e91ec11efbc510242ac120004", "location": "1.txt", "name": "1.txt", "parser_config": { "chunk_token_num": 128, "delimiter": "\\n", "html4excel": false, "layout_recognize": true, "raptor": { "use_raptor": false } }, "run": "UNSTART", "size": 17966, "thumbnail": "", "type": "doc" } ] } ``` Failure: ```json { "code": 101, "message": "No file part!" } ``` --- ### Update document **PUT** `/api/v1/datasets/{dataset_id}/documents/{document_id}` Updates configurations for a specified document. #### Request - Method: PUT - URL: `/api/v1/datasets/{dataset_id}/documents/{document_id}` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"name"`:`string` - `"meta_fields"`:`object` - `"chunk_method"`:`string` - `"parser_config"`:`object` ##### Request example ```bash curl --request PUT \ --url http://{address}/api/v1/datasets/{dataset_id}/documents/{document_id} \ --header 'Authorization: Bearer ' \ --header 'Content-Type: application/json' \ --data ' { "name": "manual.txt", "chunk_method": "manual", "parser_config": {"chunk_token_num": 128} }' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The ID of the associated dataset. - `document_id`: (*Path parameter*) The ID of the document to update. - `"name"`: (*Body parameter*), `string` - `"meta_fields"`: (*Body parameter*), `dict[str, Any]` The meta fields of the document. - `"chunk_method"`: (*Body parameter*), `string` The parsing method to apply to the document: - `"naive"`: General - `"manual`: Manual - `"qa"`: Q&A - `"table"`: Table - `"paper"`: Paper - `"book"`: Book - `"laws"`: Laws - `"presentation"`: Presentation - `"picture"`: Picture - `"one"`: One - `"email"`: Email - `"parser_config"`: (*Body parameter*), `object` The configuration settings for the dataset parser. The attributes in this JSON object vary with the selected `"chunk_method"`: - If `"chunk_method"` is `"naive"`, the `"parser_config"` object contains the following attributes: - `"chunk_token_num"`: Defaults to `256`. - `"layout_recognize"`: Defaults to `true`. - `"html4excel"`: Indicates whether to convert Excel documents into HTML format. Defaults to `false`. - `"delimiter"`: Defaults to `"\n"`. - `"task_page_size"`: Defaults to `12`. For PDF only. - `"raptor"`: RAPTOR-specific settings. Defaults to: `{"use_raptor": false}`. - If `"chunk_method"` is `"qa"`, `"manuel"`, `"paper"`, `"book"`, `"laws"`, or `"presentation"`, the `"parser_config"` object contains the following attribute: - `"raptor"`: RAPTOR-specific settings. Defaults to: `{"use_raptor": false}`. - If `"chunk_method"` is `"table"`, `"picture"`, `"one"`, or `"email"`, `"parser_config"` is an empty JSON object. - `"enabled"`: (*Body parameter*), `integer` Whether the document should be **available** in the knowledge base. - `1` → (available) - `0` → (unavailable) #### Response Success: ```json { "code": 0, "data": { "id": "cd38dd72d4a611f0af9c71de94a988ef", "name": "large.md", "type": "doc", "suffix": "md", "size": 2306906, "location": "large.md", "source_type": "local", "status": "1", "run": "DONE", "dataset_id": "5f546a1ad4a611f0af9c71de94a988ef", "chunk_method": "naive", "chunk_count": 2, "token_count": 8126, "created_by": "eab7f446cb5a11f0ab334fbc3aa38f35", "create_date": "Tue, 09 Dec 2025 10:28:52 GMT", "create_time": 1765247332122, "update_date": "Wed, 17 Dec 2025 10:51:16 GMT", "update_time": 1765939876819, "process_begin_at": "Wed, 17 Dec 2025 10:33:55 GMT", "process_duration": 14.8615, "progress": 1.0, "progress_msg": [ "10:33:58 Task has been received.", "10:33:59 Page(1~100000001): Start to parse.", "10:33:59 Page(1~100000001): Finish parsing.", "10:34:07 Page(1~100000001): Generate 2 chunks", "10:34:09 Page(1~100000001): Embedding chunks (2.13s)", "10:34:09 Page(1~100000001): Indexing done (0.31s).", "10:34:09 Page(1~100000001): Task done (11.68s)" ], "parser_config": { "chunk_token_num": 512, "delimiter": "\n", "auto_keywords": 0, "auto_questions": 0, "topn_tags": 3, "layout_recognize": "DeepDOC", "html4excel": false, "image_context_size": 0, "table_context_size": 0, "graphrag": { "use_graphrag": true, "method": "light", "entity_types": [ "organization", "person", "geo", "event", "category" ] }, "raptor": { "use_raptor": true, "max_cluster": 64, "max_token": 256, "threshold": 0.1, "random_seed": 0, "prompt": "Please summarize the following paragraphs. Be careful with the numbers, do not make things up. Paragraphs as following:\n {cluster_content}\nThe above is the content you need to summarize." } }, "meta_fields": {}, "pipeline_id": "", "thumbnail": "" } } ``` Failure: ```json { "code": 102, "message": "The dataset does not have the document." } ``` --- ### Download document **GET** `/api/v1/datasets/{dataset_id}/documents/{document_id}` Downloads a document from a specified dataset. #### Request - Method: GET - URL: `/api/v1/datasets/{dataset_id}/documents/{document_id}` - Headers: - `'Authorization: Bearer '` - Output: - `'{PATH_TO_THE_FILE}'` ##### Request example ```bash curl --request GET \ --url http://{address}/api/v1/datasets/{dataset_id}/documents/{document_id} \ --header 'Authorization: Bearer ' \ --output ./ragflow.txt ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The associated dataset ID. - `documents_id`: (*Path parameter*) The ID of the document to download. #### Response Success: ```json This is a test to verify the file download feature. ``` Failure: ```json { "code": 102, "message": "You do not own the dataset 7898da028a0511efbf750242ac1220005." } ``` --- ### List documents **GET** `/api/v1/datasets/{dataset_id}/documents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name}&create_time_from={timestamp}&create_time_to={timestamp}&suffix={file_suffix}&run={run_status}&metadata_condition={json}` Lists documents in a specified dataset. #### Request - Method: GET - URL: `/api/v1/datasets/{dataset_id}/documents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&keywords={keywords}&id={document_id}&name={document_name}&create_time_from={timestamp}&create_time_to={timestamp}&suffix={file_suffix}&run={run_status}` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` ##### Request examples **A basic request with pagination:** ```bash curl --request GET \ --url http://{address}/api/v1/datasets/{dataset_id}/documents?page=1&page_size=10 \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The associated dataset ID. - `keywords`: (*Filter parameter*), `string` The keywords used to match document titles. - `page`: (*Filter parameter*), `integer` Specifies the page on which the documents will be displayed. Defaults to `1`. - `page_size`: (*Filter parameter*), `integer` The maximum number of documents on each page. Defaults to `30`. - `orderby`: (*Filter parameter*), `string` The field by which documents should be sorted. Available options: - `create_time` (default) - `update_time` - `desc`: (*Filter parameter*), `boolean` Indicates whether the retrieved documents should be sorted in descending order. Defaults to `true`. - `id`: (*Filter parameter*), `string` The ID of the document to retrieve. - `create_time_from`: (*Filter parameter*), `integer` Unix timestamp for filtering documents created after this time. 0 means no filter. Defaults to `0`. - `create_time_to`: (*Filter parameter*), `integer` Unix timestamp for filtering documents created before this time. 0 means no filter. Defaults to `0`. - `suffix`: (*Filter parameter*), `array[string]` Filter by file suffix. Supports multiple values, e.g., `pdf`, `txt`, and `docx`. Defaults to all suffixes. - `run`: (*Filter parameter*), `array[string]` Filter by document processing status. Supports numeric, text, and mixed formats: - Numeric format: `["0", "1", "2", "3", "4"]` - Text format: `[UNSTART, RUNNING, CANCEL, DONE, FAIL]` - Mixed format: `[UNSTART, 1, DONE]` (mixing numeric and text formats) - Status mapping: - `0` / `UNSTART`: Document not yet processed - `1` / `RUNNING`: Document is currently being processed - `2` / `CANCEL`: Document processing was cancelled - `3` / `DONE`: Document processing completed successfully - `4` / `FAIL`: Document processing failed Defaults to all statuses. - `metadata_condition`: (*Filter parameter*), `object` (JSON in query) Optional metadata filter applied to documents when `document_ids` is not provided. Uses the same structure as retrieval: - `logic`: `"and"` (default) or `"or"` - `conditions`: array of `{ "name": string, "comparison_operator": string, "value": string }` - `comparison_operator` supports: `is`, `not is`, `contains`, `not contains`, `in`, `not in`, `start with`, `end with`, `>`, `<`, `≥`, `≤`, `empty`, `not empty` ##### Usage examples **A request with multiple filtering parameters** ```bash curl --request GET \ --url 'http://{address}/api/v1/datasets/{dataset_id}/documents?suffix=pdf&run=DONE&page=1&page_size=10' \ --header 'Authorization: Bearer ' ``` **Filter by metadata (query JSON):** ```bash curl -G \ --url "http://localhost:9222/api/v1/datasets/{{KB_ID}}/documents" \ --header 'Authorization: Bearer ' \ --data-urlencode 'metadata_condition={"logic":"and","conditions":[{"name":"tags","comparison_operator":"is","value":"bar"},{"name":"author","comparison_operator":"is","value":"alice"}]}' ``` #### Response Success: ```json { "code": 0, "data": { "docs": [ { "chunk_count": 0, "create_date": "Mon, 14 Oct 2024 09:11:01 GMT", "create_time": 1728897061948, "created_by": "69736c5e723611efb51b0242ac120007", "id": "3bcfbf8a8a0c11ef8aba0242ac120006", "knowledgebase_id": "7898da028a0511efbf750242ac120005", "location": "Test_2.txt", "name": "Test_2.txt", "parser_config": { "chunk_token_count": 128, "delimiter": "\n", "layout_recognize": true, "task_page_size": 12 }, "chunk_method": "naive", "process_begin_at": null, "process_duration": 0.0, "progress": 0.0, "progress_msg": "", "run": "UNSTART", "size": 7, "source_type": "local", "status": "1", "thumbnail": null, "token_count": 0, "type": "doc", "update_date": "Mon, 14 Oct 2024 09:11:01 GMT", "update_time": 1728897061948 } ], "total_datasets": 1 } } ``` Failure: ```json { "code": 102, "message": "You don't own the dataset 7898da028a0511efbf750242ac1220005. " } ``` --- ### Delete documents **DELETE** `/api/v1/datasets/{dataset_id}/documents` Deletes documents by ID. #### Request - Method: DELETE - URL: `/api/v1/datasets/{dataset_id}/documents` - Headers: - `'Content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"ids"`: `list[string]` ##### Request example ```bash curl --request DELETE \ --url http://{address}/api/v1/datasets/{dataset_id}/documents \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "ids": ["id_1","id_2"] }' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The associated dataset ID. - `"ids"`: (*Body parameter*), `list[string]` The IDs of the documents to delete. If it is not specified, all documents in the specified dataset will be deleted. #### Response Success: ```json { "code": 0 }. ``` Failure: ```json { "code": 102, "message": "You do not own the dataset 7898da028a0511efbf750242ac1220005." } ``` --- ### Parse documents **POST** `/api/v1/datasets/{dataset_id}/chunks` Parses documents in a specified dataset. #### Request - Method: POST - URL: `/api/v1/datasets/{dataset_id}/chunks` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"document_ids"`: `list[string]` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/datasets/{dataset_id}/chunks \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "document_ids": ["97a5f1c2759811efaa500242ac120004","97ad64b6759811ef9fc30242ac120004"] }' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The dataset ID. - `"document_ids"`: (*Body parameter*), `list[string]`, *Required* The IDs of the documents to parse. #### Response Success: ```json { "code": 0 } ``` Failure: ```json { "code": 102, "message": "`document_ids` is required" } ``` --- ### Stop parsing documents **DELETE** `/api/v1/datasets/{dataset_id}/chunks` Stops parsing specified documents. #### Request - Method: DELETE - URL: `/api/v1/datasets/{dataset_id}/chunks` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"document_ids"`: `list[string]` ##### Request example ```bash curl --request DELETE \ --url http://{address}/api/v1/datasets/{dataset_id}/chunks \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "document_ids": ["97a5f1c2759811efaa500242ac120004","97ad64b6759811ef9fc30242ac120004"] }' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The associated dataset ID. - `"document_ids"`: (*Body parameter*), `list[string]`, *Required* The IDs of the documents for which the parsing should be stopped. #### Response Success: ```json { "code": 0 } ``` Failure: ```json { "code": 102, "message": "`document_ids` is required" } ``` --- ## CHUNK MANAGEMENT WITHIN DATASET --- ### Add chunk **POST** `/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks` Adds a chunk to a specified document in a specified dataset. #### Request - Method: POST - URL: `/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"content"`: `string` - `"important_keywords"`: `list[string]` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "content": "" }' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The associated dataset ID. - `document_ids`: (*Path parameter*) The associated document ID. - `"content"`: (*Body parameter*), `string`, *Required* The text content of the chunk. - `"important_keywords`(*Body parameter*), `list[string]` The key terms or phrases to tag with the chunk. - `"questions"`(*Body parameter*), `list[string]` If there is a given question, the embedded chunks will be based on them #### Response Success: ```json { "code": 0, "data": { "chunk": { "content": "who are you", "create_time": "2024-12-30 16:59:55", "create_timestamp": 1735549195.969164, "dataset_id": "72f36e1ebdf411efb7250242ac120006", "document_id": "61d68474be0111ef98dd0242ac120006", "id": "12ccdc56e59837e5", "important_keywords": [], "questions": [] } } } ``` Failure: ```json { "code": 102, "message": "`content` is required" } ``` --- ### List chunks **GET** `/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks?keywords={keywords}&page={page}&page_size={page_size}&id={id}` Lists chunks in a specified document. #### Request - Method: GET - URL: `/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks?keywords={keywords}&page={page}&page_size={page_size}&id={chunk_id}` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url http://{address}/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks?keywords={keywords}&page={page}&page_size={page_size}&id={chunk_id} \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The associated dataset ID. - `document_id`: (*Path parameter*) The associated document ID. - `keywords`(*Filter parameter*), `string` The keywords used to match chunk content. - `page`(*Filter parameter*), `integer` Specifies the page on which the chunks will be displayed. Defaults to `1`. - `page_size`(*Filter parameter*), `integer` The maximum number of chunks on each page. Defaults to `1024`. - `id`(*Filter parameter*), `string` The ID of the chunk to retrieve. #### Response Success: ```json { "code": 0, "data": { "chunks": [ { "available": true, "content": "This is a test content.", "docnm_kwd": "1.txt", "document_id": "b330ec2e91ec11efbc510242ac120004", "id": "b48c170e90f70af998485c1065490726", "image_id": "", "important_keywords": "", "positions": [ "" ] } ], "doc": { "chunk_count": 1, "chunk_method": "naive", "create_date": "Thu, 24 Oct 2024 09:45:27 GMT", "create_time": 1729763127646, "created_by": "69736c5e723611efb51b0242ac120007", "dataset_id": "527fa74891e811ef9c650242ac120006", "id": "b330ec2e91ec11efbc510242ac120004", "location": "1.txt", "name": "1.txt", "parser_config": { "chunk_token_num": 128, "delimiter": "\\n", "html4excel": false, "layout_recognize": true, "raptor": { "use_raptor": false } }, "process_begin_at": "Thu, 24 Oct 2024 09:56:44 GMT", "process_duration": 0.54213, "progress": 0.0, "progress_msg": "Task dispatched...", "run": "2", "size": 17966, "source_type": "local", "status": "1", "thumbnail": "", "token_count": 8, "type": "doc", "update_date": "Thu, 24 Oct 2024 11:03:15 GMT", "update_time": 1729767795721 }, "total": 1 } } ``` Failure: ```json { "code": 102, "message": "You don't own the document 5c5999ec7be811ef9cab0242ac12000e5." } ``` --- ### Delete chunks **DELETE** `/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks` Deletes chunks by ID. #### Request - Method: DELETE - URL: `/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"chunk_ids"`: `list[string]` ##### Request example ```bash curl --request DELETE \ --url http://{address}/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "chunk_ids": ["test_1", "test_2"] }' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The associated dataset ID. - `document_ids`: (*Path parameter*) The associated document ID. - `"chunk_ids"`: (*Body parameter*), `list[string]` The IDs of the chunks to delete. If it is not specified, all chunks of the specified document will be deleted. #### Response Success: ```json { "code": 0 } ``` Failure: ```json { "code": 102, "message": "`chunk_ids` is required" } ``` --- ### Update chunk **PUT** `/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks/{chunk_id}` Updates content or configurations for a specified chunk. #### Request - Method: PUT - URL: `/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks/{chunk_id}` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"content"`: `string` - `"important_keywords"`: `list[string]` - `"available"`: `boolean` ##### Request example ```bash curl --request PUT \ --url http://{address}/api/v1/datasets/{dataset_id}/documents/{document_id}/chunks/{chunk_id} \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "content": "ragflow123", "important_keywords": [] }' ``` ##### Request parameters - `dataset_id`: (*Path parameter*) The associated dataset ID. - `document_ids`: (*Path parameter*) The associated document ID. - `chunk_id`: (*Path parameter*) The ID of the chunk to update. - `"content"`: (*Body parameter*), `string` The text content of the chunk. - `"important_keywords"`: (*Body parameter*), `list[string]` A list of key terms or phrases to tag with the chunk. - `"available"`: (*Body parameter*) `boolean` The chunk's availability status in the dataset. Value options: - `true`: Available (default) - `false`: Unavailable #### Response Success: ```json { "code": 0 } ``` Failure: ```json { "code": 102, "message": "Can't find this chunk 29a2d9987e16ba331fb4d7d30d99b71d2" } ``` --- ### Retrieve a metadata summary from a dataset **GET** `/api/v1/datasets/{dataset_id}/metadata/summary` Aggregates metadata values across all documents in a dataset. #### Request - Method: GET - URL: `/api/v1/datasets/{dataset_id}/metadata/summary` - Headers: - `'Authorization: Bearer '` ##### Response Success: ```json { "code": 0, "data": { "summary": { "tags": [["bar", 2], ["foo", 1], ["baz", 1]], "author": [["alice", 2], ["bob", 1]] } } } ``` --- ### Update or delete metadata **POST** `/api/v1/datasets/{dataset_id}/metadata/update` Batch update or delete document-level metadata within a specified dataset. If both `document_ids` and `metadata_condition` are omitted, all documents within that dataset are selected. When both are provided, the intersection is used. #### Request - Method: POST - URL: `/api/v1/datasets/{dataset_id}/metadata/update` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `selector`: `object` - `updates`: `list[object]` - `deletes`: `list[object]` #### Request parameters - `dataset_id`: (*Path parameter*) The associated dataset ID. - `"selector"`: (*Body parameter*), `object`, *optional* A document selector: - `"document_ids"`: `list[string]` *optional* The associated document ID. - `"metadata_condition"`: `object`, *optional* - `"logic"`: Defines the logic relation between conditions if multiple conditions are provided. Options: - `"and"` (default) - `"or"` - `"conditions"`: `list[object]` *optional* Each object: `{ "name": string, "comparison_operator": string, "value": string }` - `"name"`: `string` The key name to search by. - `"comparison_operator"`: `string` Available options: - `"is"` - `"not is"` - `"contains"` - `"not contains"` - `"in"` - `"not in"` - `"start with"` - `"end with"` - `">"` - `"<"` - `"≥"` - `"≤"` - `"empty"` - `"not empty"` - `"value"`: `string` The key value to search by. - `"updates"`: (*Body parameter*), `list[object]`, *optional* Replaces metadata of the retrieved documents. Each object: `{ "key": string, "match": string, "value": string }`. - `"key"`: `string` The name of the key to update. - `"match"`: `string` *optional* The current value of the key to update. When omitted, the corresponding keys are updated to `"value"` regardless of their current values. - `"value"`: `string` The new value to set for the specified keys. - `"deletes`: (*Body parameter*), `list[ojbect]`, *optional* Deletes metadata of the retrieved documents. Each object: `{ "key": string, "value": string }`. - `"key"`: `string` The name of the key to delete. - `"value"`: `string` *Optional* The value of the key to delete. - When provided, only keys with a matching value are deleted. - When omitted, all specified keys are deleted. ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/datasets/{dataset_id}/metadata/update \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "selector": { "metadata_condition": { "logic": "and", "conditions": [ {"name": "author", "comparison_operator": "is", "value": "alice"} ] } }, "updates": [ {"key": "tags", "match": "foo", "value": "foo_new"} ], "deletes": [ {"key": "obsolete_key"}, {"key": "author", "value": "alice"} ] }' ``` ##### Response Success: ```json { "code": 0, "data": { "updated": 1, "matched_docs": 2 } } ``` --- ### Retrieve chunks **POST** `/api/v1/retrieval` Retrieves chunks from specified datasets. #### Request - Method: POST - URL: `/api/v1/retrieval` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"question"`: `string` - `"dataset_ids"`: `list[string]` - `"document_ids"`: `list[string]` - `"page"`: `integer` - `"page_size"`: `integer` - `"similarity_threshold"`: `float` - `"vector_similarity_weight"`: `float` - `"top_k"`: `integer` - `"rerank_id"`: `string` - `"keyword"`: `boolean` - `"highlight"`: `boolean` - `"cross_languages"`: `list[string]` - `"metadata_condition"`: `object` - `"use_kg"`: `boolean` - `"toc_enhance"`: `boolean` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/retrieval \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "question": "What is advantage of ragflow?", "dataset_ids": ["b2a62730759d11ef987d0242ac120004"], "document_ids": ["77df9ef4759a11ef8bdd0242ac120004"], "metadata_condition": { "logic": "and", "conditions": [ { "name": "author", "comparison_operator": "=", "value": "Toby" }, { "name": "url", "comparison_operator": "not contains", "value": "amd" } ] } }' ``` ##### Request parameter - `"question"`: (*Body parameter*), `string`, *Required* The user query or query keywords. - `"dataset_ids"`: (*Body parameter*) `list[string]` The IDs of the datasets to search. If you do not set this argument, ensure that you set `"document_ids"`. - `"document_ids"`: (*Body parameter*), `list[string]` The IDs of the documents to search. Ensure that all selected documents use the same embedding model. Otherwise, an error will occur. If you do not set this argument, ensure that you set `"dataset_ids"`. - `"page"`: (*Body parameter*), `integer` Specifies the page on which the chunks will be displayed. Defaults to `1`. - `"page_size"`: (*Body parameter*) The maximum number of chunks on each page. Defaults to `30`. - `"similarity_threshold"`: (*Body parameter*) The minimum similarity score. Defaults to `0.2`. - `"vector_similarity_weight"`: (*Body parameter*), `float` The weight of vector cosine similarity. Defaults to `0.3`. If x represents the weight of vector cosine similarity, then (1 - x) is the term similarity weight. - `"top_k"`: (*Body parameter*), `integer` The number of chunks engaged in vector cosine computation. Defaults to `1024`. - `"use_kg"`: (*Body parameter*), `boolean` Whether to search chunks related to the generated knowledge graph for multi-hop queries. Defaults to `False`. Before enabling this, ensure you have successfully constructed a knowledge graph for the specified datasets. See [here](https://ragflow.io/docs/dev/construct_knowledge_graph) for details. - `"toc_enhance"`: (*Body parameter*), `boolean` Whether to search chunks with extracted table of content. Defaults to `False`. Before enabling this, ensure you have enabled `TOC_Enhance` and successfully extracted table of contents for the specified datasets. See [here](https://ragflow.io/docs/dev/enable_table_of_contents) for details. - `"rerank_id"`: (*Body parameter*), `integer` The ID of the rerank model. - `"keyword"`: (*Body parameter*), `boolean` Indicates whether to enable keyword-based matching: - `true`: Enable keyword-based matching. - `false`: Disable keyword-based matching (default). - `"highlight"`: (*Body parameter*), `boolean` Specifies whether to enable highlighting of matched terms in the results: - `true`: Enable highlighting of matched terms. - `false`: Disable highlighting of matched terms (default). - `"cross_languages"`: (*Body parameter*) `list[string]` The languages that should be translated into, in order to achieve keywords retrievals in different languages. - `"metadata_condition"`: (*Body parameter*), `object` The metadata condition used for filtering chunks: - `"logic"`: (*Body parameter*), `string` - `"and"`: Return only results that satisfy *every* condition (default). - `"or"`: Return results that satisfy *any* condition. - `"conditions"`: (*Body parameter*), `array` A list of metadata filter conditions. - `"name"`: `string` - The metadata field name to filter by, e.g., `"author"`, `"company"`, `"url"`. Ensure this parameter before use. See [Set metadata](../guides/dataset/set_metadata.md) for details. - `comparison_operator`: `string` - The comparison operator. Can be one of: - `"contains"` - `"not contains"` - `"start with"` - `"empty"` - `"not empty"` - `"="` - `"≠"` - `">"` - `"<"` - `"≥"` - `"≤"` - `"value"`: `string` - The value to compare. #### Response Success: ```json { "code": 0, "data": { "chunks": [ { "content": "ragflow content", "content_ltks": "ragflow content", "document_id": "5c5999ec7be811ef9cab0242ac120005", "document_keyword": "1.txt", "highlight": "ragflow content", "id": "d78435d142bd5cf6704da62c778795c5", "image_id": "", "important_keywords": [ "" ], "kb_id": "c7ee74067a2c11efb21c0242ac120006", "positions": [ "" ], "similarity": 0.9669436601210759, "term_similarity": 1.0, "vector_similarity": 0.8898122004035864 } ], "doc_aggs": [ { "count": 1, "doc_id": "5c5999ec7be811ef9cab0242ac120005", "doc_name": "1.txt" } ], "total": 1 } } ``` Failure: ```json { "code": 102, "message": "`datasets` is required." } ``` --- ## CHAT ASSISTANT MANAGEMENT --- ### Create chat assistant **POST** `/api/v1/chats` Creates a chat assistant. #### Request - Method: POST - URL: `/api/v1/chats` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"name"`: `string` - `"avatar"`: `string` - `"dataset_ids"`: `list[string]` - `"llm"`: `object` - `"prompt"`: `object` ##### Request example ```shell curl --request POST \ --url http://{address}/api/v1/chats \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "dataset_ids": ["0b2cbc8c877f11ef89070242ac120005"], "name":"new_chat_1" }' ``` ##### Request parameters - `"name"`: (*Body parameter*), `string`, *Required* The name of the chat assistant. - `"avatar"`: (*Body parameter*), `string` Base64 encoding of the avatar. - `"dataset_ids"`: (*Body parameter*), `list[string]` The IDs of the associated datasets. - `"llm"`: (*Body parameter*), `object` The LLM settings for the chat assistant to create. If it is not explicitly set, a JSON object with the following values will be generated as the default. An `llm` JSON object contains the following attributes: - `"model_name"`, `string` The chat model name. If not set, the user's default chat model will be used. :::caution WARNING `model_type` is an *internal* parameter, serving solely as a temporary workaround for the current model-configuration design limitations. Its main purpose is to let *multimodal* models (stored in the database as `"image2text"`) pass backend validation/dispatching. Be mindful that: - Do *not* treat it as a stable public API. - It is subject to change or removal in future releases. ::: - `"model_type"`: `string` A model type specifier. Only `"chat"` and `"image2text"` are recognized; any other inputs, or when omitted, are treated as `"chat"`. - `"model_name"`, `string` - `"temperature"`: `float` Controls the randomness of the model's predictions. A lower temperature results in more conservative responses, while a higher temperature yields more creative and diverse responses. Defaults to `0.1`. - `"top_p"`: `float` Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3` - `"presence_penalty"`: `float` This discourages the model from repeating the same information by penalizing words that have already appeared in the conversation. Defaults to `0.4`. - `"frequency penalty"`: `float` Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently. Defaults to `0.7`. - `"prompt"`: (*Body parameter*), `object` Instructions for the LLM to follow. If it is not explicitly set, a JSON object with the following values will be generated as the default. A `prompt` JSON object contains the following attributes: - `"similarity_threshold"`: `float` RAGFlow employs either a combination of weighted keyword similarity and weighted vector cosine similarity, or a combination of weighted keyword similarity and weighted reranking score during retrieval. This argument sets the threshold for similarities between the user query and chunks. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`. - `"keywords_similarity_weight"`: `float` This argument sets the weight of keyword similarity in the hybrid similarity score with vector cosine similarity or reranking model similarity. By adjusting this weight, you can control the influence of keyword similarity in relation to other similarity measures. The default value is `0.7`. - `"top_n"`: `int` This argument specifies the number of top chunks with similarity scores above the `similarity_threshold` that are fed to the LLM. The LLM will *only* access these 'top N' chunks. The default value is `6`. - `"variables"`: `object[]` This argument lists the variables to use in the 'System' field of **Chat Configurations**. Note that: - `"knowledge"` is a reserved variable, which represents the retrieved chunks. - All the variables in 'System' should be curly bracketed. - The default value is `[{"key": "knowledge", "optional": true}]`. - `"rerank_model"`: `string` If it is not specified, vector cosine similarity will be used; otherwise, reranking score will be used. - `top_k`: `int` Refers to the process of reordering or selecting the top-k items from a list or set based on a specific ranking criterion. Default to 1024. - `"empty_response"`: `string` If nothing is retrieved in the dataset for the user's question, this will be used as the response. To allow the LLM to improvise when nothing is found, leave this blank. - `"opener"`: `string` The opening greeting for the user. Defaults to `"Hi! I am your assistant, can I help you?"`. - `"show_quote`: `boolean` Indicates whether the source of text should be displayed. Defaults to `true`. - `"prompt"`: `string` The prompt content. #### Response Success: ```json { "code": 0, "data": { "avatar": "", "create_date": "Thu, 24 Oct 2024 11:18:29 GMT", "create_time": 1729768709023, "dataset_ids": [ "527fa74891e811ef9c650242ac120006" ], "description": "A helpful Assistant", "do_refer": "1", "id": "b1f2f15691f911ef81180242ac120003", "language": "English", "llm": { "frequency_penalty": 0.7, "model_name": "qwen-plus@Tongyi-Qianwen", "presence_penalty": 0.4, "temperature": 0.1, "top_p": 0.3 }, "name": "12234", "prompt": { "empty_response": "Sorry! No relevant content was found in the knowledge base!", "keywords_similarity_weight": 0.3, "opener": "Hi! I'm your assistant. What can I do for you?", "prompt": "You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\" Answers need to consider chat history.\n ", "rerank_model": "", "similarity_threshold": 0.2, "top_n": 6, "variables": [ { "key": "knowledge", "optional": false } ] }, "prompt_type": "simple", "status": "1", "tenant_id": "69736c5e723611efb51b0242ac120007", "top_k": 1024, "update_date": "Thu, 24 Oct 2024 11:18:29 GMT", "update_time": 1729768709023 } } ``` Failure: ```json { "code": 102, "message": "Duplicated chat name in creating dataset." } ``` --- ### Update chat assistant **PUT** `/api/v1/chats/{chat_id}` Updates configurations for a specified chat assistant. #### Request - Method: PUT - URL: `/api/v1/chats/{chat_id}` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"name"`: `string` - `"avatar"`: `string` - `"dataset_ids"`: `list[string]` - `"llm"`: `object` - `"prompt"`: `object` ##### Request example ```bash curl --request PUT \ --url http://{address}/api/v1/chats/{chat_id} \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "name":"Test" }' ``` #### Parameters - `chat_id`: (*Path parameter*) The ID of the chat assistant to update. - `"name"`: (*Body parameter*), `string`, *Required* The revised name of the chat assistant. - `"avatar"`: (*Body parameter*), `string` Base64 encoding of the avatar. - `"dataset_ids"`: (*Body parameter*), `list[string]` The IDs of the associated datasets. - `"llm"`: (*Body parameter*), `object` The LLM settings for the chat assistant to create. If it is not explicitly set, a dictionary with the following values will be generated as the default. An `llm` object contains the following attributes: - `"model_name"`, `string` The chat model name. If not set, the user's default chat model will be used. - `"temperature"`: `float` Controls the randomness of the model's predictions. A lower temperature results in more conservative responses, while a higher temperature yields more creative and diverse responses. Defaults to `0.1`. - `"top_p"`: `float` Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3` - `"presence_penalty"`: `float` This discourages the model from repeating the same information by penalizing words that have already appeared in the conversation. Defaults to `0.2`. - `"frequency penalty"`: `float` Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently. Defaults to `0.7`. - `"prompt"`: (*Body parameter*), `object` Instructions for the LLM to follow. A `prompt` object contains the following attributes: - `"similarity_threshold"`: `float` RAGFlow employs either a combination of weighted keyword similarity and weighted vector cosine similarity, or a combination of weighted keyword similarity and weighted rerank score during retrieval. This argument sets the threshold for similarities between the user query and chunks. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`. - `"keywords_similarity_weight"`: `float` This argument sets the weight of keyword similarity in the hybrid similarity score with vector cosine similarity or reranking model similarity. By adjusting this weight, you can control the influence of keyword similarity in relation to other similarity measures. The default value is `0.7`. - `"top_n"`: `int` This argument specifies the number of top chunks with similarity scores above the `similarity_threshold` that are fed to the LLM. The LLM will *only* access these 'top N' chunks. The default value is `8`. - `"variables"`: `object[]` This argument lists the variables to use in the 'System' field of **Chat Configurations**. Note that: - `"knowledge"` is a reserved variable, which represents the retrieved chunks. - All the variables in 'System' should be curly bracketed. - The default value is `[{"key": "knowledge", "optional": true}]` - `"rerank_model"`: `string` If it is not specified, vector cosine similarity will be used; otherwise, reranking score will be used. - `"empty_response"`: `string` If nothing is retrieved in the dataset for the user's question, this will be used as the response. To allow the LLM to improvise when nothing is found, leave this blank. - `"opener"`: `string` The opening greeting for the user. Defaults to `"Hi! I am your assistant, can I help you?"`. - `"show_quote`: `boolean` Indicates whether the source of text should be displayed. Defaults to `true`. - `"prompt"`: `string` The prompt content. #### Response Success: ```json { "code": 0 } ``` Failure: ```json { "code": 102, "message": "Duplicated chat name in updating dataset." } ``` --- ### Delete chat assistants **DELETE** `/api/v1/chats` Deletes chat assistants by ID. #### Request - Method: DELETE - URL: `/api/v1/chats` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"ids"`: `list[string]` ##### Request example ```bash curl --request DELETE \ --url http://{address}/api/v1/chats \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "ids": ["test_1", "test_2"] }' ``` ##### Request parameters - `"ids"`: (*Body parameter*), `list[string]` The IDs of the chat assistants to delete. If it is not specified, all chat assistants in the system will be deleted. #### Response Success: ```json { "code": 0 } ``` Failure: ```json { "code": 102, "message": "ids are required" } ``` --- ### List chat assistants **GET** `/api/v1/chats?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={chat_name}&id={chat_id}` Lists chat assistants. #### Request - Method: GET - URL: `/api/v1/chats?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={chat_name}&id={chat_id}` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url http://{address}/api/v1/chats?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={chat_name}&id={chat_id} \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `page`: (*Filter parameter*), `integer` Specifies the page on which the chat assistants will be displayed. Defaults to `1`. - `page_size`: (*Filter parameter*), `integer` The number of chat assistants on each page. Defaults to `30`. - `orderby`: (*Filter parameter*), `string` The attribute by which the results are sorted. Available options: - `create_time` (default) - `update_time` - `desc`: (*Filter parameter*), `boolean` Indicates whether the retrieved chat assistants should be sorted in descending order. Defaults to `true`. - `id`: (*Filter parameter*), `string` The ID of the chat assistant to retrieve. - `name`: (*Filter parameter*), `string` The name of the chat assistant to retrieve. #### Response Success: ```json { "code": 0, "data": [ { "avatar": "", "create_date": "Fri, 18 Oct 2024 06:20:06 GMT", "create_time": 1729232406637, "description": "A helpful Assistant", "do_refer": "1", "id": "04d0d8e28d1911efa3630242ac120006", "dataset_ids": ["527fa74891e811ef9c650242ac120006"], "language": "English", "llm": { "frequency_penalty": 0.7, "model_name": "qwen-plus@Tongyi-Qianwen", "presence_penalty": 0.4, "temperature": 0.1, "top_p": 0.3 }, "name": "13243", "prompt": { "empty_response": "Sorry! No relevant content was found in the knowledge base!", "keywords_similarity_weight": 0.3, "opener": "Hi! I'm your assistant. What can I do for you?", "prompt": "You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence \"The answer you are looking for is not found in the knowledge base!\" Answers need to consider chat history.\n", "rerank_model": "", "similarity_threshold": 0.2, "top_n": 6, "variables": [ { "key": "knowledge", "optional": false } ] }, "prompt_type": "simple", "status": "1", "tenant_id": "69736c5e723611efb51b0242ac120007", "top_k": 1024, "update_date": "Fri, 18 Oct 2024 06:20:06 GMT", "update_time": 1729232406638 } ] } ``` Failure: ```json { "code": 102, "message": "The chat doesn't exist" } ``` --- ## SESSION MANAGEMENT --- ### Create session with chat assistant **POST** `/api/v1/chats/{chat_id}/sessions` Creates a session with a chat assistant. #### Request - Method: POST - URL: `/api/v1/chats/{chat_id}/sessions` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"name"`: `string` - `"user_id"`: `string` (optional) ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/chats/{chat_id}/sessions \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "name": "new session" }' ``` ##### Request parameters - `chat_id`: (*Path parameter*) The ID of the associated chat assistant. - `"name"`: (*Body parameter*), `string` The name of the chat session to create. - `"user_id"`: (*Body parameter*), `string` Optional user-defined ID. #### Response Success: ```json { "code": 0, "data": { "chat_id": "2ca4b22e878011ef88fe0242ac120005", "create_date": "Fri, 11 Oct 2024 08:46:14 GMT", "create_time": 1728636374571, "id": "4606b4ec87ad11efbc4f0242ac120006", "messages": [ { "content": "Hi! I am your assistant, can I help you?", "role": "assistant" } ], "name": "new session", "update_date": "Fri, 11 Oct 2024 08:46:14 GMT", "update_time": 1728636374571 } } ``` Failure: ```json { "code": 102, "message": "Name cannot be empty." } ``` --- ### Update chat assistant's session **PUT** `/api/v1/chats/{chat_id}/sessions/{session_id}` Updates a session of a specified chat assistant. #### Request - Method: PUT - URL: `/api/v1/chats/{chat_id}/sessions/{session_id}` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"name`: `string` - `"user_id`: `string` (optional) ##### Request example ```bash curl --request PUT \ --url http://{address}/api/v1/chats/{chat_id}/sessions/{session_id} \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "name": "" }' ``` ##### Request Parameter - `chat_id`: (*Path parameter*) The ID of the associated chat assistant. - `session_id`: (*Path parameter*) The ID of the session to update. - `"name"`: (*Body Parameter*), `string` The revised name of the session. - `"user_id"`: (*Body parameter*), `string` Optional user-defined ID. #### Response Success: ```json { "code": 0 } ``` Failure: ```json { "code": 102, "message": "Name cannot be empty." } ``` --- ### List chat assistant's sessions **GET** `/api/v1/chats/{chat_id}/sessions?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={session_name}&id={session_id}` Lists sessions associated with a specified chat assistant. #### Request - Method: GET - URL: `/api/v1/chats/{chat_id}/sessions?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={session_name}&id={session_id}&user_id={user_id}` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url http://{address}/api/v1/chats/{chat_id}/sessions?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={session_name}&id={session_id} \ --header 'Authorization: Bearer ' ``` ##### Request Parameters - `chat_id`: (*Path parameter*) The ID of the associated chat assistant. - `page`: (*Filter parameter*), `integer` Specifies the page on which the sessions will be displayed. Defaults to `1`. - `page_size`: (*Filter parameter*), `integer` The number of sessions on each page. Defaults to `30`. - `orderby`: (*Filter parameter*), `string` The field by which sessions should be sorted. Available options: - `create_time` (default) - `update_time` - `desc`: (*Filter parameter*), `boolean` Indicates whether the retrieved sessions should be sorted in descending order. Defaults to `true`. - `name`: (*Filter parameter*) `string` The name of the chat session to retrieve. - `id`: (*Filter parameter*), `string` The ID of the chat session to retrieve. - `user_id`: (*Filter parameter*), `string` The optional user-defined ID passed in when creating session. #### Response Success: ```json { "code": 0, "data": [ { "chat": "2ca4b22e878011ef88fe0242ac120005", "create_date": "Fri, 11 Oct 2024 08:46:43 GMT", "create_time": 1728636403974, "id": "578d541e87ad11ef96b90242ac120006", "messages": [ { "content": "Hi! I am your assistant, can I help you?", "role": "assistant" } ], "name": "new session", "update_date": "Fri, 11 Oct 2024 08:46:43 GMT", "update_time": 1728636403974 } ] } ``` Failure: ```json { "code": 102, "message": "The session doesn't exist" } ``` --- ### Delete chat assistant's sessions **DELETE** `/api/v1/chats/{chat_id}/sessions` Deletes sessions of a chat assistant by ID. #### Request - Method: DELETE - URL: `/api/v1/chats/{chat_id}/sessions` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"ids"`: `list[string]` ##### Request example ```bash curl --request DELETE \ --url http://{address}/api/v1/chats/{chat_id}/sessions \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "ids": ["test_1", "test_2"] }' ``` ##### Request Parameters - `chat_id`: (*Path parameter*) The ID of the associated chat assistant. - `"ids"`: (*Body Parameter*), `list[string]` The IDs of the sessions to delete. If it is not specified, all sessions associated with the specified chat assistant will be deleted. #### Response Success: ```json { "code": 0 } ``` Failure: ```json { "code": 102, "message": "The chat doesn't own the session" } ``` --- ### Converse with chat assistant **POST** `/api/v1/chats/{chat_id}/completions` Asks a specified chat assistant a question to start an AI-powered conversation. :::tip NOTE - In streaming mode, not all responses include a reference, as this depends on the system's judgement. - In streaming mode, the last message is an empty message: ```json data: { "code": 0, "data": true } ``` ::: #### Request - Method: POST - URL: `/api/v1/chats/{chat_id}/completions` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"question"`: `string` - `"stream"`: `boolean` - `"session_id"`: `string` (optional) - `"user_id`: `string` (optional) - `"metadata_condition"`: `object` (optional) ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/chats/{chat_id}/completions \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data-binary ' { }' ``` ```bash curl --request POST \ --url http://{address}/api/v1/chats/{chat_id}/completions \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data-binary ' { "question": "Who are you", "stream": true, "session_id":"9fa7691cb85c11ef9c5f0242ac120005", "metadata_condition": { "logic": "and", "conditions": [ { "name": "author", "comparison_operator": "is", "value": "bob" } ] } }' ``` ##### Request Parameters - `chat_id`: (*Path parameter*) The ID of the associated chat assistant. - `"question"`: (*Body Parameter*), `string`, *Required* The question to start an AI-powered conversation. - `"stream"`: (*Body Parameter*), `boolean` Indicates whether to output responses in a streaming way: - `true`: Enable streaming (default). - `false`: Disable streaming. - `"session_id"`: (*Body Parameter*) The ID of session. If it is not provided, a new session will be generated. - `"user_id"`: (*Body parameter*), `string` The optional user-defined ID. Valid *only* when no `session_id` is provided. - `"metadata_condition"`: (*Body parameter*), `object` Optional metadata filter conditions applied to retrieval results. - `logic`: `string`, one of `and` / `or` - `conditions`: `list[object]` where each condition contains: - `name`: `string` metadata key - `comparison_operator`: `string` (e.g. `is`, `not is`, `contains`, `not contains`, `start with`, `end with`, `empty`, `not empty`, `>`, `<`, `≥`, `≤`) - `value`: `string|number|boolean` (optional for `empty`/`not empty`) #### Response Success without `session_id`: ```json data:{ "code": 0, "message": "", "data": { "answer": "Hi! I'm your assistant. What can I do for you?", "reference": {}, "audio_binary": null, "id": null, "session_id": "b01eed84b85611efa0e90242ac120005" } } data:{ "code": 0, "message": "", "data": true } ``` Success with `session_id`: ```json data:{ "code": 0, "data": { "answer": "I am an intelligent assistant designed to help answer questions by summarizing content from a", "reference": {}, "audio_binary": null, "id": "a84c5dd4-97b4-4624-8c3b-974012c8000d", "session_id": "82b0ab2a9c1911ef9d870242ac120006" } } data:{ "code": 0, "data": { "answer": "I am an intelligent assistant designed to help answer questions by summarizing content from a knowledge base. My responses are based on the information available in the knowledge base and", "reference": {}, "audio_binary": null, "id": "a84c5dd4-97b4-4624-8c3b-974012c8000d", "session_id": "82b0ab2a9c1911ef9d870242ac120006" } } data:{ "code": 0, "data": { "answer": "I am an intelligent assistant designed to help answer questions by summarizing content from a knowledge base. My responses are based on the information available in the knowledge base and any relevant chat history.", "reference": {}, "audio_binary": null, "id": "a84c5dd4-97b4-4624-8c3b-974012c8000d", "session_id": "82b0ab2a9c1911ef9d870242ac120006" } } data:{ "code": 0, "data": { "answer": "I am an intelligent assistant designed to help answer questions by summarizing content from a knowledge base ##0$$. My responses are based on the information available in the knowledge base and any relevant chat history.", "reference": { "total": 1, "chunks": [ { "id": "faf26c791128f2d5e821f822671063bd", "content": "xxxxxxxx", "document_id": "dd58f58e888511ef89c90242ac120006", "document_name": "1.txt", "dataset_id": "8e83e57a884611ef9d760242ac120006", "image_id": "", "url": null, "similarity": 0.7, "vector_similarity": 0.0, "term_similarity": 1.0, "doc_type": [], "positions": [ "" ] } ], "doc_aggs": [ { "doc_name": "1.txt", "doc_id": "dd58f58e888511ef89c90242ac120006", "count": 1 } ] }, "prompt": "xxxxxxxxxxx", "created_at": 1755055623.6401553, "id": "a84c5dd4-97b4-4624-8c3b-974012c8000d", "session_id": "82b0ab2a9c1911ef9d870242ac120006" } } data:{ "code": 0, "data": true } ``` Failure: ```json { "code": 102, "message": "Please input your question." } ``` --- ### Create session with agent :::danger DEPRECATED This method is deprecated and not recommended. You can still call it but be mindful that calling `Converse with agent` will automatically generate a session ID for the associated agent. ::: **POST** `/api/v1/agents/{agent_id}/sessions` Creates a session with an agent. #### Request - Method: POST - URL: `/api/v1/agents/{agent_id}/sessions?user_id={user_id}` - Headers: - `'content-Type: application/json' - `'Authorization: Bearer '` - Body: - the required parameters:`str` - other parameters: The variables specified in the **Begin** component. ##### Request example If the **Begin** component in your agent does not take required parameters: ```bash curl --request POST \ --url http://{address}/api/v1/agents/{agent_id}/sessions \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ }' ``` ##### Request parameters - `agent_id`: (*Path parameter*) The ID of the associated agent. - `user_id`: (*Filter parameter*) The optional user-defined ID for parsing docs (especially images) when creating a session while uploading files. #### Response Success: ```json { "code": 0, "data": { "agent_id": "dbb4ed366e8611f09690a55a6daec4ef", "dsl": { "components": { "Message:EightyJobsAsk": { "downstream": [], "obj": { "component_name": "Message", "params": { "content": [ "{begin@var1}{begin@var2}" ], "debug_inputs": {}, "delay_after_error": 2.0, "description": "", "exception_default_value": null, "exception_goto": null, "exception_method": null, "inputs": {}, "max_retries": 0, "message_history_window_size": 22, "outputs": { "content": { "type": "str", "value": null } }, "stream": true } }, "upstream": [ "begin" ] }, "begin": { "downstream": [ "Message:EightyJobsAsk" ], "obj": { "component_name": "Begin", "params": { "debug_inputs": {}, "delay_after_error": 2.0, "description": "", "enablePrologue": true, "enable_tips": true, "exception_default_value": null, "exception_goto": null, "exception_method": null, "inputs": { "var1": { "name": "var1", "optional": false, "options": [], "type": "line", "value": null }, "var2": { "name": "var2", "optional": false, "options": [], "type": "line", "value": null } }, "max_retries": 0, "message_history_window_size": 22, "mode": "conversational", "outputs": {}, "prologue": "Hi! I'm your assistant. What can I do for you?", "tips": "Please fill in the form" } }, "upstream": [] } }, "globals": { "sys.conversation_turns": 0, "sys.files": [], "sys.query": "", "sys.user_id": "" }, "graph": { "edges": [ { "data": { "isHovered": false }, "id": "xy-edge__beginstart-Message:EightyJobsAskend", "markerEnd": "logo", "source": "begin", "sourceHandle": "start", "style": { "stroke": "rgba(151, 154, 171, 1)", "strokeWidth": 1 }, "target": "Message:EightyJobsAsk", "targetHandle": "end", "type": "buttonEdge", "zIndex": 1001 } ], "nodes": [ { "data": { "form": { "enablePrologue": true, "inputs": { "var1": { "name": "var1", "optional": false, "options": [], "type": "line" }, "var2": { "name": "var2", "optional": false, "options": [], "type": "line" } }, "mode": "conversational", "prologue": "Hi! I'm your assistant. What can I do for you?" }, "label": "Begin", "name": "begin" }, "dragging": false, "id": "begin", "measured": { "height": 112, "width": 200 }, "position": { "x": 270.64098070942583, "y": -56.320928437811176 }, "selected": false, "sourcePosition": "left", "targetPosition": "right", "type": "beginNode" }, { "data": { "form": { "content": [ "{begin@var1}{begin@var2}" ] }, "label": "Message", "name": "Message_0" }, "dragging": false, "id": "Message:EightyJobsAsk", "measured": { "height": 57, "width": 200 }, "position": { "x": 279.5, "y": 190 }, "selected": true, "sourcePosition": "right", "targetPosition": "left", "type": "messageNode" } ] }, "history": [], "memory": [], "messages": [], "path": [], "retrieval": [], "task_id": "dbb4ed366e8611f09690a55a6daec4ef" }, "id": "0b02fe80780e11f084adcfdc3ed1d902", "message": [ { "content": "Hi! I'm your assistant. What can I do for you?", "role": "assistant" } ], "source": "agent", "user_id": "c3fb861af27a11efa69751e139332ced" } } ``` Failure: ```json { "code": 102, "message": "Agent not found." } ``` --- ### Converse with agent **POST** `/api/v1/agents/{agent_id}/completions` Asks a specified agent a question to start an AI-powered conversation. :::tip NOTE - In streaming mode, not all responses include a reference, as this depends on the system's judgement. - In streaming mode, the last message is an empty message: ``` [DONE] ``` - You can optionally return step-by-step trace logs (see `return_trace` below). ::: #### Request - Method: POST - URL: `/api/v1/agents/{agent_id}/completions` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"question"`: `string` - `"stream"`: `boolean` - `"session_id"`: `string` (optional) - `"inputs"`: `object` (optional) - `"user_id"`: `string` (optional) - `"return_trace"`: `boolean` (optional, default `false`) — include execution trace logs. #### Streaming events to handle When `stream=true`, the server sends Server-Sent Events (SSE). Clients should handle these `event` types: - `message`: streaming content from Message components. - `message_end`: end of a Message component; may include `reference`/`attachment`. - `node_finished`: a component finishes; `data.inputs/outputs/error/elapsed_time` describe the node result. If `return_trace=true`, the trace is attached inside the same `node_finished` event (`data.trace`). The stream terminates with `[DONE]`. :::info IMPORTANT You can include custom parameters in the request body, but first ensure they are defined in the [Begin](../guides/agent/agent_component_reference/begin.mdx) component. ::: ##### Request example - If the **Begin** component does not take parameters: ```bash curl --request POST \ --url http://{address}/api/v1/agents/{agent_id}/completions \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data-binary ' { "question": "Hello", "stream": false, }' ``` - If the **Begin** component takes parameters, include their values in the body of `"inputs"` as follows: ```bash curl --request POST \ --url http://{address}/api/v1/agents/{agent_id}/completions \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data-binary ' { "question": "Hello", "stream": false, "inputs": { "line_var": { "type": "line", "value": "I am line_var" }, "int_var": { "type": "integer", "value": 1 }, "paragraph_var": { "type": "paragraph", "value": "a\nb\nc" }, "option_var": { "type": "options", "value": "option 2" }, "boolean_var": { "type": "boolean", "value": true } } }' ``` The following code will execute the completion process ```bash curl --request POST \ --url http://{address}/api/v1/agents/{agent_id}/completions \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data-binary ' { "question": "Hello", "stream": true, "session_id": "cb2f385cb86211efa36e0242ac120005" }' ``` ##### Request Parameters - `agent_id`: (*Path parameter*), `string` The ID of the associated agent. - `"question"`: (*Body Parameter*), `string`, *Required* The question to start an AI-powered conversation. - `"stream"`: (*Body Parameter*), `boolean` Indicates whether to output responses in a streaming way: - `true`: Enable streaming (default). - `false`: Disable streaming. - `"session_id"`: (*Body Parameter*) The ID of the session. If it is not provided, a new session will be generated. - `"inputs"`: (*Body Parameter*) Variables specified in the **Begin** component. - `"user_id"`: (*Body parameter*), `string` The optional user-defined ID. Valid *only* when no `session_id` is provided. :::tip NOTE For now, this method does *not* support a file type input/variable. As a workaround, use the following to upload a file to an agent: `http://{address}/v1/canvas/upload/{agent_id}` *You will get a corresponding file ID from its response body.* ::: #### Response success without `session_id` provided and with no variables specified in the **Begin** component: Stream: ```json ... data: { "event": "message", "message_id": "cecdcb0e83dc11f0858253708ecb6573", "created_at": 1756364483, "task_id": "d1f79142831f11f09cc51795b9eb07c0", "data": { "content": " themes" }, "session_id": "cd097ca083dc11f0858253708ecb6573" } data: { "event": "message", "message_id": "cecdcb0e83dc11f0858253708ecb6573", "created_at": 1756364483, "task_id": "d1f79142831f11f09cc51795b9eb07c0", "data": { "content": "." }, "session_id": "cd097ca083dc11f0858253708ecb6573" } data: { "event": "message_end", "message_id": "cecdcb0e83dc11f0858253708ecb6573", "created_at": 1756364483, "task_id": "d1f79142831f11f09cc51795b9eb07c0", "data": { "reference": { "chunks": { "20": { "id": "4b8935ac0a22deb1", "content": "```cd /usr/ports/editors/neovim/ && make install```## Android[Termux](https://github.com/termux/termux-app) offers a Neovim package.", "document_id": "4bdd2ff65e1511f0907f09f583941b45", "document_name": "INSTALL22.md", "dataset_id": "456ce60c5e1511f0907f09f583941b45", "image_id": "", "positions": [ [ 12, 11, 11, 11, 11 ] ], "url": null, "similarity": 0.5705525104787287, "vector_similarity": 0.7351750337624289, "term_similarity": 0.5000000005, "doc_type": "" } }, "doc_aggs": { "INSTALL22.md": { "doc_name": "INSTALL22.md", "doc_id": "4bdd2ff65e1511f0907f09f583941b45", "count": 3 }, "INSTALL.md": { "doc_name": "INSTALL.md", "doc_id": "4bd7fdd85e1511f0907f09f583941b45", "count": 2 }, "INSTALL(1).md": { "doc_name": "INSTALL(1).md", "doc_id": "4bdfb42e5e1511f0907f09f583941b45", "count": 2 }, "INSTALL3.md": { "doc_name": "INSTALL3.md", "doc_id": "4bdab5825e1511f0907f09f583941b45", "count": 1 } } } }, "session_id": "cd097ca083dc11f0858253708ecb6573" } data: { "event": "node_finished", "message_id": "cecdcb0e83dc11f0858253708ecb6573", "created_at": 1756364483, "task_id": "d1f79142831f11f09cc51795b9eb07c0", "data": { "inputs": { "sys.query": "how to install neovim?" }, "outputs": { "content": "xxxxxxx", "_created_time": 15294.0382, "_elapsed_time": 0.00017 }, "component_id": "Agent:EveryHairsChew", "component_name": "Agent_1", "component_type": "Agent", "error": null, "elapsed_time": 11.2091, "created_at": 15294.0382, "trace": [ { "component_id": "begin", "trace": [ { "inputs": {}, "outputs": { "_created_time": 15257.7949, "_elapsed_time": 0.00070 }, "component_id": "begin", "component_name": "begin", "component_type": "Begin", "error": null, "elapsed_time": 0.00085, "created_at": 15257.7949 } ] }, { "component_id": "Agent:WeakDragonsRead", "trace": [ { "inputs": { "sys.query": "how to install neovim?" }, "outputs": { "content": "xxxxxxx", "_created_time": 15257.7982, "_elapsed_time": 36.2382 }, "component_id": "Agent:WeakDragonsRead", "component_name": "Agent_0", "component_type": "Agent", "error": null, "elapsed_time": 36.2385, "created_at": 15257.7982 } ] }, { "component_id": "Agent:EveryHairsChew", "trace": [ { "inputs": { "sys.query": "how to install neovim?" }, "outputs": { "content": "xxxxxxxxxxxxxxxxx", "_created_time": 15294.0382, "_elapsed_time": 0.00017 }, "component_id": "Agent:EveryHairsChew", "component_name": "Agent_1", "component_type": "Agent", "error": null, "elapsed_time": 11.2091, "created_at": 15294.0382 } ] } ] }, "session_id": "cd097ca083dc11f0858253708ecb6573" } data:[DONE] ``` Non-stream: ```json { "code": 0, "data": { "created_at": 1756363177, "data": { "content": "\nTo install Neovim, the process varies depending on your operating system:\n\n### For macOS:\nUsing Homebrew:\n```bash\nbrew install neovim\n```\n\n### For Linux (Debian/Ubuntu):\n```bash\nsudo apt update\nsudo apt install neovim\n```\n\nFor other Linux distributions, you can use their respective package managers or build from source.\n\n### For Windows:\n1. Download the latest Windows installer from the official Neovim GitHub releases page\n2. Run the installer and follow the prompts\n3. Add Neovim to your PATH if not done automatically\n\n### From source (Unix-like systems):\n```bash\ngit clone https://github.com/neovim/neovim.git\ncd neovim\nmake CMAKE_BUILD_TYPE=Release\nsudo make install\n```\n\nAfter installation, you can verify it by running `nvim --version` in your terminal.", "created_at": 18129.044975627, "elapsed_time": 10.0157331670016, "inputs": { "var1": { "value": "I am var1" }, "var2": { "value": "I am var2" } }, "outputs": { "_created_time": 18129.502422278, "_elapsed_time": 0.00013378599760471843, "content": "\nTo install Neovim, the process varies depending on your operating system:\n\n### For macOS:\nUsing Homebrew:\n```bash\nbrew install neovim\n```\n\n### For Linux (Debian/Ubuntu):\n```bash\nsudo apt update\nsudo apt install neovim\n```\n\nFor other Linux distributions, you can use their respective package managers or build from source.\n\n### For Windows:\n1. Download the latest Windows installer from the official Neovim GitHub releases page\n2. Run the installer and follow the prompts\n3. Add Neovim to your PATH if not done automatically\n\n### From source (Unix-like systems):\n```bash\ngit clone https://github.com/neovim/neovim.git\ncd neovim\nmake CMAKE_BUILD_TYPE=Release\nsudo make install\n```\n\nAfter installation, you can verify it by running `nvim --version` in your terminal." }, "reference": { "chunks": { "20": { "content": "```cd /usr/ports/editors/neovim/ && make install```## Android[Termux](https://github.com/termux/termux-app) offers a Neovim package.", "dataset_id": "456ce60c5e1511f0907f09f583941b45", "doc_type": "", "document_id": "4bdd2ff65e1511f0907f09f583941b45", "document_name": "INSTALL22.md", "id": "4b8935ac0a22deb1", "image_id": "", "positions": [ [ 12, 11, 11, 11, 11 ] ], "similarity": 0.5705525104787287, "term_similarity": 0.5000000005, "url": null, "vector_similarity": 0.7351750337624289 } }, "doc_aggs": { "INSTALL(1).md": { "count": 2, "doc_id": "4bdfb42e5e1511f0907f09f583941b45", "doc_name": "INSTALL(1).md" }, "INSTALL.md": { "count": 2, "doc_id": "4bd7fdd85e1511f0907f09f583941b45", "doc_name": "INSTALL.md" }, "INSTALL22.md": { "count": 3, "doc_id": "4bdd2ff65e1511f0907f09f583941b45", "doc_name": "INSTALL22.md" }, "INSTALL3.md": { "count": 1, "doc_id": "4bdab5825e1511f0907f09f583941b45", "doc_name": "INSTALL3.md" } } }, "trace": [ { "component_id": "begin", "trace": [ { "component_id": "begin", "component_name": "begin", "component_type": "Begin", "created_at": 15926.567517862, "elapsed_time": 0.0008189299987861887, "error": null, "inputs": {}, "outputs": { "_created_time": 15926.567517862, "_elapsed_time": 0.0006958619997021742 } } ] }, { "component_id": "Agent:WeakDragonsRead", "trace": [ { "component_id": "Agent:WeakDragonsRead", "component_name": "Agent_0", "component_type": "Agent", "created_at": 15926.569121755, "elapsed_time": 53.49016142000073, "error": null, "inputs": { "sys.query": "how to install neovim?" }, "outputs": { "_created_time": 15926.569121755, "_elapsed_time": 53.489981256001556, "content": "xxxxxxxxxxxxxx", "use_tools": [ { "arguments": { "query": "xxxx" }, "name": "search_my_dateset", "results": "xxxxxxxxxxx" } ] } } ] }, { "component_id": "Agent:EveryHairsChew", "trace": [ { "component_id": "Agent:EveryHairsChew", "component_name": "Agent_1", "component_type": "Agent", "created_at": 15980.060569101, "elapsed_time": 23.61718057500002, "error": null, "inputs": { "sys.query": "how to install neovim?" }, "outputs": { "_created_time": 15980.060569101, "_elapsed_time": 0.0003451630000199657, "content": "xxxxxxxxxxxx" } } ] }, { "component_id": "Message:SlickDingosHappen", "trace": [ { "component_id": "Message:SlickDingosHappen", "component_name": "Message_0", "component_type": "Message", "created_at": 15980.061302513, "elapsed_time": 23.61655923699982, "error": null, "inputs": { "Agent:EveryHairsChew@content": "xxxxxxxxx", "Agent:WeakDragonsRead@content": "xxxxxxxxxxx" }, "outputs": { "_created_time": 15980.061302513, "_elapsed_time": 0.0006695749998471001, "content": "xxxxxxxxxxx" } } ] } ] }, "event": "workflow_finished", "message_id": "c4692a2683d911f0858253708ecb6573", "session_id": "c39f6f9c83d911f0858253708ecb6573", "task_id": "d1f79142831f11f09cc51795b9eb07c0" } } ``` Success without `session_id` provided and with variables specified in the **Begin** component: Stream: ```json data:{ "event": "message", "message_id": "0e273472783711f0806e1a6272e682d8", "created_at": 1755083830, "task_id": "99ee29d6783511f09c921a6272e682d8", "data": { "content": "Hello" }, "session_id": "0e0d1542783711f0806e1a6272e682d8" } data:{ "event": "message", "message_id": "0e273472783711f0806e1a6272e682d8", "created_at": 1755083830, "task_id": "99ee29d6783511f09c921a6272e682d8", "data": { "content": "!" }, "session_id": "0e0d1542783711f0806e1a6272e682d8" } data:{ "event": "message", "message_id": "0e273472783711f0806e1a6272e682d8", "created_at": 1755083830, "task_id": "99ee29d6783511f09c921a6272e682d8", "data": { "content": " How" }, "session_id": "0e0d1542783711f0806e1a6272e682d8" } ... data:[DONE] ``` Non-stream: ```json { "code": 0, "data": { "created_at": 1755083779, "data": { "created_at": 547400.868004651, "elapsed_time": 3.5037803899031132, "inputs": { "boolean_var": { "type": "boolean", "value": true }, "int_var": { "type": "integer", "value": 1 }, "line_var": { "type": "line", "value": "I am line_var" }, "option_var": { "type": "options", "value": "option 2" }, "paragraph_var": { "type": "paragraph", "value": "a\nb\nc" } }, "outputs": { "_created_time": 547400.869271305, "_elapsed_time": 0.0001251999055966735, "content": "Hello there! How can I assist you today?" } }, "event": "workflow_finished", "message_id": "effdad8c783611f089261a6272e682d8", "session_id": "efe523b6783611f089261a6272e682d8", "task_id": "99ee29d6783511f09c921a6272e682d8" } } ``` Success with variables specified in the **Begin** component: Stream: ```json data:{ "event": "message", "message_id": "5b62e790783711f0bc531a6272e682d8", "created_at": 1755083960, "task_id": "99ee29d6783511f09c921a6272e682d8", "data": { "content": "Hello" }, "session_id": "979e450c781d11f095cb729e3aa55728" } data:{ "event": "message", "message_id": "5b62e790783711f0bc531a6272e682d8", "created_at": 1755083960, "task_id": "99ee29d6783511f09c921a6272e682d8", "data": { "content": "!" }, "session_id": "979e450c781d11f095cb729e3aa55728" } data:{ "event": "message", "message_id": "5b62e790783711f0bc531a6272e682d8", "created_at": 1755083960, "task_id": "99ee29d6783511f09c921a6272e682d8", "data": { "content": " You" }, "session_id": "979e450c781d11f095cb729e3aa55728" } ... data:[DONE] ``` Non-stream: ```json { "code": 0, "data": { "created_at": 1755084029, "data": { "created_at": 547650.750818867, "elapsed_time": 1.6227330720284954, "inputs": {}, "outputs": { "_created_time": 547650.752800839, "_elapsed_time": 9.628792759031057e-05, "content": "Hello! It appears you've sent another \"Hello\" without additional context. I'm here and ready to respond to any requests or questions you may have. Is there something specific you'd like to discuss or learn about?" } }, "event": "workflow_finished", "message_id": "84eec534783711f08db41a6272e682d8", "session_id": "979e450c781d11f095cb729e3aa55728", "task_id": "99ee29d6783511f09c921a6272e682d8" } } ``` Failure: ```json { "code": 102, "message": "`question` is required." } ``` --- ### List agent sessions **GET** `/api/v1/agents/{agent_id}/sessions?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&id={session_id}&user_id={user_id}&dsl={dsl}` Lists sessions associated with a specified agent. #### Request - Method: GET - URL: `/api/v1/agents/{agent_id}/sessions?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&id={session_id}` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url http://{address}/api/v1/agents/{agent_id}/sessions?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&id={session_id}&user_id={user_id} \ --header 'Authorization: Bearer ' ``` ##### Request Parameters - `agent_id`: (*Path parameter*) The ID of the associated agent. - `page`: (*Filter parameter*), `integer` Specifies the page on which the sessions will be displayed. Defaults to `1`. - `page_size`: (*Filter parameter*), `integer` The number of sessions on each page. Defaults to `30`. - `orderby`: (*Filter parameter*), `string` The field by which sessions should be sorted. Available options: - `create_time` (default) - `update_time` - `desc`: (*Filter parameter*), `boolean` Indicates whether the retrieved sessions should be sorted in descending order. Defaults to `true`. - `id`: (*Filter parameter*), `string` The ID of the agent session to retrieve. - `user_id`: (*Filter parameter*), `string` The optional user-defined ID passed in when creating session. - `dsl`: (*Filter parameter*), `boolean` Indicates whether to include the dsl field of the sessions in the response. Defaults to `true`. #### Response Success: ```json { "code": 0, "data": [{ "agent_id": "e9e2b9c2b2f911ef801d0242ac120006", "dsl": { "answer": [], "components": { "Answer:OrangeTermsBurn": { "downstream": [], "obj": { "component_name": "Answer", "params": {} }, "upstream": [] }, "Generate:SocialYearsRemain": { "downstream": [], "obj": { "component_name": "Generate", "params": { "cite": true, "frequency_penalty": 0.7, "llm_id": "gpt-4o___OpenAI-API@OpenAI-API-Compatible", "message_history_window_size": 12, "parameters": [], "presence_penalty": 0.4, "prompt": "Please summarize the following paragraph. Pay attention to the numbers and do not make things up. The paragraph is as follows:\n{input}\nThis is what you need to summarize.", "temperature": 0.1, "top_p": 0.3 } }, "upstream": [] }, "begin": { "downstream": [], "obj": { "component_name": "Begin", "params": {} }, "upstream": [] } }, "graph": { "edges": [], "nodes": [ { "data": { "label": "Begin", "name": "begin" }, "height": 44, "id": "begin", "position": { "x": 50, "y": 200 }, "sourcePosition": "left", "targetPosition": "right", "type": "beginNode", "width": 200 }, { "data": { "form": { "cite": true, "frequencyPenaltyEnabled": true, "frequency_penalty": 0.7, "llm_id": "gpt-4o___OpenAI-API@OpenAI-API-Compatible", "maxTokensEnabled": true, "message_history_window_size": 12, "parameters": [], "presencePenaltyEnabled": true, "presence_penalty": 0.4, "prompt": "Please summarize the following paragraph. Pay attention to the numbers and do not make things up. The paragraph is as follows:\n{input}\nThis is what you need to summarize.", "temperature": 0.1, "temperatureEnabled": true, "topPEnabled": true, "top_p": 0.3 }, "label": "Generate", "name": "Generate Answer_0" }, "dragging": false, "height": 105, "id": "Generate:SocialYearsRemain", "position": { "x": 561.3457829707513, "y": 178.7211182312641 }, "positionAbsolute": { "x": 561.3457829707513, "y": 178.7211182312641 }, "selected": true, "sourcePosition": "right", "targetPosition": "left", "type": "generateNode", "width": 200 }, { "data": { "form": {}, "label": "Answer", "name": "Dialogue_0" }, "height": 44, "id": "Answer:OrangeTermsBurn", "position": { "x": 317.2368194777658, "y": 218.30635555445093 }, "sourcePosition": "right", "targetPosition": "left", "type": "logicNode", "width": 200 } ] }, "history": [], "messages": [], "path": [], "reference": [] }, "id": "792dde22b2fa11ef97550242ac120006", "message": [ { "content": "Hi! I'm your smart assistant. What can I do for you?", "role": "assistant" } ], "source": "agent", "user_id": "" }] } ``` Failure: ```json { "code": 102, "message": "You don't own the agent ccd2f856b12311ef94ca0242ac1200052." } ``` --- ### Delete agent's sessions **DELETE** `/api/v1/agents/{agent_id}/sessions` Deletes sessions of an agent by ID. #### Request - Method: DELETE - URL: `/api/v1/agents/{agent_id}/sessions` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"ids"`: `list[string]` ##### Request example ```bash curl --request DELETE \ --url http://{address}/api/v1/agents/{agent_id}/sessions \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "ids": ["test_1", "test_2"] }' ``` ##### Request Parameters - `agent_id`: (*Path parameter*) The ID of the associated agent. - `"ids"`: (*Body Parameter*), `list[string]` The IDs of the sessions to delete. If it is not specified, all sessions associated with the specified agent will be deleted. #### Response Success: ```json { "code": 0 } ``` Failure: ```json { "code": 102, "message": "The agent doesn't own the session cbd31e52f73911ef93b232903b842af6" } ``` --- ### Generate related questions **POST** `/api/v1/sessions/related_questions` Generates five to ten alternative question strings from the user's original query to retrieve more relevant search results. This operation requires a `Bearer Login Token`, which typically expires with in 24 hours. You can find it in the Request Headers in your browser easily as shown below: ![Image](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/login_token.jpg) :::tip NOTE The chat model autonomously determines the number of questions to generate based on the instruction, typically between five and ten. ::: #### Request - Method: POST - URL: `/api/v1/sessions/related_questions` - Headers: - `'content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"question"`: `string` - `"industry"`: `string` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/sessions/related_questions \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data ' { "question": "What are the key advantages of Neovim over Vim?", "industry": "software_development" }' ``` ##### Request Parameters - `"question"`: (*Body Parameter*), `string` The original user question. - `"industry"`: (*Body Parameter*), `string` Industry of the question. #### Response Success: ```json { "code": 0, "data": [ "What makes Neovim superior to Vim in terms of features?", "How do the benefits of Neovim compare to those of Vim?", "What advantages does Neovim offer that are not present in Vim?", "In what ways does Neovim outperform Vim in functionality?", "What are the most significant improvements in Neovim compared to Vim?", "What unique advantages does Neovim bring to the table over Vim?", "How does the user experience in Neovim differ from Vim in terms of benefits?", "What are the top reasons to switch from Vim to Neovim?", "What features of Neovim are considered more advanced than those in Vim?" ], "message": "success" } ``` Failure: ```json { "code": 401, "data": null, "message": "" } ``` --- ## AGENT MANAGEMENT --- ### List agents **GET** `/api/v1/agents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={agent_name}&id={agent_id}` Lists agents. #### Request - Method: GET - URL: `/api/v1/agents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&title={agent_name}&id={agent_id}` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url http://{address}/api/v1/agents?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&title={agent_name}&id={agent_id} \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `page`: (*Filter parameter*), `integer` Specifies the page on which the agents will be displayed. Defaults to `1`. - `page_size`: (*Filter parameter*), `integer` The number of agents on each page. Defaults to `30`. - `orderby`: (*Filter parameter*), `string` The attribute by which the results are sorted. Available options: - `create_time` (default) - `update_time` - `desc`: (*Filter parameter*), `boolean` Indicates whether the retrieved agents should be sorted in descending order. Defaults to `true`. - `id`: (*Filter parameter*), `string` The ID of the agent to retrieve. - `title`: (*Filter parameter*), `string` The name of the agent to retrieve. #### Response Success: ```json { "code": 0, "data": [ { "avatar": null, "canvas_type": null, "create_date": "Thu, 05 Dec 2024 19:10:36 GMT", "create_time": 1733397036424, "description": null, "dsl": { "answer": [], "components": { "begin": { "downstream": [], "obj": { "component_name": "Begin", "params": {} }, "upstream": [] } }, "graph": { "edges": [], "nodes": [ { "data": { "label": "Begin", "name": "begin" }, "height": 44, "id": "begin", "position": { "x": 50, "y": 200 }, "sourcePosition": "left", "targetPosition": "right", "type": "beginNode", "width": 200 } ] }, "history": [], "messages": [], "path": [], "reference": [] }, "id": "8d9ca0e2b2f911ef9ca20242ac120006", "title": "123465", "update_date": "Thu, 05 Dec 2024 19:10:56 GMT", "update_time": 1733397056801, "user_id": "69736c5e723611efb51b0242ac120007" } ] } ``` Failure: ```json { "code": 102, "message": "The agent doesn't exist." } ``` --- ### Create agent **POST** `/api/v1/agents` Create an agent. #### Request - Method: POST - URL: `/api/v1/agents` - Headers: - `'Content-Type: application/json` - `'Authorization: Bearer '` - Body: - `"title"`: `string` - `"description"`: `string` - `"dsl"`: `object` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/agents \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "title": "Test Agent", "description": "A test agent", "dsl": { // ... Canvas DSL here ... } }' ``` ##### Request parameters - `title`: (*Body parameter*), `string`, *Required* The title of the agent. - `description`: (*Body parameter*), `string` The description of the agent. Defaults to `None`. - `dsl`: (*Body parameter*), `object`, *Required* The canvas DSL object of the agent. #### Response Success: ```json { "code": 0, "data": true, "message": "success" } ``` Failure: ```json { "code": 102, "message": "Agent with title test already exists." } ``` --- ### Update agent **PUT** `/api/v1/agents/{agent_id}` Update an agent by id. #### Request - Method: PUT - URL: `/api/v1/agents/{agent_id}` - Headers: - `'Content-Type: application/json` - `'Authorization: Bearer '` - Body: - `"title"`: `string` - `"description"`: `string` - `"dsl"`: `object` ##### Request example ```bash curl --request PUT \ --url http://{address}/api/v1/agents/58af890a2a8911f0a71a11b922ed82d6 \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "title": "Test Agent", "description": "A test agent", "dsl": { // ... Canvas DSL here ... } }' ``` ##### Request parameters - `agent_id`: (*Path parameter*), `string` The id of the agent to be updated. - `title`: (*Body parameter*), `string` The title of the agent. - `description`: (*Body parameter*), `string` The description of the agent. - `dsl`: (*Body parameter*), `object` The canvas DSL object of the agent. Only specify the parameter you want to change in the request body. If a parameter does not exist or is `None`, it won't be updated. #### Response Success: ```json { "code": 0, "data": true, "message": "success" } ``` Failure: ```json { "code": 103, "message": "Only owner of canvas authorized for this operation." } ``` --- ### Delete agent **DELETE** `/api/v1/agents/{agent_id}` Delete an agent by id. #### Request - Method: DELETE - URL: `/api/v1/agents/{agent_id}` - Headers: - `'Content-Type: application/json` - `'Authorization: Bearer '` ##### Request example ```bash curl --request DELETE \ --url http://{address}/api/v1/agents/58af890a2a8911f0a71a11b922ed82d6 \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{}' ``` ##### Request parameters - `agent_id`: (*Path parameter*), `string` The id of the agent to be deleted. #### Response Success: ```json { "code": 0, "data": true, "message": "success" } ``` Failure: ```json { "code": 103, "message": "Only owner of canvas authorized for this operation." } ``` --- ### System --- ### Check system health **GET** `/v1/system/healthz` Check the health status of RAGFlow’s dependencies (database, Redis, document engine, object storage). #### Request - Method: GET - URL: `/v1/system/healthz` - Headers: - 'Content-Type: application/json' (no Authorization required) ##### Request example ```bash curl --request GET --url http://{address}/v1/system/healthz --header 'Content-Type: application/json' ``` ##### Request parameters - `address`: (*Path parameter*), string The host and port of the backend service (e.g., `localhost:7897`). --- #### Responses - **200 OK** – All services healthy ```http HTTP/1.1 200 OK Content-Type: application/json { "db": "ok", "redis": "ok", "doc_engine": "ok", "storage": "ok", "status": "ok" } ``` - **500 Internal Server Error** – At least one service unhealthy ```http HTTP/1.1 500 INTERNAL SERVER ERROR Content-Type: application/json { "db": "ok", "redis": "nok", "doc_engine": "ok", "storage": "ok", "status": "nok", "_meta": { "redis": { "elapsed": "5.2", "error": "Lost connection!" } } } ``` Explanation: - Each service is reported as "ok" or "nok". - The top-level `status` reflects overall health. - If any service is "nok", detailed error info appears in `_meta`. --- ## FILE MANAGEMENT --- ### Upload file **POST** `/api/v1/file/upload` Uploads one or multiple files to the system. #### Request - Method: POST - URL: `/api/v1/file/upload` - Headers: - `'Content-Type: multipart/form-data'` - `'Authorization: Bearer '` - Form: - `'file=@{FILE_PATH}'` - `'parent_id'`: `string` (optional) ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/file/upload \ --header 'Content-Type: multipart/form-data' \ --header 'Authorization: Bearer ' \ --form 'file=@./test1.txt' \ --form 'file=@./test2.pdf' \ --form 'parent_id={folder_id}' ``` ##### Request parameters - `'file'`: (*Form parameter*), `file`, *Required* The file(s) to upload. Multiple files can be uploaded in a single request. - `'parent_id'`: (*Form parameter*), `string` The parent folder ID where the file will be uploaded. If not specified, files will be uploaded to the root folder. #### Response Success: ```json { "code": 0, "data": [ { "id": "b330ec2e91ec11efbc510242ac120004", "name": "test1.txt", "size": 17966, "type": "doc", "parent_id": "527fa74891e811ef9c650242ac120006", "location": "test1.txt", "create_time": 1729763127646 } ] } ``` Failure: ```json { "code": 400, "message": "No file part!" } ``` --- ### Create file or folder **POST** `/api/v1/file/create` Creates a new file or folder in the system. #### Request - Method: POST - URL: `/api/v1/file/create` - Headers: - `'Content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"name"`: `string` - `"parent_id"`: `string` (optional) - `"type"`: `string` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/file/create \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "name": "New Folder", "type": "FOLDER", "parent_id": "{folder_id}" }' ``` ##### Request parameters - `"name"`: (*Body parameter*), `string`, *Required* The name of the file or folder to create. - `"parent_id"`: (*Body parameter*), `string` The parent folder ID. If not specified, the file/folder will be created in the root folder. - `"type"`: (*Body parameter*), `string` The type of the file to create. Available options: - `"FOLDER"`: Create a folder - `"VIRTUAL"`: Create a virtual file #### Response Success: ```json { "code": 0, "data": { "id": "b330ec2e91ec11efbc510242ac120004", "name": "New Folder", "type": "FOLDER", "parent_id": "527fa74891e811ef9c650242ac120006", "size": 0, "create_time": 1729763127646 } } ``` Failure: ```json { "code": 409, "message": "Duplicated folder name in the same folder." } ``` --- ### List files **GET** `/api/v1/file/list?parent_id={parent_id}&keywords={keywords}&page={page}&page_size={page_size}&orderby={orderby}&desc={desc}` Lists files and folders under a specific folder. #### Request - Method: GET - URL: `/api/v1/file/list?parent_id={parent_id}&keywords={keywords}&page={page}&page_size={page_size}&orderby={orderby}&desc={desc}` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url 'http://{address}/api/v1/file/list?parent_id={folder_id}&page=1&page_size=15' \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `parent_id`: (*Filter parameter*), `string` The folder ID to list files from. If not specified, the root folder is used by default. - `keywords`: (*Filter parameter*), `string` Search keyword to filter files by name. - `page`: (*Filter parameter*), `integer` Specifies the page on which the files will be displayed. Defaults to `1`. - `page_size`: (*Filter parameter*), `integer` The number of files on each page. Defaults to `15`. - `orderby`: (*Filter parameter*), `string` The field by which files should be sorted. Available options: - `create_time` (default) - `desc`: (*Filter parameter*), `boolean` Indicates whether the retrieved files should be sorted in descending order. Defaults to `true`. #### Response Success: ```json { "code": 0, "data": { "total": 10, "files": [ { "id": "b330ec2e91ec11efbc510242ac120004", "name": "test1.txt", "type": "doc", "size": 17966, "parent_id": "527fa74891e811ef9c650242ac120006", "create_time": 1729763127646 } ], "parent_folder": { "id": "527fa74891e811ef9c650242ac120006", "name": "Parent Folder" } } } ``` Failure: ```json { "code": 404, "message": "Folder not found!" } ``` --- ### Get root folder **GET** `/api/v1/file/root_folder` Retrieves the user's root folder information. #### Request - Method: GET - URL: `/api/v1/file/root_folder` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url http://{address}/api/v1/file/root_folder \ --header 'Authorization: Bearer ' ``` ##### Request parameters No parameters required. #### Response Success: ```json { "code": 0, "data": { "root_folder": { "id": "527fa74891e811ef9c650242ac120006", "name": "root", "type": "FOLDER" } } } ``` --- ### Get parent folder **GET** `/api/v1/file/parent_folder?file_id={file_id}` Retrieves the immediate parent folder information of a specified file. #### Request - Method: GET - URL: `/api/v1/file/parent_folder?file_id={file_id}` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url 'http://{address}/api/v1/file/parent_folder?file_id={file_id}' \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `file_id`: (*Filter parameter*), `string`, *Required* The ID of the file whose immediate parent folder to retrieve. #### Response Success: ```json { "code": 0, "data": { "parent_folder": { "id": "527fa74891e811ef9c650242ac120006", "name": "Parent Folder" } } } ``` Failure: ```json { "code": 404, "message": "Folder not found!" } ``` --- ### Get all parent folders **GET** `/api/v1/file/all_parent_folder?file_id={file_id}` Retrieves all parent folders of a specified file in the folder hierarchy. #### Request - Method: GET - URL: `/api/v1/file/all_parent_folder?file_id={file_id}` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url 'http://{address}/api/v1/file/all_parent_folder?file_id={file_id}' \ --header 'Authorization: Bearer ' ``` ##### Request parameters - `file_id`: (*Filter parameter*), `string`, *Required* The ID of the file whose parent folders to retrieve. #### Response Success: ```json { "code": 0, "data": { "parent_folders": [ { "id": "527fa74891e811ef9c650242ac120006", "name": "Parent Folder 1" }, { "id": "627fa74891e811ef9c650242ac120007", "name": "Parent Folder 2" } ] } } ``` Failure: ```json { "code": 404, "message": "Folder not found!" } ``` --- ### Delete files **POST** `/api/v1/file/rm` Deletes one or multiple files or folders. #### Request - Method: POST - URL: `/api/v1/file/rm` - Headers: - `'Content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"file_ids"`: `list[string]` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/file/rm \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "file_ids": ["file_id_1", "file_id_2"] }' ``` ##### Request parameters - `"file_ids"`: (*Body parameter*), `list[string]`, *Required* The IDs of the files or folders to delete. #### Response Success: ```json { "code": 0, "data": true } ``` Failure: ```json { "code": 404, "message": "File or Folder not found!" } ``` --- ### Rename file **POST** `/api/v1/file/rename` Renames a file or folder. #### Request - Method: POST - URL: `/api/v1/file/rename` - Headers: - `'Content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"file_id"`: `string` - `"name"`: `string` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/file/rename \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "file_id": "{file_id}", "name": "new_name.txt" }' ``` ##### Request parameters - `"file_id"`: (*Body parameter*), `string`, *Required* The ID of the file or folder to rename. - `"name"`: (*Body parameter*), `string`, *Required* The new name for the file or folder. Note: Changing file extensions is *not* supported. #### Response Success: ```json { "code": 0, "data": true } ``` Failure: ```json { "code": 400, "message": "The extension of file can't be changed" } ``` or ```json { "code": 409, "message": "Duplicated file name in the same folder." } ``` --- ### Download file **GET** `/api/v1/file/get/{file_id}` Downloads a file from the system. #### Request - Method: GET - URL: `/api/v1/file/get/{file_id}` - Headers: - `'Authorization: Bearer '` ##### Request example ```bash curl --request GET \ --url http://{address}/api/v1/file/get/{file_id} \ --header 'Authorization: Bearer ' \ --output ./downloaded_file.txt ``` ##### Request parameters - `file_id`: (*Path parameter*), `string`, *Required* The ID of the file to download. #### Response Success: Returns the file content as a binary stream with appropriate Content-Type headers. Failure: ```json { "code": 404, "message": "Document not found!" } ``` --- ### Move files **POST** `/api/v1/file/mv` Moves one or multiple files or folders to a specified folder. #### Request - Method: POST - URL: `/api/v1/file/mv` - Headers: - `'Content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"src_file_ids"`: `list[string]` - `"dest_file_id"`: `string` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/file/mv \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "src_file_ids": ["file_id_1", "file_id_2"], "dest_file_id": "{destination_folder_id}" }' ``` ##### Request parameters - `"src_file_ids"`: (*Body parameter*), `list[string]`, *Required* The IDs of the files or folders to move. - `"dest_file_id"`: (*Body parameter*), `string`, *Required* The ID of the destination folder. #### Response Success: ```json { "code": 0, "data": true } ``` Failure: ```json { "code": 404, "message": "File or Folder not found!" } ``` or ```json { "code": 404, "message": "Parent Folder not found!" } ``` --- ### Convert files to documents and link them to datasets **POST** `/api/v1/file/convert` Converts files to documents and links them to specified datasets. #### Request - Method: POST - URL: `/api/v1/file/convert` - Headers: - `'Content-Type: application/json'` - `'Authorization: Bearer '` - Body: - `"file_ids"`: `list[string]` - `"kb_ids"`: `list[string]` ##### Request example ```bash curl --request POST \ --url http://{address}/api/v1/file/convert \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer ' \ --data '{ "file_ids": ["file_id_1", "file_id_2"], "kb_ids": ["dataset_id_1", "dataset_id_2"] }' ``` ##### Request parameters - `"file_ids"`: (*Body parameter*), `list[string]`, *Required* The IDs of the files to convert. If a folder ID is provided, all files within that folder will be converted. - `"kb_ids"`: (*Body parameter*), `list[string]`, *Required* The IDs of the target datasets. #### Response Success: ```json { "code": 0, "data": [ { "id": "file2doc_id_1", "file_id": "file_id_1", "document_id": "document_id_1" } ] } ``` Failure: ```json { "code": 404, "message": "File not found!" } ``` or ```json { "code": 404, "message": "Can't find this dataset!" } ``` --- --- sidebar_position: 5 slug: /python_api_reference --- # Python API A complete reference for RAGFlow's Python APIs. Before proceeding, please ensure you [have your RAGFlow API key ready for authentication](https://ragflow.io/docs/dev/acquire_ragflow_api_key). :::tip NOTE Run the following command to download the Python SDK: ```bash pip install ragflow-sdk ``` ::: --- ## ERROR CODES --- | Code | Message | Description | |------|-----------------------|----------------------------| | 400 | Bad Request | Invalid request parameters | | 401 | Unauthorized | Unauthorized access | | 403 | Forbidden | Access denied | | 404 | Not Found | Resource not found | | 500 | Internal Server Error | Server internal error | | 1001 | Invalid Chunk ID | Invalid Chunk ID | | 1002 | Chunk Update Failed | Chunk update failed | --- ## OpenAI-Compatible API --- ### Create chat completion Creates a model response for the given historical chat conversation via OpenAI's API. #### Parameters ##### model: `str`, *Required* The model used to generate the response. The server will parse this automatically, so you can set it to any value for now. ##### messages: `list[object]`, *Required* A list of historical chat messages used to generate the response. This must contain at least one message with the `user` role. ##### stream: `boolean` Whether to receive the response as a stream. Set this to `false` explicitly if you prefer to receive the entire response in one go instead of as a stream. #### Returns - Success: Response [message](https://platform.openai.com/docs/api-reference/chat/create) like OpenAI - Failure: `Exception` #### Examples ```python from openai import OpenAI model = "model" client = OpenAI(api_key="ragflow-api-key", base_url=f"http://ragflow_address/api/v1/chats_openai/") stream = True reference = True completion = client.chat.completions.create( model=model, messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Who are you?"}, {"role": "assistant", "content": "I am an AI assistant named..."}, {"role": "user", "content": "Can you tell me how to install neovim"}, ], stream=stream, extra_body={"reference": reference} ) if stream: for chunk in completion: print(chunk) if reference and chunk.choices[0].finish_reason == "stop": print(f"Reference:\n{chunk.choices[0].delta.reference}") print(f"Final content:\n{chunk.choices[0].delta.final_content}") else: print(completion.choices[0].message.content) if reference: print(completion.choices[0].message.reference) ``` ## DATASET MANAGEMENT --- ### Create dataset ```python RAGFlow.create_dataset( name: str, avatar: Optional[str] = None, description: Optional[str] = None, embedding_model: Optional[str] = "BAAI/bge-large-zh-v1.5@BAAI", permission: str = "me", chunk_method: str = "naive", parser_config: DataSet.ParserConfig = None ) -> DataSet ``` Creates a dataset. #### Parameters ##### name: `str`, *Required* The unique name of the dataset to create. It must adhere to the following requirements: - Maximum 128 characters. - Case-insensitive. ##### avatar: `str` Base64 encoding of the avatar. Defaults to `None` ##### description: `str` A brief description of the dataset to create. Defaults to `None`. ##### permission Specifies who can access the dataset to create. Available options: - `"me"`: (Default) Only you can manage the dataset. - `"team"`: All team members can manage the dataset. ##### chunk_method, `str` The chunking method of the dataset to create. Available options: - `"naive"`: General (default) - `"manual`: Manual - `"qa"`: Q&A - `"table"`: Table - `"paper"`: Paper - `"book"`: Book - `"laws"`: Laws - `"presentation"`: Presentation - `"picture"`: Picture - `"one"`: One - `"email"`: Email ##### parser_config The parser configuration of the dataset. A `ParserConfig` object's attributes vary based on the selected `chunk_method`: - `chunk_method`=`"naive"`: `{"chunk_token_num":512,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"use_raptor":False}}`. - `chunk_method`=`"qa"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"manuel"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"table"`: `None` - `chunk_method`=`"paper"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"book"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"laws"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"picture"`: `None` - `chunk_method`=`"presentation"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"one"`: `None` - `chunk_method`=`"knowledge-graph"`: `{"chunk_token_num":128,"delimiter":"\\n","entity_types":["organization","person","location","event","time"]}` - `chunk_method`=`"email"`: `None` #### Returns - Success: A `dataset` object. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") dataset = rag_object.create_dataset(name="kb_1") ``` --- ### Delete datasets ```python RAGFlow.delete_datasets(ids: list[str] | None = None) ``` Deletes datasets by ID. #### Parameters ##### ids: `list[str]` or `None`, *Required* The IDs of the datasets to delete. Defaults to `None`. - If `None`, all datasets will be deleted. - If an array of IDs, only the specified datasets will be deleted. - If an empty array, no datasets will be deleted. #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python rag_object.delete_datasets(ids=["d94a8dc02c9711f0930f7fbc369eab6d","e94a8dc02c9711f0930f7fbc369eab6e"]) ``` --- ### List datasets ```python RAGFlow.list_datasets( page: int = 1, page_size: int = 30, orderby: str = "create_time", desc: bool = True, id: str = None, name: str = None ) -> list[DataSet] ``` Lists datasets. #### Parameters ##### page: `int` Specifies the page on which the datasets will be displayed. Defaults to `1`. ##### page_size: `int` The number of datasets on each page. Defaults to `30`. ##### orderby: `str` The field by which datasets should be sorted. Available options: - `"create_time"` (default) - `"update_time"` ##### desc: `bool` Indicates whether the retrieved datasets should be sorted in descending order. Defaults to `True`. ##### id: `str` The ID of the dataset to retrieve. Defaults to `None`. ##### name: `str` The name of the dataset to retrieve. Defaults to `None`. #### Returns - Success: A list of `DataSet` objects. - Failure: `Exception`. #### Examples ##### List all datasets ```python for dataset in rag_object.list_datasets(): print(dataset) ``` ##### Retrieve a dataset by ID ```python dataset = rag_object.list_datasets(id = "id_1") print(dataset[0]) ``` --- ### Update dataset ```python DataSet.update(update_message: dict) ``` Updates configurations for the current dataset. #### Parameters ##### update_message: `dict[str, str|int]`, *Required* A dictionary representing the attributes to update, with the following keys: - `"name"`: `str` The revised name of the dataset. - Basic Multilingual Plane (BMP) only - Maximum 128 characters - Case-insensitive - `"avatar"`: (*Body parameter*), `string` The updated base64 encoding of the avatar. - Maximum 65535 characters - `"embedding_model"`: (*Body parameter*), `string` The updated embedding model name. - Ensure that `"chunk_count"` is `0` before updating `"embedding_model"`. - Maximum 255 characters - Must follow `model_name@model_factory` format - `"permission"`: (*Body parameter*), `string` The updated dataset permission. Available options: - `"me"`: (Default) Only you can manage the dataset. - `"team"`: All team members can manage the dataset. - `"pagerank"`: (*Body parameter*), `int` refer to [Set page rank](https://ragflow.io/docs/dev/set_page_rank) - Default: `0` - Minimum: `0` - Maximum: `100` - `"chunk_method"`: (*Body parameter*), `enum` The chunking method for the dataset. Available options: - `"naive"`: General (default) - `"book"`: Book - `"email"`: Email - `"laws"`: Laws - `"manual"`: Manual - `"one"`: One - `"paper"`: Paper - `"picture"`: Picture - `"presentation"`: Presentation - `"qa"`: Q&A - `"table"`: Table - `"tag"`: Tag #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") dataset = rag_object.list_datasets(name="kb_name") dataset = dataset[0] dataset.update({"embedding_model":"BAAI/bge-zh-v1.5", "chunk_method":"manual"}) ``` --- ## FILE MANAGEMENT WITHIN DATASET --- ### Upload documents ```python DataSet.upload_documents(document_list: list[dict]) ``` Uploads documents to the current dataset. #### Parameters ##### document_list: `list[dict]`, *Required* A list of dictionaries representing the documents to upload, each containing the following keys: - `"display_name"`: (Optional) The file name to display in the dataset. - `"blob"`: (Optional) The binary content of the file to upload. #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python dataset = rag_object.create_dataset(name="kb_name") dataset.upload_documents([{"display_name": "1.txt", "blob": ""}, {"display_name": "2.pdf", "blob": ""}]) ``` --- ### Update document ```python Document.update(update_message:dict) ``` Updates configurations for the current document. #### Parameters ##### update_message: `dict[str, str|dict[]]`, *Required* A dictionary representing the attributes to update, with the following keys: - `"display_name"`: `str` The name of the document to update. - `"meta_fields"`: `dict[str, Any]` The meta fields of the document. - `"chunk_method"`: `str` The parsing method to apply to the document. - `"naive"`: General - `"manual`: Manual - `"qa"`: Q&A - `"table"`: Table - `"paper"`: Paper - `"book"`: Book - `"laws"`: Laws - `"presentation"`: Presentation - `"picture"`: Picture - `"one"`: One - `"email"`: Email - `"parser_config"`: `dict[str, Any]` The parsing configuration for the document. Its attributes vary based on the selected `"chunk_method"`: - `"chunk_method"`=`"naive"`: `{"chunk_token_num":128,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"use_raptor":False}}`. - `chunk_method`=`"qa"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"manuel"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"table"`: `None` - `chunk_method`=`"paper"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"book"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"laws"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"presentation"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"picture"`: `None` - `chunk_method`=`"one"`: `None` - `chunk_method`=`"knowledge-graph"`: `{"chunk_token_num":128,"delimiter":"\\n","entity_types":["organization","person","location","event","time"]}` - `chunk_method`=`"email"`: `None` #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") dataset = rag_object.list_datasets(id='id') dataset = dataset[0] doc = dataset.list_documents(id="wdfxb5t547d") doc = doc[0] doc.update([{"parser_config": {"chunk_token_num": 256}}, {"chunk_method": "manual"}]) ``` --- ### Download document ```python Document.download() -> bytes ``` Downloads the current document. #### Returns The downloaded document in bytes. #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") dataset = rag_object.list_datasets(id="id") dataset = dataset[0] doc = dataset.list_documents(id="wdfxb5t547d") doc = doc[0] open("~/ragflow.txt", "wb+").write(doc.download()) print(doc) ``` --- ### List documents ```python Dataset.list_documents( id: str = None, keywords: str = None, page: int = 1, page_size: int = 30, order_by: str = "create_time", desc: bool = True, create_time_from: int = 0, create_time_to: int = 0 ) -> list[Document] ``` Lists documents in the current dataset. #### Parameters ##### id: `str` The ID of the document to retrieve. Defaults to `None`. ##### keywords: `str` The keywords used to match document titles. Defaults to `None`. ##### page: `int` Specifies the page on which the documents will be displayed. Defaults to `1`. ##### page_size: `int` The maximum number of documents on each page. Defaults to `30`. ##### orderby: `str` The field by which documents should be sorted. Available options: - `"create_time"` (default) - `"update_time"` ##### desc: `bool` Indicates whether the retrieved documents should be sorted in descending order. Defaults to `True`. ##### create_time_from: `int` Unix timestamp for filtering documents created after this time. 0 means no filter. Defaults to 0. ##### create_time_to: `int` Unix timestamp for filtering documents created before this time. 0 means no filter. Defaults to 0. #### Returns - Success: A list of `Document` objects. - Failure: `Exception`. A `Document` object contains the following attributes: - `id`: The document ID. Defaults to `""`. - `name`: The document name. Defaults to `""`. - `thumbnail`: The thumbnail image of the document. Defaults to `None`. - `dataset_id`: The dataset ID associated with the document. Defaults to `None`. - `chunk_method` The chunking method name. Defaults to `"naive"`. - `source_type`: The source type of the document. Defaults to `"local"`. - `type`: Type or category of the document. Defaults to `""`. Reserved for future use. - `created_by`: `str` The creator of the document. Defaults to `""`. - `size`: `int` The document size in bytes. Defaults to `0`. - `token_count`: `int` The number of tokens in the document. Defaults to `0`. - `chunk_count`: `int` The number of chunks in the document. Defaults to `0`. - `progress`: `float` The current processing progress as a percentage. Defaults to `0.0`. - `progress_msg`: `str` A message indicating the current progress status. Defaults to `""`. - `process_begin_at`: `datetime` The start time of document processing. Defaults to `None`. - `process_duration`: `float` Duration of the processing in seconds. Defaults to `0.0`. - `run`: `str` The document's processing status: - `"UNSTART"` (default) - `"RUNNING"` - `"CANCEL"` - `"DONE"` - `"FAIL"` - `status`: `str` Reserved for future use. - `parser_config`: `ParserConfig` Configuration object for the parser. Its attributes vary based on the selected `chunk_method`: - `chunk_method`=`"naive"`: `{"chunk_token_num":128,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"use_raptor":False}}`. - `chunk_method`=`"qa"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"manuel"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"table"`: `None` - `chunk_method`=`"paper"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"book"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"laws"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"presentation"`: `{"raptor": {"use_raptor": False}}` - `chunk_method`=`"picure"`: `None` - `chunk_method`=`"one"`: `None` - `chunk_method`=`"email"`: `None` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") dataset = rag_object.create_dataset(name="kb_1") filename1 = "~/ragflow.txt" blob = open(filename1 , "rb").read() dataset.upload_documents([{"name":filename1,"blob":blob}]) for doc in dataset.list_documents(keywords="rag", page=0, page_size=12): print(doc) ``` --- ### Delete documents ```python DataSet.delete_documents(ids: list[str] = None) ``` Deletes documents by ID. #### Parameters ##### ids: `list[list]` The IDs of the documents to delete. Defaults to `None`. If it is not specified, all documents in the dataset will be deleted. #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") dataset = rag_object.list_datasets(name="kb_1") dataset = dataset[0] dataset.delete_documents(ids=["id_1","id_2"]) ``` --- ### Parse documents ```python DataSet.async_parse_documents(document_ids:list[str]) -> None ``` Parses documents in the current dataset. #### Parameters ##### document_ids: `list[str]`, *Required* The IDs of the documents to parse. #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python rag_object = RAGFlow(api_key="", base_url="http://:9380") dataset = rag_object.create_dataset(name="dataset_name") documents = [ {'display_name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()}, {'display_name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()}, {'display_name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()} ] dataset.upload_documents(documents) documents = dataset.list_documents(keywords="test") ids = [] for document in documents: ids.append(document.id) dataset.async_parse_documents(ids) print("Async bulk parsing initiated.") ``` --- ### Parse documents (with document status) ```python DataSet.parse_documents(document_ids: list[str]) -> list[tuple[str, str, int, int]] ``` *Asynchronously* parses documents in the current dataset. This method encapsulates `async_parse_documents()`. It awaits the completion of all parsing tasks before returning detailed results, including the parsing status and statistics for each document. If a keyboard interruption occurs (e.g., `Ctrl+C`), all pending parsing tasks will be cancelled gracefully. #### Parameters ##### document_ids: `list[str]`, *Required* The IDs of the documents to parse. #### Returns A list of tuples with detailed parsing results: ```python [ (document_id: str, status: str, chunk_count: int, token_count: int), ... ] ``` - `status`: The final parsing state (e.g., `success`, `failed`, `cancelled`). - `chunk_count`: The number of content chunks created from the document. - `token_count`: The total number of tokens processed. --- #### Example ```python rag_object = RAGFlow(api_key="", base_url="http://:9380") dataset = rag_object.create_dataset(name="dataset_name") documents = dataset.list_documents(keywords="test") ids = [doc.id for doc in documents] try: finished = dataset.parse_documents(ids) for doc_id, status, chunk_count, token_count in finished: print(f"Document {doc_id} parsing finished with status: {status}, chunks: {chunk_count}, tokens: {token_count}") except KeyboardInterrupt: print("\nParsing interrupted by user. All pending tasks have been cancelled.") except Exception as e: print(f"Parsing failed: {e}") ``` --- ### Stop parsing documents ```python DataSet.async_cancel_parse_documents(document_ids:list[str])-> None ``` Stops parsing specified documents. #### Parameters ##### document_ids: `list[str]`, *Required* The IDs of the documents for which parsing should be stopped. #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python rag_object = RAGFlow(api_key="", base_url="http://:9380") dataset = rag_object.create_dataset(name="dataset_name") documents = [ {'display_name': 'test1.txt', 'blob': open('./test_data/test1.txt',"rb").read()}, {'display_name': 'test2.txt', 'blob': open('./test_data/test2.txt',"rb").read()}, {'display_name': 'test3.txt', 'blob': open('./test_data/test3.txt',"rb").read()} ] dataset.upload_documents(documents) documents = dataset.list_documents(keywords="test") ids = [] for document in documents: ids.append(document.id) dataset.async_parse_documents(ids) print("Async bulk parsing initiated.") dataset.async_cancel_parse_documents(ids) print("Async bulk parsing cancelled.") ``` --- ## CHUNK MANAGEMENT WITHIN DATASET --- ### Add chunk ```python Document.add_chunk(content:str, important_keywords:list[str] = []) -> Chunk ``` Adds a chunk to the current document. #### Parameters ##### content: `str`, *Required* The text content of the chunk. ##### important_keywords: `list[str]` The key terms or phrases to tag with the chunk. #### Returns - Success: A `Chunk` object. - Failure: `Exception`. A `Chunk` object contains the following attributes: - `id`: `str`: The chunk ID. - `content`: `str` The text content of the chunk. - `important_keywords`: `list[str]` A list of key terms or phrases tagged with the chunk. - `create_time`: `str` The time when the chunk was created (added to the document). - `create_timestamp`: `float` The timestamp representing the creation time of the chunk, expressed in seconds since January 1, 1970. - `dataset_id`: `str` The ID of the associated dataset. - `document_name`: `str` The name of the associated document. - `document_id`: `str` The ID of the associated document. - `available`: `bool` The chunk's availability status in the dataset. Value options: - `False`: Unavailable - `True`: Available (default) #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") datasets = rag_object.list_datasets(id="123") dataset = datasets[0] doc = dataset.list_documents(id="wdfxb5t547d") doc = doc[0] chunk = doc.add_chunk(content="xxxxxxx") ``` --- ### List chunks ```python Document.list_chunks(keywords: str = None, page: int = 1, page_size: int = 30, id : str = None) -> list[Chunk] ``` Lists chunks in the current document. #### Parameters ##### keywords: `str` The keywords used to match chunk content. Defaults to `None` ##### page: `int` Specifies the page on which the chunks will be displayed. Defaults to `1`. ##### page_size: `int` The maximum number of chunks on each page. Defaults to `30`. ##### id: `str` The ID of the chunk to retrieve. Default: `None` #### Returns - Success: A list of `Chunk` objects. - Failure: `Exception`. #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") dataset = rag_object.list_datasets("123") dataset = dataset[0] docs = dataset.list_documents(keywords="test", page=1, page_size=12) for chunk in docs[0].list_chunks(keywords="rag", page=0, page_size=12): print(chunk) ``` --- ### Delete chunks ```python Document.delete_chunks(chunk_ids: list[str]) ``` Deletes chunks by ID. #### Parameters ##### chunk_ids: `list[str]` The IDs of the chunks to delete. Defaults to `None`. If it is not specified, all chunks of the current document will be deleted. #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") dataset = rag_object.list_datasets(id="123") dataset = dataset[0] doc = dataset.list_documents(id="wdfxb5t547d") doc = doc[0] chunk = doc.add_chunk(content="xxxxxxx") doc.delete_chunks(["id_1","id_2"]) ``` --- ### Update chunk ```python Chunk.update(update_message: dict) ``` Updates content or configurations for the current chunk. #### Parameters ##### update_message: `dict[str, str|list[str]|int]` *Required* A dictionary representing the attributes to update, with the following keys: - `"content"`: `str` The text content of the chunk. - `"important_keywords"`: `list[str]` A list of key terms or phrases to tag with the chunk. - `"available"`: `bool` The chunk's availability status in the dataset. Value options: - `False`: Unavailable - `True`: Available (default) #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") dataset = rag_object.list_datasets(id="123") dataset = dataset[0] doc = dataset.list_documents(id="wdfxb5t547d") doc = doc[0] chunk = doc.add_chunk(content="xxxxxxx") chunk.update({"content":"sdfx..."}) ``` --- ### Retrieve chunks ```python RAGFlow.retrieve(question:str="", dataset_ids:list[str]=None, document_ids=list[str]=None, page:int=1, page_size:int=30, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,cross_languages:list[str]=None,metadata_condition: dict=None) -> list[Chunk] ``` Retrieves chunks from specified datasets. #### Parameters ##### question: `str`, *Required* The user query or query keywords. Defaults to `""`. ##### dataset_ids: `list[str]`, *Required* The IDs of the datasets to search. Defaults to `None`. ##### document_ids: `list[str]` The IDs of the documents to search. Defaults to `None`. You must ensure all selected documents use the same embedding model. Otherwise, an error will occur. ##### page: `int` The starting index for the documents to retrieve. Defaults to `1`. ##### page_size: `int` The maximum number of chunks to retrieve. Defaults to `30`. ##### Similarity_threshold: `float` The minimum similarity score. Defaults to `0.2`. ##### vector_similarity_weight: `float` The weight of vector cosine similarity. Defaults to `0.3`. If x represents the vector cosine similarity, then (1 - x) is the term similarity weight. ##### top_k: `int` The number of chunks engaged in vector cosine computation. Defaults to `1024`. ##### rerank_id: `str` The ID of the rerank model. Defaults to `None`. ##### keyword: `bool` Indicates whether to enable keyword-based matching: - `True`: Enable keyword-based matching. - `False`: Disable keyword-based matching (default). ##### cross_languages: `list[string]` The languages that should be translated into, in order to achieve keywords retrievals in different languages. ##### metadata_condition: `dict` filter condition for `meta_fields`. #### Returns - Success: A list of `Chunk` objects representing the document chunks. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") dataset = rag_object.list_datasets(name="ragflow") dataset = dataset[0] name = 'ragflow_test.txt' path = './test_data/ragflow_test.txt' documents =[{"display_name":"test_retrieve_chunks.txt","blob":open(path, "rb").read()}] docs = dataset.upload_documents(documents) doc = docs[0] doc.add_chunk(content="This is a chunk addition test") for c in rag_object.retrieve(dataset_ids=[dataset.id],document_ids=[doc.id]): print(c) ``` --- ## CHAT ASSISTANT MANAGEMENT --- ### Create chat assistant ```python RAGFlow.create_chat( name: str, avatar: str = "", dataset_ids: list[str] = [], llm: Chat.LLM = None, prompt: Chat.Prompt = None ) -> Chat ``` Creates a chat assistant. #### Parameters ##### name: `str`, *Required* The name of the chat assistant. ##### avatar: `str` Base64 encoding of the avatar. Defaults to `""`. ##### dataset_ids: `list[str]` The IDs of the associated datasets. Defaults to `[""]`. ##### llm: `Chat.LLM` The LLM settings for the chat assistant to create. Defaults to `None`. When the value is `None`, a dictionary with the following values will be generated as the default. An `LLM` object contains the following attributes: - `model_name`: `str` The chat model name. If it is `None`, the user's default chat model will be used. - `temperature`: `float` Controls the randomness of the model's predictions. A lower temperature results in more conservative responses, while a higher temperature yields more creative and diverse responses. Defaults to `0.1`. - `top_p`: `float` Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. It focuses on the most likely words, cutting off the less probable ones. Defaults to `0.3` - `presence_penalty`: `float` This discourages the model from repeating the same information by penalizing words that have already appeared in the conversation. Defaults to `0.2`. - `frequency penalty`: `float` Similar to the presence penalty, this reduces the model’s tendency to repeat the same words frequently. Defaults to `0.7`. ##### prompt: `Chat.Prompt` Instructions for the LLM to follow. A `Prompt` object contains the following attributes: - `similarity_threshold`: `float` RAGFlow employs either a combination of weighted keyword similarity and weighted vector cosine similarity, or a combination of weighted keyword similarity and weighted reranking score during retrieval. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`. - `keywords_similarity_weight`: `float` This argument sets the weight of keyword similarity in the hybrid similarity score with vector cosine similarity or reranking model similarity. By adjusting this weight, you can control the influence of keyword similarity in relation to other similarity measures. The default value is `0.7`. - `top_n`: `int` This argument specifies the number of top chunks with similarity scores above the `similarity_threshold` that are fed to the LLM. The LLM will *only* access these 'top N' chunks. The default value is `8`. - `variables`: `list[dict[]]` This argument lists the variables to use in the 'System' field of **Chat Configurations**. Note that: - `knowledge` is a reserved variable, which represents the retrieved chunks. - All the variables in 'System' should be curly bracketed. - The default value is `[{"key": "knowledge", "optional": True}]`. - `rerank_model`: `str` If it is not specified, vector cosine similarity will be used; otherwise, reranking score will be used. Defaults to `""`. - `top_k`: `int` Refers to the process of reordering or selecting the top-k items from a list or set based on a specific ranking criterion. Default to 1024. - `empty_response`: `str` If nothing is retrieved in the dataset for the user's question, this will be used as the response. To allow the LLM to improvise when nothing is found, leave this blank. Defaults to `None`. - `opener`: `str` The opening greeting for the user. Defaults to `"Hi! I am your assistant, can I help you?"`. - `show_quote`: `bool` Indicates whether the source of text should be displayed. Defaults to `True`. - `prompt`: `str` The prompt content. #### Returns - Success: A `Chat` object representing the chat assistant. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") datasets = rag_object.list_datasets(name="kb_1") dataset_ids = [] for dataset in datasets: dataset_ids.append(dataset.id) assistant = rag_object.create_chat("Miss R", dataset_ids=dataset_ids) ``` --- ### Update chat assistant ```python Chat.update(update_message: dict) ``` Updates configurations for the current chat assistant. #### Parameters ##### update_message: `dict[str, str|list[str]|dict[]]`, *Required* A dictionary representing the attributes to update, with the following keys: - `"name"`: `str` The revised name of the chat assistant. - `"avatar"`: `str` Base64 encoding of the avatar. Defaults to `""` - `"dataset_ids"`: `list[str]` The datasets to update. - `"llm"`: `dict` The LLM settings: - `"model_name"`, `str` The chat model name. - `"temperature"`, `float` Controls the randomness of the model's predictions. A lower temperature results in more conservative responses, while a higher temperature yields more creative and diverse responses. - `"top_p"`, `float` Also known as “nucleus sampling”, this parameter sets a threshold to select a smaller set of words to sample from. - `"presence_penalty"`, `float` This discourages the model from repeating the same information by penalizing words that have appeared in the conversation. - `"frequency penalty"`, `float` Similar to presence penalty, this reduces the model’s tendency to repeat the same words. - `"prompt"` : Instructions for the LLM to follow. - `"similarity_threshold"`: `float` RAGFlow employs either a combination of weighted keyword similarity and weighted vector cosine similarity, or a combination of weighted keyword similarity and weighted rerank score during retrieval. This argument sets the threshold for similarities between the user query and chunks. If a similarity score falls below this threshold, the corresponding chunk will be excluded from the results. The default value is `0.2`. - `"keywords_similarity_weight"`: `float` This argument sets the weight of keyword similarity in the hybrid similarity score with vector cosine similarity or reranking model similarity. By adjusting this weight, you can control the influence of keyword similarity in relation to other similarity measures. The default value is `0.7`. - `"top_n"`: `int` This argument specifies the number of top chunks with similarity scores above the `similarity_threshold` that are fed to the LLM. The LLM will *only* access these 'top N' chunks. The default value is `8`. - `"variables"`: `list[dict[]]` This argument lists the variables to use in the 'System' field of **Chat Configurations**. Note that: - `knowledge` is a reserved variable, which represents the retrieved chunks. - All the variables in 'System' should be curly bracketed. - The default value is `[{"key": "knowledge", "optional": True}]`. - `"rerank_model"`: `str` If it is not specified, vector cosine similarity will be used; otherwise, reranking score will be used. Defaults to `""`. - `"empty_response"`: `str` If nothing is retrieved in the dataset for the user's question, this will be used as the response. To allow the LLM to improvise when nothing is retrieved, leave this blank. Defaults to `None`. - `"opener"`: `str` The opening greeting for the user. Defaults to `"Hi! I am your assistant, can I help you?"`. - `"show_quote`: `bool` Indicates whether the source of text should be displayed Defaults to `True`. - `"prompt"`: `str` The prompt content. #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") datasets = rag_object.list_datasets(name="kb_1") dataset_id = datasets[0].id assistant = rag_object.create_chat("Miss R", dataset_ids=[dataset_id]) assistant.update({"name": "Stefan", "llm": {"temperature": 0.8}, "prompt": {"top_n": 8}}) ``` --- ### Delete chat assistants ```python RAGFlow.delete_chats(ids: list[str] = None) ``` Deletes chat assistants by ID. #### Parameters ##### ids: `list[str]` The IDs of the chat assistants to delete. Defaults to `None`. If it is empty or not specified, all chat assistants in the system will be deleted. #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") rag_object.delete_chats(ids=["id_1","id_2"]) ``` --- ### List chat assistants ```python RAGFlow.list_chats( page: int = 1, page_size: int = 30, orderby: str = "create_time", desc: bool = True, id: str = None, name: str = None ) -> list[Chat] ``` Lists chat assistants. #### Parameters ##### page: `int` Specifies the page on which the chat assistants will be displayed. Defaults to `1`. ##### page_size: `int` The number of chat assistants on each page. Defaults to `30`. ##### orderby: `str` The attribute by which the results are sorted. Available options: - `"create_time"` (default) - `"update_time"` ##### desc: `bool` Indicates whether the retrieved chat assistants should be sorted in descending order. Defaults to `True`. ##### id: `str` The ID of the chat assistant to retrieve. Defaults to `None`. ##### name: `str` The name of the chat assistant to retrieve. Defaults to `None`. #### Returns - Success: A list of `Chat` objects. - Failure: `Exception`. #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") for assistant in rag_object.list_chats(): print(assistant) ``` --- ## SESSION MANAGEMENT --- ### Create session with chat assistant ```python Chat.create_session(name: str = "New session") -> Session ``` Creates a session with the current chat assistant. #### Parameters ##### name: `str` The name of the chat session to create. #### Returns - Success: A `Session` object containing the following attributes: - `id`: `str` The auto-generated unique identifier of the created session. - `name`: `str` The name of the created session. - `message`: `list[Message]` The opening message of the created session. Default: `[{"role": "assistant", "content": "Hi! I am your assistant, can I help you?"}]` - `chat_id`: `str` The ID of the associated chat assistant. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") assistant = rag_object.list_chats(name="Miss R") assistant = assistant[0] session = assistant.create_session() ``` --- ### Update chat assistant's session ```python Session.update(update_message: dict) ``` Updates the current session of the current chat assistant. #### Parameters ##### update_message: `dict[str, Any]`, *Required* A dictionary representing the attributes to update, with only one key: - `"name"`: `str` The revised name of the session. #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") assistant = rag_object.list_chats(name="Miss R") assistant = assistant[0] session = assistant.create_session("session_name") session.update({"name": "updated_name"}) ``` --- ### List chat assistant's sessions ```python Chat.list_sessions( page: int = 1, page_size: int = 30, orderby: str = "create_time", desc: bool = True, id: str = None, name: str = None ) -> list[Session] ``` Lists sessions associated with the current chat assistant. #### Parameters ##### page: `int` Specifies the page on which the sessions will be displayed. Defaults to `1`. ##### page_size: `int` The number of sessions on each page. Defaults to `30`. ##### orderby: `str` The field by which sessions should be sorted. Available options: - `"create_time"` (default) - `"update_time"` ##### desc: `bool` Indicates whether the retrieved sessions should be sorted in descending order. Defaults to `True`. ##### id: `str` The ID of the chat session to retrieve. Defaults to `None`. ##### name: `str` The name of the chat session to retrieve. Defaults to `None`. #### Returns - Success: A list of `Session` objects associated with the current chat assistant. - Failure: `Exception`. #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") assistant = rag_object.list_chats(name="Miss R") assistant = assistant[0] for session in assistant.list_sessions(): print(session) ``` --- ### Delete chat assistant's sessions ```python Chat.delete_sessions(ids:list[str] = None) ``` Deletes sessions of the current chat assistant by ID. #### Parameters ##### ids: `list[str]` The IDs of the sessions to delete. Defaults to `None`. If it is not specified, all sessions associated with the current chat assistant will be deleted. #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") assistant = rag_object.list_chats(name="Miss R") assistant = assistant[0] assistant.delete_sessions(ids=["id_1","id_2"]) ``` --- ### Converse with chat assistant ```python Session.ask(question: str = "", stream: bool = False, **kwargs) -> Optional[Message, iter[Message]] ``` Asks a specified chat assistant a question to start an AI-powered conversation. :::tip NOTE In streaming mode, not all responses include a reference, as this depends on the system's judgement. ::: #### Parameters ##### question: `str`, *Required* The question to start an AI-powered conversation. Default to `""` ##### stream: `bool` Indicates whether to output responses in a streaming way: - `True`: Enable streaming (default). - `False`: Disable streaming. ##### **kwargs The parameters in prompt(system). #### Returns - A `Message` object containing the response to the question if `stream` is set to `False`. - An iterator containing multiple `message` objects (`iter[Message]`) if `stream` is set to `True` The following shows the attributes of a `Message` object: ##### id: `str` The auto-generated message ID. ##### content: `str` The content of the message. Defaults to `"Hi! I am your assistant, can I help you?"`. ##### reference: `list[Chunk]` A list of `Chunk` objects representing references to the message, each containing the following attributes: - `id` `str` The chunk ID. - `content` `str` The content of the chunk. - `img_id` `str` The ID of the snapshot of the chunk. Applicable only when the source of the chunk is an image, PPT, PPTX, or PDF file. - `document_id` `str` The ID of the referenced document. - `document_name` `str` The name of the referenced document. - `position` `list[str]` The location information of the chunk within the referenced document. - `dataset_id` `str` The ID of the dataset to which the referenced document belongs. - `similarity` `float` A composite similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity. It is the weighted sum of `vector_similarity` and `term_similarity`. - `vector_similarity` `float` A vector similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between vector embeddings. - `term_similarity` `float` A keyword similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between keywords. #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") assistant = rag_object.list_chats(name="Miss R") assistant = assistant[0] session = assistant.create_session() print("\n==================== Miss R =====================\n") print("Hello. What can I do for you?") while True: question = input("\n==================== User =====================\n> ") print("\n==================== Miss R =====================\n") cont = "" for ans in session.ask(question, stream=True): print(ans.content[len(cont):], end='', flush=True) cont = ans.content ``` --- ### Create session with agent ```python Agent.create_session(**kwargs) -> Session ``` Creates a session with the current agent. #### Parameters ##### **kwargs The parameters in `begin` component. #### Returns - Success: A `Session` object containing the following attributes: - `id`: `str` The auto-generated unique identifier of the created session. - `message`: `list[Message]` The messages of the created session assistant. Default: `[{"role": "assistant", "content": "Hi! I am your assistant, can I help you?"}]` - `agent_id`: `str` The ID of the associated agent. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow, Agent rag_object = RAGFlow(api_key="", base_url="http://:9380") agent_id = "AGENT_ID" agent = rag_object.list_agents(id = agent_id)[0] session = agent.create_session() ``` --- ### Converse with agent ```python Session.ask(question: str="", stream: bool = False) -> Optional[Message, iter[Message]] ``` Asks a specified agent a question to start an AI-powered conversation. :::tip NOTE In streaming mode, not all responses include a reference, as this depends on the system's judgement. ::: #### Parameters ##### question: `str` The question to start an AI-powered conversation. If the **Begin** component takes parameters, a question is not required. ##### stream: `bool` Indicates whether to output responses in a streaming way: - `True`: Enable streaming (default). - `False`: Disable streaming. #### Returns - A `Message` object containing the response to the question if `stream` is set to `False` - An iterator containing multiple `message` objects (`iter[Message]`) if `stream` is set to `True` The following shows the attributes of a `Message` object: ##### id: `str` The auto-generated message ID. ##### content: `str` The content of the message. Defaults to `"Hi! I am your assistant, can I help you?"`. ##### reference: `list[Chunk]` A list of `Chunk` objects representing references to the message, each containing the following attributes: - `id` `str` The chunk ID. - `content` `str` The content of the chunk. - `image_id` `str` The ID of the snapshot of the chunk. Applicable only when the source of the chunk is an image, PPT, PPTX, or PDF file. - `document_id` `str` The ID of the referenced document. - `document_name` `str` The name of the referenced document. - `position` `list[str]` The location information of the chunk within the referenced document. - `dataset_id` `str` The ID of the dataset to which the referenced document belongs. - `similarity` `float` A composite similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity. It is the weighted sum of `vector_similarity` and `term_similarity`. - `vector_similarity` `float` A vector similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between vector embeddings. - `term_similarity` `float` A keyword similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between keywords. #### Examples ```python from ragflow_sdk import RAGFlow, Agent rag_object = RAGFlow(api_key="", base_url="http://:9380") AGENT_id = "AGENT_ID" agent = rag_object.list_agents(id = AGENT_id)[0] session = agent.create_session() print("\n===== Miss R ====\n") print("Hello. What can I do for you?") while True: question = input("\n===== User ====\n> ") print("\n==== Miss R ====\n") cont = "" for ans in session.ask(question, stream=True): print(ans.content[len(cont):], end='', flush=True) cont = ans.content ``` --- ### List agent sessions ```python Agent.list_sessions( page: int = 1, page_size: int = 30, orderby: str = "update_time", desc: bool = True, id: str = None ) -> List[Session] ``` Lists sessions associated with the current agent. #### Parameters ##### page: `int` Specifies the page on which the sessions will be displayed. Defaults to `1`. ##### page_size: `int` The number of sessions on each page. Defaults to `30`. ##### orderby: `str` The field by which sessions should be sorted. Available options: - `"create_time"` - `"update_time"`(default) ##### desc: `bool` Indicates whether the retrieved sessions should be sorted in descending order. Defaults to `True`. ##### id: `str` The ID of the agent session to retrieve. Defaults to `None`. #### Returns - Success: A list of `Session` objects associated with the current agent. - Failure: `Exception`. #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") AGENT_id = "AGENT_ID" agent = rag_object.list_agents(id = AGENT_id)[0] sessons = agent.list_sessions() for session in sessions: print(session) ``` --- ### Delete agent's sessions ```python Agent.delete_sessions(ids: list[str] = None) ``` Deletes sessions of an agent by ID. #### Parameters ##### ids: `list[str]` The IDs of the sessions to delete. Defaults to `None`. If it is not specified, all sessions associated with the agent will be deleted. #### Returns - Success: No value is returned. - Failure: `Exception` #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") AGENT_id = "AGENT_ID" agent = rag_object.list_agents(id = AGENT_id)[0] agent.delete_sessions(ids=["id_1","id_2"]) ``` --- ## AGENT MANAGEMENT --- ### List agents ```python RAGFlow.list_agents( page: int = 1, page_size: int = 30, orderby: str = "create_time", desc: bool = True, id: str = None, title: str = None ) -> List[Agent] ``` Lists agents. #### Parameters ##### page: `int` Specifies the page on which the agents will be displayed. Defaults to `1`. ##### page_size: `int` The number of agents on each page. Defaults to `30`. ##### orderby: `str` The attribute by which the results are sorted. Available options: - `"create_time"` (default) - `"update_time"` ##### desc: `bool` Indicates whether the retrieved agents should be sorted in descending order. Defaults to `True`. ##### id: `str` The ID of the agent to retrieve. Defaults to `None`. ##### name: `str` The name of the agent to retrieve. Defaults to `None`. #### Returns - Success: A list of `Agent` objects. - Failure: `Exception`. #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") for agent in rag_object.list_agents(): print(agent) ``` --- ### Create agent ```python RAGFlow.create_agent( title: str, dsl: dict, description: str | None = None ) -> None ``` Create an agent. #### Parameters ##### title: `str` Specifies the title of the agent. ##### dsl: `dict` Specifies the canvas DSL of the agent. ##### description: `str` The description of the agent. Defaults to `None`. #### Returns - Success: Nothing. - Failure: `Exception`. #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") rag_object.create_agent( title="Test Agent", description="A test agent", dsl={ # ... canvas DSL here ... } ) ``` --- ### Update agent ```python RAGFlow.update_agent( agent_id: str, title: str | None = None, description: str | None = None, dsl: dict | None = None ) -> None ``` Update an agent. #### Parameters ##### agent_id: `str` Specifies the id of the agent to be updated. ##### title: `str` Specifies the new title of the agent. `None` if you do not want to update this. ##### dsl: `dict` Specifies the new canvas DSL of the agent. `None` if you do not want to update this. ##### description: `str` The new description of the agent. `None` if you do not want to update this. #### Returns - Success: Nothing. - Failure: `Exception`. #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") rag_object.update_agent( agent_id="58af890a2a8911f0a71a11b922ed82d6", title="Test Agent", description="A test agent", dsl={ # ... canvas DSL here ... } ) ``` --- ### Delete agent ```python RAGFlow.delete_agent( agent_id: str ) -> None ``` Delete an agent. #### Parameters ##### agent_id: `str` Specifies the id of the agent to be deleted. #### Returns - Success: Nothing. - Failure: `Exception`. #### Examples ```python from ragflow_sdk import RAGFlow rag_object = RAGFlow(api_key="", base_url="http://:9380") rag_object.delete_agent("58af890a2a8911f0a71a11b922ed82d6") ``` --- --- --- sidebar_position: 2 slug: /release_notes --- # Releases Key features, improvements and bug fixes in the latest releases. ## v0.23.1 Released on December 31, 2025. ### Improvements - Memory: Enhances the stability of memory extraction when all memory types are selected. - RAG: Refines the context window extraction strategy for images and tables. ### Fixed issues - Memory: - The RAGFlow server failed to start if an empty memory object existed. - Unable to delete a newly created empty Memory. - RAG: MDX file parsing was not supported. ### Data sources - GitHub - Gitlab - Asana - IMAP ## v0.23.0 Released on December 27, 2025. ### New features - Memory - Implements a **Memory** interface for managing memory. - Supports configuring context via the **Retrieval** or **Message** component. - Agent - Improves the **Agent** component's performance by refactoring the underlying architecture. - The **Agent** component can now output structured data for use in downstream components. - Supports using webhook to trigger agent execution. - Supports voice input/output. - Supports configuring multiple **Retrieval** components per **Agent** component. - Ingestion pipeline - Supports extracting table of contents in the **Transformer** component to improve long-context RAG performance. - Dataset - Supports configuring context window for images and tables. - Introduces parent-child chunking strategy. - Supports auto-generation of metadata during file parsing. - Chat: Supports voice input. ### Improvements - RAG: Accelerates GraphRAG generation significantly. - Bumps RAGFlow's document engine, [Infinity](https://github.com/infiniflow/infinity) to v0.6.15 (backward compatible). ### Data sources - Google Cloud Storage - Gmail - Dropbox - WebDAV - Airtable ### Model support - GPT-5.2 - GPT-5.2 Pro - GPT-5.1 - GPT-5.1 Instant - Claude Opus 4.5 - MiniMax M2 - GLM-4.7. - A MinerU configuration interface. - AI Badgr (model provider). ### API changes #### HTTP API - [Converse with Agent](./references/http_api_reference.md#converse-with-agent) returns complete execution trace logs. - [Create chat completion](./references/http_api_reference.md#create-chat-completion) supports metadata-based filtering. - [Converse with chat assistant](./references/http_api_reference.md#converse-with-chat-assistant) supports metadata-based filtering. ## v0.22.1 Released on November 19, 2025. ### Improvements - Agent: - Supports exporting Agent outputs in Word or Markdown formats. - Adds a **List operations** component. - Adds a **Variable aggregator** component. - Data sources: - Supports S3-compatible data sources, e.g., MinIO. - Adds data synchronization with JIRA. - Continues the redesign of the **Profile** page layouts. - Upgrades the Flask web framework from synchronous to asynchronous, increasing concurrency and preventing blocking issues caused when requesting upstream LLM services. ### Fixed issues - A v0.22.0 issue: Users failed to parse uploaded files or switch embedding model in a dataset containing parsed files using a built-in model from a `-full` RAGFlow edition. - Image concatenated in Word documents. [#11310](https://github.com/infiniflow/ragflow/pull/11310) - Mixed images and text were not correctly displayed in the chat history. ### Newly supported models - Gemini 3 Pro Preview ## v0.22.0 Released on November 12, 2025. ### Breaking Changes :::danger IMPORTANT From this release onwards, we ship only the slim edition (without embedding models) Docker image and no longer append the `-slim` suffix to the image tag. ::: ### New Features - Dataset: - Supports data synchronization from five online sources (AWS S3, Google Drive, Notion, Confluence, and Discord). - RAPTOR can be built across an entire dataset or on individual documents. - Ingestion pipeline: Supports [Docling document parsing](https://github.com/docling-project/docling) in the **Parser** component. - Launches a new administrative Web UI dashboard for graphical user management and service status monitoring. - Agent: - Supports structured output. - Supports metadata filtering in the **Retrieval** component. - Introduces a **Variable aggregator** component with data operation and session variable definition capabilities. ### Improvements - Agent: Supports visualizing previous components' outputs in the **Await Response** component. - Revamps the model provider page. - Upgrades RAGFlow's document engine Infinity to v0.6.5. ### Added Models - Kimi-K2-Thinking ### New agent templates - Interactive Agent, incorporates real-time user feedback to dynamically optimize Agent output. ## v0.21.1 Released on October 23, 2025. ### New features - Experimental: Adds support for PDF document parsing using MinerU. See [here](./faq.mdx#how-to-use-mineru-to-parse-pdf-documents). ### Improvements - Enhances UI/UX for the dataset and personal center pages. - Upgrades RAGFlow's document engine, [Infinity](https://github.com/infiniflow/infinity), to v0.6.1. ### Fixed issues - An issue with video parsing. ## v0.21.0 Released on October 15, 2025. ### New features - Orchestratable ingestion pipeline: Supports customized data ingestion and cleansing workflows, enabling users to flexibly design their data flows or directly apply the official data flow templates on the canvas. - GraphRAG & RAPTOR write process optimized: Replaces the automatic incremental build process with manual batch building, significantly reducing construction overhead. - Long-context RAG: Automatically generates document-level table of contents (TOC) structures to mitigate context loss caused by inaccurate or excessive chunking, substantially improving retrieval quality. This feature is now available via a TOC extraction template. See [here](./guides/dataset/extract_table_of_contents.md). - Video file parsing: Expands the system's multimodal data processing capabilities by supporting video file parsing. - Admin CLI: Introduces a new command-line tool for system administration, allowing users to manage and monitor RAGFlow's service status via command line. ### Improvements - Redesigns RAGFlow's Login and Registration pages. - Upgrades RAGFlow's document engine Infinity to v0.6.0. ### Newly supported models - Tongyi Qwen 3 series - Claude Sonnet 4.5 - Meituan LongCat-Flash-Thinking ### New agent templates - Company Research Report Deep Dive Agent: Designed for financial institutions to help analysts quickly organize information, generate research reports, and make investment decisions. - Orchestratable Ingestion Pipeline Template: Allows users to apply this template on the canvas to rapidly establish standardized data ingestion and cleansing processes. ## v0.20.5 Released on September 10, 2025. ### Improvements - Agent: - Agent Performance Optimized: Improves planning and reflection speed for simple tasks; optimizes concurrent tool calls for parallelizable scenarios, significantly reducing overall response time. - Four framework-level prompt blocks are available in the **System prompt** section, enabling customization and overriding of prompts at the framework level, thereby enhancing flexibility and control. See [here](./guides/agent/agent_component_reference/agent.mdx#system-prompt). - **Execute SQL** component enhanced: Replaces the original variable reference component with a text input field, allowing users to write free-form SQL queries and reference variables. See [here](./guides/agent/agent_component_reference/execute_sql.md). - Chat: Re-enables **Reasoning** and **Cross-language search**. ### Newly supported models - Meituan LongCat - Kimi: kimi-k2-turbo-preview and kimi-k2-0905-preview - Qwen: qwen3-max-preview - SiliconFlow: DeepSeek V3.1 ### Fixed issues - Dataset: Deleted files remained searchable. - Chat: Unable to chat with an Ollama model. - Agent: - A **Cite** toggle failure. - An Agent in task mode still required a dialogue to trigger. - Repeated answers in multi-turn dialogues. - Duplicate summarization of parallel execution results. ### API changes #### HTTP APIs - Adds a body parameter `"metadata_condition"` to the [Retrieve chunks](./references/http_api_reference.md#retrieve-chunks) method, enabling metadata-based chunk filtering during retrieval. [#9877](https://github.com/infiniflow/ragflow/pull/9877) #### Python APIs - Adds a parameter `metadata_condition` to the [Retrieve chunks](./references/python_api_reference.md#retrieve-chunks) method, enabling metadata-based chunk filtering during retrieval. [#9877](https://github.com/infiniflow/ragflow/pull/9877) ## v0.20.4 Released on August 27, 2025. ### Improvements - Agent component: Completes Chinese localization for the Agent component. - Introduces the `ENABLE_TIMEOUT_ASSERTION` environment variable to enable or disable timeout assertions for file parsing tasks. - Dataset: - Improves Markdown file parsing, with AST support to avoid unintended chunking. - Enhances HTML parsing, supporting bs4-based HTML tag traversal. ### Newly supported models ZHIPU GLM-4.5 ### New Agent templates Ecommerce Customer Service Workflow: A template designed to handle enquiries about product features and multi-product comparisons using the internal dataset, as well as to manage installation appointment bookings. ### Fixed issues - Dataset: - Unable to share resources with the team. - Inappropriate restrictions on the number and size of uploaded files. - Chat: - Unable to preview referenced files in responses. - Unable to send out messages after file uploads. - An OAuth2 authentication failure. - A logical error in multi-conditioned metadata searches within a dataset. - Citations infinitely increased in multi-turn conversations. ## v0.20.3 Released on August 20, 2025. ### Improvements - Revamps the user interface for the **Datasets**, **Chat**, and **Search** pages. - Search and Chat: Introduces document-level metadata filtering, allowing automatic or manual filtering during chats or searches. - Search: Supports creating search apps tailored to various business scenarios - Chat: Supports comparing answer performance of up to three chat model settings on a single **Chat** page. - Agent: - Implements a toggle in the **Agent** component to enable or disable citation. - Introduces a drag-and-drop method for creating components. - Documentation: Corrects inaccuracies in the API reference. ### New Agent templates - Report Agent: A template for generating summary reports in internal question-answering scenarios, supporting the display of tables and formulae. [#9427](https://github.com/infiniflow/ragflow/pull/9427) ### Fixed issues - The timeout mechanism introduced in v0.20.0 caused tasks like GraphRAG to halt. - Predefined opening greeting in the **Agent** component was missing during conversations. - An automatic line break issue in the prompt editor. - A memory leak issue caused by PyPDF. [#9469](https://github.com/infiniflow/ragflow/pull/9469) ### API changes #### Deprecated [Create session with agent](./references/http_api_reference.md#create-session-with-agent) ## v0.20.1 Released on August 8, 2025. ### New Features - The **Retrieval** component now supports the dynamic specification of dataset names using variables. - The user interface now includes a French language option. ### Newly supported models - GPT-5 - Claude 4.1 ### New agent templates (both workflow and agentic) - SQL Assistant Workflow: Empowers non-technical teams (e.g., operations, product) to independently query business data. - Choose Your Knowledge Base Workflow: Lets users select a dataset to query during conversations. [#9325](https://github.com/infiniflow/ragflow/pull/9325) - Choose Your Knowledge Base Agent: Delivers higher-quality responses with extended reasoning time, suited for complex queries. [#9325](https://github.com/infiniflow/ragflow/pull/9325) ### Fixed Issues - The **Agent** component was unable to invoke models installed via vLLM. - Agents could not be shared with the team. - Embedding an Agent into a webpage was not functioning properly. ## v0.20.0 Released on August 4, 2025. ### Compatibility changes From v0.20.0 onwards, Agents are no longer compatible with earlier versions, and all existing Agents from previous versions must be rebuilt following the upgrade. ### New features - Unified orchestration of both Agents and Workflows. - A comprehensive refactor of the Agent, greatly enhancing its capabilities and usability, with support for Multi-Agent configurations, planning and reflection, and visual functionalities. - Fully implemented MCP functionality, allowing for MCP Server import, Agents functioning as MCP Clients, and RAGFlow itself operating as an MCP Server. - Access to runtime logs for Agents. - Chat histories with Agents available through the management panel. - Integration of a new, more robust version of Infinity, enabling the auto-tagging functionality with Infinity as the underlying document engine. - An OpenAI-compatible API that supports file reference information. - Support for new models, including Kimi K2, Grok 4, and Voyage embedding. - RAGFlow’s codebase is now mirrored on Gitee. - Introduction of a new model provider, Gitee AI. ### New agent templates introduced - Multi-Agent based Deep Research: Collaborative Agent teamwork led by a Lead Agent with multiple Subagents, distinct from traditional workflow orchestration. - An intelligent Q&A chatbot leveraging internal datasets, designed for customer service and training scenarios. - A resume analysis template used by the RAGFlow team to screen, analyze, and record candidate information. - A blog generation workflow that transforms raw ideas into SEO-friendly blog content. - An intelligent customer service workflow. - A user feedback analysis template that directs user feedback to appropriate teams through semantic analysis. - Trip Planner: Uses web search and map MCP servers to assist with travel planning. - Image Lingo: Translates content from uploaded photos. - An information search assistant that retrieves answers from both internal datasets and the web. ## v0.19.1 Released on June 23, 2025. ### Fixed issues - A memory leak issue during high-concurrency requests. - Large file parsing freezes when GraphRAG entity resolution is enabled. [#8223](https://github.com/infiniflow/ragflow/pull/8223) - A context error occurring when using Sandbox in standalone mode. [#8340](https://github.com/infiniflow/ragflow/pull/8340) - An excessive CPU usage issue caused by Ollama. [#8216](https://github.com/infiniflow/ragflow/pull/8216) - A bug in the Code Component. [#7949](https://github.com/infiniflow/ragflow/pull/7949) - Added support for models installed via Ollama or VLLM when creating a dataset through the API. [#8069](https://github.com/infiniflow/ragflow/pull/8069) - Enabled role-based authentication for S3 bucket access. [#8149](https://github.com/infiniflow/ragflow/pull/8149) ### Newly supported models - Qwen 3 Embedding. [#8184](https://github.com/infiniflow/ragflow/pull/8184) - Voyage Multimodal 3. [#7987](https://github.com/infiniflow/ragflow/pull/7987) ## v0.19.0 Released on May 26, 2025. ### New features - [Cross-language search](./references/glossary.mdx#cross-language-search) is supported in the Knowledge and Chat modules, enhancing search accuracy and user experience in multilingual environments, such as in Chinese-English datasets. - Agent component: A new Code component supports Python and JavaScript scripts, enabling developers to handle more complex tasks like dynamic data processing. - Enhanced image display: Images in Chat and Search now render directly within responses, rather than as external references. Knowledge retrieval testing can retrieve images directly, instead of texts extracted from images. - Claude 4 and ChatGPT o3: Developers can now use the newly released, most advanced Claude model and OpenAI’s latest ChatGPT o3 inference model. > The following features have been contributed by our community: - Agent component: Enables tool calling within the Generate Component. Thanks to [notsyncing](https://github.com/notsyncing). - Markdown rendering: Image references in a markdown file can be displayed after chunking. Thanks to [Woody-Hu](https://github.com/Woody-Hu). - Document engine support: OpenSearch can now be used as RAGFlow's document engine. Thanks to [pyyuhao](https://github.com/pyyuhao). ### Documentation #### Added documents - [Select PDF parser](./guides/dataset/select_pdf_parser.md) - [Enable Excel2HTML](./guides/dataset/enable_excel2html.md) - [Code component](./guides/agent/agent_component_reference/code.mdx) ## v0.18.0 Released on April 23, 2025. ### Compatibility changes From this release onwards, built-in rerank models have been removed because they have minimal impact on retrieval rates but significantly increase retrieval time. ### New features - MCP server: enables access to RAGFlow's datasets via MCP. - DeepDoc supports adopting VLM model as a processing pipeline during document layout recognition, enabling in-depth analysis of images in PDF and DOCX files. - OpenAI-compatible APIs: Agents can be called via OpenAI-compatible APIs. - User registration control: administrators can enable or disable user registration through an environment variable. - Team collaboration: Agents can be shared with team members. - Agent version control: all updates are continuously logged and can be rolled back to a previous version via export. ![export_agent](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/export_agent_as_json.jpg) ### Improvements - Enhanced answer referencing: Citation accuracy in generated responses is improved. - Enhanced question-answering experience: users can now manually stop streaming output during a conversation. ### Documentation #### Added documents - [Set page rank](./guides/dataset/set_page_rank.md) - [Enable RAPTOR](./guides/dataset/enable_raptor.md) - [Set variables for your chat assistant](./guides/chat/set_chat_variables.md) - [Launch RAGFlow MCP server](./develop/mcp/launch_mcp_server.md) ## v0.17.2 Released on March 13, 2025. ### Compatibility changes - Removes the **Max_tokens** setting from **Chat configuration**. - Removes the **Max_tokens** setting from **Generate**, **Rewrite**, **Categorize**, **Keyword** agent components. From this release onwards, if you still see RAGFlow's responses being cut short or truncated, check the **Max_tokens** setting of your model provider. ### Improvements - Adds OpenAI-compatible APIs. - Introduces a German user interface. - Accelerates knowledge graph extraction. - Enables Tavily-based web search in the **Retrieval** agent component. - Adds Tongyi-Qianwen QwQ models (OpenAI-compatible). - Supports CSV files in the **General** chunking method. ### Fixed issues - Unable to add models via Ollama/Xinference, an issue introduced in v0.17.1. ### API changes #### HTTP APIs - [Create chat completion](./references/http_api_reference.md#openai-compatible-api) #### Python APIs - [Create chat completion](./references/python_api_reference.md#openai-compatible-api) ## v0.17.1 Released on March 11, 2025. ### Improvements - Improves English tokenization quality. - Improves the table extraction logic in Markdown document parsing. - Updates SiliconFlow's model list. - Supports parsing XLS files (Excel 97-2003) with improved corresponding error handling. - Supports Huggingface rerank models. - Enables relative time expressions ("now", "yesterday", "last week", "next year", and more) in chat assistant and the **Rewrite** agent component. ### Fixed issues - A repetitive knowledge graph extraction issue. - Issues with API calling. - Options in the **PDF parser**, aka **Document parser**, dropdown are missing. - A Tavily web search issue. - Unable to preview diagrams or images in an AI chat. ### Documentation #### Added documents - [Use tag set](./guides/dataset/use_tag_sets.md) ## v0.17.0 Released on March 3, 2025. ### New features - AI chat: Implements Deep Research for agentic reasoning. To activate this, enable the **Reasoning** toggle under the **Prompt engine** tab of your chat assistant dialogue. - AI chat: Leverages Tavily-based web search to enhance contexts in agentic reasoning. To activate this, enter the correct Tavily API key under the **Assistant settings** tab of your chat assistant dialogue. - AI chat: Supports starting a chat without specifying datasets. - AI chat: HTML files can also be previewed and referenced, in addition to PDF files. - Dataset: Adds a **PDF parser**, aka **Document parser**, dropdown menu to dataset configurations. This includes a DeepDoc model option, which is time-consuming, a much faster **naive** option (plain text), which skips DLA (Document Layout Analysis), OCR (Optical Character Recognition), and TSR (Table Structure Recognition) tasks, and several currently *experimental* large model options. See [here](./guides/dataset/select_pdf_parser.md). - Agent component: **(x)** or a forward slash `/` can be used to insert available keys (variables) in the system prompt field of the **Generate** or **Template** component. - Object storage: Supports using Aliyun OSS (Object Storage Service) as a file storage option. - Models: Updates the supported model list for Tongyi-Qianwen (Qwen), adding DeepSeek-specific models; adds ModelScope as a model provider. - APIs: Document metadata can be updated through an API. The following diagram illustrates the workflow of RAGFlow's Deep Research: ![Image](https://github.com/user-attachments/assets/f65d4759-4f09-4d9d-9549-c0e1fe907525) The following is a screenshot of a conversation that integrates Deep Research: ![Image](https://github.com/user-attachments/assets/165b88ff-1f5d-4fb8-90e2-c836b25e32e9) ### API changes #### HTTP APIs Adds a body parameter `"meta_fields"` to the [Update document](./references/http_api_reference.md#update-document) method. #### Python APIs Adds a key option `"meta_fields"` to the [Update document](./references/python_api_reference.md#update-document) method. ### Documentation #### Added documents - [Run retrieval test](./guides/dataset/run_retrieval_test.md) ## v0.16.0 Released on February 6, 2025. ### New features - Supports DeepSeek R1 and DeepSeek V3. - GraphRAG refactor: Knowledge graph is dynamically built on an entire dataset rather than on an individual file, and automatically updated when a newly uploaded file starts parsing. See [here](https://ragflow.io/docs/dev/construct_knowledge_graph). - Adds an **Iteration** agent component and a **Research report generator** agent template. See [here](./guides/agent/agent_component_reference/iteration.mdx). - New UI language: Portuguese. - Allows setting metadata for a specific file in a dataset to enhance AI-powered chats. See [here](./guides/dataset/set_metadata.md). - Upgrades RAGFlow's document engine [Infinity](https://github.com/infiniflow/infinity) to v0.6.0.dev3. - Supports GPU acceleration for DeepDoc (see [docker-compose-gpu.yml](https://github.com/infiniflow/ragflow/blob/main/docker/docker-compose-gpu.yml)). - Supports creating and referencing a **Tag** dataset as a key milestone towards bridging the semantic gap between query and response. :::danger IMPORTANT The **Tag dataset** feature is *unavailable* on the [Infinity](https://github.com/infiniflow/infinity) document engine. ::: ### Documentation #### Added documents - [Construct knowledge graph](./guides/dataset/construct_knowledge_graph.md) - [Set metadata](./guides/dataset/set_metadata.md) - [Begin component](./guides/agent/agent_component_reference/begin.mdx) - [Generate component](./guides/agent/agent_component_reference/generate.mdx) - [Interact component](./guides/agent/agent_component_reference/interact.mdx) - [Retrieval component](./guides/agent/agent_component_reference/retrieval.mdx) - [Categorize component](./guides/agent/agent_component_reference/categorize.mdx) - [Keyword component](./guides/agent/agent_component_reference/keyword.mdx) - [Message component](./guides/agent/agent_component_reference/message.mdx) - [Rewrite component](./guides/agent/agent_component_reference/rewrite.mdx) - [Switch component](./guides/agent/agent_component_reference/switch.mdx) - [Concentrator component](./guides/agent/agent_component_reference/concentrator.mdx) - [Template component](./guides/agent/agent_component_reference/template.mdx) - [Iteration component](./guides/agent/agent_component_reference/iteration.mdx) - [Note component](./guides/agent/agent_component_reference/note.mdx) ## v0.15.1 Released on December 25, 2024. ### Upgrades - Upgrades RAGFlow's document engine [Infinity](https://github.com/infiniflow/infinity) to v0.5.2. - Enhances the log display of document parsing status. ### Fixed issues This release fixes the following issues: - The `SCORE not found` and `position_int` errors returned by [Infinity](https://github.com/infiniflow/infinity). - Once an embedding model in a specific dataset is changed, embedding models in other datasets can no longer be changed. - Slow response in question-answering and AI search due to repetitive loading of the embedding model. - Fails to parse documents with RAPTOR. - Using the **Table** parsing method results in information loss. - Miscellaneous API issues. ### API changes #### HTTP APIs Adds an optional parameter `"user_id"` to the following APIs: - [Create session with chat assistant](https://ragflow.io/docs/dev/http_api_reference#create-session-with-chat-assistant) - [Update chat assistant's session](https://ragflow.io/docs/dev/http_api_reference#update-chat-assistants-session) - [List chat assistant's sessions](https://ragflow.io/docs/dev/http_api_reference#list-chat-assistants-sessions) - [Create session with agent](https://ragflow.io/docs/dev/http_api_reference#create-session-with-agent) - [Converse with chat assistant](https://ragflow.io/docs/dev/http_api_reference#converse-with-chat-assistant) - [Converse with agent](https://ragflow.io/docs/dev/http_api_reference#converse-with-agent) - [List agent sessions](https://ragflow.io/docs/dev/http_api_reference#list-agent-sessions) ## v0.15.0 Released on December 18, 2024. ### New features - Introduces additional Agent-specific APIs. - Supports using page rank score to improve retrieval performance when searching across multiple datasets. - Offers an iframe in Chat and Agent to facilitate the integration of RAGFlow into your webpage. - Adds a Helm chart for deploying RAGFlow on Kubernetes. - Supports importing or exporting an agent in JSON format. - Supports step run for Agent components/tools. - Adds a new UI language: Japanese. - Supports resuming GraphRAG and RAPTOR from a failure, enhancing task management resilience. - Adds more Mistral models. - Adds a dark mode to the UI, allowing users to toggle between light and dark themes. ### Improvements - Upgrades the Document Layout Analysis model in DeepDoc. - Significantly enhances the retrieval performance when using [Infinity](https://github.com/infiniflow/infinity) as document engine. ### API changes #### HTTP APIs - [List agent sessions](https://ragflow.io/docs/dev/http_api_reference#list-agent-sessions) - [List agents](https://ragflow.io/docs/dev/http_api_reference#list-agents) #### Python APIs - [List agent sessions](https://ragflow.io/docs/dev/python_api_reference#list-agent-sessions) - [List agents](https://ragflow.io/docs/dev/python_api_reference#list-agents) ## v0.14.1 Released on November 29, 2024. ### Improvements Adds [Infinity's configuration file](https://github.com/infiniflow/ragflow/blob/main/docker/infinity_conf.toml) to facilitate integration and customization of [Infinity](https://github.com/infiniflow/infinity) as a document engine. From this release onwards, updates to Infinity's configuration can be made directly within RAGFlow and will take effect immediately after restarting RAGFlow using `docker compose`. [#3715](https://github.com/infiniflow/ragflow/pull/3715) ### Fixed issues This release fixes the following issues: - Unable to display or edit content of a chunk after clicking it. - A `'Not found'` error in Elasticsearch. - Chinese text becoming garbled during parsing. - A compatibility issue with Polars. - A compatibility issue between Infinity and GraphRAG. ## v0.14.0 Released on November 26, 2024. ### New features - Supports [Infinity](https://github.com/infiniflow/infinity) or Elasticsearch (default) as document engine for vector storage and full-text indexing. [#2894](https://github.com/infiniflow/ragflow/pull/2894) - Enhances user experience by adding more variables to the Agent and implementing auto-saving. - Adds a three-step translation agent template, inspired by [Andrew Ng's translation agent](https://github.com/andrewyng/translation-agent). - Adds an SEO-optimized blog writing agent template. - Provides HTTP and Python APIs for conversing with an agent. - Supports the use of English synonyms during retrieval processes. - Optimizes term weight calculations, reducing the retrieval time by 50%. - Improves task executor monitoring with additional performance indicators. - Replaces Redis with Valkey. - Adds three new UI languages (*contributed by the community*): Indonesian, Spanish, and Vietnamese. ### Compatibility changes From this release onwards, **service_config.yaml.template** replaces **service_config.yaml** for configuring backend services. Upon Docker container startup, the environment variables defined in this template file are automatically populated and a **service_config.yaml** is auto-generated from it. [#3341](https://github.com/infiniflow/ragflow/pull/3341) This approach eliminates the need to manually update **service_config.yaml** after making changes to **.env**, facilitating dynamic environment configurations. :::danger IMPORTANT Ensure that you [upgrade **both** your code **and** Docker image to this release](https://ragflow.io/docs/dev/upgrade_ragflow#upgrade-ragflow-to-the-most-recent-officially-published-release) before trying this new approach. ::: ### API changes #### HTTP APIs - [Create session with agent](https://ragflow.io/docs/dev/http_api_reference#create-session-with-agent) - [Converse with agent](https://ragflow.io/docs/dev/http_api_reference#converse-with-agent) #### Python APIs - [Create session with agent](https://ragflow.io/docs/dev/python_api_reference#create-session-with-agent) - [Converse with agent](https://ragflow.io/docs/dev/python_api_reference#create-session-with-agent) ### Documentation #### Added documents - [Configurations](https://ragflow.io/docs/dev/configurations) - [Manage team members](./guides/team/manage_team_members.md) - [Run health check on RAGFlow's dependencies](https://ragflow.io/docs/dev/run_health_check) ## v0.13.0 Released on October 31, 2024. ### New features - Adds the team management functionality for all users. - Updates the Agent UI to improve usability. - Adds support for Markdown chunking in the **General** chunking method. - Introduces an **invoke** tool within the Agent UI. - Integrates support for Dify's knowledge base API. - Adds support for GLM4-9B and Yi-Lightning models. - Introduces HTTP and Python APIs for dataset management, file management within dataset, and chat assistant management. :::tip NOTE To download RAGFlow's Python SDK: ```bash pip install ragflow-sdk==0.13.0 ``` ::: ### Documentation #### Added documents - [Acquire a RAGFlow API key](./develop/acquire_ragflow_api_key.md) - [HTTP API Reference](./references/http_api_reference.md) - [Python API Reference](./references/python_api_reference.md) ## v0.12.0 Released on September 30, 2024. ### New features - Offers slim editions of RAGFlow's Docker images, which do not include built-in BGE/BCE embedding or reranking models. - Improves the results of multi-round dialogues. - Enables users to remove added LLM vendors. - Adds support for **OpenTTS** and **SparkTTS** models. - Implements an **Excel to HTML** toggle in the **General** chunking method, allowing users to parse a spreadsheet into either HTML tables or key-value pairs by row. - Adds agent tools **YahooFinance** and **Jin10**. - Adds an investment advisor agent template. ### Compatibility changes From this release onwards, RAGFlow offers slim editions of its Docker images to improve the experience for users with limited Internet access. A slim edition of RAGFlow's Docker image does not include built-in BGE/BCE embedding models and has a size of about 1GB; a full edition of RAGFlow is approximately 9GB and includes two built-in embedding models. The default Docker image edition is `nightly-slim`. The following list clarifies the differences between various editions: - `nightly-slim`: The slim edition of the most recent tested Docker image. - `v0.12.0-slim`: The slim edition of the most recent **officially released** Docker image. - `nightly`: The full edition of the most recent tested Docker image. - `v0.12.0`: The full edition of the most recent **officially released** Docker image. See [Upgrade RAGFlow](https://ragflow.io/docs/dev/upgrade_ragflow) for instructions on upgrading. ### Documentation #### Added documents - [Upgrade RAGFlow](https://ragflow.io/docs/dev/upgrade_ragflow) ## v0.11.0 Released on September 14, 2024. ### New features - Introduces an AI search interface within the RAGFlow UI. - Supports audio output via **FishAudio** or **Tongyi Qwen TTS**. - Allows the use of Postgres for metadata storage, in addition to MySQL. - Supports object storage options with S3 or Azure Blob. - Supports model vendors: **Anthropic**, **Voyage AI**, and **Google Cloud**. - Supports the use of **Tencent Cloud ASR** for audio content recognition. - Adds finance-specific agent components: **WenCai**, **AkShare**, **YahooFinance**, and **TuShare**. - Adds a medical consultant agent template. - Supports running retrieval benchmarking on the following datasets: - [ms_marco_v1.1](https://huggingface.co/datasets/microsoft/ms_marco) - [trivia_qa](https://huggingface.co/datasets/mandarjoshi/trivia_qa) - [miracl](https://huggingface.co/datasets/miracl/miracl) ## v0.10.0 Released on August 26, 2024. ### New features - Introduces a text-to-SQL template in the Agent UI. - Implements Agent APIs. - Incorporates monitoring for the task executor. - Introduces Agent tools **GitHub**, **DeepL**, **BaiduFanyi**, **QWeather**, and **GoogleScholar**. - Supports chunking of EML files. - Supports more LLMs or model services: **GPT-4o-mini**, **PerfXCloud**, **TogetherAI**, **Upstage**, **Novita AI**, **01.AI**, **SiliconFlow**, **PPIO**, **XunFei Spark**, **Jiekou.AI**, **Baidu Yiyan**, and **Tencent Hunyuan**. ## v0.9.0 Released on August 6, 2024. ### New features - Supports GraphRAG as a chunking method. - Introduces Agent component **Keyword** and search tools, including **Baidu**, **DuckDuckGo**, **PubMed**, **Wikipedia**, **Bing**, and **Google**. - Supports speech-to-text recognition for audio files. - Supports model vendors **Gemini** and **Groq**. - Supports inference frameworks, engines, and services including **LM studio**, **OpenRouter**, **LocalAI**, and **Nvidia API**. - Supports using reranker models in Xinference. ## v0.8.0 Released on July 8, 2024. ### New features - Supports Agentic RAG, enabling graph-based workflow construction for RAG and agents. - Supports model vendors **Mistral**, **MiniMax**, **Bedrock**, and **Azure OpenAI**. - Supports DOCX files in the MANUAL chunking method. - Supports DOCX, MD, and PDF files in the Q&A chunking method. ## v0.7.0 Released on May 31, 2024. ### New features - Supports the use of reranker models. - Integrates reranker and embedding models: [BCE](https://github.com/netease-youdao/BCEmbedding), [BGE](https://github.com/FlagOpen/FlagEmbedding), and [Jina](https://jina.ai/embeddings/). - Supports LLMs Baichuan and VolcanoArk. - Implements [RAPTOR](https://arxiv.org/html/2401.18059v1) for improved text retrieval. - Supports HTML files in the GENERAL chunking method. - Provides HTTP and Python APIs for deleting documents by ID. - Supports ARM64 platforms. :::danger IMPORTANT While we also test RAGFlow on ARM64 platforms, we do not maintain RAGFlow Docker images for ARM. If you are on an ARM platform, follow [this guide](./develop/build_docker_image.mdx) to build a RAGFlow Docker image. ::: ### API changes #### HTTP API - [Delete documents](https://ragflow.io/docs/dev/http_api_reference#delete-documents) #### Python API - [Delete documents](https://ragflow.io/docs/dev/python_api_reference#delete-documents) ## v0.6.0 Released on May 21, 2024. ### New features - Supports streaming output. - Provides HTTP and Python APIs for retrieving document chunks. - Supports monitoring of system components, including Elasticsearch, MySQL, Redis, and MinIO. - Supports disabling **Layout Recognition** in the GENERAL chunking method to reduce file chunking time. ### API changes #### HTTP API - [Retrieve chunks](https://ragflow.io/docs/dev/http_api_reference#retrieve-chunks) #### Python API - [Retrieve chunks](https://ragflow.io/docs/dev/python_api_reference#retrieve-chunks) ## v0.5.0 Released on May 8, 2024. ### New features - Supports LLM DeepSeek. --- --- sidebar_position: 1 slug: /build_docker_image --- # Build RAGFlow Docker image import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; A guide explaining how to build a RAGFlow Docker image from its source code. By following this guide, you'll be able to create a local Docker image that can be used for development, debugging, or testing purposes. ## Target Audience - Developers who have added new features or modified the existing code and require a Docker image to view and debug their changes. - Developers seeking to build a RAGFlow Docker image for an ARM64 platform. - Testers aiming to explore the latest features of RAGFlow in a Docker image. ## Prerequisites - CPU ≥ 4 cores - RAM ≥ 16 GB - Disk ≥ 50 GB - Docker ≥ 24.0.0 & Docker Compose ≥ v2.26.1 ## Build a Docker image This image is approximately 2 GB in size and relies on external LLM and embedding services. :::danger IMPORTANT - While we also test RAGFlow on ARM64 platforms, we do not maintain RAGFlow Docker images for ARM. However, you can build an image yourself on a `linux/arm64` or `darwin/arm64` host machine as well. - For ARM64 platforms, please upgrade the `xgboost` version in **pyproject.toml** to `1.6.0` and ensure **unixODBC** is properly installed. ::: ```bash git clone https://github.com/infiniflow/ragflow.git cd ragflow/ uv run download_deps.py docker build -f Dockerfile.deps -t infiniflow/ragflow_deps . docker build -f Dockerfile -t infiniflow/ragflow:nightly . ``` ## Launch a RAGFlow Service from Docker for MacOS After building the infiniflow/ragflow:nightly image, you are ready to launch a fully-functional RAGFlow service with all the required components, such as Elasticsearch, MySQL, MinIO, Redis, and more. ## Example: Apple M2 Pro (Sequoia) 1. Edit Docker Compose Configuration Open the `docker/.env` file. Find the `RAGFLOW_IMAGE` setting and change the image reference from `infiniflow/ragflow:v0.23.1` to `infiniflow/ragflow:nightly` to use the pre-built image. 2. Launch the Service ```bash cd docker $ docker compose -f docker-compose-macos.yml up -d ``` 3. Access the RAGFlow Service Once the setup is complete, open your web browser and navigate to http://127.0.0.1 or your server's \; (the default port is \ = 80). You will be directed to the RAGFlow welcome page. Enjoy!🍻 --- --- sidebar_position: 10 slug: /faq --- # FAQs Answers to questions about general features, troubleshooting, usage, and more. --- import TOCInline from '@theme/TOCInline'; ## General features --- ### What sets RAGFlow apart from other RAG products? The "garbage in garbage out" status quo remains unchanged despite the fact that LLMs have advanced Natural Language Processing (NLP) significantly. In its response, RAGFlow introduces two unique features compared to other Retrieval-Augmented Generation (RAG) products. - Fine-grained document parsing: Document parsing involves images and tables, with the flexibility for you to intervene as needed. - Traceable answers with reduced hallucinations: You can trust RAGFlow's responses as you can view the citations and references supporting them. --- ### Which embedding models can be deployed locally? Starting from `v0.22.0`, we ship only the slim edition and no longer append the **-slim** suffix to the image tag. --- ### Where to find the version of RAGFlow? How to interpret it? You can find the RAGFlow version number on the **System** page of the UI: ![Image](https://github.com/user-attachments/assets/20cf7213-2537-4e18-a88c-4dadf6228c6b) If you build RAGFlow from source, the version number is also in the system log: ``` ____ ___ ______ ______ __ / __ \ / | / ____// ____// /____ _ __ / /_/ // /| | / / __ / /_ / // __ \| | /| / / / _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ / /_/ |_|/_/ |_|\____//_/ /_/ \____/ |__/|__/ 2025-02-18 10:10:43,835 INFO 1445658 RAGFlow version: v0.15.0-50-g6daae7f2 ``` Where: - `v0.15.0`: The officially published release. - `50`: The number of git commits since the official release. - `g6daae7f2`: `g` is the prefix, and `6daae7f2` is the first seven characters of the current commit ID. --- ### Why not use other open-source vector databases as the document engine? Currently, only Elasticsearch and [Infinity](https://github.com/infiniflow/infinity) meet the hybrid search requirements of RAGFlow. Most open-source vector databases have limited support for full-text search, and sparse embedding is not an alternative to full-text search. Additionally, these vector databases lack critical features essential to RAGFlow, such as phrase search and advanced ranking capabilities. These limitations led us to develop [Infinity](https://github.com/infiniflow/infinity), the AI-native database, from the ground up. --- ### Differences between demo.ragflow.io and a locally deployed open-source RAGFlow service? demo.ragflow.io demonstrates the capabilities of RAGFlow Enterprise. Its DeepDoc models are pre-trained using proprietary data and it offers much more sophisticated team permission controls. Essentially, demo.ragflow.io serves as a preview of RAGFlow's forthcoming SaaS (Software as a Service) offering. You can deploy an open-source RAGFlow service and call it from a Python client or through RESTful APIs. However, this is not supported on demo.ragflow.io. --- ### Why does it take longer for RAGFlow to parse a document than LangChain? We put painstaking effort into document pre-processing tasks like layout analysis, table structure recognition, and OCR (Optical Character Recognition) using our vision models. This contributes to the additional time required. --- ### Why does RAGFlow require more resources than other projects? RAGFlow has a number of built-in models for document structure parsing, which account for the additional computational resources. --- ### Which architectures or devices does RAGFlow support? We officially support x86 CPU and nvidia GPU. While we also test RAGFlow on ARM64 platforms, we do not maintain RAGFlow Docker images for ARM. If you are on an ARM platform, follow [this guide](./develop/build_docker_image.mdx) to build a RAGFlow Docker image. --- ### Do you offer an API for integration with third-party applications? The corresponding APIs are now available. See the [RAGFlow HTTP API Reference](./references/http_api_reference.md) or the [RAGFlow Python API Reference](./references/python_api_reference.md) for more information. --- ### Do you support stream output? Yes, we do. Stream output is enabled by default in the chat assistant and agent. Note that you cannot disable stream output via RAGFlow's UI. To disable stream output in responses, use RAGFlow's Python or RESTful APIs: Python: - [Create chat completion](./references/python_api_reference.md#create-chat-completion) - [Converse with chat assistant](./references/python_api_reference.md#converse-with-chat-assistant) - [Converse with agent](./references/python_api_reference.md#converse-with-agent) RESTful: - [Create chat completion](./references/http_api_reference.md#create-chat-completion) - [Converse with chat assistant](./references/http_api_reference.md#converse-with-chat-assistant) - [Converse with agent](./references/http_api_reference.md#converse-with-agent) --- ### Do you support sharing dialogue through URL? No, this feature is not supported. --- ### Do you support multiple rounds of dialogues, referencing previous dialogues as context for the current query? Yes, we support enhancing user queries based on existing context of an ongoing conversation: 1. On the **Chat** page, hover over the desired assistant and select **Edit**. 2. In the **Chat Configuration** popup, click the **Prompt engine** tab. 3. Switch on **Multi-turn optimization** to enable this feature. --- ### Key differences between AI search and chat? - **AI search**: This is a single-turn AI conversation using a predefined retrieval strategy (a hybrid search of weighted keyword similarity and weighted vector similarity) and the system's default chat model. It does not involve advanced RAG strategies like knowledge graph, auto-keyword, or auto-question. Retrieved chunks will be listed below the chat model's response. - **AI chat**: This is a multi-turn AI conversation where you can define your retrieval strategy (a weighted reranking score can be used to replace the weighted vector similarity in a hybrid search) and choose your chat model. In an AI chat, you can configure advanced RAG strategies, such as knowledge graphs, auto-keyword, and auto-question, for your specific case. Retrieved chunks are not displayed along with the answer. When debugging your chat assistant, you can use AI search as a reference to verify your model settings and retrieval strategy. --- ## Troubleshooting --- ### How to build the RAGFlow image from scratch? See [Build a RAGFlow Docker image](./develop/build_docker_image.mdx). ### Cannot access https://huggingface.co A locally deployed RAGFlow downloads OCR models from [Huggingface website](https://huggingface.co) by default. If your machine is unable to access this site, the following error occurs and PDF parsing fails: ``` FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997/ocr.res' ``` To fix this issue, use https://hf-mirror.com instead: 1. Stop all containers and remove all related resources: ```bash cd ragflow/docker/ docker compose down ``` 2. Uncomment the following line in **ragflow/docker/.env**: ``` # HF_ENDPOINT=https://hf-mirror.com ``` 3. Start up the server: ```bash docker compose up -d ``` --- ### `MaxRetryError: HTTPSConnectionPool(host='hf-mirror.com', port=443)` This error suggests that you do not have Internet access or are unable to connect to hf-mirror.com. Try the following: 1. Manually download the resource files from [huggingface.co/InfiniFlow/deepdoc](https://huggingface.co/InfiniFlow/deepdoc) to your local folder **~/deepdoc**. 2. Add a volumes to **docker-compose.yml**, for example: ``` - ~/deepdoc:/ragflow/rag/res/deepdoc ``` --- ### `WARNING: can't find /ragflow/rag/res/borker.tm` Ignore this warning and continue. All system warnings can be ignored. --- ### `network anomaly There is an abnormality in your network and you cannot connect to the server.` ![anomaly](https://github.com/infiniflow/ragflow/assets/93570324/beb7ad10-92e4-4a58-8886-bfb7cbd09e5d) You will not log in to RAGFlow unless the server is fully initialized. Run `docker logs -f docker-ragflow-cpu-1`. *The server is successfully initialized, if your system displays the following:* ``` ____ ___ ______ ______ __ / __ \ / | / ____// ____// /____ _ __ / /_/ // /| | / / __ / /_ / // __ \| | /| / / / _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ / /_/ |_|/_/ |_|\____//_/ /_/ \____/ |__/|__/ * Running on all addresses (0.0.0.0) * Running on http://127.0.0.1:9380 * Running on http://x.x.x.x:9380 INFO:werkzeug:Press CTRL+C to quit ``` --- ### `Realtime synonym is disabled, since no redis connection` Ignore this warning and continue. All system warnings can be ignored. ![](https://github.com/infiniflow/ragflow/assets/93570324/ef5a6194-084a-4fe3-bdd5-1c025b40865c) --- ### Why does my document parsing stall at under one percent? ![stall](https://github.com/infiniflow/ragflow/assets/93570324/3589cc25-c733-47d5-bbfc-fedb74a3da50) Click the red cross beside the 'parsing status' bar, then restart the parsing process to see if the issue remains. If the issue persists and your RAGFlow is deployed locally, try the following: 1. Check the log of your RAGFlow server to see if it is running properly: ```bash docker logs -f docker-ragflow-cpu-1 ``` 2. Check if the **task_executor.py** process exists. 3. Check if your RAGFlow server can access hf-mirror.com or huggingface.com. --- ### Why does my pdf parsing stall near completion, while the log does not show any error? Click the red cross beside the 'parsing status' bar, then restart the parsing process to see if the issue remains. If the issue persists and your RAGFlow is deployed locally, the parsing process is likely killed due to insufficient RAM. Try increasing your memory allocation by increasing the `MEM_LIMIT` value in **docker/.env**. :::note Ensure that you restart up your RAGFlow server for your changes to take effect! ```bash docker compose stop ``` ```bash docker compose up -d ``` ::: ![nearcompletion](https://github.com/infiniflow/ragflow/assets/93570324/563974c3-f8bb-4ec8-b241-adcda8929cbb) --- ### `Index failure` An index failure usually indicates an unavailable Elasticsearch service. --- ### How to check the log of RAGFlow? ```bash tail -f ragflow/docker/ragflow-logs/*.log ``` --- ### How to check the status of each component in RAGFlow? 1. Check the status of the Elasticsearch Docker container: ```bash $ docker ps ``` *The following is an example result:* ```bash 5bc45806b680 infiniflow/ragflow:latest "./entrypoint.sh" 11 hours ago Up 11 hours 0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp, 0.0.0.0:9380->9380/tcp, :::9380->9380/tcp docker-ragflow-cpu-1 91220e3285dd docker.elastic.co/elasticsearch/elasticsearch:8.11.3 "/bin/tini -- /usr/l…" 11 hours ago Up 11 hours (healthy) 9300/tcp, 0.0.0.0:9200->9200/tcp, :::9200->9200/tcp ragflow-es-01 d8c86f06c56b mysql:5.7.18 "docker-entrypoint.s…" 7 days ago Up 16 seconds (healthy) 0.0.0.0:3306->3306/tcp, :::3306->3306/tcp ragflow-mysql cd29bcb254bc quay.io/minio/minio:RELEASE.2023-12-20T01-00-02Z "/usr/bin/docker-ent…" 2 weeks ago Up 11 hours 0.0.0.0:9001->9001/tcp, :::9001->9001/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp ragflow-minio ``` 2. Follow [this document](./guides/run_health_check.md) to check the health status of the Elasticsearch service. :::danger IMPORTANT The status of a Docker container status does not necessarily reflect the status of the service. You may find that your services are unhealthy even when the corresponding Docker containers are up running. Possible reasons for this include network failures, incorrect port numbers, or DNS issues. ::: --- ### `Exception: Can't connect to ES cluster` 1. Check the status of the Elasticsearch Docker container: ```bash $ docker ps ``` *The status of a healthy Elasticsearch component should look as follows:* ``` 91220e3285dd docker.elastic.co/elasticsearch/elasticsearch:8.11.3 "/bin/tini -- /usr/l…" 11 hours ago Up 11 hours (healthy) 9300/tcp, 0.0.0.0:9200->9200/tcp, :::9200->9200/tcp ragflow-es-01 ``` 2. Follow [this document](./guides/run_health_check.md) to check the health status of the Elasticsearch service. :::danger IMPORTANT The status of a Docker container status does not necessarily reflect the status of the service. You may find that your services are unhealthy even when the corresponding Docker containers are up running. Possible reasons for this include network failures, incorrect port numbers, or DNS issues. ::: 3. If your container keeps restarting, ensure `vm.max_map_count` >= 262144 as per [this README](https://github.com/infiniflow/ragflow?tab=readme-ov-file#-start-up-the-server). Updating the `vm.max_map_count` value in **/etc/sysctl.conf** is required, if you wish to keep your change permanent. Note that this configuration works only for Linux. --- ### Can't start ES container and get `Elasticsearch did not exit normally` This is because you forgot to update the `vm.max_map_count` value in **/etc/sysctl.conf** and your change to this value was reset after a system reboot. --- ### `{"data":null,"code":100,"message":""}` Your IP address or port number may be incorrect. If you are using the default configurations, enter `http://` (**NOT 9380, AND NO PORT NUMBER REQUIRED!**) in your browser. This should work. --- ### `Ollama - Mistral instance running at 127.0.0.1:11434 but cannot add Ollama as model in RagFlow` A correct Ollama IP address and port is crucial to adding models to Ollama: - If you are on demo.ragflow.io, ensure that the server hosting Ollama has a publicly accessible IP address. Note that 127.0.0.1 is not a publicly accessible IP address. - If you deploy RAGFlow locally, ensure that Ollama and RAGFlow are in the same LAN and can communicate with each other. See [Deploy a local LLM](./guides/models/deploy_local_llm.mdx) for more information. --- ### Do you offer examples of using DeepDoc to parse PDF or other files? Yes, we do. See the Python files under the **rag/app** folder. --- ### `FileNotFoundError: [Errno 2] No such file or directory` 1. Check the status of the MinIO Docker container: ```bash $ docker ps ``` *The status of a healthy Elasticsearch component should look as follows:* ```bash cd29bcb254bc quay.io/minio/minio:RELEASE.2023-12-20T01-00-02Z "/usr/bin/docker-ent…" 2 weeks ago Up 11 hours 0.0.0.0:9001->9001/tcp, :::9001->9001/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp ragflow-minio ``` 2. Follow [this document](./guides/run_health_check.md) to check the health status of the Elasticsearch service. :::danger IMPORTANT The status of a Docker container status does not necessarily reflect the status of the service. You may find that your services are unhealthy even when the corresponding Docker containers are up running. Possible reasons for this include network failures, incorrect port numbers, or DNS issues. ::: --- ## Usage --- ### How to run RAGFlow with a locally deployed LLM? You can use Ollama or Xinference to deploy local LLM. See [here](./guides/models/deploy_local_llm.mdx) for more information. --- ### How to add an LLM that is not supported? If your model is not currently supported but has APIs compatible with those of OpenAI, click **OpenAI-API-Compatible** on the **Model providers** page to configure your model: ![openai-api-compatible](https://github.com/user-attachments/assets/b1e964f2-b86e-41af-8528-fd8a96dc5f6f) --- ### How to integrate RAGFlow with Ollama? - If RAGFlow is locally deployed, ensure that your RAGFlow and Ollama are in the same LAN. - If you are using our online demo, ensure that the IP address of your Ollama server is public and accessible. See [here](./guides/models/deploy_local_llm.mdx) for more information. --- ### How to change the file size limit? For a locally deployed RAGFlow: the total file size limit per upload is 1GB, with a batch upload limit of 32 files. There is no cap on the total number of files per account. To update this 1GB file size limit: - In **docker/.env**, uncomment `# MAX_CONTENT_LENGTH=1073741824`, adjust the value as needed, and note that `1073741824` represents 1GB in bytes. - If you update the value of `MAX_CONTENT_LENGTH` in **docker/.env**, ensure that you update `client_max_body_size` in **nginx/nginx.conf** accordingly. :::tip NOTE It is not recommended to manually change the 32-file batch upload limit. However, if you use RAGFlow's HTTP API or Python SDK to upload files, the 32-file batch upload limit is automatically removed. ::: --- ### `Error: Range of input length should be [1, 30000]` This error occurs because there are too many chunks matching your search criteria. Try reducing the **TopN** and increasing **Similarity threshold** to fix this issue: 1. Click **Chat** in the middle top of the page. 2. Right-click the desired conversation > **Edit** > **Prompt engine** 3. Reduce the **TopN** and/or raise **Similarity threshold**. 4. Click **OK** to confirm your changes. ![topn](https://github.com/infiniflow/ragflow/assets/93570324/7ec72ab3-0dd2-4cff-af44-e2663b67b2fc) --- ### How to get an API key for integration with third-party applications? See [Acquire a RAGFlow API key](./develop/acquire_ragflow_api_key.md). --- ### How to upgrade RAGFlow? See [Upgrade RAGFlow](./guides/upgrade_ragflow.mdx) for more information. --- ### How to switch the document engine to Infinity? To switch your document engine from Elasticsearch to [Infinity](https://github.com/infiniflow/infinity): 1. Stop all running containers: ```bash $ docker compose -f docker/docker-compose.yml down -v ``` :::caution WARNING `-v` will delete all Docker container volumes, and the existing data will be cleared. ::: 2. In **docker/.env**, set `DOC_ENGINE=${DOC_ENGINE:-infinity}` 3. Restart your Docker image: ```bash $ docker compose -f docker-compose.yml up -d ``` --- ### Where are my uploaded files stored in RAGFlow's image? All uploaded files are stored in Minio, RAGFlow's object storage solution. For instance, if you upload your file directly to a dataset, it is located at `/filename`. --- ### How to tune batch size for document parsing and embedding? You can control the batch size for document parsing and embedding by setting the environment variables `DOC_BULK_SIZE` and `EMBEDDING_BATCH_SIZE`. Increasing these values may improve throughput for large-scale data processing, but will also increase memory usage. Adjust them according to your hardware resources. --- ### How to accelerate the question-answering speed of my chat assistant? See [here](./guides/chat/best_practices/accelerate_question_answering.mdx). --- ### How to accelerate the question-answering speed of my Agent? See [here](./guides/agent/best_practices/accelerate_agent_question_answering.md). ### How to use MinerU to parse PDF documents? From v0.22.0 onwards, RAGFlow includes MinerU (≥ 2.6.3) as an optional PDF parser of multiple backends. Please note that RAGFlow acts only as a *remote client* for MinerU, calling the MinerU API to parse PDFs and reading the returned files. To use this feature: 1. Prepare a reachable MinerU API service (FastAPI server). 2. In the **.env** file or from the **Model providers** page in the UI, configure RAGFlow as a remote client to MinerU: - `MINERU_APISERVER`: The MinerU API endpoint (e.g., `http://mineru-host:8886`). - `MINERU_BACKEND`: The MinerU backend: - `"pipeline"` (default) - `"vlm-http-client"` - `"vlm-transformers"` - `"vlm-vllm-engine"` - `"vlm-mlx-engine"` - `"vlm-vllm-async-engine"` - `"vlm-lmdeploy-engine"`. - `MINERU_SERVER_URL`: (optional) The downstream vLLM HTTP server (e.g., `http://vllm-host:30000`). Applicable when `MINERU_BACKEND` is set to `"vlm-http-client"`. - `MINERU_OUTPUT_DIR`: (optional) The local directory for holding the outputs of the MinerU API service (zip/JSON) before ingestion. - `MINERU_DELETE_OUTPUT`: Whether to delete temporary output when a temporary directory is used: - `1`: Delete. - `0`: Retain. 3. In the web UI, navigate to your dataset's **Configuration** page and find the **Ingestion pipeline** section: - If you decide to use a chunking method from the **Built-in** dropdown, ensure it supports PDF parsing, then select **MinerU** from the **PDF parser** dropdown. - If you use a custom ingestion pipeline instead, select **MinerU** in the **PDF parser** section of the **Parser** component. :::note All MinerU environment variables are optional. When set, these values are used to auto-provision a MinerU OCR model for the tenant on first use. To avoid auto-provisioning, skip the environment variable settings and only configure MinerU from the **Model providers** page in the UI. ::: :::caution WARNING Third-party visual models are marked **Experimental**, because we have not fully tested these models for the aforementioned data extraction tasks. ::: --- ### How to configure MinerU-specific settings? The table below summarizes the most frequently used MinerU environment variables for remote MinerU: | Environment variable | Description | Default | Example | | ---------------------- | ---------------------------------- | ----------------------------------- | ----------------------------------------------------------------------------------------------- | | `MINERU_APISERVER` | URL of the MinerU API service | _unset_ | `MINERU_APISERVER=http://your-mineru-server:8886` | | `MINERU_BACKEND` | MinerU parsing backend | `pipeline` | `MINERU_BACKEND=pipeline\|vlm-transformers\|vlm-vllm-engine\|vlm-mlx-engine\|vlm-vllm-async-engine\|vlm-http-client` | | `MINERU_SERVER_URL` | URL of remote vLLM server (for `vlm-http-client`) | _unset_ | `MINERU_SERVER_URL=http://your-vllm-server-ip:30000` | | `MINERU_OUTPUT_DIR` | Directory for MinerU output files | System-defined temporary directory | `MINERU_OUTPUT_DIR=/home/ragflow/mineru/output` | | `MINERU_DELETE_OUTPUT` | Whether to delete MinerU output directory when a temp dir is used | `1` (delete temp output) | `MINERU_DELETE_OUTPUT=0` | 1. Set `MINERU_APISERVER` to point RAGFlow to your MinerU API server. 2. Set `MINERU_BACKEND` to specify a parsing backend. 3. If using the `"vlm-http-client"` backend, set `MINERU_SERVER_URL` to your vLLM server's URL. MinerU API expects `backend=vlm-http-client` and `server_url=http://:30000` in the request body. 4. Set `MINERU_OUTPUT_DIR` to specify where RAGFlow stores MinerU API output; otherwise, a system temp directory is used. 5. Set `MINERU_DELETE_OUTPUT` to `0` to keep MinerU's temp output (useful for debugging). :::tip NOTE For information about other environment variables natively supported by MinerU, see [here](https://opendatalab.github.io/MinerU/usage/cli_tools/#environment-variables-description). ::: --- ### How to use MinerU with a vLLM server for document parsing? RAGFlow supports MinerU's `vlm-http-client` backend, enabling you to delegate document parsing tasks to a remote vLLM server while calling MinerU via HTTP. To configure: 1. Ensure a MinerU API service is reachable (for example `http://mineru-host:8886`). 2. Set up or point to a vLLM HTTP server (for example `http://vllm-host:30000`). 3. Configure the following in your **docker/.env** file (or your shell if running from source): - `MINERU_APISERVER=http://mineru-host:8886` - `MINERU_BACKEND="vlm-http-client"` - `MINERU_SERVER_URL="http://vllm-host:30000"` MinerU API calls expect `backend=vlm-http-client` and `server_url=http://:30000` in the request body. 4. Configure `MINERU_OUTPUT_DIR` / `MINERU_DELETE_OUTPUT` as desired to manage the returned zip/JSON before ingestion. :::tip NOTE When using the `vlm-http-client` backend, the RAGFlow server requires no GPU, only network connectivity. This enables cost-effective distributed deployment with multiple RAGFlow instances sharing one remote vLLM server. ::: --- --- sidebar_position: 2 slug: /agent_component --- # Agent component The component equipped with reasoning, tool usage, and multi-agent collaboration capabilities. --- An **Agent** component fine-tunes the LLM and sets its prompt. From v0.20.5 onwards, an **Agent** component is able to work independently and with the following capabilities: - Autonomous reasoning with reflection and adjustment based on environmental feedback. - Use of tools or subagents to complete tasks. ## Scenarios An **Agent** component is essential when you need the LLM to assist with summarizing, translating, or controlling various tasks. ## Prerequisites 1. Ensure you have a chat model properly configured: ![Set default models](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/set_default_models.jpg) 2. If your Agent involves dataset retrieval, ensure you [have properly configured your target dataset(s)](../../dataset/configure_knowledge_base.md). ## Quickstart ### 1. Click on an **Agent** component to show its configuration panel The corresponding configuration panel appears to the right of the canvas. Use this panel to define and fine-tune the **Agent** component's behavior. ### 2. Select your model Click **Model**, and select a chat model from the dropdown menu. :::tip NOTE If no model appears, check if your have added a chat model on the **Model providers** page. ::: ### 3. Update system prompt (Optional) The system prompt typically defines your model's role. You can either keep the system prompt as is or customize it to override the default. ### 4. Update user prompt The user prompt typically defines your model's task. You will find the `sys.query` variable auto-populated. Type `/` or click **(x)** to view or add variables. In this quickstart, we assume your **Agent** component is used standalone (without tools or sub-Agents below), then you may also need to specify retrieved chunks using the `formalized_content` variable: ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/standalone_user_prompt_variable.jpg) ### 5. Skip Tools and Agent The **+ Add tools** and **+ Add agent** sections are used *only* when you need to configure your **Agent** component as a planner (with tools or sub-Agents beneath). In this quickstart, we assume your **Agent** component is used standalone (without tools or sub-Agents beneath). ### 6. Choose the next component When necessary, click the **+** button on the **Agent** component to choose the next component in the workflow from the dropdown list. ## Connect to an MCP server as a client :::danger IMPORTANT In this section, we assume your **Agent** will be configured as a planner, with a Tavily tool beneath it. ::: ### 1. Navigate to the MCP configuration page ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/mcp_page.jpg) ### 2. Configure your Tavily MCP server Update your MCP server's name, URL (including the API key), server type, and other necessary settings. When configured correctly, the available tools will be displayed. ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/edit_mcp_server.jpg) ### 3. Navigate to your Agent's editing page ### 4. Connect to your MCP server 1. Click **+ Add tools**: ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/add_tools.jpg) 2. Click **MCP** to show the available MCP servers. 3. Select your MCP server: *The target MCP server appears below your Agent component, and your Agent will autonomously decide when to invoke the available tools it offers.* ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/choose_tavily_mcp_server.jpg) ### 5. Update system prompt to specify trigger conditions (Optional) To ensure reliable tool calls, you may specify within the system prompt which tasks should trigger each tool call. ### 6. View the available tools of your MCP server On the canvas, click the newly-populated Tavily server to view and select its available tools: ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/tavily_mcp_server.jpg) ## Configurations ### Model Click the dropdown menu of **Model** to show the model configuration window. - **Model**: The chat model to use. - Ensure you set the chat model correctly on the **Model providers** page. - You can use different models for different components to increase flexibility or improve overall performance. - **Creativity**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**. This parameter has three options: - **Improvise**: Produces more creative responses. - **Precise**: (Default) Produces more conservative responses. - **Balance**: A middle ground between **Improvise** and **Precise**. - **Temperature**: The randomness level of the model's output. Defaults to 0.1. - Lower values lead to more deterministic and predictable outputs. - Higher values lead to more creative and varied outputs. - A temperature of zero results in the same output for the same prompt. - **Top P**: Nucleus sampling. - Reduces the likelihood of generating repetitive or unnatural text by setting a threshold *P* and restricting the sampling to tokens with a cumulative probability exceeding *P*. - Defaults to 0.3. - **Presence penalty**: Encourages the model to include a more diverse range of tokens in the response. - A higher **presence penalty** value results in the model being more likely to generate tokens not yet been included in the generated text. - Defaults to 0.4. - **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text. - A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens. - Defaults to 0.7. - **Max tokens**: This sets the maximum length of the model's output, measured in the number of tokens (words or pieces of words). It is disabled by default, allowing the model to determine the number of tokens in its responses. :::tip NOTE - It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one. - If you are uncertain about the mechanism behind **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**, simply choose one of the three options of **Creativity**. ::: ### System prompt Typically, you use the system prompt to describe the task for the LLM, specify how it should respond, and outline other miscellaneous requirements. We do not plan to elaborate on this topic, as it can be as extensive as prompt engineering. However, please be aware that the system prompt is often used in conjunction with keys (variables), which serve as various data inputs for the LLM. An **Agent** component relies on keys (variables) to specify its data inputs. Its immediate upstream component is *not* necessarily its data input, and the arrows in the workflow indicate *only* the processing sequence. Keys in a **Agent** component are used in conjunction with the system prompt to specify data inputs for the LLM. Use a forward slash `/` or the **(x)** button to show the keys to use. #### Advanced usage From v0.20.5 onwards, four framework-level prompt blocks are available in the **System prompt** field, enabling you to customize and *override* prompts at the framework level. Type `/` or click **(x)** to view them; they appear under the **Framework** entry in the dropdown menu. - `task_analysis` prompt block - This block is responsible for analyzing tasks — either a user task or a task assigned by the lead Agent when the **Agent** component is acting as a Sub-Agent. - Reference design: [analyze_task_system.md](https://github.com/infiniflow/ragflow/blob/main/rag/prompts/analyze_task_system.md) and [analyze_task_user.md](https://github.com/infiniflow/ragflow/blob/main/rag/prompts/analyze_task_user.md) - Available *only* when this **Agent** component is acting as a planner, with either tools or sub-Agents under it. - Input variables: - `agent_prompt`: The system prompt. - `task`: The user prompt for either a lead Agent or a sub-Agent. The lead Agent's user prompt is defined by the user, while a sub-Agent's user prompt is defined by the lead Agent when delegating tasks. - `tool_desc`: A description of the tools and sub_Agents that can be called. - `context`: The operational context, which stores interactions between the Agent, tools, and sub-agents; initially empty. - `plan_generation` prompt block - This block creates a plan for the **Agent** component to execute next, based on the task analysis results. - Reference design: [next_step.md](https://github.com/infiniflow/ragflow/blob/main/rag/prompts/next_step.md) - Available *only* when this **Agent** component is acting as a planner, with either tools or sub-Agents under it. - Input variables: - `task_analysis`: The analysis result of the current task. - `desc`: A description of the tools or sub-Agents currently being called. - `today`: The date of today. - `reflection` prompt block - This block enables the **Agent** component to reflect, improving task accuracy and efficiency. - Reference design: [reflect.md](https://github.com/infiniflow/ragflow/blob/main/rag/prompts/reflect.md) - Available *only* when this **Agent** component is acting as a planner, with either tools or sub-Agents under it. - Input variables: - `goal`: The goal of the current task. It is the user prompt for either a lead Agent or a sub-Agent. The lead Agent's user prompt is defined by the user, while a sub-Agent's user prompt is defined by the lead Agent. - `tool_calls`: The history of tool calling - `call.name`:The name of the tool called. - `call.result`:The result of tool calling - `citation_guidelines` prompt block - Reference design: [citation_prompt.md](https://github.com/infiniflow/ragflow/blob/main/rag/prompts/citation_prompt.md) *The screenshots below show the framework prompt blocks available to an **Agent** component, both as a standalone and as a planner (with a Tavily tool below):* ![standalone](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/standalone_agent_framework_block.jpg) ![planner](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/planner_agent_framework_blocks.jpg) ### User prompt The user-defined prompt. Defaults to `sys.query`, the user query. As a general rule, when using the **Agent** component as a standalone module (not as a planner), you usually need to specify the corresponding **Retrieval** component’s output variable (`formalized_content`) here as part of the input to the LLM. ### Tools You can use an **Agent** component as a collaborator that reasons and reflects with the aid of other tools; for instance, **Retrieval** can serve as one such tool for an **Agent**. ### Agent You use an **Agent** component as a collaborator that reasons and reflects with the aid of subagents or other tools, forming a multi-agent system. ### Message window size An integer specifying the number of previous dialogue rounds to input into the LLM. For example, if it is set to 12, the tokens from the last 12 dialogue rounds will be fed to the LLM. This feature consumes additional tokens. :::tip IMPORTANT This feature is used for multi-turn dialogue *only*. ::: ### Max retries Defines the maximum number of attempts the agent will make to retry a failed task or operation before stopping or reporting failure. ### Delay after error The waiting period in seconds that the agent observes before retrying a failed task, helping to prevent immediate repeated attempts and allowing system conditions to improve. Defaults to 1 second. ### Max reflection rounds Defines the maximum number reflection rounds of the selected chat model. Defaults to 1 round. :::tip NOTE Increasing this value will significantly extend your agent's response time. ::: ### Output The global variable name for the output of the **Agent** component, which can be referenced by other components in the workflow. ## Frequently asked questions ### Why does it take so long for my Agent to respond? See [here](../best_practices/accelerate_agent_question_answering.md) for details. --- --- sidebar_position: 5 slug: /await_response --- # Await response component A component that halts the workflow and awaits user input. --- An **Await response** component halts the workflow, initiating a conversation and collecting key information via predefined forms. ## Scenarios An **Await response** component is essential where you need to display the agent's responses or require user-computer interaction. ## Configurations ### Guiding question Whether to show the message defined in the **Message** field. ### Message The static message to send out. Click **+ Add message** to add message options. When multiple messages are supplied, the **Message** component randomly selects one to send. ### Input You can define global variables within the **Await response** component, which can be either mandatory or optional. Once set, users will need to provide values for these variables when engaging with the agent. Click **+** to add a global variable, each with the following attributes: - **Name**: _Required_ A descriptive name providing additional details about the variable. - **Type**: _Required_ The type of the variable: - **Single-line text**: Accepts a single line of text without line breaks. - **Paragraph text**: Accepts multiple lines of text, including line breaks. - **Dropdown options**: Requires the user to select a value for this variable from a dropdown menu. And you are required to set _at least_ one option for the dropdown menu. - **file upload**: Requires the user to upload one or multiple files. - **Number**: Accepts a number as input. - **Boolean**: Requires the user to toggle between on and off. - **Key**: _Required_ The unique variable name. - **Optional**: A toggle indicating whether the variable is optional. :::tip NOTE To pass in parameters from a client, call: - HTTP method [Converse with agent](../../../references/http_api_reference.md#converse-with-agent), or - Python method [Converse with agent](../../../references/python_api_reference.md#converse-with-agent). ::: :::danger IMPORTANT If you set the key type as **file**, ensure the token count of the uploaded file does not exceed your model provider's maximum token limit; otherwise, the plain text in your file will be truncated and incomplete. ::: --- --- sidebar_position: 1 slug: /begin_component --- # Begin component The starting component in a workflow. --- The **Begin** component sets an opening greeting or accepts inputs from the user. It is automatically populated onto the canvas when you create an agent, whether from a template or from scratch (from a blank template). There should be only one **Begin** component in the workflow. ## Scenarios A **Begin** component is essential in all cases. Every agent includes a **Begin** component, which cannot be deleted. ## Configurations Click the component to display its **Configuration** window. Here, you can set an opening greeting and the input parameters (global variables) for the agent. ### Mode Mode defines how the workflow is triggered. - Conversational: The agent is triggered from a conversation. - Task: The agent starts without a conversation. ### Opening greeting **Conversational mode only.** An agent in conversational mode begins with an opening greeting. It is the agent's first message to the user in conversational mode, which can be a welcoming remark or an instruction to guide the user forward. ### Global variables You can define global variables within the **Begin** component, which can be either mandatory or optional. Once set, users will need to provide values for these variables when engaging with the agent. Click **+ Add variable** to add a global variable, each with the following attributes: - **Name**: _Required_ A descriptive name providing additional details about the variable. - **Type**: _Required_ The type of the variable: - **Single-line text**: Accepts a single line of text without line breaks. - **Paragraph text**: Accepts multiple lines of text, including line breaks. - **Dropdown options**: Requires the user to select a value for this variable from a dropdown menu. And you are required to set _at least_ one option for the dropdown menu. - **file upload**: Requires the user to upload one or multiple files. - **Number**: Accepts a number as input. - **Boolean**: Requires the user to toggle between on and off. - **Key**: _Required_ The unique variable name. - **Optional**: A toggle indicating whether the variable is optional. :::tip NOTE To pass in parameters from a client, call: - HTTP method [Converse with agent](../../../references/http_api_reference.md#converse-with-agent), or - Python method [Converse with agent](../../../references/python_api_reference.md#converse-with-agent). ::: :::danger IMPORTANT If you set the key type as **file**, ensure the token count of the uploaded file does not exceed your model provider's maximum token limit; otherwise, the plain text in your file will be truncated and incomplete. ::: :::note You can tune document parsing and embedding efficiency by setting the environment variables `DOC_BULK_SIZE` and `EMBEDDING_BATCH_SIZE`. ::: ## Frequently asked questions ### Is the uploaded file in a dataset? No. Files uploaded to an agent as input are not stored in a dataset and hence will not be processed using RAGFlow's built-in OCR, DLR or TSR models, or chunked using RAGFlow's built-in chunking methods. ### File size limit for an uploaded file There is no _specific_ file size limit for a file uploaded to an agent. However, note that model providers typically have a default or explicit maximum token setting, which can range from 8196 to 128k: The plain text part of the uploaded file will be passed in as the key value, but if the file's token count exceeds this limit, the string will be truncated and incomplete. :::tip NOTE The variables `MAX_CONTENT_LENGTH` in `/docker/.env` and `client_max_body_size` in `/docker/nginx/nginx.conf` set the file size limit for each upload to a dataset or RAGFlow's File system. These settings DO NOT apply in this scenario. ::: --- --- sidebar_position: 8 slug: /categorize_component --- # Categorize component A component that classifies user inputs and applies strategies accordingly. --- A **Categorize** component is usually the downstream of the **Interact** component. ## Scenarios A **Categorize** component is essential when you need the LLM to help you identify user intentions and apply appropriate processing strategies. ## Configurations ### Query variables *Mandatory* Select the source for categorization. The **Categorize** component relies on query variables to specify its data inputs (queries). All global variables defined before the **Categorize** component are available in the dropdown list. ### Input The **Categorize** component relies on input variables to specify its data inputs (queries). Click **+ Add variable** in the **Input** section to add the desired input variables. There are two types of input variables: **Reference** and **Text**. - **Reference**: Uses a component's output or a user input as the data source. You are required to select from the dropdown menu: - A component ID under **Component Output**, or - A global variable under **Begin input**, which is defined in the **Begin** component. - **Text**: Uses fixed text as the query. You are required to enter static text. ### Model Click the dropdown menu of **Model** to show the model configuration window. - **Model**: The chat model to use. - Ensure you set the chat model correctly on the **Model providers** page. - You can use different models for different components to increase flexibility or improve overall performance. - **Creativity**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**. This parameter has three options: - **Improvise**: Produces more creative responses. - **Precise**: (Default) Produces more conservative responses. - **Balance**: A middle ground between **Improvise** and **Precise**. - **Temperature**: The randomness level of the model's output. Defaults to 0.1. - Lower values lead to more deterministic and predictable outputs. - Higher values lead to more creative and varied outputs. - A temperature of zero results in the same output for the same prompt. - **Top P**: Nucleus sampling. - Reduces the likelihood of generating repetitive or unnatural text by setting a threshold *P* and restricting the sampling to tokens with a cumulative probability exceeding *P*. - Defaults to 0.3. - **Presence penalty**: Encourages the model to include a more diverse range of tokens in the response. - A higher **presence penalty** value results in the model being more likely to generate tokens not yet been included in the generated text. - Defaults to 0.4. - **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text. - A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens. - Defaults to 0.7. - **Max tokens**: This sets the maximum length of the model's output, measured in the number of tokens (words or pieces of words). It is disabled by default, allowing the model to determine the number of tokens in its responses. :::tip NOTE - It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one. - If you are uncertain about the mechanism behind **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**, simply choose one of the three options of **Creativity**. ::: ### Message window size An integer specifying the number of previous dialogue rounds to input into the LLM. For example, if it is set to 12, the tokens from the last 12 dialogue rounds will be fed to the LLM. This feature consumes additional tokens. Defaults to 1. :::tip IMPORTANT This feature is used for multi-turn dialogue *only*. If your **Categorize** component is not part of a multi-turn dialogue (i.e., it is not in a loop), leave this field as-is. ::: ### Category name A **Categorize** component must have at least two categories. This field sets the name of the category. Click **+ Add Item** to include the intended categories. :::tip NOTE You will notice that the category name is auto-populated. No worries. Each category is assigned a random name upon creation. Feel free to change it to a name that is understandable to the LLM. ::: #### Description Description of this category. You can input criteria, situation, or information that may help the LLM determine which inputs belong in this category. #### Examples Additional examples that may help the LLM determine which inputs belong in this category. :::danger IMPORTANT Examples are more helpful than the description if you want the LLM to classify particular cases into this category. ::: Once a new category is added, navigate to the **Categorize** component on the canvas, find the **+** button next to the case, and click it to specify the downstream component(s). #### Output The global variable name for the output of the component, which can be referenced by other components in the workflow. Defaults to `category_name`. --- --- sidebar_position: 13 slug: /code_component --- # Code component A component that enables users to integrate Python or JavaScript codes into their Agent for dynamic data processing. --- ## Scenarios A **Code** component is essential when you need to integrate complex code logic (Python or JavaScript) into your Agent for dynamic data processing. ## Prerequisites ### 1. Ensure gVisor is properly installed We use gVisor to isolate code execution from the host system. Please follow [the official installation guide](https://gvisor.dev/docs/user_guide/install/) to install gVisor, ensuring your operating system is compatible before proceeding. ### 2. Ensure Sandbox is properly installed RAGFlow Sandbox is a secure, pluggable code execution backend. It serves as the code executor for the **Code** component. Please follow the [instructions here](https://github.com/infiniflow/ragflow/tree/main/sandbox) to install RAGFlow Sandbox. :::note Docker client version The executor manager image now bundles Docker CLI `29.1.0` (API 1.44+). Older images shipped Docker 24.x and will fail against newer Docker daemons with `client version 1.43 is too old`. Pull the latest `infiniflow/sandbox-executor-manager:latest` or rebuild it in `./sandbox/executor_manager` if you encounter this error. ::: :::tip NOTE If your RAGFlow Sandbox is not working, please be sure to consult the [Troubleshooting](#troubleshooting) section in this document. We assure you that it addresses 99.99% of the issues! ::: ### 3. (Optional) Install necessary dependencies If you need to import your own Python or JavaScript packages into Sandbox, please follow the commands provided in the [How to import my own Python or JavaScript packages into Sandbox?](#how-to-import-my-own-python-or-javascript-packages-into-sandbox) section to install the additional dependencies. ### 4. Enable Sandbox-specific settings in RAGFlow Ensure all Sandbox-specific settings are enabled in **ragflow/docker/.env**. ### 5. Restart the service after making changes Any changes to the configuration or environment *require* a full service restart to take effect. ## Configurations ### Input You can specify multiple input sources for the **Code** component. Click **+ Add variable** in the **Input variables** section to include the desired input variables. ### Code This field allows you to enter and edit your source code. :::danger IMPORTANT If your code implementation includes defined variables, whether input or output variables, ensure they are also specified in the corresponding **Input** or **Output** sections. ::: #### A Python code example ```Python def main(arg1: str, arg2: str) -> dict: return { "result": arg1 + arg2, } ``` #### A JavaScript code example ```JavaScript const axios = require('axios'); async function main(args) { try { const response = await axios.get('https://github.com/infiniflow/ragflow'); console.log('Body:', response.data); } catch (error) { console.error('Error:', error.message); } } ``` ### Return values You define the output variable(s) of the **Code** component here. :::danger IMPORTANT If you define output variables here, ensure they are also defined in your code implementation; otherwise, their values will be `null`. The following are two examples: ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/set_object_output.jpg) ![](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/set_nested_object_output.png) ::: ### Output The defined output variable(s) will be auto-populated here. ## Troubleshooting ### `HTTPConnectionPool(host='sandbox-executor-manager', port=9385): Read timed out.` **Root cause** - You did not properly install gVisor and `runsc` was not recognized as a valid Docker runtime. - You did not pull the required base images for the runners and no runner was started. **Solution** For the gVisor issue: 1. Install [gVisor](https://gvisor.dev/docs/user_guide/install/). 2. Restart Docker. 3. Run the following to double check: ```bash docker run --rm --runtime=runsc hello-world ``` For the base image issue, pull the required base images: ```bash docker pull infiniflow/sandbox-base-nodejs:latest docker pull infiniflow/sandbox-base-python:latest ``` ### `docker: Error response from daemon: client version 1.43 is too old. Minimum supported API version is 1.44` **Root cause** Your executor manager image includes Docker CLI 24.x (API 1.43), but the host Docker daemon, e.g., Docker 25+ / 29.x, now requires API 1.44+. **Solution** Pull the latest executor manager image or rebuild it in `./sandbox/executor_manager` to upgrade the built-in Docker client: ```bash docker pull infiniflow/sandbox-executor-manager:latest # or docker build -t sandbox-executor-manager:latest ./sandbox/executor_manager ``` ### `HTTPConnectionPool(host='none', port=9385): Max retries exceeded.` **Root cause** `sandbox-executor-manager` is not mapped in `/etc/hosts`. **Solution** Add a new entry to `/etc/hosts`: `127.0.0.1 es01 infinity mysql minio redis sandbox-executor-manager` ### `Container pool is busy` **Root cause** All runners are currently in use, executing tasks. **Solution** Please try again shortly or increase the pool size in the configuration to improve availability and reduce waiting times. ## Frequently asked questions ### How to import my own Python or JavaScript packages into Sandbox? To import your Python packages, update **sandbox_base_image/python/requirements.txt** to install the required dependencies. For example, to add the `openpyxl` package, proceed with the following command lines: ```bash {4,6} (ragflow) ➜ ragflow/sandbox main ✓ pwd # make sure you are in the right directory /home/infiniflow/workspace/ragflow/sandbox (ragflow) ➜ ragflow/sandbox main ✓ echo "openpyxl" >> sandbox_base_image/python/requirements.txt # add the package to the requirements.txt file (ragflow) ➜ ragflow/sandbox main ✗ cat sandbox_base_image/python/requirements.txt # make sure the package is added numpy pandas requests openpyxl # here it is (ragflow) ➜ ragflow/sandbox main ✗ make # rebuild the docker image, this command will rebuild the image and start the service immediately. To build image only, using `make build` instead. (ragflow) ➜ ragflow/sandbox main ✗ docker exec -it sandbox_python_0 /bin/bash # entering container to check if the package is installed # in the container nobody@ffd8a7dd19da:/workspace$ python # launch python shell Python 3.11.13 (main, Aug 12 2025, 22:46:03) [GCC 12.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import openpyxl # import the package to verify installation >>> # That's okay! ``` To import your JavaScript packages, navigate to `sandbox_base_image/nodejs` and use `npm` to install the required packages. For example, to add the `lodash` package, run the following commands: ```bash (ragflow) ➜ ragflow/sandbox main ✓ pwd /home/infiniflow/workspace/ragflow/sandbox (ragflow) ➜ ragflow/sandbox main ✓ cd sandbox_base_image/nodejs (ragflow) ➜ ragflow/sandbox/sandbox_base_image/nodejs main ✓ npm install lodash (ragflow) ➜ ragflow/sandbox/sandbox_base_image/nodejs main ✓ cd ../.. # go back to sandbox root directory (ragflow) ➜ ragflow/sandbox main ✗ make # rebuild the docker image, this command will rebuild the image and start the service immediately. To build image only, using `make build` instead. (ragflow) ➜ ragflow/sandbox main ✗ docker exec -it sandbox_nodejs_0 /bin/bash # entering container to check if the package is installed # in the container nobody@dd4bbcabef63:/workspace$ npm list lodash # verify via npm list /workspace `-- lodash@4.17.21 extraneous nobody@dd4bbcabef63:/workspace$ ls node_modules | grep lodash # or verify via listing node_modules lodash # That's okay! ``` --- --- sidebar_position: 7 slug: /iteration_component --- # Iteration component A component that splits text input into text segments and iterates a predefined workflow for each one. --- An **Interaction** component can divide text input into text segments and apply its built-in component workflow to each segment. ## Scenario An **Iteration** component is essential when a workflow loop is required and the loop count is *not* fixed but depends on number of segments created from the output of specific agent components. - If, for instance, you plan to feed several paragraphs into an LLM for content generation, each with its own focus, and feeding them to the LLM all at once could create confusion or contradictions, then you can use an **Iteration** component, which encapsulates a **Generate** component, to repeat the content generation process for each paragraph. - Another example: If you wish to use the LLM to translate a lengthy paper into a target language without exceeding its token limit, consider using an **Iteration** component, which encapsulates a **Generate** component, to break the paper into smaller pieces and repeat the translation process for each one. ## Internal components ### IterationItem Each **Iteration** component includes an internal **IterationItem** component. The **IterationItem** component serves as both the starting point and input node of the workflow within the **Iteration** component. It manages the loop of the workflow for all text segments created from the input. :::tip NOTE The **IterationItem** component is visible *only* to the components encapsulated by the current **Iteration** components. ::: ### Build an internal workflow You are allowed to pull other components into the **Iteration** component to build an internal workflow, and these "added internal components" are no longer visible to components outside of the current **Iteration** component. :::danger IMPORTANT To reference the created text segments from an added internal component, simply add a **Reference** variable that equals **IterationItem** within the **Input** section of that internal component. There is no need to reference the corresponding external component, as the **IterationItem** component manages the loop of the workflow for all created text segments. ::: :::tip NOTE An added internal component can reference an external component when necessary. ::: ## Configurations ### Input The **Iteration** component uses input variables to specify its data inputs, namely the texts to be segmented. You are allowed to specify multiple input sources for the **Iteration** component. Click **+ Add variable** in the **Input** section to include the desired input variables. There are two types of input variables: **Reference** and **Text**. - **Reference**: Uses a component's output or a user input as the data source. You are required to select from the dropdown menu: - A component ID under **Component Output**, or - A global variable under **Begin input**, which is defined in the **Begin** component. - **Text**: Uses fixed text as the query. You are required to enter static text. ### Delimiter The delimiter to use to split the text input into segments: - Comma (Default) - Line break - Tab - Underline - Forward slash - Dash - Semicolon --- --- sidebar_position: 4 slug: /message_component --- # Message component A component that sends out a static or dynamic message. --- As the final component of the workflow, a Message component returns the workflow’s ultimate data output accompanied by predefined message content. The system selects one message at random if multiple messages are provided. ## Configurations ### Messages The message to send out. Click `(x)` or type `/` to quickly insert variables. Click **+ Add message** to add message options. When multiple messages are supplied, the **Message** component randomly selects one to send. --- --- sidebar_position: 3 slug: /retrieval_component --- # Retrieval component A component that retrieves information from specified datasets. ## Scenarios A **Retrieval** component is essential in most RAG scenarios, where information is extracted from designated datasets before being sent to the LLM for content generation. A **Retrieval** component can operate either as a standalone workflow module or as a tool for an **Agent** component. In the latter role, the **Agent** component has autonomous control over when to invoke it for query and retrieval. The following screenshot shows a reference design using the **Retrieval** component, where the component serves as a tool for an **Agent** component. You can find it from the **Report Agent Using Knowledge Base** Agent template. ![retrieval_reference_design](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/retrieval_reference_design.jpg) ## Prerequisites Ensure you [have properly configured your target dataset(s)](../../dataset/configure_knowledge_base.md). ## Quickstart ### 1. Click on a **Retrieval** component to show its configuration panel The corresponding configuration panel appears to the right of the canvas. Use this panel to define and fine-tune the **Retrieval** component's search behavior. ### 2. Input query variable(s) The **Retrieval** component depends on query variables to specify its queries. :::caution IMPORTANT - If you use the **Retrieval** component as a standalone workflow module, input query variables in the **Input Variables** text box. - If it is used as a tool for an **Agent** component, input the query variables in the **Agent** component's **User prompt** field. ::: By default, you can use `sys.query`, which is the user query and the default output of the **Begin** component. All global variables defined before the **Retrieval** component can also be used as query statements. Use the `(x)` button or type `/` to show all the available query variables. ### 3. Select dataset(s) to query You can specify one or multiple datasets to retrieve data from. If selecting multiple, ensure they use the same embedding model. ### 4. Expand **Advanced Settings** to configure the retrieval method By default, a combination of weighted keyword similarity and weighted vector cosine similarity is used for retrieval. If a rerank model is selected, a combination of weighted keyword similarity and weighted reranking score will be used instead. As a starter, you can skip this step to stay with the default retrieval method. :::caution WARNING Using a rerank model will *significantly* increase the system's response time. ::: ### 5. Enable cross-language search If your user query is different from the languages of the datasets, you can select the target languages in the **Cross-language search** dropdown menu. The model will then translates queries to ensure accurate matching of semantic meaning across languages. ### 6. Test retrieval results Click the **Run** button on the top of canvas to test the retrieval results. ### 7. Choose the next component When necessary, click the **+** button on the **Retrieval** component to choose the next component in the workflow from the dropdown list. ## Configurations ### Query variables *Mandatory* Select the query source for retrieval. Defaults to `sys.query`, which is the default output of the **Begin** component. The **Retrieval** component relies on query variables to specify its queries. All global variables defined before the **Retrieval** component can also be used as queries. Use the `(x)` button or type `/` to show all the available query variables. ### Knowledge bases Select the dataset(s) to retrieve data from. - If no dataset is selected, meaning conversations with the agent will not be based on any dataset, ensure that the **Empty response** field is left blank to avoid an error. - If you select multiple datasets, you must ensure that the datasets you select use the same embedding model; otherwise, an error message would occur. ### Similarity threshold RAGFlow employs a combination of weighted keyword similarity and weighted vector cosine similarity during retrieval. This parameter sets the threshold for similarities between the user query and chunks stored in the datasets. Any chunk with a similarity score below this threshold will be excluded from the results. Defaults to 0.2. ### Vector similarity weight This parameter sets the weight of vector similarity in the composite similarity score. The total of the two weights must equal 1.0. Its default value is 0.3, which means the weight of keyword similarity in a combined search is 1 - 0.3 = 0.7. ### Top N This parameter selects the "Top N" chunks from retrieved ones and feed them to the LLM. Defaults to 8. ### Rerank model *Optional* If a rerank model is selected, a combination of weighted keyword similarity and weighted reranking score will be used for retrieval. :::caution WARNING Using a rerank model will *significantly* increase the system's response time. ::: ### Empty response - Set this as a response if no results are retrieved from the dataset(s) for your query, or - Leave this field blank to allow the chat model to improvise when nothing is found. :::caution WARNING If you do not specify a dataset, you must leave this field blank; otherwise, an error would occur. ::: ### Cross-language search Select one or more languages for cross‑language search. If no language is selected, the system searches with the original query. ### Use knowledge graph :::caution IMPORTANT Before enabling this feature, ensure you have properly [constructed a knowledge graph from each target dataset](../../dataset/construct_knowledge_graph.md). ::: Whether to use knowledge graph(s) in the specified dataset(s) during retrieval for multi-hop question answering. When enabled, this would involve iterative searches across entity, relationship, and community report chunks, greatly increasing retrieval time. ### Output The global variable name for the output of the **Retrieval** component, which can be referenced by other components in the workflow. ## Frequently asked questions ### How to reduce response time? Go through the checklist below for best performance: - Leave the **Rerank model** field empty to disable rerank. - Disable **Use knowledge graph**. --- --- sidebar_position: 6 slug: /switch_component --- # Switch component A component that evaluates whether specified conditions are met and directs the follow of execution accordingly. --- A **Switch** component evaluates conditions based on the output of specific components, directing the flow of execution accordingly to enable complex branching logic. ## Scenarios A **Switch** component is essential for condition-based direction of execution flow. While it shares similarities with the [Categorize](./categorize.mdx) component, which is also used in multi-pronged strategies, the key distinction lies in their approach: the evaluation of the **Switch** component is rule-based, whereas the **Categorize** component involves AI and uses an LLM for decision-making. ## Configurations ### Case n A **Switch** component must have at least one case, each with multiple specified conditions. When multiple conditions are specified for a case, you must set the logical relationship between them to either AND or OR. Once a new case is added, navigate to the **Switch** component on the canvas, find the **+** button next to the case, and click it to specify the downstream component(s). #### Condition Evaluates whether the output of specific components meets certain conditions :::danger IMPORTANT When you have added multiple conditions for a specific case, a **Logical operator** field appears, requiring you to set the logical relationship between these conditions as either AND or OR. ::: - **Operator**: The operator required to form a conditional expression. - Equals (default) - Not equal - Greater than - Greater equal - Less than - Less equal - Contains - Not contains - Starts with - Ends with - Is empty - Not empty - **Value**: A single value, which can be an integer, float, or string. - Delimiters, multiple values, or expressions are *not* supported. --- --- sidebar_position: 15 slug: /text_processing --- # Text processing component A component that merges or splits texts. --- A **Text processing** component merges or splits texts. ## Configurations ### Method - Split: Split the text - Merge: Merge the text ### Split_ref Appears only when you select **Split** as method. The variable to be split. Type `/` to quickly insert variables. ### Script Template for the merge. Appears only when you select **Merge** as method. Type `/` to quickly insert variables. ### Delimiters The delimiter(s) used to split or merge the text. ### Output The global variable name for the output of the component, which can be referenced by other components in the workflow. --- --- sidebar_position: 1 slug: /accelerate_question_answering --- # Accelerate answering import APITable from '@site/src/components/APITable'; A checklist to speed up question answering for your chat assistant. --- Please note that some of your settings may consume a significant amount of time. If you often find that your question answering is time-consuming, here is a checklist to consider: - Disabling **Multi-turn optimization** will reduce the time required to get an answer from the LLM. - Leaving the **Rerank model** field empty will significantly decrease retrieval time. - Disabling the **Reasoning** toggle will reduce the LLM's thinking time. For a model like Qwen3, you also need to add `/no_think` to the system prompt to disable reasoning. - When using a rerank model, ensure you have a GPU for acceleration; otherwise, the reranking process will be *prohibitively* slow. :::tip NOTE Please note that rerank models are essential in certain scenarios. There is always a trade-off between speed and performance; you must weigh the pros against cons for your specific case. ::: - Disabling **Keyword analysis** will reduce the time to receive an answer from the LLM. - When chatting with your chat assistant, click the light bulb icon above the *current* dialogue and scroll down the popup window to view the time taken for each task: ![enlighten](https://github.com/user-attachments/assets/fedfa2ee-21a7-451b-be66-20125619923c) ```mdx-code-block ``` | Item name | Description | | ----------------- | --------------------------------------------------------------------------------------------- | | Total | Total time spent on this conversation round, including chunk retrieval and answer generation. | | Check LLM | Time to validate the specified LLM. | | Create retriever | Time to create a chunk retriever. | | Bind embedding | Time to initialize an embedding model instance. | | Bind LLM | Time to initialize an LLM instance. | | Tune question | Time to optimize the user query using the context of the multi-turn conversation. | | Bind reranker | Time to initialize an reranker model instance for chunk retrieval. | | Generate keywords | Time to extract keywords from the user query. | | Retrieval | Time to retrieve the chunks. | | Generate answer | Time to generate the answer. | ```mdx-code-block ``` --- --- sidebar_position: 3 slug: /autokeyword_autoquestion --- # Auto-keyword Auto-question import APITable from '@site/src/components/APITable'; Use a chat model to generate keywords or questions from each chunk in the dataset. --- When selecting a chunking method, you can also enable auto-keyword or auto-question generation to increase retrieval rates. This feature uses a chat model to produce a specified number of keywords and questions from each created chunk, generating an "additional layer of information" from the original content. :::caution WARNING Enabling this feature increases document indexing time and uses extra tokens, as all created chunks will be sent to the chat model for keyword or question generation. ::: ## What is Auto-keyword? Auto-keyword refers to the auto-keyword generation feature of RAGFlow. It uses a chat model to generate a set of keywords or synonyms from each chunk to correct errors and enhance retrieval accuracy. This feature is implemented as a slider under **Page rank** on the **Configuration** page of your dataset. **Values**: - 0: (Default) Disabled. - Between 3 and 5 (inclusive): Recommended if you have chunks of approximately 1,000 characters. - 30 (maximum) :::tip NOTE - If your chunk size increases, you can increase the value accordingly. Please note, as the value increases, the marginal benefit decreases. - An Auto-keyword value must be an integer. If you set it to a non-integer, say 1.7, it will be rounded down to the nearest integer, which in this case is 1. ::: ## What is Auto-question? Auto-question is a feature of RAGFlow that automatically generates questions from chunks of data using a chat model. These questions (e.g. who, what, and why) also help correct errors and improve the matching of user queries. The feature usually works with FAQ retrieval scenarios involving product manuals or policy documents. And you can find this feature as a slider under **Page rank** on the **Configuration** page of your dataset. **Values**: - 0: (Default) Disabled. - 1 or 2: Recommended if you have chunks of approximately 1,000 characters. - 10 (maximum) :::tip NOTE - If your chunk size increases, you can increase the value accordingly. Please note, as the value increases, the marginal benefit decreases. - An Auto-question value must be an integer. If you set it to a non-integer, say 1.7, it will be rounded down to the nearest integer, which in this case is 1. ::: ## Tips from the community The Auto-keyword or Auto-question values relate closely to the chunking size in your dataset. However, if you are new to this feature and unsure which value(s) to start with, the following are some value settings we gathered from our community. While they may not be accurate, they provide a starting point at the very least. ```mdx-code-block ``` | Use cases or typical scenarios | Document volume/length | Auto_keyword (0–30) | Auto_question (0–10) | |---------------------------------------------------------------------|---------------------------------|----------------------------|----------------------------| | Internal process guidance for employee handbook | Small, under 10 pages | 0 | 0 | | Customer service FAQs | Medium, 10–100 pages | 3–7 | 1–3 | | Technical white papers: Development standards, protocol details | Large, over 100 pages | 2–4 | 1–2 | | Contracts / Regulations / Legal clause retrieval | Large, over 50 pages | 2–5 | 0–1 | | Multi-repository layered new documents + old archive | Many | Adjust as appropriate |Adjust as appropriate | | Social media comment pool: multilingual & mixed spelling | Very large volume of short text | 8–12 | 0 | | Operational logs for troubleshooting | Very large volume of short text | 3–6 | 0 | | Marketing asset library: multilingual product descriptions | Medium | 6–10 | 1–2 | | Training courses / eBooks | Large | 2–5 | 1–2 | | Maintenance manual: equipment diagrams + steps | Medium | 3–7 | 1–2 | ```mdx-code-block ``` --- --- sidebar_position: 1 slug: /accelerate_doc_indexing --- # Accelerate indexing import APITable from '@site/src/components/APITable'; A checklist to speed up document parsing and indexing. --- Please note that some of your settings may consume a significant amount of time. If you often find that document parsing is time-consuming, here is a checklist to consider: - On the configuration page of your dataset, switch off **Use RAPTOR to enhance retrieval**. - Extracting knowledge graph (GraphRAG) is time-consuming. - Disable **Auto-keyword** and **Auto-question** on the configuration page of your dataset, as both depend on the LLM. - **v0.17.0+:** If all PDFs in your dataset are plain text and do not require GPU-intensive processes like OCR (Optical Character Recognition), TSR (Table Structure Recognition), or DLA (Document Layout Analysis), you can choose **Naive** over **DeepDoc** or other time-consuming large model options in the **Document parser** dropdown. This will substantially reduce document parsing time. --- --- sidebar_position: 2 slug: /deploy_local_llm --- # Deploy local models import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; Deploy and run local models using Ollama, Xinference, VLLM ,SGLANG or other frameworks. --- RAGFlow supports deploying models locally using Ollama, Xinference, IPEX-LLM, or jina. If you have locally deployed models to leverage or wish to enable GPU or CUDA for inference acceleration, you can bind Ollama or Xinference into RAGFlow and use either of them as a local "server" for interacting with your local models. RAGFlow seamlessly integrates with Ollama and Xinference, without the need for further environment configurations. You can use them to deploy two types of local models in RAGFlow: chat models and embedding models. :::tip NOTE This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference. ::: ## Deploy local models using Ollama [Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage. :::note - For information about downloading Ollama, see [here](https://github.com/ollama/ollama?tab=readme-ov-file#ollama). - For a complete list of supported models and variants, see the [Ollama model library](https://ollama.com/library). ::: ### 1. Deploy Ollama using Docker Ollama can be [installed from binaries](https://ollama.com/download) or [deployed with Docker](https://hub.docker.com/r/ollama/ollama). Here are the instructions to deploy with Docker: ```bash $ sudo docker run --name ollama -p 11434:11434 ollama/ollama > time=2024-12-02T02:20:21.360Z level=INFO source=routes.go:1248 msg="Listening on [::]:11434 (version 0.4.6)" > time=2024-12-02T02:20:21.360Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12]" ``` Ensure Ollama is listening on all IP address: ```bash $ sudo ss -tunlp | grep 11434 > tcp LISTEN 0 4096 0.0.0.0:11434 0.0.0.0:* users:(("docker-proxy",pid=794507,fd=4)) > tcp LISTEN 0 4096 [::]:11434 [::]:* users:(("docker-proxy",pid=794513,fd=4)) ``` Pull models as you need. We recommend that you start with `llama3.2` (a 3B chat model) and `bge-m3` (a 567M embedding model): ```bash $ sudo docker exec ollama ollama pull llama3.2 > pulling dde5aa3fc5ff... 100% ▕████████████████▏ 2.0 GB > success ``` ```bash $ sudo docker exec ollama ollama pull bge-m3 > pulling daec91ffb5dd... 100% ▕████████████████▏ 1.2 GB > success ``` ### 2. Find Ollama URL and ensure it is accessible - If RAGFlow runs in Docker, the localhost is mapped within the RAGFlow Docker container as `host.docker.internal`. If Ollama runs on the same host machine, the right URL to use for Ollama would be `http://host.docker.internal:11434/' and you should check that Ollama is accessible from inside the RAGFlow container with: ```bash $ sudo docker exec -it docker-ragflow-cpu-1 bash $ curl http://host.docker.internal:11434/ > Ollama is running ``` - If RAGFlow is launched from source code and Ollama runs on the same host machine as RAGFlow, check if Ollama is accessible from RAGFlow's host machine: ```bash $ curl http://localhost:11434/ > Ollama is running ``` - If RAGFlow and Ollama run on different machines, check if Ollama is accessible from RAGFlow's host machine: ```bash $ curl http://${IP_OF_OLLAMA_MACHINE}:11434/ > Ollama is running ``` ### 3. Add Ollama In RAGFlow, click on your logo on the top right of the page **>** **Model providers** and add Ollama to RAGFlow: ![add ollama](https://github.com/infiniflow/ragflow/assets/93570324/10635088-028b-4b3d-add9-5c5a6e626814) ### 4. Complete basic Ollama settings In the popup window, complete basic settings for Ollama: 1. Ensure that your model name and type match those been pulled at step 1 (Deploy Ollama using Docker). For example, (`llama3.2` and `chat`) or (`bge-m3` and `embedding`). 2. Put in the Ollama base URL, i.e. `http://host.docker.internal:11434`, `http://localhost:11434` or `http://${IP_OF_OLLAMA_MACHINE}:11434`. 3. OPTIONAL: Switch on the toggle under **Does it support Vision?** if your model includes an image-to-text model. :::caution WARNING Improper base URL settings will trigger the following error: ```bash Max retries exceeded with url: /api/chat (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused')) ``` ::: ### 5. Update System Model Settings Click on your logo **>** **Model providers** **>** **System Model Settings** to update your model: - *You should now be able to find **llama3.2** from the dropdown list under **Chat model**, and **bge-m3** from the dropdown list under **Embedding model**.* ### 6. Update Chat Configuration Update your model(s) accordingly in **Chat Configuration**. ## Deploy a local model using Xinference Xorbits Inference ([Xinference](https://github.com/xorbitsai/inference)) enables you to unleash the full potential of cutting-edge AI models. :::note - For information about installing Xinference Ollama, see [here](https://inference.readthedocs.io/en/latest/getting_started/). - For a complete list of supported models, see the [Builtin Models](https://inference.readthedocs.io/en/latest/models/builtin/). ::: To deploy a local model, e.g., **Mistral**, using Xinference: ### 1. Check firewall settings Ensure that your host machine's firewall allows inbound connections on port 9997. ### 2. Start an Xinference instance ```bash $ xinference-local --host 0.0.0.0 --port 9997 ``` ### 3. Launch your local model Launch your local model (**Mistral**), ensuring that you replace `${quantization}` with your chosen quantization method: ```bash $ xinference launch -u mistral --model-name mistral-v0.1 --size-in-billions 7 --model-format pytorch --quantization ${quantization} ``` ### 4. Add Xinference In RAGFlow, click on your logo on the top right of the page **>** **Model providers** and add Xinference to RAGFlow: ![add xinference](https://github.com/infiniflow/ragflow/assets/93570324/10635088-028b-4b3d-add9-5c5a6e626814) ### 5. Complete basic Xinference settings Enter an accessible base URL, such as `http://:9997/v1`. > For rerank model, please use the `http://:9997/v1/rerank` as the base URL. ### 6. Update System Model Settings Click on your logo **>** **Model providers** **>** **System Model Settings** to update your model. *You should now be able to find **mistral** from the dropdown list under **Chat model**.* ### 7. Update Chat Configuration Update your chat model accordingly in **Chat Configuration**: ## Deploy a local model using IPEX-LLM [IPEX-LLM](https://github.com/intel-analytics/ipex-llm) is a PyTorch library for running LLMs on local Intel CPUs or GPUs (including iGPU or discrete GPUs like Arc, Flex, and Max) with low latency. It supports Ollama on Linux and Windows systems. To deploy a local model, e.g., **Qwen2**, using IPEX-LLM-accelerated Ollama: ### 1. Check firewall settings Ensure that your host machine's firewall allows inbound connections on port 11434. For example: ```bash sudo ufw allow 11434/tcp ``` ### 2. Launch Ollama service using IPEX-LLM #### 2.1 Install IPEX-LLM for Ollama :::tip NOTE IPEX-LLM's supports Ollama on Linux and Windows systems. ::: For detailed information about installing IPEX-LLM for Ollama, see [Run llama.cpp with IPEX-LLM on Intel GPU Guide](https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/llama_cpp_quickstart.md): - [Prerequisites](https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/llama_cpp_quickstart.md#0-prerequisites) - [Install IPEX-LLM cpp with Ollama binaries](https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/llama_cpp_quickstart.md#1-install-ipex-llm-for-llamacpp) *After the installation, you should have created a Conda environment, e.g., `llm-cpp`, for running Ollama commands with IPEX-LLM.* #### 2.2 Initialize Ollama 1. Activate the `llm-cpp` Conda environment and initialize Ollama: ```bash conda activate llm-cpp init-ollama ``` Run these commands with *administrator privileges in Miniforge Prompt*: ```cmd conda activate llm-cpp init-ollama.bat ``` 2. If the installed `ipex-llm[cpp]` requires an upgrade to the Ollama binary files, remove the old binary files and reinitialize Ollama using `init-ollama` (Linux) or `init-ollama.bat` (Windows). *A symbolic link to Ollama appears in your current directory, and you can use this executable file following standard Ollama commands.* #### 2.3 Launch Ollama service 1. Set the environment variable `OLLAMA_NUM_GPU` to `999` to ensure that all layers of your model run on the Intel GPU; otherwise, some layers may default to CPU. 2. For optimal performance on Intel Arc™ A-Series Graphics with Linux OS (Kernel 6.2), set the following environment variable before launching the Ollama service: ```bash export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 ``` 3. Launch the Ollama service: ```bash export OLLAMA_NUM_GPU=999 export no_proxy=localhost,127.0.0.1 export ZES_ENABLE_SYSMAN=1 source /opt/intel/oneapi/setvars.sh export SYCL_CACHE_PERSISTENT=1 ./ollama serve ``` Run the following command *in Miniforge Prompt*: ```cmd set OLLAMA_NUM_GPU=999 set no_proxy=localhost,127.0.0.1 set ZES_ENABLE_SYSMAN=1 set SYCL_CACHE_PERSISTENT=1 ollama serve ``` :::tip NOTE To enable the Ollama service to accept connections from all IP addresses, use `OLLAMA_HOST=0.0.0.0 ./ollama serve` rather than simply `./ollama serve`. ::: *The console displays messages similar to the following:* ![](https://llm-assets.readthedocs.io/en/latest/_images/ollama_serve.png) ### 3. Pull and Run Ollama model #### 3.1 Pull Ollama model With the Ollama service running, open a new terminal and run `./ollama pull ` (Linux) or `ollama.exe pull ` (Windows) to pull the desired model. e.g., `qwen2:latest`: ![](https://llm-assets.readthedocs.io/en/latest/_images/ollama_pull.png) #### 3.2 Run Ollama model ```bash ./ollama run qwen2:latest ``` ```cmd ollama run qwen2:latest ``` ### 4. Configure RAGFlow To enable IPEX-LLM accelerated Ollama in RAGFlow, you must also complete the configurations in RAGFlow. The steps are identical to those outlined in the *Deploy a local model using Ollama* section: 1. [Add Ollama](#4-add-ollama) 2. [Complete basic Ollama settings](#5-complete-basic-ollama-settings) 3. [Update System Model Settings](#6-update-system-model-settings) 4. [Update Chat Configuration](#7-update-chat-configuration) ### 5. Deploy VLLM ubuntu 22.04/24.04 ```bash pip install vllm ``` ### 5.1 RUN VLLM WITH BEST PRACTISE ```bash nohup vllm serve /data/Qwen3-8B --served-model-name Qwen3-8B-FP8 --dtype auto --port 1025 --gpu-memory-utilization 0.90 --tool-call-parser hermes --enable-auto-tool-choice > /var/log/vllm_startup1.log 2>&1 & ``` you can get log info ```bash tail -f -n 100 /var/log/vllm_startup1.log ``` when see the follow ,it means vllm engine is ready for access ```bash Starting vLLM API server 0 on http://0.0.0.0:1025 Started server process [19177] Application startup complete. ``` ### 5.2 INTERGRATEING RAGFLOW WITH VLLM CHAT/EM/RERANK LLM WITH WEBUI setting->model providers->search->vllm->add ,configure as follow: ![add vllm](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/ragflow_vllm.png) select vllm chat model as default llm model as follow: ![chat](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/ragflow_vllm1.png) ### 5.3 chat with vllm chat model create chat->create conversations-chat as follow: ![chat](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/ragflow_vllm2.png) --- --- sidebar_position: 9 slug: /tracing --- # Tracing Observability & Tracing with Langfuse. --- :::info KUDOS This document is contributed by our community contributor [jannikmaierhoefer](https://github.com/jannikmaierhoefer). 👏 ::: RAGFlow ships with a built-in [Langfuse](https://langfuse.com) integration so that you can **inspect and debug every retrieval and generation step** of your RAG pipelines in near real-time. Langfuse stores traces, spans and prompt payloads in a purpose-built observability backend and offers filtering and visualisations on top. :::info NOTE • RAGFlow **≥ 0.18.0** (contains the Langfuse connector) • A Langfuse workspace (cloud or self-hosted) with a _Project Public Key_ and _Secret Key_ ::: --- ## 1. Collect your Langfuse credentials 1. Sign in to your Langfuse dashboard. 2. Open **Settings ▸ Projects** and either create a new project or select an existing one. 3. Copy the **Public Key** and **Secret Key**. 4. Note the Langfuse **host** (e.g. `https://cloud.langfuse.com`). Use the base URL of your own installation if you self-host. > The keys are _project-scoped_: one pair of keys is enough for all environments that should write into the same project. --- ## 2. Add the keys to RAGFlow RAGFlow stores the credentials _per tenant_. You can configure them either via the web UI or the HTTP API. 1. Log in to RAGFlow and click your avatar in the top-right corner. 2. Select **API ▸ Scroll down to the bottom ▸ Langfuse Configuration**. 3. Fill in you Langfuse **Host**, **Public Key** and **Secret Key**. 4. Click **Save**. ![Example RAGFlow trace in Langfuse](https://langfuse.com/images/docs/ragflow/ragflow-configuration.gif) Once saved, RAGFlow starts emitting traces automatically – no code change required. --- ## 3. Run a pipeline and watch the traces 1. Execute any chat or retrieval pipeline in RAGFlow (e.g. the Quickstart demo). 2. Open your Langfuse project ▸ **Traces**. 3. Filter by **name ~ `ragflow-*`** (RAGFlow prefixes each trace with `ragflow-`). For every user request you will see: • a **trace** representing the overall request • **spans** for retrieval, ranking and generation steps • the complete **prompts**, **retrieved documents** and **LLM responses** as metadata ![Example RAGFlow trace in Langfuse](https://langfuse.com/images/docs/ragflow/ragflow-trace-frame.png) ([Example trace in Langfuse](https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/0bde9629-4251-4386-b583-26101b8e7561?timestamp=2025-05-09T19%3A15%3A37.797Z&display=details&observation=823997d8-ac40-40f3-8e7b-8aa6753b499e)) :::tip NOTE Use Langfuse's diff view to compare prompt versions or drill down into long-running retrievals to identify bottlenecks. ::: --- --- sidebar_position: 11 slug: /upgrade_ragflow --- # Upgrading import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; Upgrade RAGFlow to `nightly` or the latest, published release. :::info NOTE Upgrading RAGFlow in itself will *not* remove your uploaded/historical data. However, be aware that `docker compose -f docker/docker-compose.yml down -v` will remove Docker container volumes, resulting in data loss. ::: ## Upgrade RAGFlow to `nightly`, the most recent, tested Docker image `nightly` refers to the RAGFlow Docker image without embedding models. To upgrade RAGFlow, you must upgrade **both** your code **and** your Docker image: 1. Stop the server ```bash docker compose -f docker/docker-compose.yml down ``` 2. Update the local code ```bash git pull ``` 3. Update **ragflow/docker/.env**: ```bash RAGFLOW_IMAGE=infiniflow/ragflow:nightly ``` 4. Update RAGFlow image and restart RAGFlow: ```bash docker compose -f docker/docker-compose.yml pull docker compose -f docker/docker-compose.yml up -d ``` ## Upgrade RAGFlow to given release To upgrade RAGFlow, you must upgrade **both** your code **and** your Docker image: 1. Stop the server ```bash docker compose -f docker/docker-compose.yml down ``` 2. Update the local code ```bash git pull ``` 3. Switch to the latest, officially published release, e.g., `v0.23.1`: ```bash git checkout -f v0.23.1 ``` 4. Update **ragflow/docker/.env**: ```bash RAGFLOW_IMAGE=infiniflow/ragflow:v0.23.1 ``` 5. Update the RAGFlow image and restart RAGFlow: ```bash docker compose -f docker/docker-compose.yml pull docker compose -f docker/docker-compose.yml up -d ``` ## Frequently asked questions ### Do I need to back up my datasets before upgrading RAGFlow? No, you do not need to. Upgrading RAGFlow in itself will *not* remove your uploaded data or dataset settings. However, be aware that `docker compose -f docker/docker-compose.yml down -v` will remove Docker container volumes, resulting in data loss. ### Upgrade RAGFlow in an offline environment (without Internet access) 1. From an environment with Internet access, pull the required Docker image. 2. Save the Docker image to a **.tar** file. ```bash docker save -o ragflow.v0.23.1.tar infiniflow/ragflow:v0.23.1 ``` 3. Copy the **.tar** file to the target server. 4. Load the **.tar** file into Docker: ```bash docker load -i ragflow.v0.23.1.tar ``` --- --- sidebar_position: 0 slug: / --- # Get started import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; import APITable from '@site/src/components/APITable'; RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. When integrated with LLMs, it is capable of providing truthful question-answering capabilities, backed by well-founded citations from various complex formatted data. This quick start guide describes a general process from: - Starting up a local RAGFlow server, - Creating a dataset, - Intervening with file parsing, to - Establishing an AI chat based on your datasets. :::danger IMPORTANT We officially support x86 CPU and Nvidia GPU, and this document offers instructions on deploying RAGFlow using Docker on x86 platforms. While we also test RAGFlow on ARM64 platforms, we do not maintain RAGFlow Docker images for ARM. If you are on an ARM platform, follow [this guide](./develop/build_docker_image.mdx) to build a RAGFlow Docker image. ::: ## Prerequisites - CPU ≥ 4 cores (x86); - RAM ≥ 16 GB; - Disk ≥ 50 GB; - Docker ≥ 24.0.0 & Docker Compose ≥ v2.26.1. - [gVisor](https://gvisor.dev/docs/user_guide/install/): Required only if you intend to use the code executor ([sandbox](https://github.com/infiniflow/ragflow/tree/main/sandbox)) feature of RAGFlow. :::tip NOTE If you have not installed Docker on your local machine (Windows, Mac, or Linux), see [Install Docker Engine](https://docs.docker.com/engine/install/). ::: ## Start up the server This section provides instructions on setting up the RAGFlow server on Linux. If you are on a different operating system, no worries. Most steps are alike. 1. Ensure `vm.max_map_count` ≥ 262144.
Expand to show details: `vm.max_map_count`. This value sets the maximum number of memory map areas a process may have. Its default value is 65530. While most applications require fewer than a thousand maps, reducing this value can result in abnormal behaviors, and the system will throw out-of-memory errors when a process reaches the limitation. RAGFlow v0.23.1 uses Elasticsearch or [Infinity](https://github.com/infiniflow/infinity) for multiple recall. Setting the value of `vm.max_map_count` correctly is crucial to the proper functioning of the Elasticsearch component. 1.1. Check the value of `vm.max_map_count`: ```bash $ sysctl vm.max_map_count ``` 1.2. Reset `vm.max_map_count` to a value at least 262144 if it is not. ```bash $ sudo sysctl -w vm.max_map_count=262144 ``` :::caution WARNING This change will be reset after a system reboot. If you forget to update the value the next time you start up the server, you may get a `Can't connect to ES cluster` exception. ::: 1.3. To ensure your change remains permanent, add or update the `vm.max_map_count` value in **/etc/sysctl.conf** accordingly: ```bash vm.max_map_count=262144 ``` If you are on macOS with Docker Desktop, run the following command to update `vm.max_map_count`: ```bash docker run --rm --privileged --pid=host alpine sysctl -w vm.max_map_count=262144 ``` :::caution WARNING This change will be reset after a system reboot. If you forget to update the value the next time you start up the server, you may get a `Can't connect to ES cluster` exception. ::: To make your change persistent, create a file with proper settings: 1.1. Create a file: ```shell sudo nano /Library/LaunchDaemons/com.user.vmmaxmap.plist ``` 1.2. Open the file: ```shell sudo launchctl load /Library/LaunchDaemons/com.user.vmmaxmap.plist ``` 1.3. Add settings: ```xml Label com.user.vmmaxmap ProgramArguments /usr/sbin/sysctl -w vm.max_map_count=262144 RunAtLoad ``` 1.4. After saving the file, load the new daemon: ```shell sudo launchctl load /Library/LaunchDaemons/com.user.vmmaxmap.plist ``` :::note If the above steps do not work, consider using [this workaround](https://github.com/docker/for-mac/issues/7047#issuecomment-1791912053), which employs a container and does not require manual editing of the macOS settings. ::: #### If you are on Windows with Docker Desktop, then you *must* use docker-machine to set `vm.max_map_count`: ```bash $ docker-machine ssh $ sudo sysctl -w vm.max_map_count=262144 ``` #### If you are on Windows with Docker Desktop WSL 2 backend, then use docker-desktop to set `vm.max_map_count`: 1.1. Run the following in WSL: ```bash $ wsl -d docker-desktop -u root $ sysctl -w vm.max_map_count=262144 ``` :::caution WARNING This change will be reset after you restart Docker. If you forget to update the value the next time you start up the server, you may get a `Can't connect to ES cluster` exception. ::: 1.2. If you prefer not to run those commands every time you restart Docker, you can update your `%USERPROFILE%.wslconfig` as follows to keep your change permanent and global for all WSL distributions: ```bash [wsl2] kernelCommandLine = "sysctl.vm.max_map_count=262144" ``` *This causes all WSL2 virtual machines to have that setting assigned when they start.* :::note If you are on Windows 11 or Windows 10 version 22H2, and have installed the Microsoft Store version of WSL, you can also update the **/etc/sysctl.conf** within the docker-desktop WSL distribution to keep your change permanent: ```bash $ wsl -d docker-desktop -u root $ vi /etc/sysctl.conf ``` ```bash # Append a line, which reads: vm.max_map_count = 262144 ``` :::
2. Clone the repo: ```bash $ git clone https://github.com/infiniflow/ragflow.git $ cd ragflow/docker $ git checkout -f v0.23.1 ``` 3. Use the pre-built Docker images and start up the server: ```bash # Use CPU for DeepDoc tasks: $ docker compose -f docker-compose.yml up -d ``` ```mdx-code-block ``` | RAGFlow image tag | Image size (GB) | Stable? | | ------------------- | --------------- | ------------------------ | | v0.23.1 | ≈2 | Stable release | | nightly | ≈2 | _Unstable_ nightly build | ```mdx-code-block ``` :::tip NOTE The image size shown refers to the size of the *downloaded* Docker image, which is compressed. When Docker runs the image, it unpacks it, resulting in significantly greater disk usage. A Docker image will expand to around 7 GB once unpacked. ::: 4. Check the server status after having the server up and running: ```bash $ docker logs -f docker-ragflow-cpu-1 ``` _The following output confirms a successful launch of the system:_ ```bash ____ ___ ______ ______ __ / __ \ / | / ____// ____// /____ _ __ / /_/ // /| | / / __ / /_ / // __ \| | /| / / / _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ / /_/ |_|/_/ |_|\____//_/ /_/ \____/ |__/|__/ * Running on all addresses (0.0.0.0) ``` :::danger IMPORTANT If you skip this confirmation step and directly log in to RAGFlow, your browser may prompt a `network anomaly` error because, at that moment, your RAGFlow may not be fully initialized. ::: 5. In your web browser, enter the IP address of your server and log in to RAGFlow. :::caution WARNING With the default settings, you only need to enter `http://IP_OF_YOUR_MACHINE` (**sans** port number) as the default HTTP serving port `80` can be omitted when using the default configurations. ::: ## Configure LLMs RAGFlow is a RAG engine and needs to work with an LLM to offer grounded, hallucination-free question-answering capabilities. RAGFlow supports most mainstream LLMs. For a complete list of supported models, please refer to [Supported Models](./references/supported_models.mdx). :::note RAGFlow also supports deploying LLMs locally using Ollama, Xinference, or LocalAI, but this part is not covered in this quick start guide. ::: To add and configure an LLM: 1. Click on your logo on the top right of the page **>** **Model providers**. 2. Click on the desired LLM and update the API key accordingly. 3. Click **System Model Settings** to select the default models: - Chat model, - Embedding model, - Image-to-text model, - and more. > Some models, such as the image-to-text model **qwen-vl-max**, are subsidiary to a specific LLM. And you may need to update your API key to access these models. ## Create your first dataset You are allowed to upload files to a dataset in RAGFlow and parse them into datasets. A dataset is virtually a collection of datasets. Question answering in RAGFlow can be based on a particular dataset or multiple datasets. File formats that RAGFlow supports include documents (PDF, DOC, DOCX, TXT, MD, MDX), tables (CSV, XLSX, XLS), pictures (JPEG, JPG, PNG, TIF, GIF), and slides (PPT, PPTX). To create your first dataset: 1. Click the **Dataset** tab in the top middle of the page **>** **Create dataset**. 2. Input the name of your dataset and click **OK** to confirm your changes. _You are taken to the **Configuration** page of your dataset._ ![dataset configuration](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/configure_knowledge_base.jpg) 3. RAGFlow offers multiple chunk templates that cater to different document layouts and file formats. Select the embedding model and chunking method (template) for your dataset. :::danger IMPORTANT Once you have selected an embedding model and used it to parse a file, you are no longer allowed to change it. The obvious reason is that we must ensure that all files in a specific dataset are parsed using the *same* embedding model (ensure that they are being compared in the same embedding space). ::: _You are taken to the **Dataset** page of your dataset._ 4. Click **+ Add file** **>** **Local files** to start uploading a particular file to the dataset. 5. In the uploaded file entry, click the play button to start file parsing: ![parse file](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/parse_file.jpg) :::caution NOTE - If your file parsing gets stuck at below 1%, see [this FAQ](./faq.mdx#why-does-my-document-parsing-stall-at-under-one-percent). - If your file parsing gets stuck at near completion, see [this FAQ](./faq.mdx#why-does-my-pdf-parsing-stall-near-completion-while-the-log-does-not-show-any-error) ::: ## Intervene with file parsing RAGFlow features visibility and explainability, allowing you to view the chunking results and intervene where necessary. To do so: 1. Click on the file that completes file parsing to view the chunking results: _You are taken to the **Chunk** page:_ ![chunks](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/file_chunks.jpg) 2. Hover over each snapshot for a quick view of each chunk. 3. Double click the chunked texts to add keywords or make *manual* changes where necessary: ![update chunk](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/add_keyword_question.jpg) :::caution NOTE You can add keywords or questions to a file chunk to improve its ranking for queries containing those keywords. This action increases its keyword weight and can improve its position in search list. ::: 4. In Retrieval testing, ask a quick question in **Test text** to double check if your configurations work: _As you can tell from the following, RAGFlow responds with truthful citations._ ![retrieval test](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/retrieval_test.jpg) ## Set up an AI chat Conversations in RAGFlow are based on a particular dataset or multiple datasets. Once you have created your dataset and finished file parsing, you can go ahead and start an AI conversation. 1. Click the **Chat** tab in the middle top of the page **>** **Create chat** to create a chat assistant. 2. Click the created chat app to enter its configuration page. > RAGFlow offer the flexibility of choosing a different chat model for each dialogue, while allowing you to set the default models in **System Model Settings**. 2. Update **Chat setting** on the right of the configuration page: - Name your assistant and specify your datasets. - **Empty response**: - If you wish to *confine* RAGFlow's answers to your datasets, leave a response here. Then when it doesn't retrieve an answer, it *uniformly* responds with what you set here. - If you wish RAGFlow to *improvise* when it doesn't retrieve an answer from your datasets, leave it blank, which may give rise to hallucinations. 3. Update **System prompt** or leave it as is for the beginning. 4. Select a chat model in the **Model** dropdown list. 5. Now, let's start the show: ![chat_thermal_solution](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/chat_thermal_solution.jpg) :::tip NOTE RAGFlow also offers HTTP and Python APIs for you to integrate RAGFlow's capabilities into your applications. Read the following documents for more information: - [Acquire a RAGFlow API key](./develop/acquire_ragflow_api_key.md) - [HTTP API reference](./references/http_api_reference.md) - [Python API reference](./references/python_api_reference.md) ::: --- --- sidebar_position: 0 slug: /glossary --- # Glossary Definitions of key terms and basic concepts related to RAGFlow. --- import TOCInline from '@theme/TOCInline'; --- ## C ### Cross-language search Cross-language search (also known as cross-lingual retrieval) is a feature introduced in version 0.19.0. It enables users to submit queries in one language (for example, English) and retrieve relevant documents written in other languages such as Chinese or Spanish. This feature is enabled by the system’s default chat model, which translates queries to ensure accurate matching of semantic meaning across languages. By enabling cross-language search, users can effortlessly access a broader range of information regardless of language barriers, significantly enhancing the system’s usability and inclusiveness. This feature is available in the retrieval test and chat assistant settings. See [Run retrieval test](../guides/dataset/run_retrieval_test.md) and [Start AI chat](../guides/chat/start_chat.md) for further details. --- --- sidebar_position: 1 slug: /supported_models --- # Supported models import APITable from '@site/src/components/APITable'; A complete list of models supported by RAGFlow, which will continue to expand. ```mdx-code-block ``` | Provider | LLM | Image2Text | Speech2text | TTS | Embedding | Rerank | OCR | | --------------------- | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | | Anthropic | :heavy_check_mark: | | | | | | | | Azure-OpenAI | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | | | | BaiChuan | :heavy_check_mark: | | | | :heavy_check_mark: | | | | BaiduYiyan | :heavy_check_mark: | :heavy_check_mark: | | | :heavy_check_mark: | :heavy_check_mark: | | | Bedrock | :heavy_check_mark: | | | | :heavy_check_mark: | | | | Cohere | :heavy_check_mark: | :heavy_check_mark: | | | :heavy_check_mark: | :heavy_check_mark: | | | DeepSeek | :heavy_check_mark: | | | | | | | | Fish Audio | | | | :heavy_check_mark: | | | | | Gemini | :heavy_check_mark: | :heavy_check_mark: | | | :heavy_check_mark: | | | | Google Cloud | :heavy_check_mark: | | | | | | | | GPUStack | :heavy_check_mark: | | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | Groq | :heavy_check_mark: | | | | | | | | HuggingFace | :heavy_check_mark: | | | | :heavy_check_mark: | | | | Jina | | | | | :heavy_check_mark: | :heavy_check_mark: | | | LocalAI | :heavy_check_mark: | :heavy_check_mark: | | | :heavy_check_mark: | | | | LongCat | :heavy_check_mark: | | | | | | | | LM-Studio | :heavy_check_mark: | :heavy_check_mark: | | | :heavy_check_mark: | | | | MiniMax | :heavy_check_mark: | | | | | | | | MinerU | | | | | | | :heavy_check_mark: | | Mistral | :heavy_check_mark: | | | | :heavy_check_mark: | | | | ModelScope | :heavy_check_mark: | | | | | | | | Moonshot | :heavy_check_mark: | :heavy_check_mark: | | | | | | | NovitaAI | :heavy_check_mark: | | | | :heavy_check_mark: | | | | NVIDIA | :heavy_check_mark: | :heavy_check_mark: | | | :heavy_check_mark: | :heavy_check_mark: | | | Ollama | :heavy_check_mark: | :heavy_check_mark: | | | :heavy_check_mark: | | | | OpenAI | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | | OpenAI-API-Compatible | :heavy_check_mark: | :heavy_check_mark: | | | :heavy_check_mark: | :heavy_check_mark: | | | OpenRouter | :heavy_check_mark: | :heavy_check_mark: | | | | | | | Replicate | :heavy_check_mark: | | | | :heavy_check_mark: | | | | PPIO | :heavy_check_mark: | | | | | | | | SILICONFLOW | :heavy_check_mark: | :heavy_check_mark: | | | :heavy_check_mark: | :heavy_check_mark: | | | StepFun | :heavy_check_mark: | | | | | | | | Tencent Hunyuan | :heavy_check_mark: | | | | | | | | Tencent Cloud | | | :heavy_check_mark: | | | | | | TogetherAI | :heavy_check_mark: | :heavy_check_mark: | | | :heavy_check_mark: | :heavy_check_mark: | | | TokenPony | :heavy_check_mark: | | | | | | | | Tongyi-Qianwen | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | Upstage | :heavy_check_mark: | | | | :heavy_check_mark: | | | | VLLM | :heavy_check_mark: | :heavy_check_mark: | | | :heavy_check_mark: | :heavy_check_mark: | | | VolcEngine | :heavy_check_mark: | | | | | | | | Voyage AI | | :heavy_check_mark: | | | :heavy_check_mark: | :heavy_check_mark: | | | Xinference | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | XunFei Spark | :heavy_check_mark: | | | :heavy_check_mark: | | | | | xAI | :heavy_check_mark: | :heavy_check_mark: | | | | | | | ZHIPU-AI | :heavy_check_mark: | :heavy_check_mark: | | | :heavy_check_mark: | | | | DeepInfra | :heavy_check_mark: | | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | | 302.AI | :heavy_check_mark: | :heavy_check_mark: | | | :heavy_check_mark: | :heavy_check_mark: | | | CometAPI | :heavy_check_mark: | | | | :heavy_check_mark: | | | | DeerAPI | :heavy_check_mark: | :heavy_check_mark: | | :heavy_check_mark: | :heavy_check_mark: | | | | Jiekou.AI | :heavy_check_mark: | | | | :heavy_check_mark: | :heavy_check_mark: | | ```mdx-code-block ``` :::danger IMPORTANT If your model is not listed here but has APIs compatible with those of OpenAI, click **OpenAI-API-Compatible** on the **Model providers** page to configure your model. ::: ## Example: AI Badgr (OpenAI-compatible) You can use **AI Badgr** with RAGFlow via the existing OpenAI-API-Compatible provider. To configure AI Badgr: - **Provider**: `OpenAI-API-Compatible` - **Base URL**: `https://aibadgr.com/api/v1` - **API Key**: your AI Badgr API key (from the AI Badgr dashboard) - **Model**: any AI Badgr chat or embedding model ID, as exposed by AI Badgr's OpenAI-compatible APIs AI Badgr implements OpenAI-compatible endpoints for `/v1/chat/completions`, `/v1/embeddings`, and `/v1/models`, so no additional code changes in RAGFlow are required. :::note The list of supported models is extracted from [this source](https://github.com/infiniflow/ragflow/blob/main/rag/llm/__init__.py) and may not be the most current. For the latest supported model list, please refer to the Python file. :::