Skip to content

feat: Add Milvus vector store backend support for RAG Engine#2026

Open
bangqipropel wants to merge 5 commits into
kaito-project:mainfrom
bangqipropel:milvus_support
Open

feat: Add Milvus vector store backend support for RAG Engine#2026
bangqipropel wants to merge 5 commits into
kaito-project:mainfrom
bangqipropel:milvus_support

Conversation

@bangqipropel

Copy link
Copy Markdown
Collaborator
  • Add MilvusVectorStoreHandler with full CRUD overrides (list, update, delete, dedup, retrieve)
  • Wire Milvus routing in main.py and config.py
  • Add pymilvus and llama-index-vector-stores-milvus dependencies
  • Add unit tests (test_milvus_store.py) using Milvus Lite
  • Add Milvus deployment YAML and RAGEngine example YAML
  • Add Milvus E2E test case and workflow deployment step
  • Update CRD comments, docs (rag.md, rag-api.md) with Milvus info

Reason for Change:

Requirements

  • added unit tests and e2e tests (if applicable).

Issue Fixed:

Notes for Reviewers:

@kaito-pr-agent

kaito-pr-agent Bot commented May 5, 2026

Copy link
Copy Markdown
Contributor

Title

Add Milvus vector store backend support for RAG Engine


Description

  • Implement MilvusVectorStoreHandler with full CRUD and index restoration logic.

  • Integrate Milvus routing in main entry point and configuration files.

  • Add unit tests and E2E test cases for Milvus backend validation.

  • Update documentation and provide deployment example manifests.


Changes walkthrough 📝

Relevant files
Configuration changes
3 files
ragengine_types.go
Update CRD comments to include Milvus support                       
+3/-2     
config.py
Update supported vector DB types in comments                         
+1/-1     
ragengine-e2e-workflow.yaml
Add Milvus deployment step in CI                                                 
+51/-0   
Tests
2 files
rag_test.go
Add E2E test suite for Milvus backend                                       
+140/-0 
test_milvus_store.py
Add unit tests for Milvus store                                                   
+170/-0 
Enhancement
2 files
main.py
Add Milvus handler initialization logic                                   
+9/-1     
milvus_store.py
Implement Milvus vector store handler                                       
+690/-0 
Documentation
4 files
kaito_ragengine_milvus.yaml
Add RAGEngine example for Milvus                                                 
+19/-0   
milvus-deployment.yaml
Add Milvus deployment manifest                                                     
+212/-0 
rag-api.md
Update API comparison table with Milvus                                   
+8/-8     
rag.md
Add Milvus setup and usage documentation                                 
+53/-0   
Dependencies
1 files
requirements.txt
Add Milvus Python dependencies                                                     
+2/-0     

Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • @kaito-pr-agent

    kaito-pr-agent Bot commented May 5, 2026

    Copy link
    Copy Markdown
    Contributor

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 PR contains tests
    🔒 Security concerns

    Sensitive information exposure:
    The milvus-deployment.yaml file contains hardcoded MinIO credentials (minioadmin/minioadmin) that should be moved to Kubernetes Secrets. Additionally, the VECTOR_DB_ACCESS_SECRET is referenced but the actual secret creation is not shown in the deployment examples.

    ⚡ Recommended focus areas for review

    Critical Bug - Wrong Backend in Test

    The first Milvus test case (lines 420-475) incorrectly calls createLocalEmbeddingKaitoVLLMRAGEngineWithQdrant instead of createLocalEmbeddingKaitoVLLMRAGEngineWithMilvus. This is a copy-paste error that would test Qdrant instead of Milvus, defeating the purpose of the test.

    ragengineObj := createLocalEmbeddingKaitoVLLMRAGEngineWithQdrant(clusterIP, "v1/completions")

    Performance Concern - Pagination Efficiency
    The _list_documents_in_index method over-fetches entities (batch_size = max(limit * 3, 100)) and deduplicates in Python. For large collections with many chunks per document, this could result in excessive Milvus queries and memory usage. Consider implementing server-side deduplication or more efficient pagination.

    Security - Hardcoded Credentials

    The MinIO deployment uses hardcoded credentials (MINIO_ACCESS_KEY: minioadmin, MINIO_SECRET_KEY: minioadmin) in the YAML. These should be stored in Kubernetes Secrets and referenced via envFrom or secretKeyRef to prevent credential exposure in version control.

    - name: MINIO_ACCESS_KEY
      value: minioadmin
    - name: MINIO_SECRET_KEY
      value: minioadmin

    @bangqipropel bangqipropel changed the title Add Milvus vector store backend support for RAG Engine feat: Add Milvus vector store backend support for RAG Engine May 5, 2026
    @kaito-pr-agent

    kaito-pr-agent Bot commented May 5, 2026

    Copy link
    Copy Markdown
    Contributor

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Fix incorrect backend function call in Milvus test case

    This test case is named "should create RAG with Milvus vector database backend
    successfully" but incorrectly calls the Qdrant factory function. Replace it with the
    Milvus-specific function to test the correct backend.

    test/rage2e/rag_test.go [442]

    -ragengineObj := createLocalEmbeddingKaitoVLLMRAGEngineWithQdrant(clusterIP, "v1/completions")
    +ragengineObj := createLocalEmbeddingKaitoVLLMRAGEngineWithMilvus(clusterIP, "v1/completions")
    Suggestion importance[1-10]: 9

    __

    Why: The test case is named "should create RAG with Milvus vector database backend successfully" but uses the Qdrant factory function. This is a critical bug where the test code does not match the test intent/name.

    High
    Fix hash comparison to use consistent metadata

    The hash comparison uses empty metadata for the old document but includes metadata
    for the new document. This inconsistency may cause false positives where unchanged
    documents are marked as updated. Use consistent metadata for both hash calculations.

    presets/ragengine/vector_store/milvus_store.py [546-552]

     # Compare hash to skip unchanged documents
     old_text = result[0].get("text", "")
    +old_metadata = result[0].get("metadata", {})
     new_hash = LlamaDocument(text=doc.text, metadata=doc.metadata).hash
    -old_hash = LlamaDocument(text=old_text, metadata={}).hash
    +old_hash = LlamaDocument(text=old_text, metadata=old_metadata).hash
     if new_hash == old_hash:
         unchanged_docs.append(doc)
         continue
    Suggestion importance[1-10]: 2

    __

    Why: While the suggestion correctly identifies a metadata inconsistency in hash comparison, the proposed fix is ineffective because the query in update_documents only requests ["text"], so metadata will not be present in result[0] regardless of the code change.

    Low
    General
    Correct nodeSelector workload label for Milvus components

    The nodeSelector uses workload: qdrant for Milvus components (etcd, minio, milvus),
    but this should be workload: milvus to match the intended workload label and avoid
    scheduling conflicts with Qdrant deployments.

    examples/RAG/milvus-deployment.yaml [28-29]

     nodeSelector:
    -  workload: qdrant
    +  workload: milvus
    Suggestion importance[1-10]: 9

    __

    Why: Copy-paste error from Qdrant manifest causes incorrect node scheduling labels, likely preventing deployment if nodes are not labeled qdrant.

    High
    Update retrieval comment to reflect Milvus search behavior

    The comment references Qdrant's hybrid search feature, but this test is for Milvus
    which uses dense vector similarity. Update the comment to accurately reflect
    Milvus's search behavior.

    test/rage2e/rag_test.go [459-461]

    -// Retrieve document - Qdrant uses hybrid search (dense + sparse)
    +// Retrieve document - Milvus uses dense vector similarity
     expectedText := indexDoc["text"].(string)
     err = createAndValidateRetrievalPod(ragengineObj, docID, expectedText)
    Suggestion importance[1-10]: 7

    __

    Why: The comment references Qdrant's hybrid search, but the test is for Milvus. Updating the comment improves accuracy and consistency with the test name.

    Medium
    Align E2E Milvus deployment with production configuration

    The E2E Milvus deployment uses embedded etcd and local storage, which differs from
    the production manifest that uses external etcd and MinIO. This inconsistency may
    cause test failures that don't reflect production behavior. Consider aligning the
    E2E setup with production or documenting the difference.

    .github/workflows/ragengine-e2e-workflow.yaml [195-199]

     env:
    -- name: ETCD_USE_EMBED
    -  value: "true"
    -- name: COMMON_STORAGETYPE
    -  value: "local"
    +- name: ETCD_ENDPOINTS
    +  value: etcd:2379
    +- name: MINIO_ADDRESS
    +  value: minio:9000
    Suggestion importance[1-10]: 5

    __

    Why: Points out configuration inconsistency between E2E and production manifests, which is valid but E2E often prioritizes simplicity.

    Low
    Update Milvus URL to use correct namespace

    The documentation shows Milvus URL using default namespace, but the E2E workflow
    deploys Milvus to the milvus namespace. Update the documentation to reflect the
    correct namespace or clarify that users should adjust the URL based on their
    deployment namespace.

    website/docs/rag.md [229]

    -url: "http://milvus.default.svc.cluster.local:19530"
    +url: "http://milvus.milvus.svc.cluster.local:19530"
    Suggestion importance[1-10]: 5

    __

    Why: Improves consistency between documentation and E2E test configuration, though default namespace is also acceptable for examples.

    Low
    Security
    Replace hardcoded MinIO credentials with Secret references

    The MinIO deployment uses default credentials (minioadmin/minioadmin) which is a
    security risk. Replace these with environment variable references to Kubernetes
    Secrets to avoid exposing credentials in the manifest.

    examples/RAG/milvus-deployment.yaml [102-106]

     env:
       - name: MINIO_ACCESS_KEY
    -    value: minioadmin
    +    valueFrom:
    +      secretKeyRef:
    +        name: minio-credentials
    +        key: access-key
       - name: MINIO_SECRET_KEY
    -    value: minioadmin
    +    valueFrom:
    +      secretKeyRef:
    +        name: minio-credentials
    +        key: secret-key
    Suggestion importance[1-10]: 8

    __

    Why: Hardcoded credentials in example manifests are a security risk. Using Kubernetes Secrets is best practice even for examples.

    Medium

    nodeSelector:
    workload: qdrant
    containers:
    - name: etcd

    @bfoley13 bfoley13 May 7, 2026

    Copy link
    Copy Markdown
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    is this and minio needed here? if so might not want the qdrant nodeSelector

    storage_context = StorageContext.from_defaults(vector_store=vector_store)

    # Query all entities from Milvus to rebuild the docstore
    llama_docs = self._query_collection_to_docs(collection_name)

    Copy link
    Copy Markdown
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    im curious what kind of latency this will add on startup? we might want to have some telemetry on this


    milvus_offset += len(entities)

    # Total unique doc count (approximate — use count of unique doc_ids seen)

    Copy link
    Copy Markdown
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    is the total count actually an approximate? seems like count(*) will take care of this?

    Comment thread test/rage2e/rag_test.go

    clusterIP := service.Spec.ClusterIP

    ragengineObj := createLocalEmbeddingKaitoVLLMRAGEngineWithQdrant(clusterIP, "v1/completions")

    Copy link
    Copy Markdown
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    looks like this is pointing to qdrant

    Comment thread website/docs/rag.md
    See [examples/RAG/kaito_ragengine_milvus.yaml](https://github.com/kaito-project/kaito/blob/main/examples/RAG/kaito_ragengine_milvus.yaml) for the full example.

    :::tip
    Milvus persists data via its own storage layer (etcd + MinIO or local disk). On pod restart, RAGEngine automatically rediscovers existing Milvus collections and restores them as indexes.

    Copy link
    Copy Markdown
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Ahh I see that the etcd + minio is used by milvus in the deployment example, so just remove the nodeSelectors from the example deployment

    - Remove buggy duplicate Milvus E2E test that incorrectly called Qdrant helper
    - Remove qdrant nodeSelector from etcd/minio/milvus in milvus-deployment.yaml
    - Add note about demo MinIO credentials, recommend Secret refs for production
    - Clarify Milvus _list_documents total_items semantics (exact chunk count)
    - Add restore latency logging for _restore_single_index startup observability
    - Document pagination over-fetch trade-off
    - Pin milvus-lite==2.5.1 to match the version locally validated; newer
      releases have regressed on preserving dynamic metadata fields and
      ref_doc_id when used with pymilvus 2.6.7 in CI environments.
    - Pin setuptools<81 so milvus-lite's pkg_resources import keeps working.
    - Make _entity_to_doc_dict fall back to _node_content JSON for both
      metadata and doc_id, so user fields like filename/branch survive even
      when the Milvus dynamic schema doesn't promote them to top-level
      columns. Fixes test_list_documents_with_filter_index KeyError.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Projects

    Status: No status

    Development

    Successfully merging this pull request may close these issues.

    2 participants