Metadata & Multi-Tenancy Filtering with Qdrant

ank · February 20, 2025, 12:57pm

I’m using Qdrant as the vector DB in my SQL Agent and have two questions:

How do I add metadata for each point when indexing documents in Qdrant (e.g. adding a user_id)?
How do I apply a filter on my knowledge base for multi-tenancy (e.g. user_id)? I want to restrict search results so that each tenant only accesses its own documents based on the metadata.

Below is a simplified version of my SQL Agent code where I initialize the knowledge base:


class SQLAgent:

    def __init__(self, config: SQLAgentConfig) -> None:
        self.config = config
        self.sql_agent: Optional[Agent] = None

    def initialize_knowledge_base(self) -> None:

        try:
            vector_db = Qdrant(
                collection=self.config.qdrant_collection,
                host=self.config.qdrant_host,
                port=str(self.config.qdrant_port),
                embedder=OpenAIEmbedder(
                    id=self.config.embedding_model,
                    api_key=self.config.openai_api_key,
                    dimensions=1536,
                ),
            )
            # some filter {user_id: user_id}
            self.knowledge_base = AgentKnowledge(vector_db=vector_db, num_documents=3)
            logger.info("Support Agent knowledge base initialized successfully.")
        except Exception:
            logger.exception("Failed to initialize Support Agent knowledge base.")
            raise

    def initialize_agent(self) -> Agent:

        try:
            self.sql_agent = Agent(
                name=self.config.agent_name,
                model=OpenAIChat(
                    id=self.config.model,
                    api_key=self.config.openai_api_key,
                ),
                user_id=self.config.user_id,
                session_id=self.config.session_id,
                markdown=self.config.markdown,
                show_tool_calls=False,
                description=self.config.description,
                instructions=self.config.instructions,
                tools=[
                    SQLTools(
                        db_engine=self.config.engine,
                        tables=self.config.allowed_tables,
                        dialect="mysql+pymysql",
                        describe_table=True,
                        list_tables=True,
                    )
                ],
                retries=self.config.retries,
                debug_mode=self.config.debug_mode,
                num_history_responses=self.config.num_history_responses,
                add_history_to_messages=True,
                storage=MySQLAgentStorage(
                    table_name=self.config.conversation_table_name,
                    db_url=self.config.db_url,
                ),
                extra_data=self.config.extra_data,
                read_chat_history=self.config.read_chat_history,
                read_tool_call_history=self.config.read_tool_call_history,
                stream=self.config.stream,
                add_datetime_to_instructions=self.config.add_datetime_to_instructions,
                telemetry=self.config.telemetry,
            )
            logger.info("SQL Agent initialized successfully.")
            return self.sql_agent
        except Exception as exc:
            logger.exception("Failed to initialize SQL Agent.")
            raise exc

Thanks

Monali · February 21, 2025, 5:36am

Hi @ank
Thanks for reaching out and for using Agno! I’ve looped in the right engineers to help with your question. We usually respond within 24 hours, but if this is urgent, just let us know, and we’ll do our best to prioritize it.
Appreciate your patience—we’ll get back to you soon!

ank · February 21, 2025, 10:39am

Hi @Monali,

Thanks for the quick response. It’s kind of urgent—if you could help with this, it would be great!

I managed to solve the first issue by modifying the CombinedKnowledgeBase as shown below:

from typing import Dict, Iterator, List
from agno.document import Document
from agno.knowledge.agent import AgentKnowledge
from agno.utils.log import logger

class CombinedKnowledgeBase(AgentKnowledge):
    sources: List[AgentKnowledge] = []
    extra_metadata: Dict[str, any] = {}  # Global metadata to merge into every document

    @property
    def document_lists(self) -> Iterator[List[Document]]:
        """Iterate over all knowledge base sources and yield lists of documents.
        For each Document, merge in the global extra_metadata (document's own values take precedence).
        """
        for kb in self.sources:
            logger.debug(f"Loading documents from {kb.__class__.__name__}")
            for docs in kb.document_lists:
                for doc in docs:
                    # Always add the file name if available.
                    file_meta = {"file_name": doc.name} if doc.name else {}
                    # Merge the global extra metadata with the document own metadata.
                    doc.meta_data = {
                        **self.extra_metadata,
                        **doc.meta_data,
                        **file_meta,
                    }
                yield docs

Could you please help me with the second question on how to apply a filter for multi-tenancy (e.g. filtering by user_id) so that each tenant only accesses its own documents?

Also, if there’s a better approach to handling metadata in the first solution, I’d love to hear your suggestions.

Thanks so much for your assistance!

Monali · February 24, 2025, 5:15am

Hi @ank
Apologies for the delay. We’ll make sure to get back to you by the end of the day.

monalisha · February 24, 2025, 2:33pm

Hi @ank Thanks for putting this out . We are actively working on these but as of now its not supported by our sdk.

ank · February 26, 2025, 9:41am

Hi @monalisha

That’s fine. Do you have any workaround or alternative solution I could try in the meantime?

Thanks for your assistance!

monalisha · February 27, 2025, 1:36pm

Hi @ank , In the meantime you’re welcome to write your own knowledgebase and vector DB by extending ours. We are an open-source platform and would love to see such use - cases built on top of Agno features.

Topic		Replies	Views
Metadata filtering from knowledge base General agent , knowledge	2	29	February 19, 2025
Regarding use case for qdrant vector insest General vectordb	7	51	February 20, 2025
Response is not pulling from the knowledge base? General knowledge	9	110	February 28, 2025
How to create a knowledge base based on user isolation General knowledge	2	45	February 17, 2025
Creating Knowledge Base from non-Agno Qdrant Collection General knowledge , tool-call	2	31	February 6, 2025

Metadata & Multi-Tenancy Filtering with Qdrant

Related topics