Metadata & Multi-Tenancy Filtering with Qdrant

I’m using Qdrant as the vector DB in my SQL Agent and have two questions:

  1. How do I add metadata for each point when indexing documents in Qdrant (e.g. adding a user_id)?

  2. How do I apply a filter on my knowledge base for multi-tenancy (e.g. user_id)? I want to restrict search results so that each tenant only accesses its own documents based on the metadata.

Below is a simplified version of my SQL Agent code where I initialize the knowledge base:


class SQLAgent:

    def __init__(self, config: SQLAgentConfig) -> None:
        self.config = config
        self.sql_agent: Optional[Agent] = None

    def initialize_knowledge_base(self) -> None:

        try:
            vector_db = Qdrant(
                collection=self.config.qdrant_collection,
                host=self.config.qdrant_host,
                port=str(self.config.qdrant_port),
                embedder=OpenAIEmbedder(
                    id=self.config.embedding_model,
                    api_key=self.config.openai_api_key,
                    dimensions=1536,
                ),
            )
            # some filter {user_id: user_id}
            self.knowledge_base = AgentKnowledge(vector_db=vector_db, num_documents=3)
            logger.info("Support Agent knowledge base initialized successfully.")
        except Exception:
            logger.exception("Failed to initialize Support Agent knowledge base.")
            raise

    def initialize_agent(self) -> Agent:

        try:
            self.sql_agent = Agent(
                name=self.config.agent_name,
                model=OpenAIChat(
                    id=self.config.model,
                    api_key=self.config.openai_api_key,
                ),
                user_id=self.config.user_id,
                session_id=self.config.session_id,
                markdown=self.config.markdown,
                show_tool_calls=False,
                description=self.config.description,
                instructions=self.config.instructions,
                tools=[
                    SQLTools(
                        db_engine=self.config.engine,
                        tables=self.config.allowed_tables,
                        dialect="mysql+pymysql",
                        describe_table=True,
                        list_tables=True,
                    )
                ],
                retries=self.config.retries,
                debug_mode=self.config.debug_mode,
                num_history_responses=self.config.num_history_responses,
                add_history_to_messages=True,
                storage=MySQLAgentStorage(
                    table_name=self.config.conversation_table_name,
                    db_url=self.config.db_url,
                ),
                extra_data=self.config.extra_data,
                read_chat_history=self.config.read_chat_history,
                read_tool_call_history=self.config.read_tool_call_history,
                stream=self.config.stream,
                add_datetime_to_instructions=self.config.add_datetime_to_instructions,
                telemetry=self.config.telemetry,
            )
            logger.info("SQL Agent initialized successfully.")
            return self.sql_agent
        except Exception as exc:
            logger.exception("Failed to initialize SQL Agent.")
            raise exc

Thanks

Hi @ank
Thanks for reaching out and for using Agno! I’ve looped in the right engineers to help with your question. We usually respond within 24 hours, but if this is urgent, just let us know, and we’ll do our best to prioritize it.
Appreciate your patience—we’ll get back to you soon! :smile:

Hi @Monali,

Thanks for the quick response. It’s kind of urgent—if you could help with this, it would be great!

I managed to solve the first issue by modifying the CombinedKnowledgeBase as shown below:

from typing import Dict, Iterator, List
from agno.document import Document
from agno.knowledge.agent import AgentKnowledge
from agno.utils.log import logger

class CombinedKnowledgeBase(AgentKnowledge):
    sources: List[AgentKnowledge] = []
    extra_metadata: Dict[str, any] = {}  # Global metadata to merge into every document

    @property
    def document_lists(self) -> Iterator[List[Document]]:
        """Iterate over all knowledge base sources and yield lists of documents.
        For each Document, merge in the global extra_metadata (document's own values take precedence).
        """
        for kb in self.sources:
            logger.debug(f"Loading documents from {kb.__class__.__name__}")
            for docs in kb.document_lists:
                for doc in docs:
                    # Always add the file name if available.
                    file_meta = {"file_name": doc.name} if doc.name else {}
                    # Merge the global extra metadata with the document own metadata.
                    doc.meta_data = {
                        **self.extra_metadata,
                        **doc.meta_data,
                        **file_meta,
                    }
                yield docs

Could you please help me with the second question on how to apply a filter for multi-tenancy (e.g. filtering by user_id) so that each tenant only accesses its own documents?

Also, if there’s a better approach to handling metadata in the first solution, I’d love to hear your suggestions.

Thanks so much for your assistance!

Hi @ank
Apologies for the delay. We’ll make sure to get back to you by the end of the day.

Hi @ank Thanks for putting this out . We are actively working on these but as of now its not supported by our sdk.

Hi @monalisha

That’s fine. Do you have any workaround or alternative solution I could try in the meantime?

Thanks for your assistance!

Hi @ank , In the meantime you’re welcome to write your own knowledgebase and vector DB by extending ours. We are an open-source platform and would love to see such use - cases built on top of Agno features.