I am using local pdf to be added to my chromadb vector store and the pdf is successfully added to the vs but there is some issue in search.
from agno.agent import Agent
from agno.vectordb.chroma import ChromaDb
from agno.models.openai import OpenAIChat
from agno.embedder.openai import OpenAIEmbedder
from agno.knowledge.pdf import PDFKnowledgeBase
from agno.document.reader.pdf_reader import PDFReader
from dotenv import load_dotenv
import os
load_dotenv()
if __name__ == "__main__":
vector_db = ChromaDb(collection="recipes",
path="tmp\chromadb",
persistent_client=True,
embedder=OpenAIEmbedder(api_key=os.environ["OPENAI_API_KEY"]))
knowledge_base = PDFKnowledgeBase(
path="path/to/my/pdf",
vector_db=vector_db,
reader=PDFReader(chunk=True),
)
knowledge_base.load(recreate=True)
agent = Agent(model=OpenAIChat(id="gpt-4o",api_key=os.environ["OPENAI_API_KEY"]),
knowledge=knowledge_base,
show_tool_calls=True)
agent.print_response(f"Instruction : Based on the context answer the question below.Answer in detail with facts from context.\nQuestion : What is the document about?", markdown=True)
Thanks for reaching out and for using Agno! I’ve looped in the right engineers to help with your question. We usually respond within 24 hours, but if this is urgent, just let us know, and we’ll do our best to prioritize it.
Appreciate your patience—we’ll get back to you soon!
search_results: List[Document] = []
ids = result.get("ids", [[]])[0]
metadata = result.get("metadatas", [[]])[0] # type: ignore // The issue is here since i have no metadata this returns [None]
documents = result.get("documents", [[]])[0] # type: ignore
embeddings = result.get("embeddings")
distances = result.get("distances", [[]])[0] # type: ignore
uris = result.get("uris")
data = result.get("data")
metadata["distances"] = distances # type: ignore // then we try to add distance into [None] which is not valid
metadata["uris"] = uris # type: ignore
metadata["data"] = data # type: ignore
i fixed this by updating : metadata = result.get("metadatas", [[]])[0] --> metadata = {}
This is what got it working for my case. But if possible do let me know about the more proper solution.
Hey @shivamSingh! You are on the correct path. ChromaDb needs to be fixed and I will be on it soon enough. Hopefully in the upcoming releases it will be done
Thanks for replying, I would like to add that metadata = result.get("metadatas", [[]])[0] returns a list of dicts instead of a dict hence doing metadata["distances"] = distances gives : Error searching for documents: list indices must be integers or slices, not str : which is not the real cause.
Again Thanks for the response will use my local patch until new updates are pushed.