Rag agent is not pulling information from the local knowledge base

Hello,

Can you help here?

Here is the code

Create a knowledge base from a PDF

pdf_knowledge_base = PDFKnowledgeBase(
path=“C:/prg”,
#Use LanceDB as the vector database
vector_db=LanceDb(
table_name=“pdf_documents”,
uri=“data/lancedb”,
search_type=SearchType.vector,
embedder=OpenAIEmbedder(model=“text-embedding-3-small”),
),
reader=PDFReader(chunk=True),
)

Comment out after first run as the knowledge base is loaded

pdf_knowledge_base.load(recreate=False)

agent = Agent(
model=OpenAIChat(id=“gpt-4o”),
knowledge=pdf_knowledge_base,
tools=[FileTools()],
show_tool_calls=True,
add_context=True,
search_knowledge=True,
markdown=True,
)
agent.print_response(“List Net Amount Due from the PDF docs”, stream=True)

It gives following output

Also querying the vector table gives lots of data which means data has been extracted and loaded as embeddings/vector.

1 Like

Hi @gsriniva
Thank you for reaching out and using Phidata! I’ve tagged the relevant engineers to assist you with your query. We aim to respond within 48 hours.
If this is urgent, please feel free to let us know, and we’ll do our best to prioritize it.
Thanks for your patience!

Hi @gsriniva

I was able to replicate your issue. Could you please try setting the use_tools=True param when setting up your Agent?

This should let the agent access the knowledge base.
You also don’t need the tools=[FileTools()]

Let me know if this helps!

1 Like

It works now, Thank you. However it brings only 5 records from the local knowledge base even though it has 1000 of records. Does the agent has any configuration parameters to increase this limit. ?

Hi @gsriniva

I noticed that you did not provide a system prompt using the instruction parameter when creating the Agent. Can you please add this and clearly indicate to the Agent that it should use it’s knowledge base when determining answers. Let me know how this goes!

ive issues implementing RAG knowledge base can you please help me wid this if you had done this earlier

Hello,

I have added instruction parameter and given following instruction

instructions=[“Use pdf_knowledge_base”,
“use the local lancedb”,
“include all the records from the lancedb”,
“Always include sources in your response”],

It picks only 5 records (5 relevant documents) when I do the debug

DEBUG Getting 5 relevant documents for query: Total Amount Due by Year with Week Ending calculation

Any thoughts on how to include all the documents, records searching?

@gsriniva can you try passing num_documents=10 in PDFKnowledgeBase to override it?

It works now. Thank you