Not proper response for Ollama based agents

shivamSingh · April 10, 2025, 11:21am

I have tried multiple local models like llama , mistral , command-r ,nemotron , granite etc, using ollama but the model fails to understand instruction.

llm = Ollama(id="nemotron")
Agent(
            model=llm,
            storage=RedisStorage(),
            add_history_to_messages=False,
            num_history_responses=3,
            knowledge=knowledge_base,
            add_references=True,
            search_knowledge=False,
            show_tool_calls=True,
            read_chat_history=True,
            user_id="enterprise-ai",
            session_id="ea523e66-61b6-413e-b726-2bce22353eda",
            instructions=["""  
- Handle both negative (expenditure) and positive values appropriately; use absolute values where needed.  
- Categorize string values:  
  - Entries starting with numbers are subcategories.  
  - Entries without a specific name are total values.  
- Output values without signs; describe meaning instead.  
- Revenue definitions:  
  - Revenue = Total company revenue.  
  - Sales = Operating revenue.  
  - Sale Products = Revenue from product sales.  
  - Domestic = Keyword for domestic sales.  
  - Export = Keyword for export sales.  
  - Revenue from sale of goods = Sum of SALE OF FINISHED GOODS and SALE OF TRADED GOODS.  
"""]
            )

but it fails to follow the instruction. if i am using gpt-4o model it works perfectly.

Monali · April 10, 2025, 1:34pm

Hey @shivamSingh
thanks for reaching out and supporting Agno. I’ve shared this with the team, we’re working through all requests one by one and will get back to you soon.
If it’s urgent, please let us know. We appreciate your patience!

kausmos · April 14, 2025, 8:29am

Hey @shivamSingh yes so if you’re using any smaller opensourced model, it might not be that good. A better model always means better results. If he’s using llama 3b it’ll not be very good with the results compared to 7b and so on.

shivamSingh · April 14, 2025, 9:18am

I almost used all the models available on ollama to run locally ( llama , models from nvidia , mistral etc) i even tried deepseek 32b model and many other larger models. Only gpt-4o seems to work properly. If possible suggest any model that i can run locally that gives equivalent result to got-4o

mustafa · April 18, 2025, 5:25pm

Hi @shivamSingh you could try using qwen2:72b (if you have the resources) or a small model like gemma2:27b. These are particular good at instruction following.

shivamSingh · April 23, 2025, 11:24am

i also tried qwen but the responses are still vague its like the model is not even getting the instruction. I have h100 gpu so i can also load bigger models so if you have any more suggestions please share

CodingTheSmartWay · April 24, 2025, 8:27am

I have the same problem. I build an Agent with local Ollama models (e.g. llama3.2 or qwen2.5:7b) and added a knowledge base to that agent. When using Ollama models it fails on understanding the instructions and it fails to use knowledge from the knowledge base attached. When switching to GPT-4o for example everything works perfectly. Instructions are understood and knowledge is used.

Maybe there is some general issue with the Ollama model adapter? Even with those “smaller” models I would have expected just a little bit of correct understanding of the prompt and the instructions.

shivamSingh · April 25, 2025, 7:52am

this is exactly what happens with me. am currently stuck with using GPT-4o

shivamSingh · April 25, 2025, 7:53am

i even tries deepseek 32b and it still failed to follow instruction.

CodingTheSmartWay · April 25, 2025, 8:07am

Ok, but if even deepseek 32b is failing on delivering proper answers then maybe something other than the local running model itself is the root cause of this problem ??

shivamSingh · April 25, 2025, 9:46am

yeah exactly , i think something in the implementation of passing instruction to model is wrong in ollama module.

mustafa · April 25, 2025, 8:05pm

Hi @shivamSingh and @CodingTheSmartWay!

Usually base models aren’t “instruction-tuned”. I would suggest try out Ollama models like
which usually “instruction-tuned” like gemma3, phi4, granite3.3.

Also you could try embedding your instructions directly as the system message

agent = Agent(
      model=Ollama(id="nemotron"),
      create_default_system_message=False,
      system_message="instructions follow this",
      # … other settings …
    )

Also try setting agent debug flag “debug_mode = True” and observe the logs. Hope you get better results now.

Topic		Replies	Views
Ollama - Lessons Learned and Recommendations General agent , feedback	9	241	March 5, 2025
Agent answers unreasonably General agent , knowledge	3	82	December 17, 2024
请问phidata如何运用ollama模型呢 General knowledge	3	157	November 11, 2024
Multi-agent which model are you guys using as a team agent? General agent	4	82	March 20, 2025
How to send a image to the llama model using Groq General agent , knowledge	4	74	March 3, 2025

Not proper response for Ollama based agents

Related topics