Not proper response for Ollama based agents

I have tried multiple local models like llama , mistral , command-r ,nemotron , granite etc, using ollama but the model fails to understand instruction.

llm = Ollama(id="nemotron")
Agent(
            model=llm,
            storage=RedisStorage(),
            add_history_to_messages=False,
            num_history_responses=3,
            knowledge=knowledge_base,
            add_references=True,
            search_knowledge=False,
            show_tool_calls=True,
            read_chat_history=True,
            user_id="enterprise-ai",
            session_id="ea523e66-61b6-413e-b726-2bce22353eda",
            instructions=["""  
- Handle both negative (expenditure) and positive values appropriately; use absolute values where needed.  
- Categorize string values:  
  - Entries starting with numbers are subcategories.  
  - Entries without a specific name are total values.  
- Output values without signs; describe meaning instead.  
- Revenue definitions:  
  - Revenue = Total company revenue.  
  - Sales = Operating revenue.  
  - Sale Products = Revenue from product sales.  
  - Domestic = Keyword for domestic sales.  
  - Export = Keyword for export sales.  
  - Revenue from sale of goods = Sum of SALE OF FINISHED GOODS and SALE OF TRADED GOODS.  
"""]
            )

but it fails to follow the instruction. if i am using gpt-4o model it works perfectly.

Hey @shivamSingh
thanks for reaching out and supporting Agno. I’ve shared this with the team, we’re working through all requests one by one and will get back to you soon.
If it’s urgent, please let us know. We appreciate your patience!

Hey @shivamSingh yes so if you’re using any smaller opensourced model, it might not be that good. A better model always means better results. If he’s using llama 3b it’ll not be very good with the results compared to 7b and so on.

I almost used all the models available on ollama to run locally ( llama , models from nvidia , mistral etc) i even tried deepseek 32b model and many other larger models. Only gpt-4o seems to work properly. If possible suggest any model that i can run locally that gives equivalent result to got-4o

Hi @shivamSingh you could try using qwen2:72b (if you have the resources) or a small model like gemma2:27b. These are particular good at instruction following.

i also tried qwen but the responses are still vague its like the model is not even getting the instruction. I have h100 gpu so i can also load bigger models so if you have any more suggestions please share

I have the same problem. I build an Agent with local Ollama models (e.g. llama3.2 or qwen2.5:7b) and added a knowledge base to that agent. When using Ollama models it fails on understanding the instructions and it fails to use knowledge from the knowledge base attached. When switching to GPT-4o for example everything works perfectly. Instructions are understood and knowledge is used.

Maybe there is some general issue with the Ollama model adapter? Even with those “smaller” models I would have expected just a little bit of correct understanding of the prompt and the instructions.

this is exactly what happens with me. am currently stuck with using GPT-4o

i even tries deepseek 32b and it still failed to follow instruction.

Ok, but if even deepseek 32b is failing on delivering proper answers then maybe something other than the local running model itself is the root cause of this problem ??

yeah exactly , i think something in the implementation of passing instruction to model is wrong in ollama module.

Hi @shivamSingh and @CodingTheSmartWay!

Usually base models aren’t “instruction-tuned”. I would suggest try out Ollama models like
which usually “instruction-tuned” like gemma3, phi4, granite3.3.

Also you could try embedding your instructions directly as the system message

agent = Agent(
      model=Ollama(id="nemotron"),
      create_default_system_message=False,
      system_message="instructions follow this",
      # … other settings …
    )

Also try setting agent debug flag “debug_mode = True” and observe the logs. Hope you get better results now.