So, my question is why this agno agents or team does not follow the exact prompt which is given like I tried more prompts for the same system but does not follow all the time some times it gives the response which should not give but still it gives like that.
If i give the more explanations prompt just get bigger and increases the LLM tokens now the thing is due to it my model some times overloaded. also the thing is its even get the context still not giving the answer or do the things based on the context.
In this case what to do and also it takes too much time for memory and storage read write.
Hi @dp27, Thanks for raising this — what you’re seeing is common when working with LLMs in agent/teams.
There are a couple of reasons why agents don’t always follow the exact prompt:
LLMs are probabilistic, so even with the same instructions outputs can vary.
If you keep adding more explanations, the prompt grows larger and sometimes the model starts ignoring parts of it.
When context + memory + storage get big, parts of the instructions may not get prioritized.
What to do:
Keep your system prompts short and specific — put strict rules in the system role and use memory for guidance.
Use response models (schemas) so the model must return a structured output.
If you want less randomness, set the model’s temperature to 0.
If performance is a concern, switch to a lighter storage backend (e.g. SQLite or in-memory) for fast sessions, and only use Postgres/Vector when persistence is really needed.
Bottom line: the best results come from combining short, strict prompts with schemas and selective memory. This avoids overload, keeps responses consistent, and cuts down the storage read/write time.