Hello Agno community! I’m encountering an issue with my document review agent.
I’m using an agent that processes document chunks of fixed size and returns results using Pydantic models. I have add_history_to_messages=True
and num_history_responses=2
configured, but I’m getting a Claude token limit error:
Error code: 400 - prompt is too long: 210198 tokens > 200000 maximum
What’s strange is that everything works fine if I remove these history settings. Given that I’m processing fixed-size chunks and only keeping the last 2 responses in history, I’d expect the input size to remain relatively constant. Each message should just be current chunk + previous 2 responses.
Is it possible that the history is accumulating recursively? Like, if previous messages also contain their own history, could it create a Fibonacci-like growth pattern where each message contains its previous two, which each contain their previous two, and so on?