Thank you for the response! Here’s my agent configuration:
from agno.agent import Agent
from agno.models.anthropic import Claude
from pydantic import BaseModel, Field
from typing import List, Literal
class Error(BaseModel):
severity: Literal["Critical", "Major", "Minor"] = Field(description="...")
error_type: Literal["grammar", "spelling", "formatting", "consistency",
"legal", "reference", "structure"] = Field(description="...")
class ChunkReport(BaseModel):
chunk_index: int = Field(description="...")
errors: List[Error] = Field(default_factory=list, description="...")
total_errors: int = Field(default=0, description="...")
chunk_reviewer = Agent(
name="Chunk Reviewer",
model=Claude(
id=CLAUDE_SONNET_MODEL_ID,
api_key=ANTHROPIC_API_KEY,
temperature=0.1,
max_tokens=2048
),
role="Document quality control.",
description="Review documents for quality control.",
instructions=[
"..."
],
response_model=ChunkReport,
add_history_to_messages=True,
num_history_responses=2,
)
As you can see, I’m using Claude Sonnet with add_history_to_messages=True and num_history_responses=2. The token limit error only occurs with these history settings enabled, which makes me suspect there might be some recursive accumulation of history happening.
I’m processing documents in fixed-size chunks, calling the reviewer sequentially for each chunk. The token limit error only appears around iterations 10-12.
What’s strange is that with num_history_responses=2, each iteration should only include the current chunk plus the last 2 responses. But the error suggests the input is growing much larger. Could each response in the history be carrying its own history recursively? Is there a way to inspect the actual input message being sent to Claude? Since Agno is managing the Anthropic client, I can’t see the full prompt to verify if there’s indeed a recursive history accumulation happening.