Why my output of agent always not long enough even if 'max_tokens=5000'?

why my output of agent always not long enough even if ‘max_tokens=5000’?,it seems not valid?

Hey @Ray, Thank you for your question.

Your agent’s output isn’t long enough because max_tokens=5000 is just an upper bound. The actual length depends on model context window, your input size, and how forcefully you prompt the model to be verbose.

How to Fix / Get Longer Outputs

  • Be explicit in the prompt

    “Write at least 3000 tokens of detailed explanation covering X, Y, Z.”

  • Chunk outputs
    If you need something like a long article or report, instruct the agent to generate in sections (part 1, part 2).

  • Check context usage
    Log how many tokens your inputs are using. If you’re already close to the model’s window, responses will be cut short no matter what max_tokens says.

  • Remove unwanted stop sequences
    Look at your config to ensure stop=[] unless you intentionally want cutoffs.

  • Streaming + stitch
    Some use streaming mode and concatenate partial outputs to avoid early truncation.

I hope this helps

Thanks for your advice,I have soled this problem with Chunk outputs、adjusted structure of agent outputs and models selection. Now I can get outputs beyond 10000 Tokens 
1 Like

Hey @Ray, that’s awesome to hear! :tada:
Glad you got it working. Chunking and structuring outputs is exactly the right approach for long-form generations, especially when models start hitting context limits. Also, choosing the right model with higher context capacity makes a big difference.

Thanks for sharing your solution — it’ll definitely help others facing the same issue!