Why my output of agent always not long enough even if 'max_tokens=5000'?

why my output of agent always not long enough even if ‘max_tokens=5000’?,it seems not valid?

Hey @Ray, Thank you for your question.

Your agent’s output isn’t long enough because max_tokens=5000 is just an upper bound. The actual length depends on model context window, your input size, and how forcefully you prompt the model to be verbose.

How to Fix / Get Longer Outputs

  • Be explicit in the prompt

    “Write at least 3000 tokens of detailed explanation covering X, Y, Z.”

  • Chunk outputs
    If you need something like a long article or report, instruct the agent to generate in sections (part 1, part 2).

  • Check context usage
    Log how many tokens your inputs are using. If you’re already close to the model’s window, responses will be cut short no matter what max_tokens says.

  • Remove unwanted stop sequences
    Look at your config to ensure stop=[] unless you intentionally want cutoffs.

  • Streaming + stitch
    Some use streaming mode and concatenate partial outputs to avoid early truncation.

I hope this helps