Pre Tokenization for Agent and Team

dp27 · September 3, 2025, 12:32pm

Is there any functionality available to cache the tokens of agent’s role and prompt so instead of passing all this thing along with the query and then it will converted to tokens so I can do like only pass the context , query and any dynamic details to the agent and it should saved cached tokens of system prompt , instruction and role.

So basically its for reducing the TPM to the LLM.
If there is any solution or functionality available then please let me know.

Monali · September 4, 2025, 5:11am

Hey @dp27, thanks for reaching out and supporting Agno. I’ve shared this with the team, we’re working through all requests one by one and will get back to you soon.If it’s urgent, please let us know. We appreciate your patience!

dp27 · September 11, 2025, 1:10pm

Hello @Monali Its been a while still I don’t got the response from your team can you give me something that I can refer to and have at least clarity to the functionality developed or not.

Monali · September 12, 2025, 5:24am

Hey @dp27, sorry for the delay. This ticket must have slipped through the cracks. We are really very sorry. @Ruan will be here to help you asap

In the meantime, can you pls help us with which model you are using?

Ruan · September 12, 2025, 9:54am

Hey @dp27

Apologies for the delay here. The team has been overwhelmed with support requests.

Claude is the only model that has first class prompt caching support in our API.
You can see it in action here:

github.com/agno-agi/agno

cookbook/models/anthropic/prompt_caching.py

934208671

"""
This cookbook shows how to use prompt caching with Agents using Anthropic models, to catch the system prompt passed to the model.

This can significantly reduce processing time and costs.
Use it when working with a static and large system prompt.

You can check more about prompt caching with Anthropic models here: https://docs.anthropic.com/en/docs/prompt-caching
"""

from pathlib import Path

from agno.agent import Agent
from agno.models.anthropic import Claude
from agno.utils.media import download_file

# Load an example large system message from S3. A large prompt like this would benefit from caching.
txt_path = Path(__file__).parent.joinpath("system_prompt.txt")
download_file(
    "https://agno-public.s3.amazonaws.com/prompts/system_promt.txt",
    str(txt_path),

This file has been truncated. show original

Topic		Replies	Views
How to access tokens of the specific prompt General agent , knowledge	3	63	May 29, 2025
Agno Prompt processing General agent , teams	1	35	September 12, 2025
LLM as endpoint General agent	2	58	April 26, 2025
Having issues in agent memory , knowledge base General agent , memory , knowledge , tool-call , bug	9	413	March 3, 2025
Team calling 2 agents (RAG and custom function), the RAG agent doesn't use the exact instructions General knowledge	3	81	March 22, 2025

Pre Tokenization for Agent and Team

Related topics