Regarding Token Calculation in Version 1.6 When Using team.arun's Stream Method

Dreamer · July 9, 2025, 9:27am

I hope this message finds you well. I am writing to seek clarification regarding the calculation of tokens, specifically how to retrieve the number of tokens used by the LLM when utilizing the stream method from team.arun.

Prior to the 1.6 version update, within the streaming responses, there was comprehensive information about all the messages sent and received by the team. This data was extremely useful as it allowed for the calculation of tokens used during the interaction with the LLM. However, after updating to version 1.6, this functionality seems to have disappeared.

This change has brought about significant inconvenience in our work. We rely on accurate token usage data for various purposes, such as cost estimation, resource allocation, and performance optimization. Without this information, it becomes challenging to effectively manage our usage of the LLM.

I would greatly appreciate it if you could provide guidance on how to calculate tokens in the new version 1.6 when using the stream method from team.arun. Any relevant code snippets, API calls, or detailed instructions would be extremely helpful.

Thank you very much for your time and assistance. I look forward to your prompt response.

Monali · July 10, 2025, 6:27am

Hi @Dreamer, thanks for reaching out and supporting Agno. I’ve shared this with the team, we’re working through all requests one by one and will get back to you soon.
If it’s urgent, please let us know. We appreciate your patience!

Dreamer · July 11, 2025, 2:10am

This matter is urgent for us, and we would be deeply grateful if you could assist in resolving it at your earliest convenience.

Dirk · July 11, 2025, 8:59am

Hi @Dreamer
When running a team you can access the TeamRunResponse after the run is completed (via team.run_response. This would include metrics for that particular run, and member_responses which is a list of the individual member run responses which also has metrics on them. You can also then get access to the individual messages like that and access the metrics on the message.

I will add documentation to our docs to make this clearer.

anuragphi · July 11, 2025, 9:03am

Hey @Dreamer, here is an example for getting token metrics from team.arun

import asyncio
from agno.team.team import Team
from agno.models.openai import OpenAIChat
from agno.agent import Agent

async def stream_with_metrics():
    # Set up team
    team = Team(
        model=OpenAIChat("gpt-4o"),
        members=[
            Agent(
                name="Research Agent",
                model=OpenAIChat("gpt-4o"),
                role="Research information"
            ),
            Agent(
                name="Research Agent 2",
                model=OpenAIChat("gpt-4o"),
                role="Research information 2",
            ),
        ]
    )
    
    try:
        # Run with streaming
        stream = await team.arun("What is AI?", stream=True)
        
        # Process stream chunks
        response_content = ""
        async for chunk in stream:
            # Handle different chunk types
            if hasattr(chunk, 'content') and chunk.content:
                response_content += str(chunk.content)
                
        # Get final metrics after stream completion
        if team.run_response and team.run_response.metrics:
            metrics = team.run_response.metrics
            
            print("=== Token Usage Metrics ===")
            print(f"Input tokens: {metrics.get('input_tokens', 0)}")
            print(f"Output tokens: {metrics.get('output_tokens', 0)}")
            print(f"Total tokens: {metrics.get('total_tokens', 0)}")
            
            # Additional metrics that might be available
            if 'cached_tokens' in metrics:
                print(f"Cached tokens: {metrics['cached_tokens']}")
            if 'cache_write_tokens' in metrics:
                print(f"Cache write tokens: {metrics['cache_write_tokens']}")
            
            return metrics
        else:
            print("No metrics available")
            return None
            
    except Exception as e:
        print(f"Error during streaming: {e}")
        return None

# Run the example
metrics = asyncio.run(stream_with_metrics())

Let me know if you have any other questions.

Dirk · July 11, 2025, 10:40am

I have pushed an update to our docs. See here.

Dreamer · July 11, 2025, 12:01pm

Thank you. You have helped me a great deal.

Dreamer · July 11, 2025, 12:02pm

I really appreciate you taking the time.

Topic		Replies	Views
How to access tokens of the specific prompt General agent , knowledge	3	186	May 29, 2025
Team arun, I need the RunResponseEvent from the team members (next to the TeamRunResponseEvent) General teams	2	58	July 29, 2025
Run Team along with streaming and structured output General teams , bug	4	341	July 8, 2025
How do i get usage metrics when using a model General agent , knowledge	3	250	June 2, 2025
Stream team member responses before they are completed General teams	3	67	September 8, 2025

Regarding Token Calculation in Version 1.6 When Using team.arun's Stream Method

Related topics