Passing images to Tools as an argument

Hello dear friends,

I built a tool that uses a computer vision model to identify Maya glyphs in an image. The goal is that a user can input an image to my LLM, then the LLM uses this Tool to identify glyphs, and then propose a translation of such glyphs to English.

I can’t manage to pass the image as an argument to the tool when called. The Agent fails to call the tool with the actual image, since the LLM is not able to do so. Instead of passing the image, it passes a string: ‘’

I then tried to encode the image as a base64 string, and try to have it passed as a string, but it fails too, probably since the string is HUGE, and the LLM modifies the encrypted value when passing to the tool.

Any ideas ?

The Agent:

from pathlib import Path

from agno.agent import Agent
from agno.media import Image
from agno.models.openai import OpenAIChat
from saastun.tools.glyph_identifier import GlyphIdentifier

agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    markdown=True,
    debug_mode=True,
    show_tool_calls=True,
    tools=[GlyphIdentifier()],
    instructions=[
        "You are an AI agent that can translate maya glyph texts to English.",
        "Use the Glyph Identifier tool to identify glyphs.",
    ],
)
image_path = Path(__file__).parent.parent.parent.parent.joinpath("data/img/cancuen.png")
agent.print_response(
    "Identify the maya glyphs in the image",
    images=[Image(filepath=image_path)],
    stream=True,
)

The response:

DEBUG ****** Agent ID: e7965c6f-753a-4cb9-9ea6-489a26e72fa5 ******              
DEBUG ***** Session ID: 07c07880-469d-460e-95d0-e1d30bfcb2f0 *****              
DEBUG Processing tools for model                                                
DEBUG Added tool glyph_identifier from glyph_identifier                         
DEBUG ** Agent Run Start: 84a3c6bc-7b35-4a53-97fb-b79392e6affc ***              
DEBUG --------------- OpenAI Response Stream Start ---------------              
DEBUG ---------------------- Model: gpt-4o -----------------------              
DEBUG ========================== system ==========================              
DEBUG <instructions>                                                            
      - You are an AI agent that can translate maya glyph texts to English.     
      - Use the Glyph Identifier tool to identify glyphs.                       
      </instructions>                                                           
                                                                                
      <additional_information>                                                  
      - Use markdown to format your answers.                                    
      </additional_information>                                                 
DEBUG =========================== user ===========================              
DEBUG Identify the maya glyphs in the image                                     
DEBUG Images added: 1                                                           
DEBUG ======================== assistant =========================              
DEBUG Tool Calls:                                                               
        - ID: 'call_nz45NFza7FLPu8T6ORCCqzLU'                                   
          Name: 'glyph_identifier'                                              
          Arguments: 'image: <image data>'                                      
DEBUG ************************  METRICS  *************************              
DEBUG * Tokens:                      input=1217, output=18, total=1235          
DEBUG * Prompt tokens details:       {'audio_tokens': 0, 'cached_tokens': 0}    
DEBUG * Completion tokens details:   {'accepted_prediction_tokens': 0,          
      'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}
DEBUG * Time:                        2.9836s                                    
DEBUG * Tokens per second:           6.0331 tokens/s                            
DEBUG * Time to first token:         2.4331s                                    
DEBUG ************************  METRICS  *************************              
DEBUG Running: glyph_identifier(image=<image data>)                             
INFO Running inference on the given image                                       
INFO <image data>                                                               
WARNING  Could not run function glyph_identifier(image=<image data>)            
ERROR    Failed to open image from path '<image data>': [Errno 2] No such file  
         or directory: '<image data>'                                           
         Traceback (most recent call last):                                     
           File "/Users/bengonzalez/Sites/saastun/src/saastun/tools/saastun.py",
         line 116, in infer_image                                               
             image = Image.open(image_input)                                    
                     ^^^^^^^^^^^^^^^^^^^^^^^                                    
           File                                                                 
         "/Users/bengonzalez/Sites/saastun/.venv/lib/python3.12/site-packages/PI
         L/Image.py", line 3513, in open                                        
             fp = builtins.open(filename, "rb")                                 
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                 
         FileNotFoundError: [Errno 2] No such file or directory: '<image data>' 
                                                                                
         The above exception was the direct cause of the following exception:   
                                                                                
         Traceback (most recent call last):                                     
           File                                                                 
         "/Users/bengonzalez/Sites/saastun/.venv/lib/python3.12/site-packages/ag
         no/tools/function.py", line 647, in execute                            
             result = self.function.entrypoint(**arguments)                     
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                     
           File                                                                 
         "/Users/bengonzalez/Sites/saastun/.venv/lib/python3.12/site-packages/py
         dantic/_internal/_validate_call.py", line 39, in wrapper_function      
             return wrapper(*args, **kwargs)                                    
                    ^^^^^^^^^^^^^^^^^^^^^^^^                                    
           File                                                                 
         "/Users/bengonzalez/Sites/saastun/.venv/lib/python3.12/site-packages/py
         dantic/_internal/_validate_call.py", line 136, in __call__             
             res =                                                              
         self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(ar
         gs, kwargs))                                                           
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         ^^^^^^^^^^^^^^^^^^^^^^                                                 
           File                                                                 
         "/Users/bengonzalez/Sites/saastun/src/saastun/tools/glyph_identifier.py
         ", line 22, in glyph_identifier                                        
             return infer_image(image, codes_only=True)                         
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                         
           File "/Users/bengonzalez/Sites/saastun/src/saastun/tools/saastun.py",
         line 118, in infer_image                                               
             raise ValueError(f"Failed to open image from path '{image_input}': 
         {e}") from e                                                           
         ValueError: Failed to open image from path '<image data>': [Errno 2] No
         such file or directory: '<image data>'                                 
DEBUG =========================== tool ===========================              
DEBUG Tool call Id: call_nz45NFza7FLPu8T6ORCCqzLU                               
DEBUG Failed to open image from path '<image data>': [Errno 2] No such file or  
      directory: '<image data>'                                                 
DEBUG **********************  TOOL METRICS  **********************              
DEBUG * Time:                        1.6613s                                    
DEBUG **********************  TOOL METRICS  **********************              
DEBUG ======================== assistant =========================              
DEBUG It seems there was an issue with processing the image. Could you please   
      try uploading it again?                                                   
DEBUG ************************  METRICS  *************************              
DEBUG * Tokens:                      input=1270, output=20, total=1290          
DEBUG * Prompt tokens details:       {'audio_tokens': 0, 'cached_tokens': 0}    
DEBUG * Completion tokens details:   {'accepted_prediction_tokens': 0,          
      'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}
DEBUG * Time:                        4.5961s                                    
DEBUG * Tokens per second:           4.3515 tokens/s                            
DEBUG * Time to first token:         3.9610s                                    
DEBUG ************************  METRICS  *************************              
DEBUG ---------------- OpenAI Response Stream End ----------------              
DEBUG Added RunResponse to Memory                                               
DEBUG *** Agent Run End: 84a3c6bc-7b35-4a53-97fb-b79392e6affc ****              
┏━ Message ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃                                                                              ┃
┃ Identify the maya glyphs in the image                                        ┃
┃                                                                              ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
┏━ Tool Calls ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃                                                                              ┃
┃ • glyph_identifier(image=<image data>)                                       ┃
┃                                                                              ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
┏━ Response (9.2s) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃                                                                              ┃
┃ It seems there was an issue with processing the image. Could you please try  ┃
┃ uploading it again?                                                          ┃
┃                                                                              ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
Process finished with exit code 0

do you want to provide the image in tool run or at the start with user input ?

from ur agent run i see you are providing image at the start with user input if you are doing this then you dont need to provide image to a tool also tools cant process images images can only be processed at the start of the agent run. im also strugeling with something similar you can go see my repo https://github.com/GodBoii/AI-OS and how i use custom tool to analyze image https://github.com/GodBoii/AI-OS/blob/master/python-backend/browser_tools.py this is a custom tool that my agent usage see how i process image in agent run/ tool run.

Hello Prajwal!

This was just a demo, but I basically want the image that the user uploads to be passed to the tool. The user can input many images and then the tool should be called.

For now, I basically pass all images through the tool manually when uploaded through the steamlit interface…

I will check your code tonight ! Thanks so much for providing some leads.

Hi @beinir, thank you for your question. Tagging our engineer, @manu to help you.

Do let us know if its urgent, we will get to it asap