I am developing a document processing agent using Ollama. It works as below,
- User upload the file image or pdf to a chat / Playground
- Document processing agent configured with Ollama llama3.2 llm and custom OCR tool
- Customer OCR tool should get the uploaded image / pdf, using Surya OCR to covert into text
- Document processing agent process the text returned by Custom OCR tool and send the text to LLM
When I test this via playground, Custom OCR tool receive the input data as [img-0]. It’s not receiving the binary also not the file type. Can you guide/direct me how to achieve this?