Feedback and Suggestions for JinaReaderTools (Async Operation & Configuration)

Hello.

I’m currently using the JinaReaderTools toolkit, and I’ve encountered a few points that I’d like to bring to your attention, along with some suggestions for improvement.

When the agent runs in an asynchronous environment, the toolkit frequently logs the error: “ERROR Error reading URL: The read operation timed out”. This seems to stem from httpx.get() being a synchronous call blocking the event loop. I’ve found that modifying the implementation to use httpx.AsyncClient and await for requests resolves this issue, ensuring proper non-blocking behavior in async contexts. It would be great if this could be addressed in a future update.

The default and non-configurable headers:

{
“Accept”: “application/json”,
“X-With-Links-Summary”: “true”,
“X-With-Images-Summary”: “true”,
}

can be a bit problematic. For instance, when using read_url to fetch content (e.g., from a Wikipedia page), the response is often heavily padded with links and images summaries before the actual main content, leading to the useful text being truncated due to the max_content_length limit. While increasing the truncation limit is an option, it might be more beneficial to allow users to configure these headers.

The Jina AI API offers several other useful parameters that are currently not configurable through the JinaReaderTools toolkit.

I’ve made some local adjustments to the toolkit for my own use, but I believe that dedicating some time to these improvements would greatly benefit all users of this valuable tool.

Additionally, it would be really cool if you considered adding support for Jina DeepSearch directly into the toolkit as well. (UPD. Okay, forget about DeepSearch, I missed the moment when they removed the 1 free query per minute and made it paid)

Hey @Alex88
Thank you for sharing your inputs with us.
I have forwarded this to our team and we will look into this.
Thank you again for your sharing this with us.