Question regarding Web Scraping

Tpardhi · September 2, 2025, 8:44pm

Hello,

In the toolkit section, if we check the provided implementation for any web scraping tools, the command sent is to summarize some URL. In the case where the content of the URL are more, then does the agent automatically handles chunking of data or how does the agent creates a summary in that case?

Thank you.

Monali · September 3, 2025, 5:05am

Hey @Tpardhi, thanks for reaching out and supporting Agno. I’ve shared this with the team, we’re working through all requests one by one and will get back to you soon.If it’s urgent, please let us know. We appreciate your patience!

Tpardhi · September 3, 2025, 2:24pm

Yes it would be great if you can answer the query fast as this will help me in deciding the approach for my project.

Nancy · September 4, 2025, 3:01am

Hi @Tpardhi,

Agno handles large web content two ways:

Direct web scraping tools (Newspaper4k, Firecrawl) - No automatic chunking, relies on model’s context window.

Knowledge system (UrlKnowledge) - Yes, automatic chunking enabled by default. Splits content into manageable chunks (5000 chars).

from agno.knowledge.url import UrlKnowledge
from agno.document.chunking.document import DocumentChunking

agent = Agent(
    knowledge=UrlKnowledge(
        urls=["https://example.com/long-article"],
        reader=URLReader(chunking_strategy=DocumentChunking())
    )
)

Examples: agno/cookbook/teams/team_with_knowledge.py at main · agno-agi/agno · GitHub

Topic		Replies	Views
Multiple agents topic: The respective agent doesn't use information from the previous agent General agent , memory , knowledge , tool-call , teams	3	91	June 27, 2025
Agentic chunking doesnt seem to work General rag	2	81	February 17, 2025
Can I upload file with url to agent？ General agent	3	46	August 29, 2025
Use knowledge.load on the fly General knowledge	2	55	February 25, 2025
Regarding the team's response results General agent	1	33	June 27, 2025

Question regarding Web Scraping

Related topics