Website Knowledge Base Not working for specific Url on Mac

I have recently shifted my codebase to mac, the code snippet that I have attached works on windows, but doesn’t on mac. This pubmed link doesn’t work but other urls like one of agno do work, but this pubmed one doesn’t work on my mac. My windows machine is running this pubmed link absolutely fine with the exact same code, i.e it is loading the knowledge base appropriately, on my mac it doesn’t load any document to qdrant, it shows 0 Documents loaded to knowledge_base

from phi.agent import Agent
from phi.model.openai import OpenAIChat
from phi.knowledge.website import WebsiteKnowledgeBase
from phi.vectordb.qdrant import Qdrant
from dotenv import load_dotenv

load_dotenv()

vector_db = Qdrant(
    collection="tub",
    url="hidden",
    api_key="hidden"
)

knowledge_base = WebsiteKnowledgeBase(
    urls=["https://pubmed.ncbi.nlm.nih.gov/27188790/"],
    vector_db=vector_db,
)

knowledge_base.load()

agent = Agent(
    model=OpenAIChat(id="gpt-4o-mini"),
    knowledge_base=knowledge_base,
    use_tools=True,
    show_tool_calls=True,
    debug_mode=True,
)

agent.print_response("who are the authors", stream=True)

Hi @gurjott
Thanks for reaching out and for using Agno! I’ve looped in the right engineers to help with your question. We usually respond within 48 hours, but if this is urgent, just let us know, and we’ll do our best to prioritize it.
Appreciate your patience—we’ll get back to you soon! :smile:

Hi
Do you see any information in the debug_logs?
The only reason it would not load anything is if the document is already in the DB.

In the debug_logs it is -
(aienv) gurjotsingh@Gurjots-MacBook try % python websitedb_test.py INFO Loading knowledge base INFO Loaded 0 documents to knowledge base DEBUG Debug logs enabled DEBUG *********** Agent Run Start: bd581d5b-17b7-4260-bdd2-bddb94930d86 ***********
I have also tried changing the collection name, but still the same result, But when I try for another url like - “about-us – BLUORNG” then it loads in the knowledge base with the exact same code - (aienv) gurjotsingh@Gurjots-MacBook try % python websitedb_test.py INFO Loading knowledge base INFO Loaded 10 documents to knowledge base DEBUG Debug logs enabled DEBUG *********** Agent Run Start: 3d7c6ab4-3348-4fc2-b063-d82c11676676 ***********

I tested it locally. For one I can recommend upgrading from phidata to Agno, we don’t actively maintain Phidata anymore.
The code would look something like

from agno.agent import Agent
from agno.models.openai.chat import OpenAIChat
from agno.knowledge.website import WebsiteKnowledgeBase
from agno.vectordb.qdrant import Qdrant


vector_db = Qdrant(
    collection="dirk-test",
    url="http://localhost:6333"
)

knowledge_base = WebsiteKnowledgeBase(
    urls=["https://pubmed.ncbi.nlm.nih.gov/27188790/"],
    vector_db=vector_db,
)

knowledge_base.load()

agent = Agent(
    model=OpenAIChat(id="gpt-4o-mini"),
    knowledge=knowledge_base,
    show_tool_calls=True,
    debug_mode=True,
)

agent.print_response("who are the authors", stream=True)

When I ran that I got better errors than before. It seems the Pubmed returns a 403 forbidden for that URL. We don’t have an immediate solution for this.

Although not ideal I can recommend using the Pubmed tool that we have built-in.
It would look something like


agent = Agent(
    model=OpenAIChat(id="gpt-4o-mini"),
    tools=[PubmedTools()],
    show_tool_calls=True,
    debug_mode=True,
)

agent.print_response("who are the authors of the article 'https://pubmed.ncbi.nlm.nih.gov/27188790/'", stream=True)