Hello Support Team,
I’m trying to load a JSON knowledge base into PgVector and keep running into an “Unsupported data type” error. Below is a minimal reproducible example of my setup:
from pathlib import Path
from agno.agent import Agent
from agno.knowledge.json import JSONKnowledgeBase, JSONReader
from agno.vectordb.pgvector import PgVector, SearchType
from agno.models.azure import AzureOpenAI
from agno.embedder.azure_openai import AzureOpenAIEmbedder
DATABASE_URL = "{my_database_url}"
agent_description = "You are an agent for searching the Concept table."
knowledge_base = JSONKnowledgeBase(
path="vocabulary_download_v5/output_only100.json",
reader=JSONReader(),
vector_db=PgVector(
table_name="json_concept",
db_url=DATABASE_URL,
search_type=SearchType.vector,
embedder=AzureOpenAIEmbedder(),
),
)
agent = Agent(
name="concept_agent",
model=AzureOpenAI(id="gpt-4o"),
description=agent_description,
knowledge=knowledge_base,
tools=[],
show_tool_calls=True,
markdown=True
)
if agent.knowledge is not None:
agent.knowledge.load(recreate=False)
When I run this, I immediately get:
(base) snomed % python3 scripts/main.py
INFO Loading knowledge base
INFO Reading: vocabulary_download_v5/output_only100.json
ERROR Error processing document 'output_only100': Unsupported data type
/Users/kensukemiyake/.pyenv/versions/3.10.11/lib/python3.10/site-packages/agno/vectordb/pgvector/pgvector.py:339: SAWarning: Column 'ai.json_concept.id' is marked as a member of the primary key for table 'ai.json_concept', but has no Python-side or server-side default generator indicated, nor does it indicate 'autoincrement=True' or 'nullable=True', and no explicit value is passed. Primary key columns typically may not store NULL.
sess.execute(insert_stmt, batch_records)
INFO Inserted batch of 0 documents.
ERROR Error processing document 'output_only100': Unsupported data type
For reference, here is a small excerpt from output_only100.json
:
[
{
"name": "003630641",
"content": "hemorrhoidal cream 10mg/g / 144mg/g … TOPICAL CREAM",
"meta_data": {
"vocabulary_id": "NDC",
"concept_class_id": "9-digit NDC"
}
},
{
"name": "167290436",
"content": "fulvestrant 250mg/5mL INTRAMUSCULAR INJECTION",
"meta_data": {
"vocabulary_id": "NDC",
"concept_class_id": "9-digit NDC"
}
}
// …more entries…
]
- Why does AGNO report “Unsupported data type” on what appears to be plain JSON objects?
- Is there a specific file format or reader configuration I need to change (e.g. NDJSON vs. array)?
- Do I need to adjust any JSONReader or JSONKnowledgeBase parameters to accept a top-level array?
Any guidance on how to proceed would be much appreciated!
Thank you in advance,
Kensuke Miyake