When using MarkdownKnowledgeBase with PgVector storage, document IDs are being stored as string representations of UUID objects (“{UUID(‘…’)}”) instead of clean UUID strings, causing database query and indexing issues.
Environment
- Agno Version: [Current version from pip]
- Database: PostgreSQL with pgvector extension (Supabase)
- Python: 3.13
- OS: macOS
Reproduction Steps
from agno.knowledge.markdown import MarkdownKnowledgeBase
from agno.vectordb.pgvector import PgVector
from agno.embedder.cohere import CohereEmbedder
from agno.document.chunking.agentic import AgenticChunking
import tempfile
Create temporary markdown file
with tempfile.NamedTemporaryFile(mode=‘w’, suffix=‘.md’, delete=False) as temp_file:
temp_file.write(“# Test Document\nThis is test content.”)
temp_md_path = temp_file.name
Set up vector database
vector_db = PgVector(
table_name=“test_documents”,
db_url=“postgresql+psycopg://user:pass@host:port/db”,
embedder=CohereEmbedder(id=“embed-v4.0”)
)
Create and load knowledge base
knowledge_base = MarkdownKnowledgeBase(
path=temp_md_path,
vector_db=vector_db,
chunking_strategy=AgenticChunking()
)
knowledge_base.load(recreate=True)
Expected Behavior
Document IDs should be stored as clean UUID strings:
SELECT id FROM ai.test_documents;
– Expected: “885404b5-ebcf-444b-a26d-1928a09419f8”
Actual Behavior
Document IDs are stored as string representations of UUID objects:
SELECT id FROM ai.test_documents;
– Actual: “{UUID(‘885404b5-ebcf-444b-a26d-1928a09419f8’)}”
Impact
- Database Performance: String representation is not indexable as UUID type
- Query Complexity: Requires string parsing to extract actual UUID
- Integration Issues: External systems expect clean UUID format
Additional Issue: Document Naming
allow custom naming, Instead of temp filename
Thank you for your attention to this issue. The Agno framework is excellent overall, and fixing this UUID handling would make database operations much cleaner.