Sentence Transformers embeddings are all zeros

I’m trying to use Sentence Transformers embeddings (in fact this is my first attempt to do a RAG with Agno.) But it seems in the LanceDb, the embedding vectors are all zeros. I’ll include the codes below. Hopefully they are reproducible.

from agno.knowledge.knowledge import Knowledge
from agno.vectordb.lancedb import LanceDb
from agno.knowledge.embedder.sentence_transformer import SentenceTransformerEmbedder
from agno.knowledge.reader.pdf_reader import PDFReader

# Create an vector db
vector_db = LanceDb(
    uri="tmp/lancedb",
    table_name="pdf_docs2",
    embedder=SentenceTransformerEmbedder()
)

# Use page-based chunking for your PDF
pdf_reader = PDFReader(
    name="Page Chunking Reader",
    chunk_by="page",  # chunks by page
)

# Define Knowledge
knowledge = Knowledge(
    vector_db=vector_db,
)

# Add content
knowledge.add_content(
    url="https://agno-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf",
    reader=pdf_reader,
)

# Peek at the vector_db
pd = vector_db.table.to_pandas()
print(pd)
print(pd.loc[0]['vector'])

And I get something like this:

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
...

Is it a bug? Or am I missing something?

hey @timmak , this was indeed a bug. We’ve merged the fix, and it will be released soon.

thanks for raising the issue!

1 Like

Thanks for getting back so quickly. Actually I tried using HuggingfaceCustomEmbedder instead of SentenceTransformerEmbedder and it had the same results. Do you think there is a similar bug with HuggingfaceCustomEmbedder too?

thanks @timmak for raising this , will check and let you know

hey @timmak i tested huggingfacecustomEmbedder cookbook/knowledge/embedders/huggingface_embedder.py
i am not able to replicate the issue , could you please share your script/config to replicate the issue ?

thanks