Skip to content

⚓️ Retrieval with FastEmbed

This notebook demonstrates how to use FastEmbed to perform vector search and retrieval. It consists of the following sections:

  1. Setup: Installing the necessary packages.
  2. Importing Libraries: Importing FastEmbed and other libraries.
  3. Data Preparation: Example data and embedding generation.
  4. Querying: Defining a function to search documents based on a query.
  5. Running Queries: Running example queries.

Setup

First, we need to install the dependencies. fastembed to create embeddings and perform retrieval.

# !pip install fastembed --quiet --upgrade

Importing the necessary libraries:

import numpy as np
from fastembed import TextEmbedding

Data Preparation

We initialize the embedding model and generate embeddings for the documents.

💡 Tip: Prefer using query_embed for queries and passage_embed for documents.

# Example list of documents
documents: list[str] = [
    "Maharana Pratap was a Rajput warrior king from Mewar",
    "He fought against the Mughal Empire led by Akbar",
    "The Battle of Haldighati in 1576 was his most famous battle",
    "He refused to submit to Akbar and continued guerrilla warfare",
    "His capital was Chittorgarh, which he lost to the Mughals",
    "He died in 1597 at the age of 57",
    "Maharana Pratap is considered a symbol of Rajput resistance against foreign rule",
    "His legacy is celebrated in Rajasthan through festivals and monuments",
    "He had 11 wives and 17 sons, including Amar Singh I who succeeded him as ruler of Mewar",
    "His life has been depicted in various films, TV shows, and books",
]
# Initialize the DefaultEmbedding class with the desired parameters
embedding_model = TextEmbedding(model_name="BAAI/bge-small-en")

# We'll use the passage_embed method to get the embeddings for the documents
embeddings: list[np.ndarray] = list(
    embedding_model.passage_embed(documents)
)  # notice that we are casting the generator to a list

print(embeddings[0].shape, len(embeddings))
(384,) 10

Querying

We'll define a function to print the top k documents based on a query, and prepare a sample query.

query = "Who was Maharana Pratap?"
query_embedding = list(embedding_model.query_embed(query))[0]
plain_query_embedding = list(embedding_model.embed(query))[0]


def print_top_k(query_embedding, embeddings, documents, k=5):
    # use numpy to calculate the cosine similarity between the query and the documents
    scores = np.dot(embeddings, query_embedding)
    # sort the scores in descending order
    sorted_scores = np.argsort(scores)[::-1]
    # print the top 5
    for i in range(k):
        print(f"Rank {i+1}: {documents[sorted_scores[i]]}")
query_embedding[:5], plain_query_embedding[:5]
(array([-0.06002192,  0.04322132, -0.00545516, -0.04419701, -0.00542277],
       dtype=float32),
 array([-0.06002192,  0.04322132, -0.00545516, -0.04419701, -0.00542277],
       dtype=float32))

The query_embed is specifically designed for queries, leading to more relevant and context-aware results. The retrieved documents tend to align closely with the query's intent.

In contrast, embed is a more general-purpose representation that might not capture the nuances of the query as effectively. The retrieved documents using plain embeddings might be less relevant or ordered differently compared to the results obtained using query embeddings.

Conclusion: Using query and passage embeddings leads to more relevant and context-aware results.