Table of Contents
Semantic search is a powerful technique that goes beyond traditional keyword-based search by understanding the intent and contextual meaning behind a query. With the advent of Large Language Models (LLMs) like OpenAI‘s GPT, Google’s BERT, and others, semantic search has become more accessible and effective. In this blog, we’ll explore how to implement semantic search using AI and LLMs.
What is Semantic Search?
Semantic search aims to improve search accuracy by understanding the context and meaning of a query rather than relying solely on matching keywords. For example:
- A keyword search for “apple” might return results about the fruit or the tech company.
- A semantic search understands the context (e.g., “apple iPhone” vs. “apple nutrition”) and returns more relevant results.
Semantic search is widely used in applications like:
- E-commerce product search
- Document retrieval systems
- Chatbots and virtual assistants
- Knowledge base search
How Semantic Search Works with AI and LLMs
Semantic search leverages embeddings—numerical representations of text that capture its meaning. LLMs like GPT, BERT, and others generate these embeddings, which can then be used to compare the similarity between queries and documents.
Here’s a high-level overview of the process:
- Text Embedding Generation: Convert text (queries and documents) into vector embeddings using an LLM.
- Similarity Calculation: Use a similarity metric (e.g., cosine similarity) to compare the embeddings of the query and documents.
- Ranking: Rank documents based on their similarity to the query and return the most relevant results.
Step-by-Step Guide to Implementing Semantic Search
Let’s walk through the steps to build a semantic search system using Python and LLMs.

Step 1: Choose an LLM for Embeddings
You can use pre-trained models like:
- OpenAI’s GPT (e.g.,
text-embedding-ada-002
for embeddings) - Sentence Transformers (e.g.,
all-MiniLM-L6-v2
) - Google’s BERT
For this example, we’ll use OpenAI’s embeddings API.
Step 2: Install Required Libraries
Install the necessary Python libraries:
pip install openai numpy scipy
Step 3: Generate Embeddings
Use the LLM to generate embeddings for your documents and queries.
import openai
# Set your OpenAI API key
openai.api_key = "your-openai-api-key"
def get_embedding(text, model="text-embedding-ada-002"):
text = text.replace("\n", " ")
return openai.Embedding.create(input=[text], model=model)['data'][0]['embedding']
# Example documents
documents = [
"Artificial intelligence is transforming industries.",
"Machine learning is a subset of AI.",
"Python is a popular programming language for AI development."
]
# Generate embeddings for documents
document_embeddings = [get_embedding(doc) for doc in documents]
Step 4: Calculate Similarity
Use cosine similarity to compare the query embedding with document embeddings.
import numpy as np
from scipy.spatial.distance import cosine
def cosine_similarity(vec1, vec2):
return 1 - cosine(vec1, vec2)
def semantic_search(query, documents, document_embeddings):
# Get embedding for the query
query_embedding = get_embedding(query)
# Calculate similarity scores
similarity_scores = [cosine_similarity(query_embedding, doc_embedding) for doc_embedding in document_embeddings]
# Rank documents by similarity
ranked_docs = sorted(zip(documents, similarity_scores), key=lambda x: x[1], reverse=True)
return ranked_docs
# Example query
query = "What is AI?"
results = semantic_search(query, documents, document_embeddings)
# Display results
for doc, score in results:
print(f"Document: {doc}\nSimilarity Score: {score}\n")
Step 5: Optimize and Scale
- Indexing: Use vector databases like Pinecone, Weaviate, or FAISS to store and efficiently search large-scale embeddings.
- Preprocessing: Clean and preprocess text (e.g., remove stop words, normalize text) to improve embedding quality.
- Fine-tuning: Fine-tune the LLM on domain-specific data for better performance in specialized applications.
Applications of Semantic Search

Image by rawpixel.com on Freepik
- E-commerce: Improve product search by understanding user intent (e.g., “comfortable running shoes” vs. “cheap shoes”).
- Customer Support: Build chatbots that retrieve relevant answers from a knowledge base.
- Content Recommendation: Recommend articles, videos, or products based on semantic similarity.
- Legal and Medical Search: Retrieve relevant documents or case studies based on contextual queries.
Challenges and Considerations
- Computational Cost: Generating embeddings and performing similarity searches can be resource-intensive.
- Latency: Real-time applications may require optimized models and infrastructure.
- Domain-Specificity: Pre-trained models may not perform well in specialized domains without fine-tuning.
Conclusion
Semantic search powered by AI and LLMs is a game-changer for information retrieval systems. By understanding the meaning behind queries, it delivers more accurate and contextually relevant results. With tools like OpenAI’s embeddings and vector databases, implementing semantic search has never been easier.
Whether you’re building a search engine, a chatbot, or a recommendation system, semantic search can significantly enhance user experience and satisfaction. Start experimenting with the code above, and explore how you can integrate semantic search into your applications!
Let me know if you’d like further details or help with specific implementations! 🚀