Semantic Search for E-Commerce: Replacing Keyword Search with pgvector + OpenAI Embeddings

"Gifts under $50 for a coffee lover" returned zero results with keyword search. Here's how I rebuilt a Medusa v2 storefront's search with vector embeddings, reranking, and live inventory awareness.

A Medusa v2 storefront client had a search problem that no amount of keyword tuning was going to fix. Their customers were searching like humans — "gifts under $50 for a coffee lover", "something warm for winter that isn't a jacket", "wireless earbuds that won't fall out when running" — and getting zero results every time.

The existing search was PostgreSQL full-text search with tsvector. It worked fine for exact product name lookups. But e-commerce customers don't search by product name. They search by intent. And intent doesn't live in a keyword index.

I replaced the entire search layer with vector-based semantic search. Queries like "gifts under $50 for a coffee lover" now return pour-over kits, ceramic mugs, and specialty bean subscriptions. Average order value went up 18% over a 6-week A/B test on a related Mercur multi-vendor marketplace.

Why Keyword Search Fails for E-Commerce

Full-text search matches tokens. The query "coffee lover gifts" matches products with "coffee" or "gift" in the title or description. It does not understand:

Synonyms: "warm" → sweaters, fleece, thermal layers
Intent: "for running" → sweat-resistant, secure fit, lightweight
Price constraints: "under $50" is a filter, not a search term
Negation: "isn't a jacket" should exclude jackets

You can patch some of this with synonym dictionaries, manual boosts, and faceted filtering. I've done that on Shopify and WooCommerce stores. It works until someone searches for "anniversary gift for someone who has everything" and you realise you're playing whack-a-mole with natural language.

Semantic search sidesteps the problem entirely. Instead of matching tokens, you match meaning.

The Architecture

User query
    ↓
┌─────────────────────┐
│  Query parser        │ ← extracts filters (price, category)
│  + embedding         │ ← embeds the intent portion
└─────────────────────┘
    ↓
┌─────────────────────┐
│  pgvector search     │ ← cosine similarity on product embeddings
│  + SQL filters       │ ← price range, category, in-stock
└─────────────────────┘
    ↓
┌─────────────────────┐
│  Reranker            │ ← vector similarity + inventory + margin weighting
└─────────────────────┘
    ↓
  Ranked results

Three stages: parse the query to separate intent from filters, search by vector similarity with SQL constraints, then rerank with business logic.

Embedding Products

Each product gets embedded once at index time. The embedding input is a carefully constructed text blob, not just the product title:

def build_product_text(product: dict) -> str:
    parts = [
        product["title"],
        product.get("subtitle", ""),
        product.get("description", ""),
        product.get("material", ""),
    ]
 
    # Add category path for context
    if product.get("categories"):
        parts.append("Category: " + " > ".join(product["categories"]))
 
    # Add key attributes
    for tag in product.get("tags", []):
        parts.append(tag)
 
    return " | ".join(p for p in parts if p)

Why not just embed the title? Because "Classic Mug" means nothing to an embedding model. But "Classic Mug | Handmade ceramic pour-over coffee mug | 12oz capacity | Dishwasher safe | Category: Kitchen > Coffee & Tea | gift, coffee lover, handmade" gives the model enough context to understand that this product is relevant to "gifts for a coffee lover."

I used OpenAI's text-embedding-3-small and stored embeddings in pgvector alongside the product data:

ALTER TABLE product ADD COLUMN search_embedding VECTOR(1536);
 
CREATE INDEX idx_product_search_embedding
    ON product USING ivfflat (search_embedding vector_cosine_ops)
    WITH (lists = 50);

Same PostgreSQL instance Medusa already runs on. No additional infrastructure.

Query Parsing: Separating Intent from Filters

"Gifts under $50 for a coffee lover" contains both semantic intent ("gifts for a coffee lover") and a structured filter ("under $50"). If you embed the whole string including "$50", the embedding gets polluted with price semantics that don't help find relevant products.

I used a lightweight LLM call to parse the query:

from openai import OpenAI
 
client = OpenAI()
 
PARSE_PROMPT = """Extract search intent and filters from this e-commerce query.
Return JSON with: {"intent": "...", "filters": {"max_price": null, "min_price": null, "category": null, "exclude": []}}
 
Query: "{query}"
"""
 
def parse_query(query: str) -> dict:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": PARSE_PROMPT.format(query=query)}],
        response_format={"type": "json_object"},
        temperature=0,
    )
    return json.loads(response.choices[0].message.content)

For "gifts under $50 for a coffee lover", this returns:

{
  "intent": "gifts for a coffee lover",
  "filters": {
    "max_price": 5000,
    "category": null,
    "exclude": []
  }
}

The intent gets embedded for vector search. The filters become SQL WHERE clauses. Clean separation.

Is an LLM call for query parsing expensive? At gpt-4o-mini pricing, it's fractions of a cent per query. The latency adds ~200ms, which I run in parallel with other operations. For the quality improvement it provides, the tradeoff is unambiguous.

Vector Search with SQL Filters

The actual search combines pgvector cosine similarity with standard SQL filtering:

async def search_products(
    intent: str,
    filters: dict,
    limit: int = 20,
) -> list[dict]:
    intent_embedding = await get_embedding(intent)
 
    query = """
        SELECT
            p.id,
            p.title,
            p.description,
            p.price,
            p.thumbnail,
            1 - (p.search_embedding <=> $1::vector) AS similarity
        FROM product p
        JOIN product_variant pv ON pv.product_id = p.id
        WHERE
            p.status = 'published'
            AND pv.inventory_quantity > 0
    """
 
    params = [intent_embedding]
    param_idx = 2
 
    if filters.get("max_price"):
        query += f" AND pv.price <= ${param_idx}"
        params.append(filters["max_price"])
        param_idx += 1
 
    if filters.get("category"):
        query += f" AND p.category_id = ${param_idx}"
        params.append(filters["category"])
        param_idx += 1
 
    query += f"""
        ORDER BY p.search_embedding <=> $1::vector
        LIMIT ${param_idx}
    """
    params.append(limit)
 
    return await db.fetch(query, *params)

Key details:

<=> is pgvector's cosine distance operator. Lower distance = higher similarity.
inventory_quantity > 0 ensures we never return out-of-stock products. Obvious, but I've seen search implementations that return products customers can't actually buy.
Filters are SQL, not post-query filtering. The database handles price and category constraints at the index level, not after retrieving 1000 results and filtering in Python.

The Reranking Layer

Vector similarity alone isn't enough for e-commerce. A pour-over kit and a bag of coffee beans might have identical similarity scores to "gifts for a coffee lover", but the business wants to show different products depending on context.

The reranker combines three signals:

def rerank(results: list[dict], query_context: dict) -> list[dict]:
    for result in results:
        score = result["similarity"] * 0.6  # vector similarity weight
 
        # Boost in-stock items with healthy inventory
        if result["inventory"] > 10:
            score += 0.15
        elif result["inventory"] > 0:
            score += 0.05
 
        # Boost higher-margin products (business objective)
        if result.get("margin_pct", 0) > 40:
            score += 0.1
 
        # Penalise results that match on category but not intent
        if result["similarity"] < 0.3:
            score *= 0.5
 
        result["final_score"] = score
 
    return sorted(results, key=lambda r: r["final_score"], reverse=True)

The weights (0.6 similarity, 0.15 inventory boost, 0.1 margin boost) were tuned by running the reranker against a test set of 50 queries with manually labelled "ideal" result orderings. Not scientific, but effective — the client's merchandising team reviewed the results and adjusted the weights twice before settling on these values.

This is the layer that makes semantic search commercially useful. Pure relevance ranking optimises for the user. Reranking optimises for the business without destroying the user experience.

Keeping Embeddings Fresh

Products change. Prices update, descriptions get rewritten, new items launch daily. Stale embeddings return stale results.

I set up a simple event-driven refresh:

# Medusa v2 subscriber — re-embed on product update
import { SubscriberArgs, type SubscriberConfig } from "@medusajs/framework"
 
export default async function productUpdateHandler({
  event,
  container,
}: SubscriberArgs<{ id: string }>) {
  const productService = container.resolve("product")
  const searchService = container.resolve("semanticSearchService")
 
  const product = await productService.retrieveProduct(event.data.id)
  await searchService.reindexProduct(product)
}
 
export const config: SubscriberConfig = {
  event: ["product.updated", "product.created"],
}

When a product is created or updated in Medusa, the subscriber triggers a re-embedding. The new embedding replaces the old one in pgvector. No batch jobs, no stale windows. The search index is always within seconds of the source of truth.

For bulk operations (initial index, full re-sync), I run a batch script that processes products in parallel with rate limiting to stay within OpenAI's API limits.

Results

On the Medusa v2 storefront:

Metric	Keyword search	Semantic search
Zero-result rate	23%	4%
Search → product click rate	31%	52%
Search → add to cart rate	8%	14%

On a related Mercur multi-vendor marketplace where I paired semantic search with a conversational shopping assistant:

Metric	Before	After (6-week A/B)
Average order value	$47	$55 (+18%)
Search-assisted conversions	12%	28%

The AOV lift came from the combination of semantic search and the conversational assistant surfacing products customers didn't know existed. "Gifts for a coffee lover" returns a curated set instead of nothing — and the assistant can follow up with "Would you prefer something they can use daily or a one-time experience?" to narrow further.

When Not to Use This

Semantic search isn't always the answer:

Exact SKU lookups — if your customers search by part number or model code, keyword search is faster and more precise. Keep it as a fallback.
Tiny catalogues — under 500 products, a well-curated category tree with filters outperforms any search. Don't over-engineer.
No natural-language queries — if your analytics show that 90% of searches are single product names, semantic search adds latency without improving results.

Check your search analytics before building. If your zero-result rate is under 5% and your click-through rate is above 40%, keyword search is working fine. Fix something else.

The core insight is that e-commerce search is a recommendation problem disguised as a search problem. Customers don't know what they want — they know what they mean. Vector embeddings capture meaning. Keyword indexes don't.

pgvector, OpenAI embeddings, and a reranking layer that respects business objectives. That's the whole stack. No dedicated vector database, no complex ML pipeline. Just PostgreSQL doing what it does best, with a 1536-dimensional column added.

Command Palette

Semantic Search for E-Commerce: Replacing Keyword Search with pgvector + OpenAI Embeddings