Hybrid search runs BM25 and Embedding Models in parallel against the same query, then fuses their result lists into a single ranked output. Keyword search and semantic search fail in complementary ways: BM25 misses paraphrases, embeddings miss exact terms. Combining them covers both blind spots in the same pass. Searching “staying focused” won’t surface a note titled “Attention and Distraction Management” via BM25 alone; embedding-only search can dilute a specific Docker container name with semantically-similar but wrong results because embeddings compress meaning and lose exact tokens.

The standard fusion method is Reciprocal Rank Fusion (RRF), which sidesteps the problem of BM25 and cosine similarity scores being on completely different scales (BM25 ranges 0-25+; cosine 0-1). Instead of combining raw scores, RRF only uses rank position: for each document, sum 1 / (k + rank) across all retrieval methods, where k is typically 60. A document ranked #1 by both methods gets the highest fused score. A document ranked #1 by one and #50 by the other still gets credit for its strong showing.

Hybrid search shows the largest gains where the corpus mixes jargon-heavy technical content with natural-language prose, where queries vary between exact lookups (“PERM green card”) and conceptual searches (“immigration process for work sponsorship”), or where documents use inconsistent vocabulary for the same concepts (common in personal notes written over months or years).

It’s a first-stage retriever. Output (typically top 50-100 candidates) feeds into a Reranker for fine-grained scoring. The consensus best pipeline as of 2026 is hybrid retrieval, then reranker, then LLM for generation. See Cormack, Clarke, and Buettcher (2009), “Reciprocal Rank Fusion Outperforms Condorcet and Individual Rank Learning Methods”, for the original paper.