Why Your AI Search Might Be Missing Obvious Answers (And How RRF Fixes It)
When your AI says "no relevant information found" while the answer is literally in your messages, the problem isn't the AI — it's how hybrid search combines different ranking methods.
We recently encountered a puzzling bug that perfectly illustrates why hybrid search systems can fail in surprisingly obvious ways. A user asked their AI assistant a simple question, and despite having the exact answer in their chat history, the AI confidently claimed no such information existed.
AI Missing Obvious Answers
"What date was I supposed to put in the agreement?"
I checked your messages, and there's no mention anywhere of what date you were supposed to put in the agreement — none of the retrieved messages reference a signing date.
The root cause: Score scale mismatch
Our system uses hybrid search — combining two different search methods to get the best of both worlds:
- Vector Search (Semantic): Understands meaning and context. Great for "find messages about contract terms" even if the word "contract" isn't used.
- BM25 (Keyword): Classic text matching. Perfect for finding exact phrases like "1st February" or specific names.
The problem? These methods use completely different scoring scales:
Score Scale Mismatch
| Search Type | Message Found | Raw Score | Final Rank |
|---|---|---|---|
| VECTOR | "As per the agreement... Exclusivity in Nigeria" | 0.65 | #1Wrong answer |
| VECTOR | "Ok. Will counter sign and send back" | 0.58 | #2Wrong answer |
| BM25 | "1st February will be great" | 0.016 | #47Correct but buried! |
The problem: BM25 scores (0.016) are 40x smaller than Vector scores (0.65). When sorted by raw score, keyword matches always lose — even when they're the correct answer!
When the user asked "What date should I put in the agreement?", BM25 correctly found the "1st February" message at rank #1 (it matched the exact keywords). But Vector search ranked it #47 because semantically, talking about a date doesn't strongly relate to general "agreement" discussions.
Since we were sorting by raw score, Vector's 0.65 beat BM25's 0.016 every time. The correct answer got buried at position #47, and only the top 10 results were sent to the AI.
The fix: Reciprocal Rank Fusion (RRF)
The solution is elegant: instead of comparing raw scores, compare rankings. This is called Reciprocal Rank Fusion (RRF), and the formula is beautifully simple:
Why this works: Being ranked #1 in either search method contributes equally to the final score. The constant 60 (called "k") dampens the effect of lower rankings so that #1 is significantly better than #2, but #47 isn't much different from #48.
Before vs After RRF
How RRF calculates the final score
Let's trace through the math for our "1st February" message:
| Search Method | Rank | RRF Contribution |
|---|---|---|
| BM25 (keyword) | #1 | 1/(60+1) = 0.0164 |
| Vector (semantic) | #47 | 1/(60+47) = 0.0093 |
| Total RRF Score | 0.0257 | |
Compare that to the "Exclusivity in Nigeria" message that was previously #1:
| Search Method | Rank | RRF Contribution |
|---|---|---|
| Vector (semantic) | #1 | 1/(60+1) = 0.0164 |
| BM25 (keyword) | Not found | 1/(60+max) = ~0 |
| Total RRF Score | 0.0164 | |
The message that was #1 in one search method (0.0257) now outranks the message that was #1 in one search method but didn't appear in the other (0.0164). Fair is fair.
The key insight
"RRF doesn't care about the raw scores. It only cares about the order. Being #1 in keyword search carries the same weight as being #1 in semantic search. Results found by both methods rank highest."
What happens when a result only appears in one search?
For the search method where a result doesn't appear, we assign it a "max rank" (the total number of results + 1). This means:
- Results in both searches: Get contributions from both, rank highest
- Results in one search: Get contribution from one, plus minimal contribution from the other
- Same-method ties: Decided by their ranking in the other method
This elegantly handles the case where semantic search finds conceptually related messages that keyword search misses (and vice versa).
Implementation details
The code change was straightforward. Instead of:
// OLD: Sort by raw score (broken) results.sort((a, b) => b.score - a.score);
We now:
// NEW: Calculate RRF score from rankings const RRF_K = 60; const vectorRank = vectorRanks.get(id) ?? maxRank; const bm25Rank = bm25Ranks.get(id) ?? maxRank; const rrfScore = 1/(RRF_K + vectorRank) + 1/(RRF_K + bm25Rank); // Sort by RRF score results.sort((a, b) => b.rrfScore - a.rrfScore);
Why this matters for AI search
Modern search systems often combine multiple retrieval methods:
- Vector embeddings for semantic understanding
- BM25/TF-IDF for exact keyword matches
- Graph-based retrieval for relationship-aware search
- Temporal signals for recency
Each method has different score distributions. RRF provides a principled way to combine them without needing to normalize scores or tune complex weighting schemes.
"The beauty of RRF is its simplicity. No hyperparameter tuning beyond the k value (60 works well in practice). No score normalization. Just ranks and a formula."
The result
After implementing RRF, the same query "What date should I put in the agreement?" now correctly retrieves the "1st February" message at the top of results. The AI finds the answer immediately.
Fixed: Messages with exact keyword matches now rank appropriately alongside semantically similar results. The correct answer bubbles to the top.
Key takeaways
- Don't compare raw scores across different search methods — they live on different scales
- RRF converts scores to ranks, making apples-to-apples comparison possible
- The k=60 constant is well-established in research and works across most use cases
- Results in multiple search methods naturally rank higher — which is usually correct
Sometimes the fix for a confusing AI behavior isn't in the AI itself — it's in how you feed it information.
Want smarter AI search for your conversations?
Querygen uses advanced hybrid search with RRF to ensure you never miss important information in your WhatsApp messages.
Try Querygen Free