Home / Glossary / HyDE
What is HyDE (Hypothetical Document Embeddings)?
Abbreviation: HyDE
HyDE is a retrieval technique where the model first generates a hypothetical answer to the query, then embeds that answer for retrieval (instead of embedding the raw query). Improves retrieval quality on queries that are very different in wording from the source documents.
Detail
The intuition: user queries are short and abstract ("what is the refund policy?"); source documents are long and concrete ("Refunds may be requested within 30 days…"). Embedding the query directly often misses the doc. HyDE has the LLM generate what an answer might look like, then embeds that — which matches the doc better.
Costs a small extra LLM call per query but often boosts retrieval recall meaningfully on terse queries.
Related terms