Embeddings · Lucas Herrero

Embeddings are how text becomes searchable by meaning. A document goes in, a vector comes out. The vector is a compressed numerical representation of what the document is about - close to other documents that mean similar things, far from documents that mean different things. RAG depends on this. Every retrieval system that surfaces "relevant" documents has an embedding model deciding what "relevant" means.

The embedding model is upstream of everything else in the retrieval pipeline. The vector DB is doing math on whatever vectors it gets. The similarity threshold is a knob on top of those vectors. The retrieval quality is bounded by how well the embedding model captures the distinctions that matter for your domain.

Most practitioners default to OpenAI's text-embedding-3 or whatever the current open-source champion is. That works for general text. It doesn't necessarily work for finance, where the difference between a covered call and a cash-secured put is a meaningful distinction - and where a generic embedding might cluster them both under "options strategies" and miss the distinction.

I default to a strong general-purpose embedding for the wiki because most of the wiki is general-purpose context. For domain-specific corpora - earnings transcripts, options chains - I'd reach for a domain-tuned embedding. The ROI of switching is real but the engineering cost is also real. I haven't pulled the trigger on a finance-specific embedding yet; the wiki isn't dense enough in finance specifics to justify it.

The embedding is the prior. Choose accordingly.