Vector DB

Storage and similarity search for embeddings. Most of the engineering decisions here are about scale — pick the simplest one that fits the corpus and move when you have to.

AI

A vector DB stores embeddings and answers similarity queries fast. A document gets embedded once, into a vector. Thousands of documents make thousands of vectors. When the agent needs context for a query, it embeds the query into a vector and asks the DB for the nearest matches. That's the entire purpose.

Most of the engineering decisions around a vector DB are about scale. At a few thousand vectors, you can hold the whole corpus in memory and brute-force the search — every query touches every vector and the cost is negligible. At a few million, brute force collapses and you need approximate nearest-neighbor indexing — HNSW, IVF, the rest. At a few billion, you need infrastructure that doesn't fit on one box.

I run pgvector on a small Postgres instance for the wiki. The corpus is in the thousands, not the millions; brute-force is fine; the operational simplicity of using the database I already run trumps the marginal performance of a dedicated vector store. If the wiki crossed a hundred thousand entries, I'd reconsider — Qdrant or LanceDB would be the natural step up.

The temptation in this space is to over-engineer. There's a parade of vendors claiming a vector DB is the most important infrastructure decision an AI app makes. It usually isn't. The most important decision is what's in the corpus — see RAG. The vector DB is the index on top of the asset; the index can be primitive and the system still works.

Pick the simplest one that fits the corpus size. Move when you have to.