If you've spent any time around AI infrastructure in the last couple of years, you've heard the phrase "vector database" thrown around like it's the answer to everything. Sometimes it actually is. Sometimes a Postgres table would do just fine. The trick is knowing the difference, and to know the difference you need to understand what these things actually do under the hood.
So let's talk about it like normal people.
The one-sentence definition
A vector database (also called a vector store or vector search engine) is a database that stores and retrieves embeddings of data in vector space [Source 1]. That's the whole idea. Instead of rows of strings and numbers you look up by exact match, you have a pile of high-dimensional vectors, and you ask the database "what's near this other vector?"
That shift, from exact lookup to nearness, is the entire point. Traditional databases primarily look up records by exact match [Source 1]. Vector databases primarily look up records by similarity. Different question, different tool.
Write for sansxel
Want your work in the Learn library? Apply for a hardlocked byline.
An embedding is a vector, just a long list of floating-point numbers, that represents some piece of data: a sentence, an image, an audio clip, a product, a user. The numbers themselves are meaningless to a human. What matters is the geometry. Things that are semantically similar end up close together in vector space. Things that are unrelated end up far apart.
A model produces these vectors. You feed in "a golden retriever puppy" and out comes something like [0.021, -0.443, 0.119, ...] with a few hundred or a few thousand dimensions. Feed in "a labrador puppy" and you get a different vector, but one that lives in roughly the same neighborhood. Feed in "quarterly tax filing deadlines" and you land somewhere very far away.
The vector database's job is to store all those vectors and answer questions about which ones are near each other.
Why "approximate" nearest neighbor?
Here's where it gets interesting. If you have a thousand vectors, finding the closest one to a query is easy. Compare the query to all thousand, sort by distance, done. That's exact nearest neighbor.
Now imagine you have a hundred million vectors, each with 1536 dimensions. Comparing the query to every single one for every search is a non-starter. You'd burn a CPU just to answer one question.
This is why vector databases typically implement approximate nearest neighbor algorithms [Source 1]. The word "approximate" is doing real work there. Instead of guaranteeing you the exact closest match, these algorithms guarantee something close to the closest match, very fast. You trade a tiny bit of accuracy for orders of magnitude in speed. For most applications (search, recommendations, RAG) that tradeoff is invisible to the user and absolutely necessary at scale.
The specific algorithms have names like HNSW, IVF, and product quantization. You don't need to know the internals to use a vector database any more than you need to know B-tree internals to use Postgres. But it's worth knowing they exist, because index choice affects recall, latency, and memory.
What do people actually use this for?
The headline use cases are similarity search, semantic search, multi-modal search, recommendation engines, object detection, and retrieval-augmented generation [Source 1]. Let's unpack a few of those, because the names sound like marketing until you see them in action.
Semantic search. Old-school search matched keywords. If a user typed "car" and your document said "automobile," you got nothing. Embed both, and they land near each other in vector space. Search the vectors, and "automobile" comes back as a top hit even though the literal word "car" never appears. That's semantic search.
Multi-modal search. Some embedding models embed text and images into the same vector space. You can search a photo library with a text query like "sunset over mountains" and get matching images, because the text vector and the image vectors live in the same neighborhood [Source 1]. Same idea, different modalities.
Recommendations. Embed every product. Embed every user (or every user's recent activity). Find the products whose vectors are closest to the user's vector. That's a recommendation engine in roughly fifteen lines of pseudocode [Source 1].
Object detection. Vision models often produce embeddings for detected regions of an image. Comparing those embeddings against a database of known-object embeddings lets you classify or match what you're seeing [Source 1].
Retrieval-augmented generation (RAG). This is the one that made vector databases famous. An LLM has a fixed knowledge cutoff and a finite context window. If you want it to answer questions about your company's internal docs, you can't fine-tune it every time someone updates a wiki page. Instead, you embed all the docs, store the vectors, and at query time embed the user's question, fetch the top few relevant chunks from the vector database, and stuff them into the LLM's prompt. The LLM gets fresh, specific context. The vector database is the retrieval half of retrieval-augmented generation [Source 1].
RAG is why every infra company suddenly added a vector product in 2023. It's the connective tissue between your data and the model.
How is this different from a regular database?
A regular relational database is built around exact-match and range queries on structured columns. Find the user with id 42. Find all orders placed after Tuesday. Join customers to invoices. The query planner is optimizing for those access patterns, and the indexes (B-trees, hash indexes) are built for them.
A vector database is built around "find the K nearest vectors to this one." The indexes are entirely different beasts. The storage layout is different. The query language often looks different too, though many vector databases now bolt on filters so you can say "find the nearest vectors to this query, but only among documents tagged 'engineering' and updated in the last 30 days." That hybrid pattern is everywhere now.
The distinction in [Source 1] is clean: traditional databases look up by exact match, vector databases look up by semantic similarity. Both are valid. Most real systems end up using both.
Do you actually need a dedicated vector database?
Fair question, and not always yes.
Postgres has the pgvector extension. SQLite has vector extensions. Elasticsearch and OpenSearch have vector fields. Redis has a vector module. If your dataset is small (say, under a million vectors) and your latency budget is generous, an extension on a database you already run is often the right call. One less moving piece, one less thing to operationalize.
Dedicated vector databases (Pinecone, Weaviate, Milvus, Qdrant, and so on) start to pull ahead when you have tens or hundreds of millions of vectors, when you need very low latency at high QPS, when you want sophisticated filtering combined with vector search, or when you want managed sharding and replication tuned specifically for ANN workloads.
The rule of thumb: start with what you have. Move to a dedicated system when the existing one starts hurting. Don't adopt new infrastructure because a blog post said you should.
A mental model for how a query works
Imagine the inside of a vector database during a search:
You send a query vector, maybe with some metadata filters.
The database uses its ANN index to quickly identify a candidate set of vectors that are probably close to the query [Source 1]. "Probably" is the operative word. The index has structure (a graph, a set of clusters, a tree) that lets it skip the vast majority of the dataset.
It computes exact distances between the query and that candidate set.
It applies any metadata filters ("only documents from this user," "only items in stock").
It returns the top K results, sorted by distance.
The whole thing typically completes in single-digit milliseconds even on large datasets. That's the magic of approximate nearest neighbor [Source 1]. You're not searching everything. You're searching a smartly chosen slice and trusting the index to have done its job.
What you store alongside the vectors
A vector by itself isn't useful. You need to know what it represents. So vector databases also store metadata: a document ID, the original text chunk, a URL, timestamps, tags, user IDs, whatever your application needs.
When the database returns the top K nearest vectors, you usually want the metadata too. "Here are the five document chunks most similar to the user's question, along with their source URLs and the text itself." That payload is what you hand to the LLM in a RAG pipeline, or what you render in a search UI.
Good vector databases let you filter on this metadata efficiently, combined with the vector search. That combination is harder than it sounds. Filtering before the ANN search can break the index's assumptions. Filtering after can leave you with too few results. Different systems solve it differently, and it's worth reading the docs of whichever one you pick.
A few things to watch out for
Embedding model choice matters more than database choice. The quality of your search is bounded by the quality of your embeddings. A great vector database with mediocre embeddings will give you mediocre results. Spend time picking a model that fits your domain.
Dimensionality has costs. A 3072-dimensional embedding takes more memory and compute than a 768-dimensional one. Some applications need the extra fidelity. Many don't.
Re-embedding is expensive. If you switch embedding models, every vector in your database becomes incompatible with the new ones. You have to re-embed your entire corpus. Plan for this.
Recall isn't guaranteed. Approximate means approximate [Source 1]. Tune your index parameters and measure recall against a ground-truth set if accuracy matters for your use case.
Wrapping up
A vector database is a database optimized for one specific question: what's near this vector? It uses approximate nearest neighbor algorithms to answer that question fast, even on huge datasets [Source 1]. It powers semantic search, multi-modal search, recommendations, object detection, and the retrieval half of RAG [Source 1].
It's not a replacement for your existing database. It's a complement. Think of it as the right tool when your access pattern is "find me things like this" instead of "find me the thing with this exact ID." Once you internalize that distinction, the rest is engineering details: which index, which embedding model, which provider, how to combine vector search with metadata filters.
Build something with one. That's the fastest way to actually understand it. Embed a few hundred documents, put them in a vector store, and ask questions. The first time semantic search gives you back a result that contains none of your query's keywords but is exactly what you meant, you'll get it.