How does a vector database work?

It works in three steps. First, an embedding model turns each piece of data (text, an image, audio) into a vector: a list of a few hundred to a few thousand numbers that captures its meaning, so similar things get similar numbers. Second, the database stores those vectors, each with an ID and optional metadata, all in the same fixed dimension so they can be compared. Third, to search, your query is turned into a vector too, and the database finds the stored vectors closest to it, usually measured by cosine similarity (whether two vectors point in the same direction), and returns the top few nearest neighbors. At scale, production databases find those nearest neighbors approximately, using specialized indexes for speed, but the idea is unchanged. In short: meaning becomes numbers, and finding similar things becomes finding the closest numbers.

When do you need a vector database?

You need one when your core question is "find things similar to this," not "find this exact thing." The clearest cases are semantic search (people search by concept, not exact keywords), retrieval for AI chat over your own documents (finding the most relevant passages to feed a model, the pattern called RAG), recommendations ("show me more like this"), and deduplication or clustering by similarity. You do not need one for exact-key lookups (use a Key-Value Store) or for filtering structured records by date, price, or status (use a Relational Database). If you only ever need exact matches, adding a vector database is cost and complexity for a question you will never ask.

Vector database vs regular database: what is the difference?

A regular database, whether relational or key-value, answers exact questions: "give me the record with this ID" or "give me every order over fifty dollars." It matches on precise keys and conditions. A vector database answers a similarity question: "give me the items whose meaning is closest to this one," ranked by how close they are. A relational database would never find a document that is about refunds but never uses the word "refund"; a vector database is built to find exactly that. They are not competitors. Most real systems use both, the regular database as the source of truth for exact records, the vector database as the index for similarity search, each doing the job it was built for.

← Back to Learning Hub 9 min read

7 Building Blocks Framework

What Is a Vector Database? A Plain Explanation (and When You Actually Need One)

Every AI product that chats with your documents has one underneath it. Here is what a vector database actually is, how it works, and when the job genuinely needs one — without the math.

A few years ago, almost nobody outside a handful of search teams had heard the words "vector database." Now they are everywhere. Every AI product that "chats with your documents," every "find me more like this" recommendation, every semantic search box that understands what you meant instead of the exact words you typed, has one of these sitting underneath it.

So it is worth answering the question plainly: what is a vector database, and why did it suddenly become essential?

Most explanations reach for math you do not need yet. High-dimensional spaces, cosine distance, approximate nearest-neighbor indexes. You can understand the whole idea without any of that, and you should, because the point is not the math. The point is knowing when this is the right block to reach for and when it is not.

This is a deep-dive on one of the 7 building blocks: the small set of reusable parts that every system, from Instagram to Stripe to the AI app you are about to build, is made of. The vector database is the newest of the seven. It is the one the AI era made essential. It was one of the original seven, though, named before the AI boom made it unavoidable rather than in response to it. Here is what it is, how it works, and the judgment call that actually matters.

What a vector database is, plainly

A vector database is storage that answers one specific question: "what is most similar to this?"

That sounds small. It is not. Notice what it is not doing. It is not answering "what is the exact value for this key." It is not answering "give me every record where the price is under twenty dollars." It answers "find me the things that are most like this thing," ranked by how close they are.

The cleanest way to understand it is to put it next to the other storage blocks, because each one answers a different shape of question.

A Key-Value Store answers "what is the value for this exact key?" You hand it user:1042, it hands back that user's record, in under a millisecond. It is fast and it is literal. It only knows exact keys.
A Relational Database answers "what records match these conditions, and how do they relate?" Users, posts, who follows whom. It is built for structured records and the relationships between them. You can ask it complex, exact questions across many records.
A File Store answers "give me the big file at this address." Photos, videos, PDFs. It stores and serves large blobs. It cannot tell you anything about what is inside them.
A Vector Database answers "what is most similar to this, by meaning?" Not by exact match. Not by a filter. By closeness of meaning.

That last one is the gap the other three leave open. Your relational database can find the document titled exactly "Refund Policy." It cannot find the document that is about refunds but never uses that word. Your key-value store can find the product with the exact ID you asked for. It cannot find the products that are like it. Similarity by meaning is a question none of the older blocks were built to answer, and that gap is the entire reason vector databases exist.

How a vector database works (no heavy math)

Here is the whole idea in three steps.

Step one: turn things into vectors. You take your data, a paragraph of text, an image, an audio clip, and you run it through a model called an embedding model. That model gives you back a list of numbers, usually a few hundred to a few thousand of them. That list is called an embedding, or a vector. The important property is this: things that mean similar things get similar lists of numbers. The embedding for "dog" lands near the embedding for "puppy," and both land far from the embedding for "tax return." Meaning gets turned into numbers.

Step two: store the vectors. The vector database holds all of those number-lists, each one tagged with an ID and some metadata (which user it belongs to, when it was created, what document it came from). In the actual code for this building block, storing a vector is exactly that: an ID, the list of numbers, and an optional bag of metadata. It even checks that every vector has the same length, because you cannot compare lists of numbers that are different sizes. All your vectors have to live in the same space to be comparable.

Step three: search by closeness. When you want to find similar things, you turn your query into a vector the same way, then you ask the database for the stored vectors closest to it. "Closest" is measured geometrically. In the reference implementation it uses cosine similarity, which is just a way of asking "are these two lists of numbers pointing in the same direction?" The database compares your query against the stored vectors, ranks them, and hands back the top few, the nearest neighbors. (At scale, production databases use clever indexes to do this approximately, for speed, but the idea is the same.) That is the search. That is the whole trick: meaning becomes numbers, and "find similar" becomes "find the closest numbers."

You never have to compute any of that by hand. But now you know what is happening when an AI product "understands" your search. It is not understanding anything. It is turning your words into a point and handing you back the nearest points.

When to reach for a vector database (and when not to)

This is the part that matters, because the skill is not "knowing what a vector database is." It is knowing when the job actually needs one.

Reach for it when the question is "find similar," not "find exact." A few clear cases:

Semantic search. People search by concept, not by exact keyword. They type "how do I get my money back" and you need to surface the refund policy that never says "money back."
Retrieval for AI chat (RAG). This is the big one and the reason the block went mainstream. When you want an AI to answer questions over your documents, you do not stuff every document into the prompt. You find the handful of most relevant ones first, with a similarity search, and feed only those to the model. The vector database is the "find the relevant ones" step.
Recommendations. "Show me more like this." Turn every item and every user's taste into a vector, then "find things like the ones they liked" becomes a closeness query.
Deduplication and similarity. Finding near-duplicate records, clustering related items, flagging the one thing that does not look like the others.

Do not reach for it when an older block already answers the question. This is where people overbuild.

If you are looking something up by an exact key, that is a Key-Value Store, not a vector database. Do not turn a user-ID lookup into a similarity search.
If you are filtering structured records, by date, by price, by status, that is a Relational Database. Exact filters and joins are what it is built for. A vector database is bad at them.
If you only ever need exact matches, you do not need this block at all. Adding it is cost and complexity for a question you were never going to ask.

The honest rule: a vector database is narrow-purpose. It is excellent at one thing, similarity by meaning, and it does not replace your other storage. Most real AI products use a vector database and a relational database together, each doing the job it is built for. Knowing which block the job needs is the actual engineering judgment. The block is the answer. The shape of your question is what tells you which block.

Where it fits in the 7 building blocks

A vector database almost never works alone. It is one storage block in a small cast, and the way it pairs with the others is itself a reusable pattern.

A Service sits in front of it: it receives the user's query, turns that query into a vector, and asks the vector database for the nearest neighbors. The user is waiting, so this part has to be fast.
A Worker keeps it fed. Generating embeddings is slow background work, and nobody is waiting on it, so it does not belong on the request path. When a new document arrives, a Worker (usually pulling from a Queue) generates its embedding and writes it into the vector database. The data gets indexed ahead of time, in the background.
A source of truth lives alongside it. The vector database holds the meaning of your data for similarity search, but the real records, the document text, the product details, the user accounts, live in a Relational Database or a File Store. You search by similarity in one block, then fetch the full record from another.

That is the RAG and recommendations shape in one sentence: a Service to ask the question, a vector database to find what is similar, a Worker and Queue to keep it indexed, and a relational or file store holding the truth. Same blocks you already know, arranged for a new kind of question.

The block is new. The skill is the same.

The vector database is the one building block the AI era genuinely made essential. It is new, it is everywhere, and it is worth understanding for its own sake.

But notice that understanding it did not require learning a new way of thinking. You asked the same questions you would ask of any block. What shape is this data? What question am I actually asking of it, exact or similar? Which part of the system is waiting, and which part can happen in the background? Those questions are what told you when to reach for a vector database and when a relational database was already the right answer.

That is the whole skill, and it does not churn when the tools do. New blocks will keep arriving. The vector database was one. There will be others. AI can write the code for any of them. The judgment about which block a job actually needs, and why, is the part that lasts, and it is yours to build.

What to Explore Next

The vector database sits alongside the other storage blocks: the three storage extremes that decide File Store vs. Relational vs. Key-Value, and the Service and Worker split plus the Queue that keeps it indexed in the background. For real systems traced end to end, see how Instagram actually works and how Uber works, which leans on the storage blocks you just met for speed instead of similarity. And when you want to choose the right blocks for your own design, the decision framework pulls all of it into a practical guide.

Frequently Asked Questions

What is a vector database?: A vector database is a type of storage built to answer one question: "what is most similar to this?" Instead of looking data up by an exact key or filtering it by exact conditions, it finds items by closeness of meaning. It does this by storing data as vectors, which are lists of numbers (called embeddings) produced by an AI model, where things that mean similar things get similar numbers. When you search, the database returns the stored vectors closest to your query. It is the storage block used for semantic search, recommendations, and AI systems that answer questions over your own documents.
How does a vector database work?: It works in three steps. First, an embedding model turns each piece of data (text, an image, audio) into a vector: a list of a few hundred to a few thousand numbers that captures its meaning, so similar things get similar numbers. Second, the database stores those vectors, each with an ID and optional metadata, all in the same fixed dimension so they can be compared. Third, to search, your query is turned into a vector too, and the database finds the stored vectors closest to it, usually measured by cosine similarity (whether two vectors point in the same direction), and returns the top few nearest neighbors. At scale, production databases find those nearest neighbors approximately, using specialized indexes for speed, but the idea is unchanged. In short: meaning becomes numbers, and finding similar things becomes finding the closest numbers.
When do you need a vector database?: You need one when your core question is "find things similar to this," not "find this exact thing." The clearest cases are semantic search (people search by concept, not exact keywords), retrieval for AI chat over your own documents (finding the most relevant passages to feed a model, the pattern called RAG), recommendations ("show me more like this"), and deduplication or clustering by similarity. You do not need one for exact-key lookups (use a Key-Value Store) or for filtering structured records by date, price, or status (use a Relational Database). If you only ever need exact matches, adding a vector database is cost and complexity for a question you will never ask.
Vector database vs regular database: what is the difference?: A regular database, whether relational or key-value, answers exact questions: "give me the record with this ID" or "give me every order over fifty dollars." It matches on precise keys and conditions. A vector database answers a similarity question: "give me the items whose meaning is closest to this one," ranked by how close they are. A relational database would never find a document that is about refunds but never uses the word "refund"; a vector database is built to find exactly that. They are not competitors. Most real systems use both, the regular database as the source of truth for exact records, the vector database as the index for similarity search, each doing the job it was built for.

Don't just read it — build with it

You can't read your way to judgment. Put this block into practice: map your own app with Design with Blocks, or play the building blocks game with Instagram, Netflix, and Uber.

Design with Blocks Play the Game