Vector databases for retrieval — AIF-C01
Learn how vector databases power RAG pipelines in the AWS AIF-C01 exam — semantic search, embeddings, AWS services, and the fine-tuning misconception.
WHAT IT IS
A vector database is a purpose-built data store that saves data as high-dimensional numerical vectors (embeddings) and enables fast lookup of the nearest neighbors in that high-dimensional space. In a retrieval-augmented generation (RAG) system, it serves as the external knowledge layer: an embedding model converts your source documents into vectors, those vectors are stored and indexed in the vector database, and at query time the user's question is converted to a vector and matched against the index to surface the most semantically relevant passages.
Mental model
Think of a library where every book has been distilled into a point on a giant map. Books about similar topics cluster together. When you ask a question, the system plots your question on that same map and hands you the books whose points are closest — regardless of whether they share any keywords with your question. That "find the closest points" operation is what a vector database does.
When to use it
The exam tests whether you can distinguish vector-database-backed RAG from fine-tuning as approaches to grounding a foundation model in domain-specific knowledge.
| Dimension | RAG with a vector database | Fine-tuning |
|---|---|---|
| Knowledge update | Add or replace documents in the vector database; no model retraining required | Requires a new training run to incorporate new information |
| Data that changes | Well-suited for frequently updated content | Less suited; model weights encode a fixed snapshot |
| Data stays external | Source documents remain outside the model | Knowledge is baked into model weights |
| Cost tradeoff | Storage and retrieval compute at inference time | Upfront training compute; lower per-query overhead once deployed |
| Reduces hallucinations | Grounded responses cite retrieved passages | Does not inherently provide per-query grounding |
The blueprint also lists AWS services for storing embeddings: Amazon OpenSearch Service, Amazon Aurora, Amazon Neptune, Amazon DocumentDB (with MongoDB compatibility), and Amazon RDS for PostgreSQL.
COMMON MISCONCEPTION
Vector databases do not perform keyword (lexical) search — they perform similarity search over embeddings.
A traditional relational or keyword-search database matches exact or fuzzy text strings. A vector database converts both the stored content and the incoming query into numerical vectors and then uses a distance function — such as cosine similarity — to rank results by semantic closeness, not literal word overlap. This means two passages that share no words in common can still rank as highly relevant if their embeddings are nearby in vector space.
Candidates sometimes assume that any database holding text can serve as a RAG retrieval layer. The distinguishing requirement is that the store must support k-nearest-neighbor (k-NN) vector search. Some AWS services (for example, Amazon OpenSearch Service) support hybrid search on both keywords and vectors, but the vector search capability is what makes them suitable as a RAG retrieval backend.
A second misconception is treating RAG and fine-tuning as equivalent approaches to the same problem. RAG uses a vector database to supply the model with retrieved context at inference time; fine-tuning changes the model's weights. The blueprint explicitly lists both as distinct customization approaches with different cost tradeoffs.
How it shows up on the exam
Task 3.1 asks candidates to define RAG, describe its business applications, and identify AWS services that store embeddings in vector databases. Questions on this topic tend to test whether you understand the role of the vector database within the RAG pipeline, not low-level algorithmic implementation details.
Candidates often confuse the following:
- Semantic search vs. keyword search: a scenario may describe a case where exact-match search fails on paraphrased queries and ask which component resolves this — the answer points to embedding-based similarity search, which is the function a vector database provides.
- RAG vs. fine-tuning: a scenario describing a need to keep an LLM current with rapidly changing internal documents points toward RAG (update the vector database) rather than frequent fine-tuning runs.
- Where embeddings live: the embedding model creates vectors; the vector database stores and indexes them. These are separate components in the pipeline.
Signal phrases to watch for: "semantic search," "embeddings," "similarity search," "knowledge base," "retrieval," "external data source," "reduce hallucinations," and "store vectors."
Related concepts
Sources
Every claim on this page traces to the public exam blueprint and official documentation: