← Concepts
Applications of Foundation ModelsAIF-C01 · Task 3.1

Vector databases for retrieval — AIF-C01

Learn how vector databases power RAG pipelines in the AWS AIF-C01 exam — semantic search, embeddings, AWS services, and the fine-tuning misconception.

WHAT IT IS

A vector database is a purpose-built data store that saves data as high-dimensional numerical vectors (embeddings) and enables fast lookup of the nearest neighbors in that high-dimensional space. In a retrieval-augmented generation (RAG) system, it serves as the external knowledge layer: an embedding model converts your source documents into vectors, those vectors are stored and indexed in the vector database, and at query time the user's question is converted to a vector and matched against the index to surface the most semantically relevant passages.

Mental model

Think of a library where every book has been distilled into a point on a giant map. Books about similar topics cluster together. When you ask a question, the system plots your question on that same map and hands you the books whose points are closest — regardless of whether they share any keywords with your question. That "find the closest points" operation is what a vector database does.

When to use it

The exam tests whether you can distinguish vector-database-backed RAG from fine-tuning as approaches to grounding a foundation model in domain-specific knowledge.

DimensionRAG with a vector databaseFine-tuning
Knowledge updateAdd or replace documents in the vector database; no model retraining requiredRequires a new training run to incorporate new information
Data that changesWell-suited for frequently updated contentLess suited; model weights encode a fixed snapshot
Data stays externalSource documents remain outside the modelKnowledge is baked into model weights
Cost tradeoffStorage and retrieval compute at inference timeUpfront training compute; lower per-query overhead once deployed
Reduces hallucinationsGrounded responses cite retrieved passagesDoes not inherently provide per-query grounding

The blueprint also lists AWS services for storing embeddings: Amazon OpenSearch Service, Amazon Aurora, Amazon Neptune, Amazon DocumentDB (with MongoDB compatibility), and Amazon RDS for PostgreSQL.

COMMON MISCONCEPTION

Vector databases do not perform keyword (lexical) search — they perform similarity search over embeddings.

A traditional relational or keyword-search database matches exact or fuzzy text strings. A vector database converts both the stored content and the incoming query into numerical vectors and then uses a distance function — such as cosine similarity — to rank results by semantic closeness, not literal word overlap. This means two passages that share no words in common can still rank as highly relevant if their embeddings are nearby in vector space.

Candidates sometimes assume that any database holding text can serve as a RAG retrieval layer. The distinguishing requirement is that the store must support k-nearest-neighbor (k-NN) vector search. Some AWS services (for example, Amazon OpenSearch Service) support hybrid search on both keywords and vectors, but the vector search capability is what makes them suitable as a RAG retrieval backend.

A second misconception is treating RAG and fine-tuning as equivalent approaches to the same problem. RAG uses a vector database to supply the model with retrieved context at inference time; fine-tuning changes the model's weights. The blueprint explicitly lists both as distinct customization approaches with different cost tradeoffs.

How it shows up on the exam

Task 3.1 asks candidates to define RAG, describe its business applications, and identify AWS services that store embeddings in vector databases. Questions on this topic tend to test whether you understand the role of the vector database within the RAG pipeline, not low-level algorithmic implementation details.

Candidates often confuse the following:

  • Semantic search vs. keyword search: a scenario may describe a case where exact-match search fails on paraphrased queries and ask which component resolves this — the answer points to embedding-based similarity search, which is the function a vector database provides.
  • RAG vs. fine-tuning: a scenario describing a need to keep an LLM current with rapidly changing internal documents points toward RAG (update the vector database) rather than frequent fine-tuning runs.
  • Where embeddings live: the embedding model creates vectors; the vector database stores and indexes them. These are separate components in the pipeline.

Signal phrases to watch for: "semantic search," "embeddings," "similarity search," "knowledge base," "retrieval," "external data source," "reduce hallucinations," and "store vectors."

Related concepts

Sources

Every claim on this page traces to the public exam blueprint and official documentation:

CutScore is an independent study tool and is not affiliated with, authorized by, endorsed by, or sponsored by Amazon Web Services. “AWS” and “AWS Certified AI Practitioner” are trademarks of Amazon.com, Inc. or its affiliates. All content is independently authored from the public exam blueprint and official documentation — no real exam content is used.

The exam-readiness instrument. Know if you’re ready before you book.

Company
Contact