CutScore | Fine-tuning foundation models

WHAT IT IS

Fine-tuning is a model customization method that further trains a pre-trained foundation model on additional data, changing the model's weights in the process. The result is a modified model adjusted to perform better on the tasks or domain represented by the training data.

Amazon Bedrock describes supervised fine-tuning as providing labeled data — a training dataset of labeled examples — so the model learns to associate what types of outputs should be generated for certain types of inputs. Amazon SageMaker JumpStart frames fine-tuning as "an affordable way to take advantage of [foundation models'] broad capabilities while customizing a model on your own small corpus."

Both services document two primary variants:

Domain adaptation fine-tuning — uses domain-specific text data to adapt the model to industry jargon, technical terms, or specialized vocabulary. Data can be plain text, CSV, or JSON.
Instruction-based fine-tuning — uses labeled prompt–response pairs to improve performance on a specific task. Data must be structured as prompt and completion examples.

Mental model

Think of a foundation model as a generalist employee hired from a large talent pool. They know how to do many things adequately.

Prompt engineering is giving that employee a detailed briefing before each task — no retraining required, but the guidance lives only in that conversation.

RAG is handing the employee a reference binder to look things up during each task — the employee's underlying knowledge has not changed, but they can retrieve current or specialized facts on demand.

Fine-tuning is sending that employee back to school for a focused training program. Their knowledge and behavior are permanently updated. The training has a one-time cost and takes time, but every future task benefits without needing a briefing or a binder.

When to use it

The SageMaker documentation describes a clear decision order: try prompt engineering first; if that is not sufficient, consider fine-tuning or RAG.

Dimension	Prompt engineering	RAG	Fine-tuning
Changes model weights?	No	No	Yes
Best for	Steering tone, style, format; general tasks	Grounding answers in up-to-date or proprietary documents	Teaching domain-specific language, style, or task behavior the model currently lacks
Data needed	None beyond the prompt	A retrievable knowledge source (document store, vector DB)	A labeled or domain-specific training corpus
Knowledge updates	Per-prompt	Update the knowledge source; no retraining needed	Requires a new training run to incorporate new information
Recommended starting point	Yes — try this first	When prompt engineering is not enough and factual grounding from a knowledge library is needed	When prompt engineering is not enough and the model needs to internalize new language or behavior

COMMON MISCONCEPTION

The trap: fine-tuning is how you give a model access to new or private information.

This is the specific confusion the exam surfaces. Candidates often think that because fine-tuning uses custom data, it is the right approach whenever proprietary or up-to-date knowledge is involved.

The distinction the docs draw is sharp: RAG is for "customizing your model with information from a knowledge library without any retraining." Fine-tuning is suited to making the model work with "domain-specific language, such as industry jargon, technical terms, or other specialized vocabulary" and to improving performance on specific tasks — not to making facts retrievable.

If the requirement is retrieving current facts, company documents, or knowledge that changes frequently, RAG is the appropriate approach — the model itself does not need to change. If the requirement is that the model handle specialized terminology fluently or reliably complete a task type it currently handles poorly, fine-tuning is appropriate.

A second misconception: fine-tuning replaces prompt engineering. The official documentation frames them as sequential steps, not alternatives. Prompt engineering should be exhausted before fine-tuning is considered.

How it shows up on the exam

The cognitive target in this area is distinguishing which customization method — prompt engineering, fine-tuning, or RAG — fits a described business scenario.

Candidates often encounter scenario questions that describe a need and ask which customization approach is appropriate.

Signal phrases that point toward fine-tuning: "domain-specific language," "industry jargon," "specialized vocabulary," "consistent output format," "task-specific behavior," "model does not perform well on this task type," or "model weights should be updated."

Signal phrases that suggest fine-tuning is not the right answer: "up-to-date information," "recent documents," "knowledge base," "no retraining," "retrieval," or "information changes frequently."

A common candidate error is selecting fine-tuning whenever proprietary data appears in the scenario, without checking whether the goal is behavioral change (fine-tuning) or factual retrieval (RAG).

Fine-tuning foundation models — AIF-C01

WHAT IT IS

Mental model

When to use it

COMMON MISCONCEPTION

How it shows up on the exam

Related concepts

Sources