← Concepts
Applications of Foundation ModelsAIF-C01 · Task 3.3

Model customization approaches — AIF-C01

Master Amazon Bedrock model customization: supervised fine-tuning, reinforcement fine-tuning, and distillation — and when to use each for AIF-C01.

WHAT IT IS

Model customization is the process of providing training data to a model to improve its performance for specific use cases. Amazon Bedrock provides three customization methods: supervised fine-tuning, reinforcement fine-tuning, and distillation. Each method adjusts a foundation model's parameters, producing a privately owned custom model that only your AWS account can access.

Mental model

Think of customization as a spectrum of how much you already know the right answer:

  • You have labeled examples (input → correct output): use supervised fine-tuning.
  • You can measure quality but can't enumerate correct answers: use reinforcement fine-tuning.
  • You want a smaller, cheaper model that performs like a larger one: use distillation.

The key question is always: what kind of signal can you provide?

When to use it

MethodInput data requiredModel parameters change?Primary goal
Supervised fine-tuningLabeled prompt–response pairsYesImprove performance on specific tasks with known correct outputs
Reinforcement fine-tuningPrompts + reward functions (not labeled pairs)YesOptimize for measurable quality criteria; useful when correct answers are hard to define upfront
DistillationPrompts (with optional labeled pairs); teacher model generates responsesYes (student model)Transfer capability from a larger teacher model to a smaller, faster, cost-efficient student model

Supervised fine-tuning is the right choice when you have a labeled dataset and want the model to learn the association between inputs and specific output types.

Reinforcement fine-tuning fits when output quality can be objectively measured — for example, code correctness, mathematical reasoning, or structured outputs — and especially when collecting high-quality labeled examples is expensive or impractical.

Distillation is the right choice when you want to achieve the accuracy of a larger model at lower inference cost: you select a teacher model and a student model, provide prompts, and Amazon Bedrock generates teacher responses to fine-tune the student.

COMMON MISCONCEPTION

A common misconception is that reinforcement fine-tuning requires labeled input–output pairs just like supervised fine-tuning. It does not. Reinforcement fine-tuning explicitly replaces labeled pairs with reward functions that evaluate response quality. The model learns iteratively from feedback scores, not from pre-labeled examples. Conflating these two methods — and treating labeled data as a universal requirement for all fine-tuning — is a trap that scenario-based exam questions are designed to surface.

A second misconception is that distillation is simply running inference on a large model. Distillation is a training process: Amazon Bedrock uses the teacher model's responses to fine-tune the student model's parameters. The student model is changed; the teacher model is not.

How it shows up on the exam

The cognitive target for this topic is distinguishing the right customization method given a scenario's constraints. Candidates who have only a surface-level understanding often confuse the three methods by focusing on the word "fine-tuning" and missing what kind of data or signal each requires.

Watch for scenario language like:

  • "…has labeled prompt–response pairs and wants to improve accuracy on a specific task" — points toward supervised fine-tuning, where labeled data trains the model to associate input types with output types.
  • "…can write a function to score responses but does not have labeled examples" — points toward reinforcement fine-tuning, where reward functions replace labeled pairs.
  • "…needs a smaller, faster model that performs as well as a larger model on their use case" — points toward distillation, where a teacher model's knowledge is transferred to a student model.
  • "…model's parameters are adjusted" — all three customization methods adjust model parameters; this phrase alone does not distinguish between them.

The official documentation is explicit that reinforcement fine-tuning improves alignment "through feedback-based learning" and that "instead of providing labeled input-output pairs, you define reward functions." Exam scenarios describing a reward-function or scoring approach signal reinforcement fine-tuning, not supervised fine-tuning.

Related concepts

  • AI Agents — Agents orchestrate tool use and multi-step reasoning at inference time; customization changes model weights at training time. These are complementary, not interchangeable.
  • Bedrock Knowledge Bases — Knowledge bases ground a model's responses in external data at inference time via retrieval; they do not adjust model parameters.
  • RAG Design Considerations — Understanding when retrieval-augmented generation is sufficient versus when parameter-level customization is warranted is a key exam decision boundary.

Sources

Every claim on this page traces to the public exam blueprint and official documentation:

CutScore is an independent study tool and is not affiliated with, authorized by, endorsed by, or sponsored by Amazon Web Services. “AWS” and “AWS Certified AI Practitioner” are trademarks of Amazon.com, Inc. or its affiliates. All content is independently authored from the public exam blueprint and official documentation — no real exam content is used.

The exam-readiness instrument. Know if you’re ready before you book.

Company
Contact