CutScore | Prompt engineering best practices

WHAT IT IS

Prompt engineering is the practice of optimizing textual input to a Large Language Model (LLM) to obtain desired responses. A prompt is a natural language text that requests the generative AI to perform a specific task. Because the quality of prompts can impact the quality of a model's responses, prompt engineering gives practitioners a systematic way to shape model behavior without changing the model's underlying weights.

A single prompt may combine one or more components: the instruction (what the model should do), the context (domain or background information), demonstration examples, and the input text the model should act on.

Mental model

Think of a prompt as a precise work order handed to a contractor. A vague order ("make it better") produces unpredictable results. A clear order with the right context, a concrete example of the deliverable, and an explicit scope constraint produces reliable, repeatable output. Prompt engineering is the discipline of writing better work orders — and iterating until the contractor reliably delivers what you need.

When to use it

The exam regularly tests which prompting approach fits a given scenario. The three most-tested techniques are zero-shot, few-shot (in-context learning), and chain-of-thought.

Technique	What it is	Best fit
Zero-shot	No example input-output pairs in the prompt; the model relies on its training alone	Task is well-defined; examples are unavailable or unnecessary
Few-shot (in-context learning)	A small number of paired example inputs and desired outputs are included in the prompt to help the model calibrate its output	Output format or classification schema must match a specific pattern the model might not infer from instructions alone
Chain-of-thought	Breaks a complex question into smaller, logical parts that mimic a train of thought	Multi-step reasoning tasks where intermediate steps improve answer accuracy
Prompt template	A reusable prompt structure with exchangeable content slots; described as "recipes" for using LLMs for different use cases	Standardizing prompts across a team or application; consistent formatting at scale

COMMON MISCONCEPTION

Prompt engineering is not fine-tuning. A common misconception is that when a model consistently produces the wrong output format or reasoning, the solution is always to fine-tune (retrain) the model. In practice, the official AWS guidance makes clear that prompt quality directly drives response quality, and that techniques such as few-shot prompting — providing paired example inputs and desired outputs inside the prompt itself — can calibrate model output without any weight updates. Fine-tuning changes model parameters; prompt engineering changes only the input. The two are distinct levers, and the exam tests whether candidates can distinguish them.

A second trap: candidates sometimes assume that once a prompt works, no further effort is needed. The official guidance explicitly identifies experimentation and refinement as an essential best practice — effective prompt engineering is iterative, not a one-time activity.

A third trap involves the stateless nature of LLM API calls. Models accessed via API do not recall prior prompts or previous requests unless the prior interaction is explicitly included in the current prompt. Treating the model as if it maintains conversational memory across separate API calls is a design error, not a model capability.

How it shows up on the exam

Task Statement 3.2 asks candidates to choose effective prompt engineering techniques — the cognitive target is selection and application, not recall of definitions alone.

Questions in this area tend to present a scenario (a task type, an output quality problem, or a risk) and ask which technique or practice addresses it. Candidates often confuse which technique fits which scenario:

Scenarios describing inconsistent output format are often best addressed by few-shot examples, because providing demonstration pairs helps the model calibrate to a specific pattern.
Scenarios describing complex, multi-step reasoning errors point toward chain-of-thought prompting.
Scenarios describing hallucinations or inaccurate outputs connect to prompt optimization and, where relevant, to retrieval approaches — the official Bedrock guidance notes that refining prompts is one path to reducing hallucinations.

The blueprint also explicitly lists risks and limitations as in-scope: prompt exposure, poisoning, hijacking, and jailbreaking. A candidate who understands prompt engineering only as a quality tool — and not also as a security surface — may miss questions in this area.

Signal phrases that suggest a prompt engineering question: "without retraining the model," "improve response quality," "provide examples in the prompt," "break the problem into steps," "the model is not following the format," "guardrails," "specificity and concision."

Related concepts

AI Agents — Agents execute multi-step tasks and rely on well-engineered prompts to guide tool use and reasoning steps; prompt design choices directly affect agent reliability.
Bedrock Knowledge Bases — Knowledge bases supply retrieved context that is injected into prompts; understanding prompt structure clarifies how retrieved passages fit into the instruction-context-input pattern.
RAG Design Considerations — Retrieval-Augmented Generation is one approach the official AWS guidance identifies for reducing hallucinations alongside prompt refinement; knowing both helps candidates select the right lever for a given problem.

Prompt engineering best practices — AIF-C01