Data encryption for AI workloads — AIF-C01
AIF-C01 reference: encryption at rest vs. in transit for AI workloads, customer responsibilities, and key management under the AWS shared responsibility model.
WHAT IT IS
Data encryption scrambles data, making it unreadable to any person, service, or device without the key to unlock its content. For AI workloads, encryption applies to two distinct states: data sitting in storage (at rest) and data moving across a network (in transit). Customers are responsible for managing their data, including encryption options, under the AWS shared responsibility model.
Mental model
Think of encryption as a lock-and-key system that travels with the data regardless of where that data lives. The lock changes form depending on whether the data is in a warehouse (at rest) or on a delivery truck (in transit) — but in both cases, only the holder of the correct key can read what is inside. Key management is therefore just as important as the encryption itself: losing or mishandling the key renders the lock meaningless.
When to use it
The exam often asks candidates to match a scenario — training datasets stored in object storage, model artifacts in a repository, API calls carrying inference payloads, etc. — to the correct encryption type. The table below captures the distinction that matters:
| Scenario | State | Encryption type to apply |
|---|---|---|
| Training data stored in an S3 bucket | At rest | Encrypt the stored objects; manage keys through a key management service |
| Model artifacts saved after a training job | At rest | Encrypt the storage volume or object store where artifacts reside |
| Inference request sent from an application to a model endpoint | In transit | Encrypt the network channel (e.g., using TLS) so data is unreadable while traveling |
| Data pipeline moving raw data from a source to a feature store | In transit | Encrypt the transfer channel; data becomes readable only at the authorized destination |
| End-to-end scenario (data stored and then retrieved for inference) | Both | Apply at-rest encryption in storage and in-transit encryption on every network hop |
A key management service lets customers create, manage, and control the cryptographic keys used for at-rest encryption, keeping key control in the customer's hands even when AWS manages the underlying storage infrastructure.
COMMON MISCONCEPTION
A common misconception is that because AWS secures the underlying infrastructure, customers do not need to configure or manage encryption for their AI data. This conflates two separate layers. AWS is responsible for protecting the infrastructure that runs AWS Cloud services — the hardware, software, networking, and facilities. Customers remain responsible for managing their data, including encryption options. Enabling an AWS managed storage or compute service does not automatically encrypt your data in all configurations; customers must make deliberate choices about encryption settings and key management. For AI workloads, this means training data, fine-tuned model weights, and inference inputs and outputs may all require explicit encryption decisions.
A second misconception is that encryption in transit is sufficient on its own. Data at rest in storage — including large training datasets and model checkpoints — requires separate protection. Encrypting only the network channel leaves stored data exposed if storage access controls are compromised.
How it shows up on the exam
Questions in this area target the ability to identify which encryption type addresses a described scenario and to recognize where customer responsibility begins under the shared model. Candidates often confuse "AWS secures the cloud" with "AWS encrypts my data for me." A question may describe an AI pipeline and ask which action the customer is responsible for — recognizing that data encryption options fall on the customer side of the shared responsibility line is the key cognitive move.
Signal phrases to watch for: "data stored in," "data at rest," "data in transit," "encryption in transit," "key management," "customer responsibility for data," and "securing training data." When a scenario involves both storage and network transfer, look for answers that address both states rather than just one.
Related concepts
Sources
Every claim on this page traces to the public exam blueprint and official documentation: