Hashing — SY0-701
CompTIA Security+ SY0-701 reference: hashing definitions, one-way and collision-resistance properties, integrity use cases, and common exam misconceptions.
WHAT IT IS
A hash function is a function that maps a bit string of arbitrary length to a fixed-length bit string. The fixed-length output is called a message digest (also: hash value, hash output). The output depends entirely on the input: changing even a single bit in the source data produces a different message digest.
Approved cryptographic hash functions satisfy three properties (sourced from NIST):
| Property | What it means |
|---|---|
| One-way (preimage resistance) | Given a randomly chosen message digest, it is computationally infeasible to find the input that produced it. |
| Collision resistance | It is computationally infeasible to find any two distinct inputs that map to the same output. |
| Second preimage resistance | Given one input, it is computationally infeasible to find a different input that produces the same output. |
Mental model
Think of hashing as a fingerprint machine: feed in any document — a single byte or a multi-gigabyte file — and the machine always stamps out a fixed-size fingerprint. Two identical documents always produce the same fingerprint. Two different documents should never produce the same fingerprint. You cannot reconstruct the document from the fingerprint alone. The fingerprint is useful precisely because of what it cannot do.
When to use it
Hashing is the right tool when the goal is verifying that data has not changed — that is, protecting integrity. NIST defines integrity as the property that sensitive data has not been modified or deleted in an unauthorized and undetected manner since it was created, transmitted, or stored.
Hashing does not provide confidentiality (the original is not hidden from someone who already has it) and does not, by itself, prove who produced the digest.
| Goal | Tool | Why |
|---|---|---|
| Verify a file has not been altered | Hash function | Fixed-length digest changes if any bit changes |
| Keep data secret from observers | Encryption (symmetric or asymmetric) | Hashing is not reversible, but it is also not designed to conceal |
| Prove who signed a message and protect integrity | Digital signature | Defined as a hash function followed by a signature function; provides origin authentication, integrity, and non-repudiation |
| Verify integrity and authenticity with a shared secret | Message authentication code (MAC) | A cryptographic checksum using a symmetric key; detects accidental and intentional modifications |
COMMON MISCONCEPTION
"Hashing encrypts data." It does not. Encryption is a reversible transformation — the ciphertext can be decrypted back to plaintext with the correct key. Hashing is intentionally one-way: preimage resistance means it is computationally infeasible to recover the input from the output. There is no "decryption key" for a hash. A digest reveals nothing about the original input to someone who only holds the digest.
A related trap: "A matching hash guarantees authenticity." It does not — it guarantees integrity (the data has not changed) only when the hash itself is delivered through a trusted channel or protected by an additional mechanism such as a digital signature or MAC. A hash value transmitted alongside tampered data can itself be replaced.
How it shows up on the exam
The exam tests whether a candidate can distinguish the specific security service a mechanism provides. Candidates often confuse hashing with encryption because both transform data and produce an output that is not the original plaintext. The key cognitive target is recognizing that:
- Hashing protects integrity; encryption protects confidentiality.
- Hashing is one-way; encryption is reversible (given the correct key).
- A digital signature, as NIST defines it, applies a hash function first — but the signature scheme as a whole adds origin authentication and non-repudiation, which hashing alone cannot provide.
Signal phrases to recognize in a scenario: "verify the file has not been modified," "confirm the download is uncorrupted," "detect unauthorized changes," "produce a fixed-length representation." These point toward a hash function. Phrases that additionally require proving source identity point toward signatures; phrases that require keeping contents secret point toward encryption.
Candidates who understand the three NIST-defined properties — one-way, collision resistance, second preimage resistance — are equipped to reason about why a hash function does or does not satisfy a particular security requirement in a given scenario, rather than memorizing surface-level associations.
Related concepts
- Public Key Infrastructure — PKI uses digital signatures, which apply a hash function as the first step before the asymmetric signing operation.
- Symmetric Encryption — Symmetric algorithms provide confidentiality through reversible transformations; they are distinct from the one-way nature of hashing.
- Asymmetric Encryption — Asymmetric algorithms underpin digital signatures; understanding the hash-then-sign model requires distinguishing what hashing contributes versus what the asymmetric operation contributes.
Sources
Every claim on this page traces to the public exam blueprint and official documentation: