CutScore | Computer vision use cases

What it is

Computer vision is a technology that enables machines to automatically recognize images and describe them accurately and efficiently. It uses machine learning — specifically deep learning — to train computers on large volumes of visual data, so they can identify patterns and apply that learned knowledge to recognize new, unseen images and video.

The core insight is that computer vision does not change or alter an image; it makes sense of what it sees and carries out a task, such as labeling an object or raising an alert.

Mental model

Think of computer vision as giving a machine a pair of eyes and a trained brain. The eyes are sensors (cameras, scanners, medical imaging devices). The brain is a deep learning model — typically a convolutional neural network (CNN) — that has learned from millions of labeled examples what different visual patterns mean. When new visual input arrives, the model matches it against what it has learned and produces an output: a label, a bounding box, a classification, or a decision.

This is distinct from image processing, which filters or transforms pixels (sharpening a photo, adjusting contrast). Computer vision leaves the pixels alone and produces meaning.

When to use it

Use the table below when an exam question describes a scenario and you need to choose the right AI/ML approach.

Scenario	Right fit	Why
A factory camera flags defective products on a production line	Computer vision	Detecting and localizing visual defects in images is an object detection task
A hospital system interprets X-rays and MRIs	Computer vision	Analyzing medical images for anomalies (e.g., tumor detection from moles or lesions) is image classification/segmentation
A self-driving system identifies pedestrians and road signs in real time	Computer vision	Real-time image recognition and 3D map construction from camera feeds
A security system restricts access to a server room	Computer vision	Facial recognition for employee authentication is a computer vision task
A voice assistant transcribes spoken commands	Not computer vision	This is a speech/NLP task — there is no visual input
A recommendation engine predicts which product a customer will buy	Not computer vision	This is tabular/structured-data ML — no image or video input

Common misconception

The trap: candidates often assume that any "image" task is a computer vision task — for example, that resizing, compressing, or color-correcting a photo is computer vision. It is not. Those are image processing operations that transform pixels without interpreting them.

Computer vision is specifically about making sense of visual content: identifying what is in an image, where objects are located, or what is happening across a sequence of frames. The distinction the official documentation draws is sharp: computer vision does not change an image; it labels, detects, tracks, or segments what it sees.

A second misconception is that computer vision and object detection are the same thing. Object detection — identifying and localizing objects within an image — is one task within computer vision. Computer vision also encompasses image classification (assigning a category label to a whole image), object tracking (following an identified object across video frames), and image segmentation (dividing an image into pixel regions that correspond to distinct objects or areas).

How it shows up on the exam

The exam task is recall and application: given a business scenario, identify whether it calls for computer vision and, if so, which specific CV task type (classification, detection, tracking, segmentation) fits the description.

Candidates often confuse computer vision with image processing (transforming pixels vs. interpreting them) or with NLP tasks when a scenario involves reading text from a document image. A scenario describing "reading text from a scanned invoice" could involve optical character recognition, which sits at the intersection of computer vision and NLP — the visual component is CV; the text understanding that follows is NLP.

Signal phrases in stems that point toward computer vision:

"camera feed," "image," "video," "scan," "X-ray," "visual inspection"
"detect," "identify," "recognize," "locate," "classify" applied to objects in images
"autonomous vehicle," "medical imaging," "quality defect," "facial recognition," "crop disease"

Signal phrases that point away from computer vision (toward other AI/ML domains):

"predict a numeric value," "classify customer reviews," "transcribe audio," "recommend products"

Computer vision use cases — AIF-C01

What it is

Mental model

When to use it

Common misconception

How it shows up on the exam

Related concepts

Sources