Semantic Gap
There’s always a semantic gap between how humans see an image (as meaningful content) and how computers see it — just a grid of numbers.
Data-Driven Approach
A classic ML approach: collect a dataset, label it, train a model, then evaluate its performance. The better the data, the better the results.
Nearest Neighbor Algorithm
Finds the most similar image in the training set and copies its label. Simple, but slow: every prediction needs comparing to the entire dataset (O(N)). Plus, relying on a single neighbor can fail if it’s an outlier.
K-Nearest Neighbors
Solves the outlier problem by looking at multiple neighbors (K), then taking a majority vote — more robust and less sensitive to noise.
L1 vs. L2 Distances
When to Use L1 or L2
Use L1 when each feature’s individual contribution is meaningful. Use L2 if you care more about overall similarity without feature-specific importance.
Cross Validation
Instead of one fixed train/validation/test split, cross-validation splits data into folds and tests across them. This helps ensure the model generalizes well — both approaches (classic split vs. cross-validation) are valid.
Limitations of KNN
KNN struggles to capture complex differences between images because it relies purely on distance metrics — and it’s slow to predict since it compares against all training samples.
Curse of Dimensionality
In high dimensions, finding neighbors gets exponentially harder. For example:
Parametric Approach
Linear models learn parameters (weights) for each feature — making predictions faster because you just use these weights instead of searching through the dataset.
Parametric vs. KNN
Unlike KNN, a linear classifier doesn’t need the training dataset at test time. It has already distilled what it learned into the weights (W) — efficient and scalable.
Bias in Linear Models
If you have more cat photos than dog photos, the bias for “cat” will be higher. Bias terms help correct for class imbalance and prevent the model from overfitting. Too little bias? Overfits. Too much? Might underfit.
Problems with Linear Classification
Linear models can only draw straight lines (or planes) to separate classes — limiting when dealing with complex, non-linear data. They can struggle with images where patterns are more abstract.
From KNN’s simple “find your friends” idea to linear models’ efficient parametric learning, both approaches laid the groundwork for today’s deep CNNs — pushing us closer to teaching machines to truly see.