🧩 Dense vs. Sparse Representations
Dense Representation
- Means you use every value in your feature vector.
- Example: Pixel values in an image — each pixel contributes to the prediction.
Sparse Representation
- Most feature values are zeros — you only store the non-zero ones.
- Example: Text data (like emails for spam filtering). You don’t store every word in the English language, just the words that actually appear.
📉 Different Loss Functions
0/1 Loss
- Used for binary classification problems.
- The model gets a score of 0 if it predicts correctly, 1 if it’s wrong.
Squared Loss
- Best for regression tasks.
- Use when small prediction errors don’t matter much — the loss penalizes larger errors more heavily.
Absolute Loss
- Use when all errors matter equally.
- Good for regression when you want to preserve the scale of values — big or small, every difference counts.
🧪 Train-Test Split & Generalization