1. ๐Ÿ“Š Multinomial Naive Bayes

The Multinomial Naive Bayes model is well-suited for discrete feature counts, such as the word frequencies in spam detection.

๐Ÿ” Formula

We estimate the likelihood of a word w given a class c (e.g. spam or not spam) using:

ฮธ_{ฮฑ|c} = (count(ฮฑ in class c) + 1) / (total words in class c + d)

โœ… Why Add +1 in the Numerator?

This is called Laplace Smoothing, which helps handle zero-frequency problems. Without it, a word that never appears in training data of class c would zero out the whole product of probabilities.

โœ… Why Add +d in the Denominator?

d represents the total number of distinct words in the vocabulary. By adding d, we ensure the denominator reflects the adjusted total count including the smoothing.

โœ‰๏ธ Use Case: Spam Classification


2. ๐ŸŒ Gaussian Naive Bayes

Gaussian Naive Bayes is used when features follow a normal (Gaussian) distribution โ€” suitable for continuous data.

๐Ÿงช Probability Density Function (PDF)

P(x_i | y) = (1 / โˆš(2ฯ€ฯƒยฒ)) * exp(โˆ’(x_i โˆ’ ฮผ)ยฒ / 2ฯƒยฒ)

๐Ÿ“Œ Why Use Gaussian?