A complete summary based on Machinfy Data Preprocessing PDF. All notes preserved with clear formatting and examples.


πŸ“š What You Will Learn


❓ What Is Data Preprocessing?

Data preprocessing is a step in the data mining and data analysis process that takes raw data and transforms it into a format that can be understood and analyzed by computers and machine learning.

πŸ“Œ Preprocessing is everywhere β€” it’s one of the most important steps in Data Science.


πŸ”’ What Are Data Types?

Type Description Examples
Numerical Made of numbers Age, weight, shoe size
Categorical Made of words Eye color, gender, blood type
Discrete Finite options Number of children
Continuous Infinite options Age, weight
Nominal No hierarchy Eye color, blood type
Ordinal Has hierarchy Rating, mood, degree

🧱 Data Preprocessing Steps

1. πŸ“₯ Data Gathering

Collecting raw data from multiple sources.


2. πŸ§ͺ Data Quality Assessment

Evaluating if data is: