A complete summary based on Machinfy Data Preprocessing PDF. All notes preserved with clear formatting and examples.
Data preprocessing is a step in the data mining and data analysis process that takes raw data and transforms it into a format that can be understood and analyzed by computers and machine learning.
π Preprocessing is everywhere β itβs one of the most important steps in Data Science.
Type | Description | Examples |
---|---|---|
Numerical | Made of numbers | Age, weight, shoe size |
Categorical | Made of words | Eye color, gender, blood type |
Discrete | Finite options | Number of children |
Continuous | Infinite options | Age, weight |
Nominal | No hierarchy | Eye color, blood type |
Ordinal | Has hierarchy | Rating, mood, degree |
Collecting raw data from multiple sources.
Evaluating if data is: