Regression Techniques (Linear & polynomial)

Regression Definition

Regression is a statistical method used in finance, investing, and other disciplines that attempts to determine the strength and character of the relationship between one dependent variable (usually denoted by Y) and a series of other variables (known as independent variables).

Linear Equation

Linear equation is an equation in which the highest power of the variable is always 1
Formula: y = ax + b
Where:
- y is the output of prediction
- x is the input variable data
- a and b are constant values that control the linear line

Linear Regression

Supervised Machine Learning model that finds the best fit linear line between independent and dependent variables
Finds the linear relationship between dependent and independent variables

# Simple Linear Regression Implementation
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt

# Load dataset
df = pd.read_csv('Salary_Data.csv')
X = df.iloc[:, :-1].values
y = df.iloc[:, 1].values

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=0)

# Train model
regressor = LinearRegression()
regressor.fit(X_train, y_train)

# Predictions
y_pred = regressor.predict(X_test)

# Visualization
plt.scatter(X_train, y_train, color='red')
plt.plot(X_train, regressor.predict(X_train), color='blue')
plt.title('Salary vs Experience (Training set)')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.show()

Multiple Regression

Uses two or more independent variables to predict a dependent variable

General form: y = a₁x₁ + a₂x₂ + a₃x₃ + ... + aₙxₙ + b

Cost Function

Measures how well a machine learning model performs by quantifying the difference between predicted and actual outputs

Goal is to minimize this function by adjusting model parameters

For linear regression h(X) = θ₀ + θ₁X, the cost function is:

J(θ₀, θ₁) = 1/2m * Σ(h(xⁱ) - yⁱ)²

Where:

m is number of training examples