The perceptron tries to find a hyperplane that makes no mistakes.
Classification rule:
y(wᵗx) → (w + kx)ᵗx
If a point is misclassified, the hyperplane is adjusted; otherwise, it stays the same.
To prevent infinite adjustments, the perceptron relies on this condition:
k ≤ -(wᵗx) / (xᵗx)
This ensures you won’t endlessly adjust the hyperplane for data that is actually linearly separable.