Skip to content

Latest commit

 

History

History
63 lines (43 loc) · 1.15 KB

support_vector.md

File metadata and controls

63 lines (43 loc) · 1.15 KB

Support Vector Machine

Suppose dataset $x_i\in\mathbb{R}^n$ related to label $y_i\in{+1,-1}$, find $w\in\mathbb{R}^n$ and bias $b\in\mathbb{R}$ such that

$$ \begin{cases} x_i \cdot w + b \ge +1, \quad y_i &= +1 \\ x_i \cdot w + b \le -1, \quad y_i &= -1 \end{cases} $$

which leads to

$$ y_i(x_i \cdot w + b) \ge +1. $$

The distance between optimal hyperplanes is $2/||w||$, so maximizing the margin equals to minimizing $||w||$.

Primary problem : min $L_p(w,\alpha_i)$,

$$ L_p(w,\alpha_i) = \frac{1}{2} ||w||^{2} - \sum_{i}^n \alpha_i y_i(x_i\cdot w+b) + \sum_i^n \alpha_i $$

$$ \frac{\partial L_p(w,\alpha_i)}{\partial w} = 0 $$

leads to

$$ w = \sum_i^n \alpha_i y_i x_i $$

and $\forall i,\alpha_i \ge 0$,

$$ \frac{\partial L_p(w,\alpha_i)}{\partial b} = 0 $$

leads to

$$ \sum_i^n \alpha_i y_i = 0. $$

Dual problem : max $L_d(\alpha_i)$,

$$ L_d(\alpha_i) =\sum_i^n \alpha_i - \frac{1}{2} \sum_{i,j}^n \alpha_i \alpha_j y_i y_j (x_i\cdot x_j) $$

s.t. $\forall i,\alpha_i \ge 0$,

$$ \sum_i^n \alpha_i y_i = 0. $$

Only support vectors have nonzero coefficients $\alpha_i$.

Approaches : sub-gradient descent and coordinate descent.