Logistc Regression

Probability

$$\hat{p} = h_{\theta}(x) = \sigma(X^{T}\theta)$$

Sigmoid function

$$\sigma(t) = \dfrac{1}{1+exp(-t)}$$

Classification

$$\hat{y} = \begin{cases} 0, & \text{if}\ \hat{p} < 0.5 \\ 1, & \text{if}\ \hat{p} \geq 0.5 \end{cases}$$

Cost Function

$$J(\theta) = - \dfrac{1}{m} \Sigma_{i=1}^{m}[y^{(i)}log(\hat{p}^{(i)}) + (1-y^{(i)})log(1-\hat{p}^{(i)})]$$

Decision Boundaries

In [7]:
import numpy as np

from sklearn import datasets
iris = datasets.load_iris()

list(iris.keys())

X = iris['data'][:, 3:] # (150, 1)
Y = (iris['target'] == 2).astype(np.int) # (150, )
In [10]:
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
model.fit(X, Y)
/opt/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.
  FutureWarning)
Out[10]:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='warn', n_jobs=None, penalty='l2',
                   random_state=None, solver='warn', tol=0.0001, verbose=0,
                   warm_start=False)
In [13]:
X_new = np.linspace(0, 3, 1000).reshape(-1, 1)
Y_proba = model.predict_proba(X_new) # (1000, 2)
In [15]:
import matplotlib.pyplot as plt

fig, ax = plt.subplots();

ax.plot(X_new, Y_proba[:, 1], 'g-', label='Iris virginica')
ax.plot(X_new, Y_proba[:, 0], 'b--', label='Not Iris virginica')
Out[15]:
[<matplotlib.lines.Line2D at 0x1a1d938e10>]