Supervised Neural Network

Multi-layer Perceptron (MLP)

  • Non-linear function approximator for either classification or regression

  • Pros

    • Capability to learn non-linear models
  • Cons

    • Exists more than one local minimum. Different random weight initializations can lead to different validation accuracy
    • Requires tuning a number of hyperparameters such as the number of hidden neurons, layers, and iterations
    • Sensitive to feature scaling

MLP Classification

In [1]:
from sklearn.datasets import load_digits
digits = load_digits()
X = digits.data
y = digits.target
In [2]:
from sklearn.model_selection import train_test_split
train_X, test_X, train_Y, test_Y = train_test_split(X, y, test_size=0.2, random_state=42, stratify = y)
In [11]:
from sklearn.neural_network import MLPClassifier
model = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(5, 2), random_state=1)
model.max_iter = 10000

model.fit(train_X, train_Y)

y_pred = model.predict(test_X)
In [13]:
from sklearn.metrics import confusion_matrix
confusion_matrix(test_Y, y_pred)
Out[13]:
array([[33,  1,  0,  0,  0,  0,  1,  1,  0,  0],
       [ 0, 22,  1,  0,  0,  0,  2,  0, 11,  0],
       [ 1,  0, 22,  0,  0,  3,  0,  8,  1,  0],
       [ 0,  0,  0, 33,  0,  4,  0,  0,  0,  0],
       [ 0,  0,  0,  0, 30,  0,  6,  0,  0,  0],
       [ 0,  0,  4,  7,  0, 16,  1,  3,  2,  4],
       [ 0,  1,  0,  0,  2,  0, 31,  0,  2,  0],
       [ 1,  0, 13,  0,  0,  3,  0, 19,  0,  0],
       [ 0,  7,  4,  0,  0,  1,  0,  0, 22,  1],
       [ 0,  0,  0,  1,  2,  0,  0,  0,  3, 30]])

MLP Regression

  • Use one output unit
  • Removing activation function at the output layer
  • Changing cost function to quatratic cost function
  • One hidden layer for starting will be proper. Number of hidden units depends on number of features and complexity of problem
In [21]:
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=1000, n_features=10, n_targets=1, random_state=1)
In [24]:
from sklearn.model_selection import train_test_split
train_X, test_X, train_Y, test_Y = train_test_split(X, y, test_size=0.2, random_state=42)
In [30]:
from sklearn.neural_network import MLPRegressor
regr = MLPRegressor(random_state=1, max_iter=10000)
regr.fit(train_X, train_Y)

y_pred = regr.predict(test_X)
In [34]:
from sklearn.metrics import mean_squared_error
MSE = mean_squared_error(test_Y, y_pred)

import numpy as np
RMSE = np.sqrt(MSE)
RMSE
Out[34]:
1.5929596923885676