Cross Decomposition

  • Find the fundamental relations between two matrices (X and Y)
  • Convert X and Y to low-dimensional spaces and make the covariance of the two converted matrices maximal
  • Allows dimensionality reduction by taking into account the targets y

Partial Least Squares Canonical (PLSCanonical)

  • Dimensional reduction
  • Regression
In [59]:
import numpy as np
from sklearn.cross_decomposition import PLSCanonical
X = np.array([[0., 0., 1.], [1.,0.,0.], [2.,2.,2.], [2.,5.,4.]])
Y = np.array([[0.1, -0.2], [0.9, 1.1], [6.2, 5.9], [11.9, 12.3]])
In [60]:
plsca = PLSCanonical(n_components=2)
plsca.fit(X, Y)
Out[60]:
PLSCanonical(algorithm='nipals', copy=True, max_iter=500, n_components=2,
             scale=True, tol=1e-06)
In [61]:
# Predict targets of given samples
plsca.predict(X[:2, :])
Out[61]:
array([[-0.17050568, -1.29053068],
       [-0.50933055,  0.66811955]])
In [62]:
# Dimensional reduction
X_c, Y_c = plsca.transform(X, Y)
In [63]:
# Transform data back to its original space
plsca.inverse_transform(X_c)
Out[63]:
array([[ 0.39787189, -0.33790545,  0.62699746],
       [ 0.84311677,  0.13323811,  0.1470771 ],
       [ 1.41749463,  2.49471135,  2.54609533],
       [ 2.34151671,  4.70995599,  3.67983011]])
In [64]:
plsca.coef_
Out[64]:
array([[ 2.40432756,  1.68695162],
       [-0.44704791,  5.41523294],
       [ 4.86740938, -0.33590654]])

PLSSVD

  • Simplified version of PLSCanonical
In [66]:
from sklearn.cross_decomposition import PLSSVD
pls = PLSSVD(n_components=2).fit(X, Y)
In [67]:
plsca.predict(X[:2, :])
Out[67]:
array([[-0.17050568, -1.29053068],
       [-0.50933055,  0.66811955]])
In [69]:
X_c, Y_c = pls.transform(X, Y)
X_c, Y_c
Out[69]:
(array([[-1.39700475, -0.10283021],
        [-1.19678754,  0.17159333],
        [ 0.56032252, -0.10849725],
        [ 2.03346977,  0.03973413]]),
 array([[-1.22601804, -0.01930121],
        [-0.9602955 ,  0.04015847],
        [ 0.32491535, -0.04311171],
        [ 1.86139819,  0.02225445]]))

PLSRegression

  • Known as PLS1 (single targets) and PLS2 (multiple targets)
  • A form of regularized linear regression
In [74]:
from sklearn.cross_decomposition import PLSRegression
pls2 = PLSRegression(n_components=2)

pls2.fit(X, Y)
Out[74]:
PLSRegression(copy=True, max_iter=500, n_components=2, scale=True, tol=1e-06)
In [76]:
pls2.predict(X[:2, :])
Out[76]:
array([[0.26087869, 0.15302213],
       [0.60667302, 0.45634164]])

Canonical Correlation Analysis (CCA)

  • Unstable if the number of features or targets is greater than the number of samples
In [81]:
from sklearn.cross_decomposition import CCA
cca = CCA(n_components=2)

cca.fit(X, Y)
Out[81]:
CCA(copy=True, max_iter=500, n_components=2, scale=True, tol=1e-06)
In [82]:
cca.predict(X[:2, :])
Out[82]:
array([[-1.51106526, -2.12247471],
       [-0.43537494,  0.32314375]])
In [83]:
X_c, Y_c = cca.transform(X, Y)
X_c, Y_c
Out[83]:
(array([[-1.14979915,  0.07023102],
        [-0.95304207, -0.16529138],
        [ 0.35047354,  0.17359282],
        [ 1.75236768, -0.07853247]]),
 array([[-0.85511537,  0.0249032 ],
        [-0.70878547, -0.05861063],
        [ 0.26065014,  0.06155424],
        [ 1.3032507 , -0.02784681]]))