Want to learn machine learning? Use my machine learning flashcards.

# Adding Interaction Terms

## Preliminaries

```
# Load libraries
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston
from sklearn.preprocessing import PolynomialFeatures
import warnings
# Suppress Warning
warnings.filterwarnings(action="ignore", module="scipy", message="^internal gelsd")
```

## Load Boston Housing Dataset

```
# Load the data with only two features
boston = load_boston()
X = boston.data[:,0:2]
y = boston.target
```

## Add Interaction Term

Interaction effects can be account for by including a new feature comprising the product of corresponding values from the interacting features:

$$\hat_{y} = \hat_{{\beta_{0}} + \hat_{\beta_{1}}x_{1}+ \hat_{\beta_{2}}x_{2} + \hat_{\beta_{3}}x_{1}x_{2} + \epsilon$$

where $x_{1}$ and $ x_{2}$ are the values of the two features, respectively and $x_{1}x_{2}$ represents the interaction between the two. It can be useful to use scikit-learn’s `PolynomialFeatures`

to creative interaction terms for all combination of features. We can then use model selection strategies to identify the combination of features and interaction terms which produce the best model.

```
# Create interaction term (not polynomial features)
interaction = PolynomialFeatures(degree=3, include_bias=False, interaction_only=True)
X_inter = interaction.fit_transform(X)
```

## Fit Linear Regression

```
# Create linear regression
regr = LinearRegression()
# Fit the linear regression
model = regr.fit(X_inter, y)
```