# Adding Interaction Terms

## Preliminaries

# Load libraries from sklearn.linear_model import LinearRegression from sklearn.datasets import load_boston from sklearn.preprocessing import PolynomialFeatures import warnings # Suppress Warning warnings.filterwarnings(action="ignore", module="scipy", message="^internal gelsd")

## Load Boston Housing Dataset

# Load the data with only two features boston = load_boston() X = boston.data[:,0:2] y = boston.target

## Add Interaction Term

Interaction effects can be account for by including a new feature comprising the product of corresponding values from the interacting features:

$$\hat y = \hat\beta_{0} + \hat\beta_{1}x_{1}+ \hat\beta_{2}x_{2} + \hat\beta_{3}x_{1}x_{2} + \epsilon$$

where \(x_{1}\) and \( x_{2}\) are the values of the two features, respectively and \(x_{1}x_{2}\) represents the interaction between the two. It can be useful to use scikit-learn's `PolynomialFeatures`

to creative interaction terms for all combination of features. We can then use model selection strategies to identify the combination of features and interaction terms which produce the best model.

# Create interaction term (not polynomial features) interaction = PolynomialFeatures(degree=3, include_bias=False, interaction_only=True) X_inter = interaction.fit_transform(X)

## Fit Linear Regression

# Create linear regression regr = LinearRegression() # Fit the linear regression model = regr.fit(X_inter, y)