Lasso Regression In Scikit-Learn

Often we want conduct a process called regularization, wherein we penalize the number of features in a model in order to only keep the most important features. This can be particularly important when you have a dataset with 100,000+ features.

Lasso regression is a common modeling technique to do regularization. The math behind it is pretty interesting, but practically, what you need to know is that Lasso regression comes with a parameter, alpha, and the higher the alpha, the most feature coefficients are zero.

That is, when alpha is 0, Lasso regression produces the same coefficients as a linear regression. When alpha is very very large, all coefficients are zero.

In this tutorial, I run three lasso regressions, with varying levels of alpha, and show the resulting effect on the coefficients.

Preliminaries

from sklearn.linear_model import Lasso
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_boston
import pandas as pd

Load Data

boston = load_boston()
scaler = StandardScaler()
X = scaler.fit_transform(boston["data"])
Y = boston["target"]
names = boston["feature_names"]

Run Three Lasso Regressions, Varying Alpha Levels

# Create a function called lasso,
def lasso(alphas):
    '''
    Takes in a list of alphas. Outputs a dataframe containing the coefficients of lasso regressions from each alpha.
    '''
    # Create an empty data frame
    df = pd.DataFrame()

    # Create a column of feature names
    df['Feature Name'] = names

    # For each alpha value in the list of alpha values,
    for alpha in alphas:
        # Create a lasso regression with that alpha value,
        lasso = Lasso(alpha=alpha)

        # Fit the lasso regression
        lasso.fit(X, Y)

        # Create a column name for that alpha value
        column_name = 'Alpha = %f' % alpha

        # Create a column of coefficient values
        df[column_name] = lasso.coef_

    # Return the datafram    
    return df
# Run the function called, Lasso
lasso([.0001, .5, 10])
Feature Name Alpha = 0.000100 Alpha = 0.500000 Alpha = 10.000000
0 CRIM -0.920130 -0.106977 -0.0
1 ZN 1.080498 0.000000 0.0
2 INDUS 0.142027 -0.000000 -0.0
3 CHAS 0.682235 0.397399 0.0
4 NOX -2.059250 -0.000000 -0.0
5 RM 2.670814 2.973323 0.0
6 AGE 0.020680 -0.000000 -0.0
7 DIS -3.104070 -0.169378 0.0
8 RAD 2.656950 -0.000000 -0.0
9 TAX -2.074110 -0.000000 -0.0
10 PTRATIO -2.061921 -1.599574 -0.0
11 B 0.856553 0.545715 0.0
12 LSTAT -3.748470 -3.668884 -0.0

Notice that as the alpha value increases, more features have a coefficient of 0.