# Logistic Regression On Very Large Data

scikit-learn's `LogisticRegression`

offers a number of techniques for training a logistic regression, called solvers. Most of the time scikit-learn will select the best solver automatically for us or warn us that you cannot do some thing with that solver. However, there is one particular case we should be aware of.

While an exact explanation is beyond the bounds of this book, stochastic average gradient descent allows us to train a model much faster than other solvers when our data is very large. However, it is also very sensitive to feature scaling to standardizing our features is particularly important. We can set our learning algorithm to use this solver by setting `solver='sag'`

.

## Preliminaries

# Load libraries from sklearn.linear_model import LogisticRegression from sklearn import datasets from sklearn.preprocessing import StandardScaler

## Load Iris Flower Data

# Load data iris = datasets.load_iris() X = iris.data y = iris.target

## Standardize Features

# Standarize features scaler = StandardScaler() X_std = scaler.fit_transform(X)

## Train Logistic Regression Using SAG solver

# Create logistic regression object using sag solver clf = LogisticRegression(random_state=0, solver='sag') # Train model model = clf.fit(X_std, y)