Want to learn machine learning? Use my machine learning flashcards.

# Logistic Regression On Very Large Data

scikit-learn’s `LogisticRegression`

offers a number of techniques for training a logistic regression, called solvers. Most of the time scikit-learn will select the best solver automatically for us or warn us that you cannot do some thing with that solver. However, there is one particular case we should be aware of.

While an exact explanation is beyond the bounds of this book, stochastic average gradient descent allows us to train a model much faster than other solvers when our data is very large. However, it is also very sensitive to feature scaling to standardizing our features is particularly important. We can set our learning algorithm to use this solver by setting `solver='sag'`

.

## Preliminaries

```
# Load libraries
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
```

## Load Iris Flower Data

```
# Load data
iris = datasets.load_iris()
X = iris.data
y = iris.target
```

## Standardize Features

```
# Standarize features
scaler = StandardScaler()
X_std = scaler.fit_transform(X)
```

## Train Logistic Regression Using SAG solver

```
# Create logistic regression object using sag solver
clf = LogisticRegression(random_state=0, solver='sag')
# Train model
model = clf.fit(X_std, y)
```