Want to learn machine learning? Use my machine learning flashcards.

# Mini-Batch k-Means Clustering

Mini-batch k-means works similarly to the k-means algorithm discussed in the last recipe. Without going into too much detail, the difference is that in mini-batch k-means the most computationally costly step is conducted on only a random sample of observations as opposed to all observations. This approach can significantly reduce the time required for the algorithm to find convergence (i.e. fit the data) with only a small cost in quality.

## Preliminaries

```
# Load libraries
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import MiniBatchKMeans
```

## Load Iris Flower Dataset

```
# Load data
iris = datasets.load_iris()
X = iris.data
```

## Standardize Features

```
# Standarize features
scaler = StandardScaler()
X_std = scaler.fit_transform(X)
```

## Conduct k-Means Clustering

`MiniBatchKMeans`

works similarly to `KMeans`

, with one significance difference: the `batch_size`

parameter. `batch_size`

controls the number of randomly selected observations in each batch. The larger the the size of the batch, the more computationally costly the training process.

```
# Create k-mean object
clustering = MiniBatchKMeans(n_clusters=3, random_state=0, batch_size=100)
# Train model
model = clustering.fit(X_std)
```