One-Hot Encode Nominal Categorical Features

Preliminaries

# Load libraries
import numpy as np
import pandas as pd
from sklearn.preprocessing import OneHotEncoder

Create Data With One Class Label

# Create NumPy array
x = np.array([['Texas'], 
              ['California'], 
              ['Texas'], 
              ['Delaware'], 
              ['Texas']])

One-hot Encode Data (Method 1)

# Create LabelBinzarizer object
one_hot = OneHotEncoder()

# One-hot encode data
one_hot.fit_transform(x)
<5x3 sparse matrix of type '<class 'numpy.float64'>'
    with 5 stored elements in Compressed Sparse Row format>

View Column Headers

# View classes
one_hot.categories_
[array(['California', 'Delaware', 'Texas'], dtype='<U10')]

One-hot Encode Data (Method 2)

# Dummy feature
pd.get_dummies(x[:,0])

California Delaware Texas
0 0 0 1
1 1 0 0
2 0 0 1
3 0 1 0
4 0 0 1