v One-Hot Encode Features With Multiple Labels - Machine Learning

One-Hot Encode Features With Multiple Labels

Preliminaries

# Load libraries
from sklearn.preprocessing import MultiLabelBinarizer
import numpy as np

Create Data

# Create NumPy array
y = [('Texas', 'Florida'), 
    ('California', 'Alabama'), 
    ('Texas', 'Florida'), 
    ('Delware', 'Florida'), 
    ('Texas', 'Alabama')]

One-hot Encode Data

# Create MultiLabelBinarizer object
one_hot = MultiLabelBinarizer()

# One-hot encode data
one_hot.fit_transform(y)
array([[0, 0, 0, 1, 1],
       [1, 1, 0, 0, 0],
       [0, 0, 0, 1, 1],
       [0, 0, 1, 1, 0],
       [1, 0, 0, 0, 1]])

View Column Headers

# View classes
one_hot.classes_
array(['Alabama', 'California', 'Delware', 'Florida', 'Texas'], dtype=object)