Loading Features From Dictionaries

Preliminaries

from sklearn.feature_extraction import DictVectorizer

Create A Dictionary

staff = [{'name': 'Steve Miller', 'age': 33.},
         {'name': 'Lyndon Jones', 'age': 12.},
         {'name': 'Baxter Morth', 'age': 18.}]

Convert Dictionary To Feature Matrix

# Create an object for our dictionary vectorizer
vec = DictVectorizer()
# Fit then transform the staff dictionary with vec, then output an array
vec.fit_transform(staff).toarray()
array([[ 33.,   0.,   0.,   1.],
       [ 12.,   0.,   1.,   0.],
       [ 18.,   1.,   0.,   0.]])

View Feature Names

# Get Feature Names
vec.get_feature_names()
['age', 'name=Baxter Morth', 'name=Lyndon Jones', 'name=Steve Miller']