v Variance Thresholding For Feature Selection - Machine Learning

Variance Thresholding For Feature Selection

Preliminaries

from sklearn import datasets
from sklearn.feature_selection import VarianceThreshold

Load Data

# Load iris data
iris = datasets.load_iris()

# Create features and target
X = iris.data
y = iris.target

Conduct Variance Thresholding

# Create VarianceThreshold object with a variance with a threshold of 0.5
thresholder = VarianceThreshold(threshold=.5)

# Conduct variance thresholding
X_high_variance = thresholder.fit_transform(X)

View high variance features

# View first five rows with features with variances above threshold
X_high_variance[0:5]
array([[ 5.1,  1.4,  0.2],
       [ 4.9,  1.4,  0.2],
       [ 4.7,  1.3,  0.2],
       [ 4.6,  1.5,  0.2],
       [ 5. ,  1.4,  0.2]])