v Gaussian Naive Bayes Classifier - Machine Learning

Gaussian Naive Bayes Classifier

Because of the assumption of the normal distribution, Gaussian Naive Bayes is best used in cases when all our features are continuous.

Preliminaries

# Load libraries
from sklearn import datasets
from sklearn.naive_bayes import GaussianNB

Load Iris Flower Dataset

# Load data
iris = datasets.load_iris()
X = iris.data
y = iris.target

Train Gaussian Naive Bayes Classifier

# Create Gaussian Naive Bayes object with prior probabilities of each class
clf = GaussianNB(priors=[0.25, 0.25, 0.5])

# Train model
model = clf.fit(X, y)

Create Previously Unseen Observation

# Create new observation
new_observation = [[ 4,  4,  4,  0.4]]

Predict Class

# Predict class
model.predict(new_observation)
array([1])

Note: the raw predicted probabilities from Gaussian naive Bayes (outputted using predict_proba) are not calibrated. That is, they should not be believed. If we want to create useful predicted probabilities we will need to calibrate them using an isotonic regression or a related method.