Theory
The Naive Bayes assumption states that each attribute is conditionally independent given class
Naive Bayes uses a generative model that estimates how the training data is generated; to generate
Specifically, we have probabilities for each class,
We can estimate these probabilities using frequencies in the data. To avoid probabilities from being
To predict the class of a given
Model
Our model consists of probabilities of each class
Training
Given training data
For each label
For each word
Prediction
Given a document
Return the class
During implementation, we calculate the following instead to avoid underflow.
Info
For document classification, in the prediction step, we assume that there is no information in the words we did not observe; a standard probabilistic model would include probability of words not being observed for each word in the vocabulary, instead of only probability of words that appear in the input document