Skip to content

Instantly share code, notes, and snippets.

@ashishraste
ashishraste / Training.py
Created February 18, 2012 12:29 — forked from rohitdholakia/Training.py
This is for training the Bayesian model
#This is for training. Calculate all probabilities and store them in a vector. Better to store it in a file for easier access
from __future__ import division
import sys,os
'''
1. The spam and non-spam is already 50% . So they by default are 0.5
2. Now we need to calculate probability of each word , in spam and non-spam separately
2.1 we can make two dictionaries, defaultdicts basically, for spam and non-spam
2.2 When time comes to calculate probabilities, we just need to substitute values
'''
from collections import *