Jun-24-2019, 10:36 AM
Hello
I use nltk.NaiveBayesClassifier in order to make opinion analysis. I have a problem.
What I do:
1. Take lists of negative and positive words, shuffle it.
2. Use Brown corpus of movie reviews
5. Train classifier
What is wrong?
I use nltk.NaiveBayesClassifier in order to make opinion analysis. I have a problem.
What I do:
1. Take lists of negative and positive words, shuffle it.
2. Use Brown corpus of movie reviews
docs = [ (list(movie_reviews.words(fileid)), category)
for category in movie_reviews.categories()
for fileid in movie_reviews.fileids(category)]3. Function to represent text as vector of features def vector(doc):
doc_words = set(doc)
vect = {}
for w in words: // words = pos_words + neg_words
vect[w] = (w in doc_words)
return vect4. Take all labelled reviews and represent them as vectors of features ( { vector : lavel } )5. Train classifier
>>> classifier.show_most_informative_features()
Most Informative Features
astounding = 1 pos : neg = 12.3 : 1.0
outstanding = 1 pos : neg = 11.5 : 1.0
ludicrous = 1 neg : pos = 11.0 : 1.0
fascination = 1 pos : neg = 11.0 : 1.0
insulting = 1 neg : pos = 11.0 : 1.0
sucks = 1 neg : pos = 10.6 : 1.0
seamless = 1 pos : neg = 10.3 : 1.0
hatred = 1 pos : neg = 10.3 : 1.0
dread = 1 pos : neg = 9.7 : 1.0
accessible = 1 pos : neg = 9.7 : 1.0TEST:sent1 = { 'good' : 1 } \\ just one word "good"
>>> classifier.classify(sent1)
'neg'Fail! What is wrong?
