Sep-08-2021, 02:12 AM
I am completely new to RandomForest and Machine Learning. Some help will be appreciated! Thank you!
Example of DataSet
Is this correct? Can I add the sparse matrix of bow into a df?
Example of DataSet
**ID |sentiment | review | source |** '5' |0 | lousy movie | twitter | '6' |1 | excellent acting | website | '7' |0 | bad script, but wonderful actors | feedback |I create Bag-of-word (BOW) for review
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.ensemble import RandomForestClassifier
file_location = 'C:/Desktop/test.xlsx'
xlsx=pd.ExcelFile(file_location, engine='openpyxl')
df=xlsx.parse('Sheet1',header=0)
bow=df['review']
Y_train=df['sentiment']
vect = CountVectorizer()
bow = vect.fit_transform(bow)I created another df and added both BOW and Review as columnsIs this correct? Can I add the sparse matrix of bow into a df?
df1 = pd.DataFrame(bow) df1['source']=df['source'] X_train=df1.values print(X_train)ouput of print(X_train)
[[<1x16 sparse matrix of type '<class 'numpy.int64'>'
with 6 stored elements in Compressed Sparse Row format>
'twitter']
[<1x16 sparse matrix of type '<class 'numpy.int64'>'
with 5 stored elements in Compressed Sparse Row format>
'website']
[<1x16 sparse matrix of type '<class 'numpy.int64'>'
with 2 stored elements in Compressed Sparse Row format>
'feedback']Train the RandomForest Modelforest = RandomForestClassifier(n_estimators = 100) forest = forest.fit( X_train, Y_train)Error
ValueError: setting an array element with a sequence
