Jun-10-2022, 07:59 PM
(This post was last modified: Jun-10-2022, 08:12 PM by Led_Zeppelin.)
when I try to run the following code, I get an error
I am not sure how to correct, I beleive it has something to do with these lines
Any help appreciated.
Respectfully,
LZ
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.seasonal import seasonal_decompose
import time
from tqdm import tqdm
from scipy import stats
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.feature_selection import RFE
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.metrics import f1_score
from sklearn.metrics import roc_auc_score
from sklearn.metrics import roc_curve, auc
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from xgboost import XGBClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
df = pd.read_csv("sensor.csv")
print('here')
df.head()
# Find Duplicate Values
# Results will be the list of duplicate values
# If no duplicate values, nothing will list.
df[df['timestamp'].duplicated(keep=False)]
df.isnull().sum()
df['machine_status'].value_counts()
# Convert timestamp column into data type into datetime
df['timestamp'] = pd.to_datetime(df['timestamp'])
# Create a Series
time_period = pd.Series([])
# Assign values to series
for i in tqdm(range(df.shape[0])):
if (df["timestamp"][i].hour >= 4) and (df[timestamp][i].hour < 10):
time_period[i]="Morning"
elif (df["timestamp"][i].hour >= 10) and (df[timestamp][i].hour < 16):
time_period[i]="Noon"
elif (df["timestamp"][i].hour >= 16) and (df[timestamp][i].hour < 22):
time_period[i]="Evening"
else:
time_period[i]="Night"
# Insert new column time period
df.Insert(2, 'time_period', time_period) I get an error. The error isError:C:\Users\james\AppData\Local\Temp\ipykernel_24076\1118037779.py:50: FutureWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
time_period = pd.Series([])
0%| | 240/220320 [00:00<01:59, 1848.32it/s]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [1], in <cell line: 53>()
52 # Assign values to series
53 for i in tqdm(range(df.shape[0])):
---> 54 if (df["timestamp"][i].hour >= 4) and (df[timestamp][i].hour < 10):
55 time_period[i]="Morning"
56 elif (df["timestamp"][i].hour >= 10) and (df[timestamp][i].hour < 16):
NameError: name 'timestamp' is not definedNow it says timestamp not defined. I think it is. This is not my code, but somebody else's code.I am not sure how to correct, I beleive it has something to do with these lines
# Convert timestamp column into data type into datetime df['timestamp'] = pd.to_datetime(df['timestamp'])How can I fix it?
Any help appreciated.
Respectfully,
LZ
