ValueError: Found input variables with inconsistent numbers of samples: [0, 3]

ayaz786amd · Nov-26-2018, 08:00 AM

Hi

import pandas as pd
import quandl, math
import numpy as np
import sys
from sklearn import preprocessing, model_selection, svm   # model_selection replaces cross_validation
from sklearn.linear_model import LinearRegression
sys.path.insert(0, 'C:\Program Files\Scripts')
df = quandl.get('FINRA/FORF_TLLTD')

df['PCT']= df['ShortVolume']/df['TotalVolume']*100
df = df[['ShortVolume','TotalVolume', 'PCT']]
forecast_col = 'ShortVolume'
df.fillna(-99999, inplace= True)
forecast_out = int(math.ceil(0.01*len(df)))
df['label'] = df[forecast_col].shift(-forecast_out)
df.dropna(inplace=True)

X = np.array(df.drop(['label'], 1))
y = np.array(df['label'])
X= preprocessing.scale(X)
X= X[:-forecast_out+1]
y = np.array(df['label'])
X_train,X_test,y_train,y_test= model_selection.train_test_split(X, y, test_size=0.2)
clf = LinearRegression()
clf.fit(X_train, y_train)
accuracy = clf.score(X_test, y_test)
print(accuracy)

Error:Traceback (most recent call last):
  File "C:/Program Files/PycharmProjects/code examples/code practice.py", line 39, in <module>
    X_train,X_test,y_train,y_test= model_selection.train_test_split(X, y, test_size=0.2)
  File "C:\Users\Ayaz\AppData\Roaming\Python\Python37\site-packages\sklearn\model_selection\_split.py", line 2184, in train_test_split
    arrays = indexable(*arrays)
  File "C:\Users\Ayaz\AppData\Roaming\Python\Python37\site-packages\sklearn\utils\validation.py", line 260, in indexable
    check_consistent_length(*result)
  File "C:\Users\Ayaz\AppData\Roaming\Python\Python37\site-packages\sklearn\utils\validation.py", line 235, in check_consistent_length
    " samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [0, 3]

**Larz60+** · Nov-26-2018, 01:25 PM

The error occurs on line 39, you are only showing 27 lines of code.

ayaz786amd · Nov-27-2018, 07:12 AM

i have some notes on above lines not relevant to this code.
Code works when i removed

X= X[:-forecast_out+1]

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Inconsistent sorting with the .sort_values() function	devansing	4	6,011	Jun-28-2022, 06:12 PM Last Post: deanhystad
	Separating unique, stable, samples using pandas	keithpfio	1	2,203	Jun-20-2022, 07:06 PM Last Post: keithpfio
	ValueError: Found input variables with inconsistent numbers of samples: [5, 6]	bongielondy	6	37,705	Jun-28-2021, 05:23 AM Last Post: ricslato
	ValueError: Found array with 0 samples	marcellam	1	8,919	Apr-22-2020, 04:12 PM Last Post: jefsummers
	ValueError: Found input variables with inconsistent numbers of sample	robert2joe	0	6,005	Mar-25-2020, 11:10 AM Last Post: robert2joe
	ValueError: Found input variables	AhmadMWaddah	3	5,856	Mar-03-2020, 10:19 PM Last Post: AhmadMWaddah
	ValueError: Input contains infinity or a value too large for dtype('float64')	Rabah_r	1	15,469	Apr-06-2019, 11:08 AM Last Post: scidam
	ValueError: could not broadcast input array from shape (75) into shape (25)	route2sabya	0	7,745	Mar-14-2019, 01:14 PM Last Post: route2sabya
	pandas: assemble data to have samples	sdcompanies	2	4,674	Jan-19-2018, 09:45 PM Last Post: Larz60+

ValueError: Found input variables with inconsistent numbers of samples: [0, 3]

User Panel Messages

Announcements