Python Forum
ValueError: Found input variables with inconsistent numbers of samples
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
ValueError: Found input variables with inconsistent numbers of samples
#1
I'm trying to write a bounding box regression training script with Keras and TensorFlow for object detection. I have a dataset of 3153 images (in jpg extension) and an txt file of bounding box annotations which consists 6430 lines (some pictures have multiple bounding box). This is a part of txt file (to know how it look):

2007_000027 101 174 351 349
2007_000032 180 195 229 213
2007_000032 189 26 238 44
2007_000129 1 74 462 272
2007_000129 19 252 487 334
2007_000170 91 3 206 43
2007_000170 28 4 372 461
2007_000272 71 25 500 304
2007_000323 3 277 375 500
2007_000323 3 12 375 305
I created a configuration file, which stores directories to some files:
BASE_PATH = "dataset"
IMAGES_PATH = os.path.sep.join([BASE_PATH, "images"])
ANNOTS_PATH = os.path.sep.join([BASE_PATH, "bboxes.txt"])

BASE_OUTPUT = "output"
MODEL_PATH = os.path.sep.join([BASE_OUTPUT, "detector.h5"])
PLOT_PATH = os.path.sep.join([BASE_OUTPUT, "plot.png"])
TEST_FILENAMES = os.path.sep.join([BASE_OUTPUT, "test_images.txt"])

INIT_LR = 1e-4
NUM_EPOCHS = 25
BATCH_SIZE = 32
The second file includes code to train my data:
print("INFO - loading dataset...")
rows = open(config.ANNOTS_PATH).read().strip().split("\n")
data = []
targets = []
filenames = []

for row in rows: 
    row = row.split(' ')
    (filename, startX, startY, endX, endY) = row
    suffix = ".jpg"
    imagePath = os.path.sep.join([config.IMAGES_PATH, filename+suffix])
    image = cv2.imread(imagePath)
    (h, w) = image.shape[:2]

    startX = float(startX) / w
    startY = float(startY) / h
    endX = float(endX) / w
    endY = float(endY) / h

    image = load_img(imagePath, target_size=(224, 224))
    image = img_to_array(image)

    data.append(image)
    targets.append((startX, startY, endX, endY))
    filenames.append

data = np.array(data, dtype="float32") / 255.0
targets = np.array(targets, dtype="float32")

split = train_test_split(data, targets, filenames, test_size=0.10, random_state=42)

(trainImages, testImages) = split[:2]
(trainTargets, testTargets) = split[2:4]
(trainFilenames, testFilenames) = split[4:]

print("INFO - saving testing filenames...")
f = open(config.TEST_FILENAMES, "w")
f.write("\n".join(testFilenames))
f.close()

vgg = VGG16(weights="imagenet", include_top=False, input_tensor=Input(shape=(224, 224, 3)))
vgg.trainable = False

flatten = vgg.output
flatten = Flatten()(flatten)

bboxHead = Dense(128, activation="relu")(flatten)
bboxHead = Dense(64, activation="relu")(bboxHead)
bboxHead = Dense(32, activation="relu")(bboxHead)
bboxHead = Dense(4, activation="sigmoid")(bboxHead)

model = Model(inputs=vgg.input, outputs=bboxHead)

opt = Adam(lr=config.INIT_LR)
model.compile(loss="mse", optimizer=opt)
print(model.summary())

print("INFO - training bounding box regressor...")
H = model.fit(
    trainImages, trainTargets, 
    validation_data=(testImages, testTargets), 
    batch_size=config.BATCH_SIZE, epochs=config.NUM_EPOCHS, verbose=1)

print("INFO - saving objects detector model...")
model.save(config.MODEL_PATH, save_format="h5")

N = config.NUM_EPOCHS
plt.style.use("ggplot")
plt.figure()
plt.plot(np.arange(0, N), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss")
plt.title("Bounding box regression loss on training set")
plt.xlabel("Epoch #")
plt.ylabel("Loss")
plt.legend(loc="lower left")
plt.savefig(config.PLOT_PATH)
When I run my code I get the next error:
Quote:Traceback (most recent call last): File "/Users/username/Downloads/od/train.py", line 47, in split = train_test_split(data, targets, filenames, test_size=0.10, random_state=42) File "/usr/local/lib/python3.9/site-packages/sklearn/model_selection/_split.py", line 2430, in train_test_split arrays = indexable(*arrays) File "/usr/local/lib/python3.9/site-packages/sklearn/utils/validation.py", line 433, in indexable check_consistent_length(*result) File "/usr/local/lib/python3.9/site-packages/sklearn/utils/validation.py", line 387, in check_consistent_length raise ValueError( ValueError: Found input variables with inconsistent numbers of samples: [6430, 6430, 0]

I understand that the number of lines is not equal to number of images, but I can't change data in txt file. Can someone help me to correct this code to train my data properly.

Thanks!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Using a For Loop to subtract numbers from an input function. Anunderling 9 2,819 Sep-22-2025, 08:56 PM
Last Post: deanhystad
  Identify salinity of water samples with images? Rangerguy 1 1,182 Aug-24-2024, 11:18 PM
Last Post: Larz60+
  ValueError: could not broadcast input array from shape makingwithheld 1 5,656 Jul-06-2024, 03:02 PM
Last Post: paul18fr
  Read csv file with inconsistent delimiter gracenz 2 3,593 Mar-27-2023, 08:59 PM
Last Post: deanhystad
  Inconsistent loop iteration behavior JonWayn 2 2,490 Dec-10-2022, 06:49 AM
Last Post: JonWayn
  ValueError: substring not found nby2001 4 12,965 Aug-08-2022, 11:16 AM
Last Post: rob101
  WHILE Loop - constant variables NOT working with user input boundaries C0D3R 4 3,454 Apr-05-2022, 06:18 AM
Last Post: C0D3R
  Loop Dict with inconsistent Keys Personne 1 2,626 Feb-05-2022, 03:19 AM
Last Post: Larz60+
  Inconsistent counting / timing with threading rantwhy 1 3,022 Nov-24-2021, 04:04 AM
Last Post: deanhystad
  Inconsistent behaviour in output - web scraping Steve 6 5,338 Sep-20-2021, 01:54 AM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020