Using the tutorial on multiclass adaboost, I'm trying to classify some images that have two classes (but I don't suppose the algorithm shouldn't work if the problem is binary). Then I'm going to extend my samples to include other classes.

My current test is quite small, only 17 images in all, 10 for training, 7 for testing. (I'll add more samples later, right now I just want to get the code up and running).

For now I have two classes: 0: no vehicle, 1: vehicle present I used integer labels because according to the example in the link above, the training data consists of integer-based labels.

I've edited the provided example only a bit, to include my own image files, but I'm getting the following error:

Traceback (most recent call last):
  File "C:\Users\app\Documents\Python Scripts\carclassify.py", line 66, in <module>
    bdt_discrete.fit(X_train, y_train)
  File "C:\Users\app\Anaconda\lib\site-packages\sklearn\ensemble\weight_boosting.py", line 389, in fit
    return super(AdaBoostClassifier, self).fit(X, y, sample_weight)
  File "C:\Users\app\Anaconda\lib\site-packages\sklearn\ensemble\weight_boosting.py", line 99, in fit
    X = np.ascontiguousarray(array2d(X), dtype=DTYPE)
  File "C:\Users\app\Anaconda\lib\site-packages\numpy\core\numeric.py", line 408, in ascontiguousarray
    return array(a, dtype, copy=False, order='C', ndmin=1)
ValueError: setting an array element with a sequence.

The relevant part of my code is below:

f = open("PATH_TO_SAMPLES\\samples.txt",'r')
out = f.read().splitlines()
import numpy as np

imgs = []
tmp_hogs = []
# 13 of the images are with vehicles, 4 are without
labels = [1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0]

for file in out:
        filepath = "C:\PATH_TO_SAMPLE_IMAGES\\" + file
        curr_img = color.rgb2gray(io.imread(filepath))
        imgs.append(resize(curr_img,(60,40)))
        fd, hog_image = hog(curr_img, orientations=8, pixels_per_cell=(16, 16),
                 cells_per_block=(1, 1), visualise=True)
        tmp_hogs.append(fd)

img_hogs = np.array(tmp_hogs)
n_split = 10
X_train, X_test = img_hogs[:n_split], X[n_split:] # all first ten images with vehicles
y_train, y_test = labels[:n_split], labels[n_split:] # 3 images with vehicles, 4 without

#now all the code below is straight off the example on scikit-learn's website

bdt_real = AdaBoostClassifier(
    DecisionTreeClassifier(max_depth=2),
    n_estimators=600,
    learning_rate=1)

bdt_discrete = AdaBoostClassifier(
    DecisionTreeClassifier(max_depth=2),
    n_estimators=600,
    learning_rate=1.5,
    algorithm="SAMME")

bdt_real.fit(X_train, y_train)
bdt_discrete.fit(X_train, y_train)

real_test_errors = []
discrete_test_errors = []

for real_test_predict, discrete_train_predict in zip(
        bdt_real.staged_predict(X_test), bdt_discrete.staged_predict(X_test)):
    real_test_errors.append(
        1. - accuracy_score(real_test_predict, y_test))
    discrete_test_errors.append(
        1. - accuracy_score(discrete_train_predict, y_test))

n_trees = xrange(1, len(bdt_discrete) + 1)

What am I doing wrong, and what needs to be fixed?

Edit

Here are the samples, in order to replicate the error:

  1. positive sample 1

  2. positive sample 2

  3. positive sample 3

  4. positive sample 4

  5. positive sample 5

  6. positive sample 6

  7. positive sample 7

  8. positive sample 8

  9. positive sample 9

  10. positive sample 10

  11. positive sample 11

  12. positive sample 12

  13. positive sample 13

  14. negative sample 1

  15. negative sample 2

  16. negative sample 3

  17. negative sample 4

asked Apr 11 '14 at 09:28

asaaki's gravatar image

asaaki
1667

edited Apr 13 '14 at 03:26


One Answer:

Could you provide a sample so that I can try and reproduce the error?

The error itself means you are trying to do something mathematically invalid with a numpy array, such as having incorrect dimensions/sizes between rows of a matrix or a mixed type (float, int, string) violation.

answered Apr 11 '14 at 22:48

Daniel%20E%20Margolis's gravatar image

Daniel E Margolis
1065510

Sure, I've edited the question to add the links to all 17 images. I tried uploading a zipped file so I could just link you to that, but unfortunately that didn't work out. So I'm afraid you'll have to download these individually!

So I used the first ten images for training and the rest for testing. I know a real classifier will need tons more images but like I said I'll add them later.

(Apr 12 '14 at 03:17) asaaki

Before you actually run any of your classifiers, check the length of the array objects in your test set. In other words, run something like:

X_train, X_test = img_hogs[:n_split], img_hogs[n_split:]

for x in X_train:

    print len(x)

Do they match?

(Apr 12 '14 at 09:39) Daniel E Margolis

I've done that, they don't match. The length of some are 1728, but others are 1344, and others are 7600, and so on... But I've already resized the images to the same size, then what's wrong?

(Apr 13 '14 at 03:23) asaaki

Do you get the same errors if you run my code with the given samples?

(Apr 13 '14 at 07:40) asaaki

Well, there were a few lines I had to add (you didn't include some of the imported libraries), but I do get an error with your samples. When I only use those samples with the same size, I don't get an error.

(Apr 13 '14 at 09:39) Daniel E Margolis

but how is this happening? isn't resize(curr_img,(60,40)) supposed to take care of that?

(Apr 13 '14 at 14:53) asaaki
showing 5 of 6 show all
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.