Monday, April 30, 2012

The First Set of Features

  This weekend I'm working on measuring the similarity between training and testing data (actually they are mean value of columns of split odd and even quaters of the image ). As a problem was mentioned that the two sets of data were too similar if I just take the even and odd columns, I have changed it to be even and odd quaters, and get the two sets of data more different as below.
   I have used Matlab to calculate the L2 norm(without being squared)of the training and testing data, and get a histogram as below. It describes the distance matrix.
  First 10 columns describe the distance between the first testing data with all the training data. We can see that the first testing data matches the first training data(the distance is almost zero). And for the second 10 columns, we can get that the second testing data matches the second training data and so on.
  Interestingly, we can see from the histogram that the distance between material 2 and 5 are a little bit too small that we may be confused if we just use this histogram to category them . Lets see the mean value of the columns below, it's really similar. But when we see from the original pictures, we can tell they are totally different(Actually they are paper and metal). So it must be the procedure of getting the average value makes it confusing. As a conclusion, mean values are useful, but not enough.

No comments:

Post a Comment