Monday, June 4, 2012

Final Post

About the course

  Time goes by, and I'm going to finish this quarter and go back to my home university. I should say that I have got a lot from this course. Not just about project and computer vision, but also about challenging myself and thinking by myself. I still remember the days when I was hesitating and didn't know what to do. But eventually I choose to believe in myself and have finished this project. I'm grateful that this course gives me this opportunity to choose to trust myself and  work hard.
  During this quarter, the instructor helped a lot through the feedback from the class and emails. I'm thankful that he always gave good and sincere advices and always replied in time.  And also, it is  interesting and helpful to see the progress of other classmates on class and learn from it. Everyone's project is amazing and enlightening. I'm glad that I can have such brilliant and interesting people accompanied me during this quarter.

About the project

  So far I have posted all my steps on this blog. Below is a list of the milestones.
 

1.Software and Datatsets

                                   Figure 0. The device I was using
  We are using Matlab now for the whole project. And the dataset comes from the pictures taken manually from a existing laser scanner.
   We have two datasets collected in different periods. The first dataset includes 10 kinds of material: (1)mouse pad(Synthetic plastic), (2) paper, (3)tape, (4)skin, (5)metal, (6)foam grip, (7)plastic, (8)foam pad, (9)denim, (10)wood. All the pictures have been taken in two exposures, so there are 20 pictures in all. Picture below shows the objects in the first dataset.


  The second dataset includes 24 pictures with 10 material: (1)bread, (2) butter, (3)fiber, (4)denim(jeans), (5)metal, (6)paper, (7)plastic, (8)silk, (9)skin, (10)wood. All with two exposures, and also includes two deformed pictures with paper and denim.

       2. L2 Distance

  In the project, I split the pictures in the datasets to training and testing data, and first calculated their L2 distance as its feature. From the L2 distance method, I get the result of 2 errors in 10 kinds of material.
   Originally it was better, as is shown in figure1 (The L2 distance between all the ten pair of training and testing data); the distances are supposed to be the smallest when it gets to 1,2,3,4,5,6,7,8,9,0 in every 10 comparison separately, but we can see that in the fifth comparison, the second distance is actually the shortest. We modified this error when we found the special property of metal (mentioned before), thus recognizing metal by variance other than L2 distance.
                                      Figure 1: The chart of the first time result
  After we modified our method, the confusion matrix gets to a perfect result. See figure2.
Figure 2: The second time result after we modified the metal recognition
  But then we realised we need to do some rectification to reduce position factor. So after the rectification, the confusion matrix shows two errors, see figure 3. Although the result gets worse, but it is more precise.
Figure 3: The third time result after we did the rectification


   And since variance and width can’t help to make the result better, we end up with 2 errors in 10 kinds of material.


  We have also put some pictures of the same material into the cluster:
The result has one more error of wrong recognition of material 1 and 2 ,and the replaced material are material 2(paper) 9(denim) 10(wood). This means picture of material 10 (wood) in the two datasets can be recognized to each other.
 Figure 4: The result after mixing up the data


3. K-means

   I also used the built-in function kmeans in Matlab to solve this problem. Using k-means algorithm, the result on the original dataset is 3 errors in 10, as is shown in figure 5. The value of indx1 is the cluster indices of each points, and above is the difference of the training and testing data’s cluster indices, which means there are 3 errors exist.
Figure 5: The chart of the k-means result

4. Appendix

The new dataset is used to do the rectification. The results are shown below:
The left are the deformed pictures, and the right pictures are the ones after the rectification.


paper
denim

    5. Conclusion

  As I said in the begining, I am really happy that I can have the chance to participate in this course and complete this project. Since I am a electronic engineering student,  I really learned a lot about computer vision from this course, including some basic conceptions of computer vision, and how to work on a project of computer vision. And also I got self-improvement  and met a lot of nice people from this course. Thank for all of you who have provided help to me.
The end


Rectification and Other Details

Complement:


   This weekend I was trying to complement my code by implementing the rectification and mixing up the data from two data sets.


1.Rectification


   I think it would be nice if I can make the deformed pictures regular. Below is the pictures before and after getting rectified.

paper
denim
2.Data combination

Since there are same kinds of material, paper and denim, in both two datasets,  I feel obliged to do the recognition between these two datasets. I made those two pictures in the second dataset took place of the original testing data, and got the results as below.

Original result.
Current result (adding an error of wrong recognition of material 1 and 2 ),and the replaced material are material 2(paper) 9(denim) 10(wood). This means picture of material 10 (wood) in the two datasets can be recognized to each other.


 The pictures in the two datasets are taken in different exposures,  so this might be one of the reasons why the result has one more error than the original one. And also, the material is slightly different because the objects chosen to represent the material in the two datasets are different. That is to say, although they are all denim or paper or wood, they might be different kinds of denim or paper or wood.

So it will be better if I choose the same exposure and take more pictures of the same object, and do the recognition. This might be done in future work.