Material Recognition

Monday, June 4, 2012

Final Post

About the course

Time goes by, and I'm going to finish this quarter and go back to my home university. I should say that I have got a lot from this course. Not just about project and computer vision, but also about challenging myself and thinking by myself. I still remember the days when I was hesitating and didn't know what to do. But eventually I choose to believe in myself and have finished this project. I'm grateful that this course gives me this opportunity to choose to trust myself and work hard.
During this quarter, the instructor helped a lot through the feedback from the class and emails. I'm thankful that he always gave good and sincere advices and always replied in time. And also, it is interesting and helpful to see the progress of other classmates on class and learn from it. Everyone's project is amazing and enlightening. I'm glad that I can have such brilliant and interesting people accompanied me during this quarter.

About the project

So far I have posted all my steps on this blog. Below is a list of the milestones.

1.Software and Datatsets

Figure 0. The device I was using
We are using Matlab now for the whole project. And the dataset comes from the pictures taken manually from a existing laser scanner.
We have two datasets collected in different periods. The first dataset includes 10 kinds of material: (1)mouse pad(Synthetic plastic), (2) paper, (3)tape, (4)skin, (5)metal, (6)foam grip, (7)plastic, (8)foam pad, (9)denim, (10)wood. All the pictures have been taken in two exposures, so there are 20 pictures in all. Picture below shows the objects in the first dataset.

The second dataset includes 24 pictures with 10 material: (1)bread, (2) butter, (3)fiber, (4)denim(jeans), (5)metal, (6)paper, (7)plastic, (8)silk, (9)skin, (10)wood. All with two exposures, and also includes two deformed pictures with paper and denim.

2. L2 Distance

In the project, I split the pictures in the datasets to training and testing data, and first calculated their L2 distance as its feature. From the L2 distance method, I get the result of 2 errors in 10 kinds of material.
Originally it was better, as is shown in figure1 (The L2 distance between all the ten pair of training and testing data); the distances are supposed to be the smallest when it gets to 1,2,3,4,5,6,7,8,9,0 in every 10 comparison separately, but we can see that in the fifth comparison, the second distance is actually the shortest. We modified this error when we found the special property of metal (mentioned before), thus recognizing metal by variance other than L2 distance.

Figure 1: The chart of the first time result

After we modified our method, the confusion matrix gets to a perfect result. See figure2.

Figure 2: The second time result after we modified the metal recognition

But then we realised we need to do some rectification to reduce position factor. So after the rectification, the confusion matrix shows two errors, see figure 3. Although the result gets worse, but it is more precise.

Figure 3: The third time result after we did the rectification

And since variance and width can’t help to make the result better, we end up with 2 errors in 10 kinds of material.

We have also put some pictures of the same material into the cluster:
The result has one more error of wrong recognition of material 1 and 2 ,and the replaced material are material 2(paper) 9(denim) 10(wood). This means picture of material 10 (wood) in the two datasets can be recognized to each other.

Figure 4: The result after mixing up the data

3. K-means

I also used the built-in function kmeans in Matlab to solve this problem. Using k-means algorithm, the result on the original dataset is 3 errors in 10, as is shown in figure 5. The value of indx1 is the cluster indices of each points, and above is the difference of the training and testing data’s cluster indices, which means there are 3 errors exist.

Figure 5: The chart of the k-means result

4. Appendix

The new dataset is used to do the rectification. The results are shown below:
The left are the deformed pictures, and the right pictures are the ones after the rectification.

paper

denim

5. Conclusion

As I said in the begining, I am really happy that I can have the chance to participate in this course and complete this project. Since I am a electronic engineering student, I really learned a lot about computer vision from this course, including some basic conceptions of computer vision, and how to work on a project of computer vision. And also I got self-improvement and met a lot of nice people from this course. Thank for all of you who have provided help to me.

The end

Rectification and Other Details

Complement:

This weekend I was trying to complement my code by implementing the rectification and mixing up the data from two data sets.

1.Rectification

I think it would be nice if I can make the deformed pictures regular. Below is the pictures before and after getting rectified.

paper

denim

2.Data combination

Since there are same kinds of material, paper and denim, in both two datasets, I feel obliged to do the recognition between these two datasets. I made those two pictures in the second dataset took place of the original testing data, and got the results as below.

Original result.

Current result (adding an error of wrong recognition of material 1 and 2 ),and the replaced material are material 2(paper) 9(denim) 10(wood). This means picture of material 10 (wood) in the two datasets can be recognized to each other.

The pictures in the two datasets are taken in different exposures, so this might be one of the reasons why the result has one more error than the original one. And also, the material is slightly different because the objects chosen to represent the material in the two datasets are different. That is to say, although they are all denim or paper or wood, they might be different kinds of denim or paper or wood.

So it will be better if I choose the same exposure and take more pictures of the same object, and do the recognition. This might be done in future work.

Tuesday, May 29, 2012

Variance and Width

1.Variance
In last post, I have mentioned that my next step is to use variance to distinguish some errors in material recognition. While it turns out that it doesn't work. For some specific materials, variance can be a good measurement, like metal. But in error detecting, variance need to be valid for all kinds of materials so that I can use it to modify the errors. But variance can't work for several materials, like the picture shown below. I have sorted all the training and testing data's variance, and ind1 and ind2 show the sequence of the variance. The last 6 material's variances of their training and testing data can't be corresponded to each other.

2. Width
When using width to help recognize the material, the error increase to 5 in 10 (original is 2 in 10). I think maybe because widths are not unique among materials.

So I think I'm gonna stop in 2 errors in 10 materials, and start working on rectifying each pictures' deformation, which is the main problem of the new dataset.

Monday, May 28, 2012

Clustering using K-means algorithm

This weekend I was trying to find a new way to cluster the materials. And I tried to use VLfeat. By looking into the algorithms of VLfeat, I found that k-means might be a good way to cluster it. So I just used the built-in kmeans function in matlab to recognize the materials. Below is the result I got.

As in the original method, I divided each picture(13 pictures including 3 deformed pictures) into training data and testing data. And try to classify all 26 samples into 10 centers.

This result shows the clustering results of the training and testing data:

This result shows the distance between every center and every points.(Each line represents a center)

This result shows the differences of the clustering results of the training and testing data.

This time it gets confused when dealing with butter and denim. Again, I'm pretty sure that it can be solved by adding the factor of variance. And also the failure of recognizing deformed pictures reminds me that last time I just rectified the position of the peaks for every picture, I still need to rectify all the peaks within one picture(split the picture in small columns and rectify every column). So that deformed pictures can be rectified as regular pictures.

So above are my next two steps.

Wednesday, May 23, 2012

New Dataset and Some Experiment

Last week I have been able to collect my new dataset with 10 kinds of materials (Some are already existed in original dataset, some are brand new,like bread and butter). For all the materials, we took at least 2 pictures with high and low exposures:

1. Paper 4 (include 2 deformed ones)
2. Silk 2
3. Denim 6 (include 2 deformed ones, and because the pictures took from jeans, so they recommended me to take 2 more pictures in orthogonal direction)
4. Wood 2
5. Bread 2
6. Butter 2
7. Plastic 2
8. Fibre 2
9. Metal 2
10. Skin 2

Below are some materials' pictures:

1.Bread(low high)

2.Butter(low high)

3. Deformed Denim(low high)

4.Denim(low high)

5.Paper(low high)

6.Deformed paper(low high)

7.Fibre(low high)

And also I have worked on my code to align the peak of the column to the center of the image, the result of the first data set (2 errors in 10):

And I also used my code on the new data set(4 errors in 10):

So I still need to figure out some ways to improve my code.

Monday, May 14, 2012

Complementing the Original Code

1. Revise the "confusing matrix" to confusion matrix
I have studied the confusionmat function in Matlab and used it to better represent the result:

From the confusion matrix above we can see the wrong predition of material 5(based on the low exposure pictures) .

2. Multi-classification
I have used variance to recognize metal, and it works well to modify the wrong predition above(based on the low exposure pictures) .

3. Try to use two exposures
Last week, I thought I could use the low exposure pictures to extract from the high exposure pictures, thus getting the pictures without background. But now I find that the low exposure pictures are good enough to analyse this problem. They already do not have any background information. (As below)

What's more important is that when the high exposure pictures get abstracted by the low exposure pictures, they are just losing the important information, like the pictures below. So it's not a good way to supress the ambient light.

But since the pictures with low exposure have much less information than the high exposure pictures , I'm still trying to figure out a way to combine those two together. (Getting the average value of the two pictures just get the high exposure pictures darker)

Any suggestions?

And also there is a severe problem of alignment. Many low exposure pictures are not aligned, I suspect the good result I get in part1,2 is resulted from their location differences. My next step is to do the alignment between pictures or even within pictures(like the picture shown below). I know there are several groups in class using alignment, so I think maybe it's a better choice to ask these groups first.

Any suggestions?

To sum up:

Questions: 1.How to combine two exposures? 2. How to align pictures?(Existing function?)

Next step: 1.Combination and Alignment 2.Next collection of dataset.

Monday, May 7, 2012

A related paper

A week ago, the professor gave a related paper to me as an important reference, which has been published recently on PROCAMS 2012. The link is http://www.cs.cmu.edu/~ILIM/publications/PDFs/MKSN-PROCAMS12.pdf.I have read it in detail, and found that this paper was built on a device more sophisticated than ours, which could control red, green and blue lasers at high frame rates (18kHz horizontally and 60Hz vertically), and thus could use a ﬁlter to easily block the unwanted ambient light. Although they use different device to complete different goals, the method they are using can give me a lot of cues to conduct my experiment.

The structure of the paper is as follows:

types

"We discuss how the line-striping acts as a kind of “light-probe”, creating distinctive patterns of light scattered by different types of materials.
We investigate visual features that can be computed from these patterns and can reliably identify the dominant material characteristic of a scene, i.e. where most of the objects consist of either diffuse (wood), translucent (wax), reﬂective (metal) or transparent (glass) materials."

The types they are looking into:

diffuse (wood)----------lambertian materials

translucent (wax)-------------dispersive media and subsurface scatteringmaterials

reﬂective (metal)--------------------reﬂective surfaces

transparent (glass) materials---------------refractive surfaces

goals

The ﬁrst is low-power and low-cost reconstruction of diffuse scenes under strong ambient lighting (e.g. direct sun-light).

The other application of our sensor relates to the scene’s material properties.

method we can use for reference

Ambient light suppression

"Lastly the background can be suppressed by taking an image with the projector on and one with the projector off. It is not actually necessary to shut the projector off; instead, we choose a different trigger delay which effectively moves the location of the projected line. In this way, one gets two images with the same back-ground but with different projected lines. Subtracting one from the other and keeping only the positive values gives us a single line-stripe."

I can use the different exposures in different positions to subtract the background from the response.

2. Fast per-frame analysis

"Fig. 6 shows a per-frame analysis of a scene with milky
water bottles and another with glass objects. Our method has ﬁve steps:
(1) For each column, ﬁnd the maximum intensity pixel
(2) At this pixel, apply two ﬁlters (see ﬁgure inset),
(3) If ﬁlter 1’s response is greater than a threshold,it is glass

(4) Otherwise, if the response to the second ﬁlter
is greater than a second threshold, label as milk. If there is no labeling, then it is a diffuse material. In the ﬁgure, we have marked glass as red, milk as blue and diffuse as green.
The biggest errors are for clear glass when the camera sees
mostly the background. This is a fast classiﬁcation, since
for each column the ﬁlters are only applied once."

I can use the similar method to classify glass and others.

3.Full scan analysis

diffuse(plastic)

"First we point out that simple detection of a single, sharp peak in a scene point’s appearance proﬁle
[13] strongly suggests lambertian/diffuse reﬂectance. If the
proﬁle has no peak, then the projector is not illuminating this pixel and therefore it is in shadow. "

They identify the diffuse materia by looking at the number of intensity maxima, which may not be useful in our experiment.

scattering and subsurface scattering(wax)

"Figure 8. We take the power spectrum of the three dimensional Fourier transform of each scan video, and integrate the time frequency dimension. The resulting 2D matrix is mostly sparse. Low non-zero support gives an indication of scattering and subsurface scattering."

There may not be spectrums in our dataset, but I may try variance to detect this feature.

distinguishing between reﬂective or refractive surfaces(metal and glass)

"We have empirically found that the number of intensity maxima in the appearance proﬁle at each pixel can be very discriminative. An intuitive explanation is that since reﬂective caustics are caused by opaque objects, the number of observed caustics at each scene point is less than in a refractive material, where the camera can view through the material onto the diffuse background, allowing the observance of many more caustics."

Glass has more reﬂective caustics than metal, I can use this feature too.

"In (c) we show the raw features obtained from a low-res histogram of gradients (HOG). The top three discriminative features (d) for metal and glass show promise, but we believe more data is needed before a
discriminative hyperplane can be learned."

Another way to discriminate metal and glass, but maybe can't be used in my experiment.

methods my experiment can use

We need to do ambient light suppression

Can use filters to detect main difference

The number of intensity maxima can also be an important feature.